Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions configs/DEMO_RISCV.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,10 @@ Queue-Sizes:
Load: 64
Store: 36
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 10
RAS-entries: 5
Fallback-Static-Predictor: "Always-Taken"
Global-History-Length: 19
RAS-entries: 1
Comment thread
dANW34V3R marked this conversation as resolved.
L1-Data-Memory:
Interface-Type: Fixed
L1-Instruction-Memory:
Expand Down
5 changes: 2 additions & 3 deletions configs/a64fx.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,10 @@ Queue-Sizes:
Load: 40
Store: 24
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 11
Global-History-Length: 19
RAS-entries: 8
Fallback-Static-Predictor: "Always-Taken"
L1-Data-Memory:
Interface-Type: Fixed
L1-Instruction-Memory:
Expand Down
5 changes: 2 additions & 3 deletions configs/a64fx_SME.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,10 @@ Queue-Sizes:
Load: 40
Store: 24
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 11
Global-History-Length: 19
RAS-entries: 8
Fallback-Static-Predictor: "Always-Taken"
L1-Data-Memory:
Interface-Type: Fixed
L1-Instruction-Memory:
Expand Down
9 changes: 4 additions & 5 deletions configs/m1_firestorm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,10 @@ Queue-Sizes:
Load: 130
Store: 60
Branch-Predictor:
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 11
RAS-entries: 8
Fallback-Static-Predictor: "Always-Taken"
Type: "Perceptron"
BTB-Tag-Bits: 11
Global-History-Length: 19
RAS-entries: 8
L1-Data-Memory:
Interface-Type: Fixed
L1-Instruction-Memory:
Expand Down
5 changes: 2 additions & 3 deletions configs/sst-cores/a64fx-sst.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,10 @@ Queue-Sizes:
Load: 40
Store: 24
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 11
Global-History-Length: 19
RAS-entries: 8
Fallback-Static-Predictor: "Always-Taken"
L1-Data-Memory:
Interface-Type: External
L1-Instruction-Memory:
Expand Down
5 changes: 2 additions & 3 deletions configs/sst-cores/m1_firestorm-sst.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,10 @@ Queue-Sizes:
Load: 130
Store: 60
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 11
Global-History-Length: 11
RAS-entries: 8
Fallback-Static-Predictor: "Always-Taken"
L1-Data-Memory:
Interface-Type: External
L1-Instruction-Memory:
Expand Down
3 changes: 1 addition & 2 deletions configs/sst-cores/tx2-sst.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,10 @@ Queue-Sizes:
Load: 64
Store: 36
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 10
RAS-entries: 5
Fallback-Static-Predictor: "Always-Taken"
L1-Data-Memory:
Interface-Type: External
L1-Instruction-Memory:
Expand Down
5 changes: 2 additions & 3 deletions configs/tx2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,10 @@ Queue-Sizes:
Load: 64
Store: 36
Branch-Predictor:
Type: "Perceptron"
BTB-Tag-Bits: 11
Saturating-Count-Bits: 2
Global-History-Length: 10
Global-History-Length: 19
RAS-entries: 5
Fallback-Static-Predictor: "Always-Taken"
L1-Data-Memory:
Interface-Type: Fixed
L1-Instruction-Memory:
Expand Down
19 changes: 18 additions & 1 deletion docs/sphinx/developer/components/branchPred.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,24 @@ Branch Target Buffer (BTB)
If the supplied branch type is ``Unconditional``, then the predicted direction is overridden to be taken. If the supplied branch type is ``Conditional`` and the predicted direction is not taken, then the predicted target is overridden to be the next sequential instruction.

Return Address Stack (RAS)
Identified through the supplied branch type, Return instructions pop values off of the RAS to get their branch target whilst Branch-and-Link instructions push values onto the RAS, for use by a proceeding Return instruction.
Identified through the supplied branch type, Return instructions pop values off of the RAS to get their branch target whilst Branch-and-Link instructions push values onto the RAS, for later use by the Branch-and-Link instruction's corresponding Return instruction.

Static Prediction
Based on the chosen static prediction method of "always taken" or "always not taken", the n-bit saturating counter value in the initial entries of the BTB structure are filled with the weakest variant of taken or not-taken respectively.

Perceptron Predictor
--------------------
The ``PerceptronPredictor`` has the same overall structure as the ``GenericPredictor`` but replaces the saturating counter as a means for direction prediction with a perceptron. The ``PerceptronPredictor`` contains the following logic.

Global History
For indexing relevant prediction structures and for retrieving a direction from the perceptrons, a global history can be utilised. The global history value uses n-bits to store the n most recent branch direction outcomes, with the left-most bit being the oldest.

Branch Target Buffer (BTB)
For each entry, the BTB stores the most recent target along with a perceptron for an associated direction. The indexing of this structure uses the lower, non-zero bits of an instruction address XOR'ed with the current global branch history value.

The direction prediction is obtained from the perceptron by taking its dot-product with the global history. The prediction is not taken if this is negative, or taken otherwise. The perceptron is updated when its prediction is wrong or when the magnitude of the dot-product is below a pre-determined threshold (i.e., the confidence of the prediction is low). To update, each ith weight of the perceptron is incremented if the actual outcome of the branch is the same as the ith bit of ``globalHistory_``, and decremented otherwise.

If the supplied branch type is ``Unconditional``, then the predicted direction is overridden to be taken. If the supplied branch type is ``Conditional`` and the predicted direction is not taken, then the predicted target is overridden to be the next sequential instruction.
Comment thread
dANW34V3R marked this conversation as resolved.

Return Address Stack (RAS)
Comment thread
dANW34V3R marked this conversation as resolved.
Identified through the supplied branch type, Return instructions pop values off of the RAS to get their branch target whilst Branch-and-Link instructions push values onto the RAS, for later use by the Branch-and-Link instruction's corresponding Return instruction.
11 changes: 7 additions & 4 deletions docs/sphinx/user/configuring_simeng.rst
Original file line number Diff line number Diff line change
Expand Up @@ -145,20 +145,23 @@ The Branch-Prediction section contains those options to parameterise the branch

The current options include:

Type
The type of branch predictor that is used, the options are ``Generic``, and ``Perceptron``. Both types of predictor use a branch target buffer with each entry containing a direction prediction mechanism and a target address. The direction predictor used in ``Generic`` is a saturating counter, and in ``Perceptron`` it is a perceptron.

BTB-Tag-Bits
The number of bits used to denote an entry in the Branch Target Buffer (BTB). For example, a ``bits`` value of 12 could denote 4096 entries with the calculation 1 << ``bits``.
The number of bits used to index the entries in the Branch Target Buffer (BTB). The number of entries in the BTB is obtained from the calculation: 1 << ``bits``. For example, a ``bits`` value of 12 would result in a BTB with 4096 entries.

Saturating-Count-Bits
The number of bits used in the saturating counter value.
Only needed for a ``Generic`` predictor. The number of bits used in the saturating counter value.

Global-History-Length
The number of bits used to record the global history of branch directions. Each bit represents one branch direction.
The number of bits used to record the global history of branch directions. Each bit represents one branch direction. For ``PerceptronPredictor``, this dictates the size of the perceptrons (with each perceptron having Global-History-Length + 1 weights).

RAS-entries
The number of entries in the Return Address Stack (RAS).

Fallback-Static-Predictor
The static predictor used when no dynamic prediction is available. The options are either ``"Always-Taken"`` or ``"Always-Not-Taken"``.
Only needed for a ``Generic`` predictor. The static predictor used when no dynamic prediction is available. The options are either ``"Always-Taken"`` or ``"Always-Not-Taken"``.

.. _l1dcnf:

Expand Down
1 change: 1 addition & 0 deletions src/include/simeng/CoreInstance.hh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include "simeng/FixedLatencyMemoryInterface.hh"
#include "simeng/FlatMemoryInterface.hh"
#include "simeng/GenericPredictor.hh"
#include "simeng/PerceptronPredictor.hh"
#include "simeng/SpecialFileDirGen.hh"
#include "simeng/arch/Architecture.hh"
#include "simeng/arch/aarch64/Architecture.hh"
Expand Down
88 changes: 88 additions & 0 deletions src/include/simeng/PerceptronPredictor.hh
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
#pragma once

#include <deque>
#include <map>
#include <vector>

#include "simeng/BranchPredictor.hh"
#include "simeng/config/SimInfo.hh"

namespace simeng {

/** A Perceptron branch predictor implementing the branch predictor described in
* Jimenez and Lin ("Dynamic branch prediction with perceptrons", IEEE High-
* Performance Computer Architecture Symposium Proceedings (2001), 197-206 --
* https://www.cs.utexas.edu/~lin/papers/hpca01.pdf).
* The following predictors have been included:
*
* - Static predictor based on pre-allocated branch type.
*
* - A Branch Target Buffer (BTB) with a local and global indexing scheme and a
* perceptron.
*
* - A Return Address Stack (RAS) is also in use.
*/

class PerceptronPredictor : public BranchPredictor {
public:
/** Initialise predictor models. */
PerceptronPredictor(ryml::ConstNodeRef config = config::SimInfo::getConfig());
~PerceptronPredictor();

/** Generate a branch prediction for the supplied instruction address, a
* branch type, and a known branch offset; defaults to 0 meaning offset is not
* known. Returns a branch direction and branch target address. */
BranchPrediction predict(uint64_t address, BranchType type,
int64_t knownOffset = 0) override;

/** Updates appropriate predictor model objects based on the address and
* outcome of the branch instruction. */
void update(uint64_t address, bool taken, uint64_t targetAddress,
BranchType type) override;

/** Provides RAS rewinding behaviour. */
void flush(uint64_t address) override;

private:
/** Returns the dot product of a perceptron and a history vector. Used to
* determine a direction prediction */
int64_t getDotProduct(const std::vector<int8_t>& perceptron,
uint64_t history);

/** The length in bits of the BTB index; BTB will have 2^bits entries. */
uint64_t btbBits_;

/** A 2^bits length vector of pairs containing a perceptron with
* globalHistoryLength_ + 1 inputs, and a branch target.
* The perceptrons are used to provide a branch direction prediction by
* taking a dot product with the global history, as described
* in Jiminez and Lin */
Comment thread
dANW34V3R marked this conversation as resolved.
std::vector<std::pair<std::vector<int8_t>, uint64_t>> btb_;

Comment thread
dANW34V3R marked this conversation as resolved.
/** The previous hashed index for an address. */
std::map<uint64_t, uint64_t> btbHistory_;

/** An n-bit history of previous branch directions where n is equal to
* globalHistoryLength_. */
uint64_t globalHistory_ = 0;

/** The number of previous branch directions recorded globally. */
uint64_t globalHistoryLength_;

/** The magnitude of the dot product of the perceptron and the global history,
* below which the perceptron's weight must be updated */
uint64_t trainingThreshold_;

/** A return address stack. */
std::deque<uint64_t> ras_;

/** RAS history with instruction address as the keys. A non-zero value
* represents the target prediction for a return instruction and a 0 entry for
* a branch-and-link instruction. */
std::map<uint64_t, uint64_t> rasHistory_;

/** The size of the RAS. */
uint64_t rasSize_;
};

} // namespace simeng
1 change: 1 addition & 0 deletions src/lib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ set(SIMENG_SOURCES
FlatMemoryInterface.cc
GenericPredictor.cc
Instruction.cc
PerceptronPredictor.cc
RegisterFileSet.cc
RegisterValue.cc
SpecialFileDirGen.cc
Expand Down
9 changes: 7 additions & 2 deletions src/lib/CoreInstance.cc
Original file line number Diff line number Diff line change
Expand Up @@ -219,8 +219,13 @@ void CoreInstance::createCore() {
arch_ = std::make_unique<arch::aarch64::Architecture>(kernel_);
}

// Construct branch predictor object
predictor_ = std::make_unique<GenericPredictor>();
std::string predictorType =
config_["Branch-Predictor"]["Type"].as<std::string>();
if (predictorType == "Generic") {
predictor_ = std::make_unique<GenericPredictor>();
} else if (predictorType == "Perceptron") {
predictor_ = std::make_unique<PerceptronPredictor>();
}

// Extract the port arrangement from the config file
auto config_ports = config_["Ports"];
Expand Down
Loading