21
Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings of the 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 2008, , in conjunction with ISCA-35 (35th International Symposium on Computer Architecture)

Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Embed Size (px)

DESCRIPTION

Perceptron Learning Model Perceptron -> a computer model capable of recognizing many classes of patterns. They are simple to implement, have no tunable parameters (learning rate, norm, etc.) and tend to work well empirically. Represented by a vector whose elements are the current weights associated with specific features that are used to predict a particular outcome. The output y of a perceptron is binary and is given by the sign of the dot product of a weights vector w1..n and an input features vector x1..n:

Citation preview

Page 1: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Perceptron-based Coherence Predictors

Naveen R. Iyer

Publication:Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings of the 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 2008, , in conjunction with ISCA-35 (35th International Symposium on Computer Architecture)

Page 2: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Coherence MissesCoherence misses are misses due to

invalidations when using an invalidation-based cache coherence protocol.

Assumption: A simple MSI directory-based cache coherence protocol among L1 caches (private split cache) and CMP architecture consisting of 4, 8 or 16 in-order, single issue processors.

With increasing cache sizes and number of cores in future CMPs, coherence misses will account for a larger fraction of all cache misses.

Page 3: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Perceptron Learning Model

• Perceptron -> a computer model capable of recognizing many classes of patterns. They are simple to implement, have no tunable parameters (learning rate, norm, etc.) and tend to work well empirically.

• Represented by a vector whose elements are the current weights associated with specific features that are used to predict a particular outcome.

• The output y of a perceptron is binary and is given by the sign of the dot product of a weights vector w1..n and an input features vector x1..n:

Page 4: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Perceptron Learning Model (Contd..)

• If the value of y is positive, the perceptron is said to predict a positive outcome. A feedback-based learning algorithm is used to train a perceptron to identify positive correlations between its inputs and the desired output.

• The learning algorithm continually adjusts the weights vector based on whether (and the degree to which) the perceptron’s previous prediction was correct.

• The following pseudocode illustrates a simple unbiased prediction algorithm using additive updates to adjust weights:

Page 5: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Design and ImplementationThe goal of the coherence predictor is to

decide at the time of each write whether that write is the last one by a producer and if it is likely to be consumed by another node.

If so, a write-update is performed by downgrading the writer to shared mode and pushing the dirty data to a set of predicted consumers.

Likely consumers are predicted using a simple heuristic described later.

Page 6: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Perceptron Predictor Block Diagram

Page 7: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Feature Set (Input Set)• A feature is an observable value that can be used to make

a prediction.

• Features can include data such as the last writer of a cache line or whether the last access was a read or write.

• The number of features that we track per cache line is a design parameter that can be adjusted.

• To minimize space requirements, a data memory access is characterized by two simple attributes: (i) the requesting processor ID and (ii) whether the access was a READ or a WRITE.

Page 8: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Update-PredictAlgorithmAt the time of a memory write, we are

able to both determine whether the last prediction was correct and make a prediction about whether this write should perform a write-update.

Consequently, no training is needed at the time of a memory read.

Each time the perceptron makes a prediction, its weights are updated based on whether or not the prediction was correct.

Page 9: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Update-PredictAlgorithm (Contd..)To determine whether the last prediction was

correct, track of what that prediction was (PUSH or NO-PUSH) and the set of nodes to which it was pushed (if any), is kept.

Let S0 (consumer copyset) and S1 (current copyset) denote the complete set of nodes that have a shared copy at the time of the previous and current write operation, respectively.

If PUSH was predicted, then the prediction is accurate iff at least one node in S1 is also in S0, not including the node that performed previous write. If NO-PUSH was predicted, then the prediction is accurate iff no node in S1 is in S0.

Otherwise, the prediction was incorrect.

Page 10: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Update-PredictAlgorithm (Contd..)On a write access to a coherence block, the

following steps are taken:1. Fetch: The corresponding perceptron weights W, copyset C and block access history H are fetched.2. Determine Truth: Determine whether any of the nodes that had a shared copy before the last write (S0) was also a sharer before the current write operation(S1) (not including the last writer). If so, the truth (correct outcome) was that we should have pushed after the last write (t = 1), otherwise the truth was not to push (t = −1).

Page 11: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Update-PredictAlgorithm (Contd..)3. Update Weights (based on previous prediction): Compare the correct outcome with the prediction made at the time of the last write (p). The training algorithm updates the weights in W as described in Section 2. If prediction was correct, we do nothing. If prediction was incorrect, we increase weights if truth was positive (W -> W + H) or decrease weights if truth was negative (W -> W − H).

Page 12: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Update-PredictAlgorithm (Contd..)4. Predict: y (prediction for this write) is computed using the dot product of W and block access history H, (y = …)). If y is positive, we predict PUSH, otherwise we predict NO-PUSH.5. Manage History : H is updated with the current write access information by shifting the oldest feature set out of H and adding the current feature set, i.e., the bit vector that encodes whether this was a read or write and which processor is performing the operation. Further, copyset S0 is set to S1, and S1 is reset to null.

Page 13: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Update-PredictAlgorithm (Contd..)On a read access to a coherence

block, the following steps are taken:1. The reading node is added to the current copyset (S1)2. H is updated with this read access information, i.e., we shift the oldest feature set out of H and shift in a bit vector that encodes this read operation.

Page 14: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Tracking Consumers An extremely simple mechanism was employed to predict

likely consumers of a particular write –just use the set of most recent readers as tracked in the copyset.

Copyset is a bitmap representing the readers since last write to that block, and are reset after each write operation. Subsequent readers continue to add themselves to this list until the next write operation, at which point this new copyset is used to predict the consumers, if a push is predicted.

For example, let S0 and S1 be the set of nodes that have a shared copy at the time of the previous and current write, respectively. On a PUSH prediction, we send updates to the set of nodes given by (S1 <INTERSEC> S0) <UNION> (S1 − S0).

Despite its simplicity, it was found that this mechanism does a good job of predicting likely consumers and does not cause a high number of useless updates.

Page 15: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Evaluation/Results

Page 16: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Evaluation/Results (Contd..)

Page 17: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Evaluation/Results (Contd..)

Page 18: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

ConclusionsIn this paper, we have introduced a new

class of coherence predictors that uses perceptrons to identify opportunities for sending useful write-updates.

Results demonstrate that perceptrons can eliminate a significant number (on average 30%) of coherence misses on a wide range of benchmarks. When coupled with a simple consumer set prediction heuristic, over 87% of speculative updates generated by our predictor are usefully consumed.

Page 19: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Limitations/ImprovementsLesser emphasis on tracking

likely consumersOptimization of feature set (input

set) format.Inability of perceptrons to learn

linearly inseparable functions.Explored the use of only one

online learning scheme, perceptrons.

Page 20: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

ReferencesPerceptron-based Coherence Predictors. D. Ghosh, J.B.

Carter, and H. Duame. In the Proceedings of the 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI), June 2008, , in conjunction with ISCA-35 (35th International Symposium on Computer Architecture).

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Princeton University Technical Report TR-811-08, January 2008.

I. Burcea, S. Somogyi, A. Moshovos, and B. Falsafi. Predictor virtualization. In Proceedings of the 13th International conference on Architectural Support for Programming Languages and Operating Systems, 2008.

Page 21: Perceptron-based Coherence Predictors Naveen R. Iyer Publication: Perceptron-based Coherence Predictors. D. Ghosh, J.B. Carter, and H. Duame. In the Proceedings

Thank You

Any Questions?