TY - JOUR
T1 - Issue mechanism for embedded simultaneous multithreading processor
AU - Zang, Chengjie
AU - Imai, Shigeki
AU - Frank, Steven
AU - Kimura, Shinji
PY - 2008
Y1 - 2008
N2 - Simultaneous Multithreading (SMT) technology enhances instruction throughput by issuing multiple instructions from multiple threads within one clock cycle. For in-order pipeline to each thread, SMT processors can providelarge number of issued instructions close to or surpass than using out-of-order pipeline. In this work, we show an efficient issue logic for predicated instruction sequence with the parallel flag in each instruction, where the predicate register based issue control is adopted and the continuous instructions with the parallel flag of '0' are executed in parallel. The flag is pre-defined by a compiler. Instructions from different threads are issuedbased on the round-robin order. We also introduce an Instruction Queue skipmechanism for thread if the queue is empty. Using this kind of issue logic,we designed a 6 threads, 7-stage, in-order pipeline processor. Based on this processor, we compare round-robin issue policy (RR(T1-Tn)) with other policies: thread one always has the highest priority (PR(T1)) and thread one or thread n has the highest priority in turn (PR(T1-T n)). The results show that RR(T1-Tn) policy outperforms others and PR(T1-Tn) is almost the same to RR(T1-Tn) from the point of view of the issued instructions per cycle.
AB - Simultaneous Multithreading (SMT) technology enhances instruction throughput by issuing multiple instructions from multiple threads within one clock cycle. For in-order pipeline to each thread, SMT processors can providelarge number of issued instructions close to or surpass than using out-of-order pipeline. In this work, we show an efficient issue logic for predicated instruction sequence with the parallel flag in each instruction, where the predicate register based issue control is adopted and the continuous instructions with the parallel flag of '0' are executed in parallel. The flag is pre-defined by a compiler. Instructions from different threads are issuedbased on the round-robin order. We also introduce an Instruction Queue skipmechanism for thread if the queue is empty. Using this kind of issue logic,we designed a 6 threads, 7-stage, in-order pipeline processor. Based on this processor, we compare round-robin issue policy (RR(T1-Tn)) with other policies: thread one always has the highest priority (PR(T1)) and thread one or thread n has the highest priority in turn (PR(T1-T n)). The results show that RR(T1-Tn) policy outperforms others and PR(T1-Tn) is almost the same to RR(T1-Tn) from the point of view of the issued instructions per cycle.
KW - Balance roundrobin policy
KW - Parallel flag
KW - Simultaneous multithreading
UR - http://www.scopus.com/inward/record.url?scp=78049330656&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78049330656&partnerID=8YFLogxK
U2 - 10.1093/ietfec/e91-a.4.1092
DO - 10.1093/ietfec/e91-a.4.1092
M3 - Article
AN - SCOPUS:78049330656
VL - E91-A
SP - 1092
EP - 1100
JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
SN - 0916-8508
IS - 4
ER -