Show simple item record

A scalable instruction queue design for exploiting parallelism.

dc.contributor.authorRaasch, Steven Earl
dc.contributor.advisorReinhardt, Steven K.
dc.date.accessioned2016-08-30T15:35:04Z
dc.date.available2016-08-30T15:35:04Z
dc.date.issued2004
dc.identifier.urihttp://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:3137927
dc.identifier.urihttps://hdl.handle.net/2027.42/124297
dc.description.abstractTo maximize the performance of wide-issue superscalar out-of-order microprocessors, the issue stage must be able to extract as much instruction-level parallelism (ILP) as possible from the dynamic instruction stream. This dissertation examines several approaches to increasing available ILP while minimizing the impact on cycle time. First, I describe and evaluate a novel instruction queue design (the Segmented Instruction Queue) that eliminates the correspondence between IQ size and cycle time. The 512-entry Segmented IQ achieves between 58% and 98% of the performance of similarly-sized idealized instruction queue of conventional design though the latency of the latter is approximately 256 times larger. The Segmented IQ can be used as a component of a clustered architecture, another approach to reducing cycle-time penalties in wide-issue machines. The dependence tracking mechanism used by the Segmented IQ can be applied to the problem of instruction placement in clustered architectures. By changing the mix of instructions present in the IQ, simultaneous multithreading (SMT) can also be used to increase the amount of available ILP. Under SMT, partitioning schemes are needed to distribute resource among threads; however, some of these schemes, clustered architectures in particular, can significantly reduce SMT workload performance. If an SMT machine is to use a clustered microarchitecture, the choice of instruction placement policy must be carefully evaluated to avoid performance degradation. Experiments show that naively allocating clusters to individual threads, eliminating the dynamic sharing that is the core of SMT, can reduce workload performance on a four-cluster architecture by as much as 26% versus a simple load-balancing scheme. This dissertation presents data that characterizes the performance of SMT workloads in clustered architectures using both conventional instruction queues and segmented instruction queues. Individually, these mechanisms represent viable approaches to increasing available ILP. When the Segmented IQ is used in an SMT processor design, workload performance achieves an average of 80% and 86% of the idealized performance for two- and four-thread workloads, respectively, indicating that these approaches can be combined to form an effective approach to increasing processor utilization and performance.
dc.format.extent159 p.
dc.languageEnglish
dc.language.isoEN
dc.subjectDesign
dc.subjectExploiting
dc.subjectInstruction
dc.subjectMultithreading
dc.subjectParallelism
dc.subjectQueue
dc.subjectQueueing
dc.subjectScalable
dc.subjectScheduling
dc.titleA scalable instruction queue design for exploiting parallelism.
dc.typeThesis
dc.description.thesisdegreenamePhDen_US
dc.description.thesisdegreedisciplineApplied Sciences
dc.description.thesisdegreedisciplineComputer science
dc.description.thesisdegreegrantorUniversity of Michigan, Horace H. Rackham School of Graduate Studies
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/124297/2/3137927.pdf
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.