Group Members: Clara, Sean
VLIW
Decent overview. http://www.semiconductors.philips.com/acrobat_download/other/vliw-wp.pdf
Short outline at http://www.free-definition.com/VLIW.html
Another good intro http://www.byte.com/art/9604/sec8/art3.htm
And another http://www.digit-life.com/articles2/vliw/
Vector vs. scalar and VLIW architectures http://iram.cs.berkeley.edu/papers/2002.MICRO35.comparison.pdf
VLIW in Transmeta's Crusoe http://www.transmeta.com/crusoe/vliw.html
PC Processor guide -- not sure how good it is. http://www.x86.org/articles/computalk/help.htm
Outline
- Why use VLIW?
- Lots of stuff in processors. How can we maximize the amount of it that we're using at a time?
- Pipelining -- already do this
- Multiple processors -- only good for a limited set of applications?
- Superscalar implemenation -- apply for all applicaitons. "Fetch, issue excution units, and complete more than one instruction at time" Preserves architectural compatibility (eg, works with x86 architecture). Hardware looks for opportunities where it can do this.
- Multiple, independent operations per instruction (VLIW) -- very similar to to those of superscalar. However, hardware isn't responsible for discovery -- already encoded in the long instruction word. Keeps hardware much more simple than superscalar stuff. Very cheap parallel implementation! Think of it as a simplified superscalar processor.
- Lots of stuff in processors. How can we maximize the amount of it that we're using at a time?
Architecture -- this is the tools available to the programmer, what he/she has to do. Instruction formats, instruction semantics, registers, how you address memory, etc. (Hardware --> implementation)
- [see comparison table]
- Like RISC but longer. Three blocks of instructions: block 1 = branch, block 2 = ALU, block 3 = load/store. Slots filled as efficiently as possible by the compiler.
- [code fragment example]
- CISC: three cycles
- RISC: three (faster) cycles
- VLIW: 1 cycle equivalent if fully packed.