

| Situation<br>No dependence              | Example code<br>sequence |                                                                | Action                                                                                                                                                             |  |
|-----------------------------------------|--------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
|                                         | LD<br>DADD<br>DSUB<br>OR | <b>R1,</b> 45(R2)<br>R5,R6,R7<br>R8,R6,R7<br>R9,R6,R7          | No hazard possible because no dependence<br>exists on R1 in the immediately following three<br>instructions.                                                       |  |
| Dependence<br>requiring stall           | LD<br>DADD<br>DSUB<br>OR | <b>R1,</b> 45(R2)<br>R5, <b>R1,</b> R7<br>R8,R6,R7<br>R9,R6,R7 | Comparators detect the use of R1 in the DADD<br>and stall the DADD (and DSUB and OR) before the<br>DADD begins EX.                                                 |  |
| Dependence<br>overcome by<br>forwarding | LD<br>DADD<br>DSUB<br>OR | <b>R1,</b> 45(R2)<br>R5,R6,R7<br>R8, <b>R1,</b> R7<br>R9,R6,R7 | Comparators detect use of R1 in DSUB and forward result of load to ALU in time for DSUB to begin EX.                                                               |  |
| Dependence with accesses in order       | LD<br>DADD<br>DSUB<br>OR | <b>R1,</b> 45(R2)<br>R5,R6,R7<br>R8,R6,R7<br>R9, <b>R1</b> ,R7 | No action required because the read of R1 by 0R<br>occurs in the second half of the ID phase, while<br>the write of the loaded data occurred in the first<br>half. |  |

UC. Colorado Springs

Adapted from ©UCB97 & ©UCB03

CS420/520 pipeline.2



| Software                  | e (compiler) Stati                            | c Sched                     | uling             | / ILP                              |
|---------------------------|-----------------------------------------------|-----------------------------|-------------------|------------------------------------|
| ° Software s<br>program o | scheduling: the goal<br>order only where it a | l is to expl<br>affects the | oit ILP<br>outcol | by preserving<br>me of the program |
| Try producing             | y fast code for                               |                             |                   |                                    |
| a = b -                   | + C;                                          |                             |                   |                                    |
| d = e -                   | – f;                                          |                             |                   |                                    |
| assuming a, k             | o, c, d, e, and f in me                       | emory.                      |                   |                                    |
| Slow code:                |                                               | Fast cod                    | <u>e:</u>         |                                    |
| LW                        | Rb,b                                          | I                           | LW                | Rb,b                               |
| LW                        | Rc,c                                          | I                           | LW                | Rc,c                               |
| ADD                       | Ra,Rb,Rc                                      | <u> </u>                    | LW                | Re,e                               |
| SW                        | a,Ra                                          | 1                           | ADD               | Ra,Rb,Rc                           |
| LW                        | Re,e                                          | I                           | LW                | Rf,f                               |
| LW                        | Rf,f                                          | <u> </u>                    | SW                | a,Ra                               |
| SUB                       | Rd,Re,Rf                                      | ;                           | SUB               | Rd,Re,Rf                           |
| SW                        | d,Rd                                          | :                           | SW                | d,Rd                               |
| CS420/520 pipeline.4      | UC. Colorado Sprir                            | ings                        |                   | Adapted from ©UCB97 & ©UCB03       |



| Data Depend                                                   | dences                                                            |                                                                      |
|---------------------------------------------------------------|-------------------------------------------------------------------|----------------------------------------------------------------------|
| ° Data depender                                               | nce                                                               |                                                                      |
| <ul> <li>Instruction<br/>instruction</li> </ul>               | n <i>i</i> produces a result th<br>N <i>j</i>                     | at may be used by                                                    |
| <ul> <li>Instruction<br/>instruction<br/>dependent</li> </ul> | ) <i>j</i> is data dependent o<br>h k is data dependent o<br>ces) | n instruction <i>k</i> , and<br>on instruction <i>i ( a chain of</i> |
| loop                                                          | :                                                                 |                                                                      |
|                                                               | LD F0, 0(R1)                                                      |                                                                      |
|                                                               | DADD F4, F0, F2                                                   |                                                                      |
|                                                               | SD F4, 0(R1)                                                      |                                                                      |
|                                                               | DAADI R1, R1, -8                                                  |                                                                      |
|                                                               | BNE R1, R2, Loo                                                   | p                                                                    |
|                                                               |                                                                   |                                                                      |
|                                                               |                                                                   |                                                                      |
|                                                               |                                                                   |                                                                      |
| CS420/520 pipeline.6                                          | UC. Colorado Springs                                              | Adapted from ©UCB97 & ©UCB03                                         |

| ° Name depe                                       | ndence                         | (not-true-dat                                              | a-hazard)                                                                                            |
|---------------------------------------------------|--------------------------------|------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
| <ul> <li>Occurs<br/>memory<br/>betweet</li> </ul> | when t<br>/ location<br>the in | wo instruction<br>on, called a <i>n</i> a<br>structions as | ns use the same register or<br><i>am</i> e, but there is no flow of data<br>sociating with that name |
| Remem                                             | ber: do                        | not be restrie                                             | cted to the 5-stage pipeline!                                                                        |
| Anti-de                                           | epende                         | nce (WAR)                                                  |                                                                                                      |
| <i>j</i> wr                                       | ites a re                      | egister or mei                                             | nory location that <i>i</i> reads:                                                                   |
|                                                   | ADD                            | \$1, \$2, \$4                                              | What if SUB does earlier than ADD?                                                                   |
|                                                   | SUB                            | \$4, \$5, \$6                                              | Is there a data flow?                                                                                |
| Output                                            | depen                          | dence (WAW)                                                |                                                                                                      |
| <i>i</i> an                                       | d <i>j</i> write               | e the same re                                              | gister or memory location                                                                            |
|                                                   | SUB                            | \$4, \$2, \$7                                              | What if SUDI does earlier then SUD?                                                                  |
|                                                   | SUBI                           | \$4, \$5, 100                                              | Is there a data flow?                                                                                |
| How ma                                            | ny ways                        | for data to flow l                                         | between instructions?                                                                                |
| S420/520 pipeline.7                               |                                | UC. Colorado Springs                                       | Adapted from ©UCB97 & ©UCB0                                                                          |

| <ul> <li>Data hazards<br/>and write acce</li> </ul>    | may be classified, dependi<br>esses in the instructions         | ng on the order of read                                |
|--------------------------------------------------------|-----------------------------------------------------------------|--------------------------------------------------------|
| ° RAR (read aft                                        | er read) is not a hazard, no                                    | r a name dependence                                    |
| ° RAW (read af                                         | ter write):                                                     |                                                        |
| <ul> <li><i>j</i> tries to r<br/>the old va</li> </ul> | ead a source before <i>i</i> write<br>lue; most common type – t | s it, so <i>j</i> incorrectly gets<br>rue data hazards |
| Exa                                                    | ample?                                                          |                                                        |
|                                                        |                                                                 |                                                        |
|                                                        |                                                                 |                                                        |
|                                                        |                                                                 |                                                        |
|                                                        |                                                                 |                                                        |

| Data Hazard                                     | s - WAW                                                    |                               |
|-------------------------------------------------|------------------------------------------------------------|-------------------------------|
| ° WAW (write afte                               | er write):                                                 |                               |
| <ul> <li>Output dependent operand be</li> </ul> | endence of name hazard<br>fore it is written by <i>i</i> . | s: <i>j</i> tries to write an |
| Can you no                                      | minate an example?                                         |                               |
| Short/I                                         | ong pipelines                                              |                               |
| MULT                                            | F F4, F5, F6                                               |                               |
| LD F4,                                          | <b>0(F1)</b>                                               |                               |
| Is WAW pos<br>integer p                         | ssible in the MIPS classic<br>ipelining? Why?              | c five-stage                  |
| CS420/520 pipeline.9                            | UC. Colorado Springs                                       | Adapted from ©UCB97 & ©UCB03  |





| Before:             | DDIV                    | F0, F2, F4                | //onti-donondonco DSUB - F8 WAD   |
|---------------------|-------------------------|---------------------------|-----------------------------------|
|                     | DADD                    | F6, F0, F8                | //output dependence DMUL-F6, WAW  |
|                     | SD                      | F6, 0(R1)                 |                                   |
|                     | DSUB                    | F8, F10, F14              | // How many true data dependences |
|                     | DMUL                    | F6, F10, F8               | // How many true data dependences |
| After:              | DDIV                    | F0, F2, F4                |                                   |
|                     | DADD                    | <mark>S</mark> , F0, F8   |                                   |
|                     | SD                      | <mark>S</mark> , 0(R1)    |                                   |
|                     | DSUB                    | <mark>T</mark> , F10, F14 |                                   |
|                     | DMUL                    | F6, F10, T                |                                   |
| What are            | depende                 | ncies there?              |                                   |
| What dep            | endencie                | s disappear?              | And what are still there?         |
| What to d           | o with the              | e subsequenc              | e use of F8?                      |
| Finding the hardwar | e subsequ<br>re support | ent use of F8 re          | quires compiler analysis or       |
| VE20 pipeline 12    |                         | UC Colorado Springs       |                                   |



























| ° The API is defined in the                             | ANSI/IEEE POSIX 1003.1 – 1995                    |
|---------------------------------------------------------|--------------------------------------------------|
| <ul> <li>Naming conventions: al<br/>pthread_</li> </ul> | l identifiers in the library begins with         |
| <ul> <li>Three major classes of s</li> </ul>            | subroutines                                      |
| - Thread management                                     | , mutexes, condition variables                   |
|                                                         |                                                  |
| Routine Prefix                                          | Functional Group                                 |
| pthread_                                                | Threads themselves and miscellaneous subroutines |
| pthread_attr_                                           | Thread attributes objects                        |
| pthread_mutex_                                          | Mutexes                                          |
| pthread_mutexattr_                                      | Mutex attributes objects.                        |
| pthread_cond_                                           | Condition variables                              |
| nthread condattr                                        | Condition attributes objects                     |
| pilleau_condatti_                                       |                                                  |











