【正文】
e we only want to see the output once ? The single pragma directs piler that only a single thread should execute the block of code the pragma precedes ? Syntax: pragma omp single Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Use of single Pragma pragma omp parallel private(i,j) for (i = 0。 i m。 i++) { low = a[i]。 high = b[i]。 if (low high) { pragma omp single printf (Exiting (%d)\n, i)。 break。 } pragma omp for for (j = low。 j high。 j++) c[j] = (c[j] a[i])/b[i]。 } Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. nowait Clause ? Compiler puts a barrier synchronization at end of every parallel for statement ? In our example, this is necessary: if a thread leaves loop and changes low or high, it may affect behavior of another thread ? If we make these private variables, then it would be okay to let threads move ahead, which could reduce execution time Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Use of nowait Clause pragma omp parallel private(i,j,low,high) for (i = 0。 i m。 i++) { low = a[i]。 high = b[i]。 if (low high) { pragma omp single printf (Exiting (%d)\n, i)。 break。 } pragma omp for nowait for (j = low。 j high。 j++) c[j] = (c[j] a[i])/b[i]。 } Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Functional Parallelism ? To this point all of our focus has been on exploiting data parallelism ? OpenMP allows us to assign different threads to different portions of code (functional parallelism) Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Functional Parallelism Example v = alpha()。 w = beta()。 x = gamma(v, w)。 y = delta()。 printf (%\n, epsilon(x,y))。 a l p h a b e tag a m m a d e l tae p s i l o nMay execute alpha, beta, and delta in parallel Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. parallel sections Pragma ? Precedes a block of k blocks of code that may be executed concurrently by k threads ? Syntax: pragma omp parallel sections Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. section Pragma ? Precedes each block of code within the enpassing block preceded by the parallel sections pragma ? May be omitted for first parallel section after the parallel sections pragma ? Syntax: pragma omp section Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Example of parallel sections pragma omp parallel sections { pragma omp section /* Optional */ v = alpha()。 pragma omp section w = beta()。 pragma omp section y = delta()。 } x = gamma(v, w)。 printf (%\n, epsilon(x,y))。 Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Another Approach a l p h a b e tag a m m a d e l tae p s i l o nExecute alpha and beta in parallel. Execute gamma and delta in parallel. Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. sections Pragma ? Appears inside a parallel block of code ? Has same meaning as the parallel sections pragma ? If multiple sections pragmas inside one parallel block, may reduce fork/join costs Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Use of sections Pragma pragma omp parallel { pragma omp sections { v = alpha()。 pragma omp section w = beta()。 } pragma omp sections { x = gamma(v, w)。 pragma omp section y = delta()。 } } printf (%\n, epsilon(x,y))。 Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Summary (1/3) ? OpenMP an API for sharedmemory parallel programming ? Sharedmemory model based on fork/join parallelism ? Data parallelism ?parallel for pragma ?reduction clause Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Summary (2/3) ? Functional parallelism (parallel sections pragma) ? SPMDstyle programming (parallel pragma) ? Critical sections (critical pragma) ? Enhancing performance of parallel for loops ? Inverting loops ? Conditionally parallelizing loops ? Changing loop scheduling Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Summary (3/3) Characteristic OpenMP MPI Suitable for multiprocessors Yes Yes Suitable for multiputers No Yes Supports incremental parallelization Yes No Minimal extra code Yes No Explicit control of memory hierarchy No Yes