freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

shared-memoryprogramming(編輯修改稿)

2024-09-25 14:16 本頁面
 

【文章內(nèi)容簡介】 wer should be Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Race Condition Time Line T h r e a d A T h r e a d BV a l u e o f a r e a1 1 . 6 6 7+ 3 . 7 6 5+ 3 . 5 6 31 1 . 6 6 71 5 . 4 3 21 5 . 2 3 0Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. critical Pragma ? Critical section: a portion of code that only thread at a time may execute ? We denote a critical section by putting the pragma pragma omp critical in front of a block of C code Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Correct, But Inefficient, Code double area, pi, x。 int i, n。 ... area = 。 pragma omp parallel for private(x) for (i = 0。 i n。 i++) { x = (i+)/n。 pragma omp critical area += ( + x*x)。 } pi = area / n。 Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Source of Inefficiency ? Update to area inside a critical section ? Only one thread at a time may execute the statement。 ., it is sequential code ? Time to execute statement significant part of loop ? By Amdahl’s Law we know speedup will be severely constrained Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Reductions ? Reductions are so mon that OpenMP provides support for them ? May add reduction clause to parallel for pragma ? Specify reduction operation and reduction variable ? OpenMP takes care of storing partial results in private variables and bining partial results after the loop Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. reduction Clause ? The reduction clause has this syntax: reduction (op :variable) ? Operators ? + Sum ? * Product ? amp。 Bitwise and ? | Bitwise or ? ^ Bitwise exclusive or ? amp。amp。 Logical and ? || Logical or Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. ?finding Code with Reduction Clause double area, pi, x。 int i, n。 ... area = 。 pragma omp parallel for \ private(x) reduction(+:area) for (i = 0。 i n。 i++) { x = (i + )/n。 area += ( + x*x)。 } pi = area / n。 Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Performance Improvement 1 ? Too many fork/joins can lower performance ? Inverting loops may help performance if ?Parallelism is in inner loop ?After inversion, the outer loop can be made parallel ?Inversion does not significantly lower cache hit rate Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Performance Improvement 2 ? If loop has too few iterations, fork/join overhead is greater than time savings from parallel execution ? The if clause instructs piler to insert code that determines at runtime whether loop should be executed in parallel。 ., pragma omp parallel for if(n 5000) Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Performance Improvement 3 ? We can use schedule clause to specify how iterations of a loop should be allocated to threads ? Static schedule: all iterations allocated to threads before any iterations executed ? Dynamic schedule: only some iterations allocated to threads at beginning of loop’s execution. Remaining iterations allocated to threads that plete their assigned iterations. Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Static vs. Dynamic Scheduling ? Static scheduling ?Low overhead ?May exhibit high workload imbalance ? Dynamic scheduling ?Higher overhead ?Can reduce workload imbalance Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Chunks ? A chunk is a contiguous range of iterations ? Increasing chunk size reduces overhead and may increase cache hit rate ? Decreasing chunk size allows finer balancing of workloads Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. schedule Clause ? Syntax of schedule clause schedule (type[,chunk ]) ? Schedule type required, chunk size optional ? Allowable schedule types ? static: static allocation ? dynamic: dynamic allocation ? guided: guided selfscheduling ? runtime: type chosen at runtime based on value of environment variable OMP_SCHEDULE Copyright 169。 The McGrawHill Companies, Inc. Permission required for reproduction or display. Scheduling Options ? schedule(static): block allocation of about n/t contiguous iterations to each thread ? schedule(static,C): interleaved allocation of chunks of size C to threads ? schedule(dynamic): dynamic oneatatime allocation of iterations to threads ? schedule(dynamic,C): dynamic allocation of C iterations at a time to threads Co
點擊復(fù)制文檔內(nèi)容
教學(xué)課件相關(guān)推薦
文庫吧 www.dybbs8.com
備案圖片鄂ICP備17016276號-1