【正文】
thm. The following algorithm illustrates this. We first start with a simple serial loop for puting the minimum entry in a given list: 1. procedure SERIAL_MIN (A, n) 2. begin 3. min = A[0]。 4. for i := 1 to n ? 1 do 5. if (A[i] min) min := A[i]。 6. endfor。 7. return min。 8. end SERIAL_MIN 67 Recursive Deposition: Example We can rewrite the loop as follows: 1. procedure RECURSIVE_MIN (A, n) 2. begin 3. if ( n = 1 ) then 4. min := A [0] 。 5. else 6. lmin := RECURSIVE_MIN ( A, n/2 )。 7. rmin := RECURSIVE_MIN ( amp。(A[n/2]), n n/2 )。 8. if (lmin rmin) then 9. min := lmin。 10. else 11. min := rmin。 12. endelse。 13. endelse。 14. return min。 15. end RECURSIVE_MIN 68 Recursive Deposition: Example The code in the previous foil can be deposed naturally using a recursive deposition strategy. We illustrate this with the following example of finding the minimum number in the set {4, 9, 1, 7, 8, 11, 2, 12}. The task dependency graph associated with this putation is as follows: 69 Data Deposition ? Identify the data on which putations are performed ? Partition this data across various tasks ? This partitioning induces a deposition of the problem ? Data can be partitioned in various ways this critically impacts performance of a parallel algorithm 70 Data Deposition: Output Data Deposition ? Often, each element of the output can be puted independently of others (but simply as a function of the input). ? A partition of the output across tasks deposes the problem naturally. 71 Output Data Deposition: Example Consider the problem of multiplying two n x n matrices A and B to yield matrix C. The output matrix C can be partitioned into four tasks as follows: Task 1: Task 2: Task 3: Task 4: 72 Output Data Deposition: Example A partitioning of output data does not result in a unique deposition into tasks. For example, for the same problem as in previous foil, with identical output data distribution, we can derive the following two (other) depositions: Deposition I Deposition II Task 1: C1,1 = A1,1 B1,1 Task 2: C1,1 = C1,1 + A1,2 B2,1 Task 3: C1,2 = A1,1 B1,2 Task 4: C1,2 = C1,2 + A1,2 B2,2 Task 5: C2,1 = A2,1 B1,1 Task 6: C2,1 = C2,1 + A2,2 B2,1 Task 7: C2,2 = A2,1 B1,2 Task 8: C2,2 = C2,2 + A2,2 B2,2 Task 1: C1,1 = A1,1 B1,1 Task 2: C1,1 = C1,1 + A1,2 B2,1 Task 3: C1,2 = A1,2 B2,2 Task 4: C1,2 = C1,2 + A1,1 B1,2 Task 5: C2,1 = A2,2 B2,1 Task 6: C2,1 = C2,1 + A2,1 B1,1 Task 7: C2,2 = A2,1 B1,2 Task 8: C2,2 = C2,2 + A2,2 B2,2 73 Output Data Deposition: Example Consider the problem of counting the instances of given itemsets in a database of transactions. In this case, the output (itemset frequencies) can be partitioned across tasks. 74 75 Output Data Deposition: Example From the previous example, the following observations can be made: ? If the database of transactions is replicated across the processes, each task can be independently acplished with no munication. ? If the database is partitioned across processes as well (for reasons of memory utilization), each task first putes partial counts. These counts are then aggregated at the appropriate task. 76 Input Data Partitioning ? In many cases, this is the only natural deposition because the output is not clearly known apriori (., the problem of finding the minimum in a list, sorting a given list, etc.). ? A task is associated with each input data partition. The task performs as much of the putation with its part of the data. Subsequent processing bines these partial results. 77 Input Data Partitioning: Example In the database counting example, the input (., the transaction set) can be partitioned. This induces a task deposition in which each task generates partial counts for all itemsets. These are bined subsequently for aggregate counts. 78 Partitioning Input and Output Data Often input and output data deposition can be bined for a higher degree of concurrency. For the itemset counting example, the transaction set (input) and itemset counts (output) can both be deposed as follows: 79 80 Intermediate Data Partitioning ? Computation can often be viewed as a sequence of transformation from the input to the output data. ? In these cases, it is often beneficial to use one of the intermediate stages as a basis for deposition. 81 Intermediate Data Partitioning: Example Let us revisit the example of dense matrix multiplication. We first show how we can visualize this putation in terms of intermediate matrices D. 82 Intermediate Data Partitioning: Example A deposition of intermediate data structure leads to the following deposition into 8 + 4 tasks: Stage I Stage II Task 01: D1,1,1= A1,1 B1,1 Task 02: D2,1,1= A1,2 B2,1 Task 03: D1,1,2= A1,1 B1,2 Task 04: D2,1,2= A1,2 B2,2 Task 05: D1,2,1= A2,1 B1,1 Task 06: D2,2,1= A2,2 B2,1 Task 07: D1,2,2= A2,1 B1,2 Task 08: D2,2,2= A2,2 B2,2 Task 09: C1,1 = D1,1,1 + D2,1,1 Task 10: C1,2 = D1,1,2 + D2,1,2 Task 11: C2,1 = D1,2,1 + D2,2,1 Task 12: C2,2 = D1,2,2 + D2,2,2 83 Intermediate Data Partitioning: Example The task dependency graph for the deposition (shown in previous foil) into 12 tasks is as follows: 84 The Owner Computes Rule ? The Owner Computes Rule generally states that the process assigned a particular data item is responsible for all putation associated with it. ? In the case of input data deposition, the owner putes rule implies that all putations that use the input data are performed by the process. ? In the case of output data deposition, the owner putes rule implies that the output is puted by the p