1.04k likes | 1.23k Vues
The ASC Language. Part 2 of Associative Computing. Contents. References The MST example and background Variables and Data Types Operator Notation Input and Output Mask Control Statements Loop Control Statements Accessing Values in Parallel Variables Performance Monitor
E N D
The ASC Language Part 2 of Associative Computing
Contents • References • The MST example and background • Variables and Data Types • Operator Notation • Input and Output • Mask Control Statements • Loop Control Statements • Accessing Values in Parallel Variables • Performance Monitor • Subroutines and other topics • Basic Program Structure • Software Location & Execution Procedures • An Online ASC Program & Data File to Execute • ASC Code for the MST algorithm for a directed graph • The Shortest Path homework problem
References • “ASC Primer” by Professor Jerry Potter is the primary reference • Copy is posted on lab website under “software” • Lab website is at www.cs.kent.edu/~parallel/ • ASC Primer is the reading assignment for this slide set • “Associative Computing” book by Jerry Potter has a lot of additional information. • Both references use a directed-graph version of the Minimal Spanning Tree as an important example.
Features of Potter’s MST Algorithm • Both versions of MST are based on Prim’s sequential MST algorithm • See one of Baase’s algorithm books in references • A drawback of this algorithm is that it requires 1 PE for each graph edge, which in worst case is n2 – n = (n2) • Unlike earlier MST algorithm, this is not an “optimal cost” parallel algorithm • An advantage is that it works for undirected graphs • Earlier MST algorithm can probably be extended to work for directed graphs. • Uses less memory for most graphs than earlier algorithm • True especially for sparse graphs • Often will require a total of only O(n) memory locations, since the memory required for each PE is a small constant. • In the worst case, at most O(n2) memory locations are needed. • Earlier algorithm always requires (n2) memory locations, as each PE stores a row of the adjacency matrix.
Representing the Graph in a Procedural Language • We need to find edges that are incident to a node of the graph. What kind of data structure could be used to make this easy? Typically there are two choices: • An adjacency matrix • Label the rows and columns with the node names. • Put the weight w in row i and column j if i is incident to j with weight w. • Doing this we would use the representation of the graph in the problem as follows ...
2 A B 7 4 3 6 F C G 5 1 2 3 2 6 I H 4 8 2 E D 1 Graph Example for MST
An Adjacency Matrix For the Graph in the Problem A B C D E F G H I A B C D E F G H I
An Alternative Representation That Is Useful for Sequential Algorithms • Another possibility is to use adjacency lists,which can allow some additional flexibility for this problem in representing the rest of the data – namely the sets v1, v2, and v3. • It is in these type of representations that we see pointers or references play a role. • We link off of each node, all of the nodes which are incident to it, keeping them in increasing order by label.
Adjacency Lists for the Graph in the Problem ETC..... G, H, and I will have 4 entries; all others have 3. In each list, the nodes are in increasing order by node label. Note: if the node label ordering is clear, the A, B, ... need not be stored.
Adding the Other Information Needed While Finding the Solution • Consider one of the states during the run, right after the segment AG is selected: B C F A I G H V1 V2 How will this data be maintained?
We need to know: • Which set each node is in. • What is each node’s parent in the tree formed below by both collections. • A list of the candidate nodes. B C F A I G H V1 V2
A Typical Data Structure Used For This Problem Is Shown I V2lnk V2 elements are linked via yellow entries with V2lnk the head and the tail: I H C F Light blue entries appeared in earlier states, but are no longer in use. Red entries say what set the node is in. Green entries give parent of node and orange entries give edge weights. The adjacency lists are not shown, but are linked off to right. A B C D E F G H I
I Is Now Selected and We Update I V2lnk I is now in V1 so change its set value to 1. Look at nodes adjacent to I : E, F, G, H and add them to V2 if they are in V3: E is added ... A B C D E F G H I
E was Just Added to V2 E V2lnk Store I’s link H in E’s position and E in V2lnk. This makes I’s entry unreachable. So V2 is now E H C F Now we have to add relevant edges for I to any node in V2. A B C D E F G H I
Add Relevant Edges For I to V2 Nodes E V2lnk Walk I’s adjacency list: E F G H E was just added so select EI with weight 2. wgt(FI) = 5 < wgt(FA) was 7, so drop FA and add FI (see black blocks) G is in V1 so don’t add GI. wgt(HI) = 4 > wgt(HG) = 3, so no change. This is now ready for next round. A B C D E F G H I
Complexity Analysis (Time and Space) for Prim’s Sequential Algorithm • Assume • the preceding data structure is used. • The number of nodes is n • The number of edges is m • Space used is 4n plus the space for adjacency lists. • The adjacency list are Θ(m), which in worst case is Θ(n2) • This data structure sacrifices space for time. • Time is Θ(n2) in the worst case. • The adjacency list of each node is traversed only once when it is added to tree. The total work of comparing weights and updating the chart during all of these traversals is Θ(m), • There are n-1 rounds, as one tree node is selected each round. • Walking the V2 list to find the minimum could require n-1 steps the first round, n-2 the second, etc for a max of Θ(n2) steps
Alternate ASC Implemention of Prim’s Algorithm using this Approach • After setting up a data structure for the problem, we now need to code it by manipulating each state as we did on the preceding slides. • ASC model provides an easier approach. • Recall that ASC does NOT support pointers or references. • The associative searching replaces the need for these. • Recall, we collectively think of the PE processor memories as a rectangular structure consisting of multiple records. • We will next introduce basic features of the ASC language in order to implement this algorithm.
Structuring the MST Data for ASC • There are 15 bidirectional edges in the graph or 30 edges in total. • Each directed edge will have a head and a tail. • So, the bidirectional edge AB will be represented twice – once as having head A and tail B and once as having head B and tail A • We will use 30 processors and in each PE’s memory we will store an edge representation as: • State is 0, 1, 2, or 3 and will be explained shortly.
ASC Data Types and Variables • ASC has eight data types… • int (i.e., integer), real, hex (i.e., base 16), oct (i.e., base 8), bin (i.e., binary), card (i.e., cardinal), char (i.e., character), logical, index. • Card is used for unsigned integer data. • Variables can either be scalar or parallel.
ASC Parallel Variables • Parallel variables reside in the memory of individual processors. • Consequently, tail, head, weight, and state will be parallel variables. • In ASC, parallel variables are declared using an array-like notation, with $ in index: char parallel tail[$], head[$]; int parallel weight[$], state[$];
ASC Scalar and Index Variables • Scalar variables in ASC reside in the IS (i.e., the front end computer), not in the PE’s memories. • They are declared as char scalar node; • Index variables in ASC are used to manipulate the index (i.e. choice of an individual processor) of a field. For example, graph[xx] • They are declared as: index parallel xx[$]; • They occupy 1 bit of space per processor
Logical Variables and Constants • Logical variables in ASC are boolean variables. They can be scalar or parallel. • ASC does not formally distinguish between the index parallel and logical parallel variables • The correct type should be selected, based on usage. • If you prefer to work with the words TRUE and FALSE, you can define logical constants by deflog (TRUE, 1); deflog (FALSE, 0); • Constant scalars can be defined by define (identifier, value);
Logical Parallel Variables needed for MST • These are defined as follows: logical parallel nextnod[$], graph[$], result[$]; • The use of these will become clear in later slides. • For the moment, recognize they are just bit variables, one for each PE.
Array Dimensions • A parallel variable can have up to 3 dimensions • First dimension is “$”, the parallel dimension • The array numbering is zero-based, so the declaration int parallel A[$,2] creates the following 1dimensional variables: A[$,0], A[$,1], A[$,2]
Mixed Mode Operations • Mixed mode operations are supported and their result has the “natural” mode. For example, given declarations int scalar a, b, c; int parallel p[$], q[$], r[$], t[$,4]; index parallel x[$], y[$]; then c = a + b is a scalar integer q[$] = a + p[$] is a parallel integer variable a + p[x] is a integer value r[$] = t[x,2]+3*p[$] is a parallel integer variable x[$] = p[$] .eq. r[$] is an index parallel variable • More examples are given on page 9-10 of ASC Primer
The Memory Layout for MST • As with most programming languages, the order of the declarations determines the order in which the variables are identified in memory. • To illustrate, suppose we declare for MST char parallel tail[$], head[$]; int parallel weight[$], state[$]; int scalar node; index parallel xx[$]; logical parallel nexnod[$], graph[$], result[$]; • The layout in the memories is given on next slide • Integers default to the word size of the machine so ours would be 32 bits.
The Memory Layout for MST tail head weight state xx nxt gr res PE 0 1 2 3 4 p-1 p Last 4 are bit fields. The last 3 are named: nxtnod graph result
Operator Notation Relational and Logical Operators • Original syntax came from FORTRAN and the examples in the ASC Primer use that syntax. • However, the more modern syntax is supported: .lt. < .not. ! .gt. > .or. || .le. <= .and. && .ge. >= .xor. -- .eq. == .ne. != Arithmetic Operators addition + multiplication * division /
Parallel Input in ASC • Input for parallel variables can be interactive or from a data file in ASC. • We will run in a command window so file input will be handled by redirection • If you are not familiar with command window handling or Linux (Unix), this will be shown. • In either case, the data is entered in columns just like it will appear in the read command. • Do not use tabs. • THE LAST LINE MUST BE A BLANK LINE!
Parallel read and Associate Command • The format of the Parallel readstatement is read parvar1, parvar2,... in <logical parallel var> • The command only works with parallel variables, not scalars. • Input variables must be associated with a logical parallel variable before the read statement. • The logical variable is used to indicate which PEs was used on input. After the read statement, the logical parallel variable will be true (i.e., 1) for all processors holding input values.
Parallel Input in ASC • The associate command and the read command for MST would be: associate head[$], tail[$], weight[$], state[$] in graph[$]; read tail[$], head[$], weight[$] in graph[$]; • Blanks can be used rather than commas, as indicated by MST example on pg 35 of Primer. • Commenting Code: /* This is the way to comment code in ASC */
Input of Graph • Suppose we were just entering the data for AB, AG, AF, BA, BC, and BG. • Order is not important, but the data file would look like: and memory would like: tail head weight graph A B 2 1 A G 5 1 A F 9 1 B A 2 1 B C 4 1 B G 6 1 0 0 0 0 A B 2 A G 5 A F 9 B A 2 B C 4 B G 6 blank line
Scalar variable input • Static input can be handled in the code. • Also, define or deflog statements can be used to handle static input. • Dynamic input is currently not supported directly, but can be accomplished as follows: • Reserve a parallel variable dummy (of desired type) for input. • Reserve a parallel index variable used. • Values to be stored in scalar variables are first read into dummy using a parallel-read and then transferred using get or next to the appropriate scalar variable. • Example: read dummy[$] in used[x]; get x in used[$] scalar-variable = dummy[x]; endget x;
Input Summary • Direct scalar input is not directly supported. • Scalars can be set as constants or can be set during execution using various commands. • We will see this shortly • We will be able to output scalar variables • This will also be handy for debugging purposes. • The main problem on input is to remember to include the blank line at the end. • I suggest always printing your input data initially so you see it is going in properly.
Parallel Variable Output • Format for parallel print statement is print parvar1, parvar2,... in <logical parallel var> • Again, variables to be displayed must be associated with a logical parallel variable first. • You can use the same association as for the read command: associate tail[$], head[$], weight[$] with graph[$]; read tail[$], head[$], weight[$] in graph[$]; print tail[$], head[$], weight[$] in graph[$]; • You can use a logical parallel variable that has been set with another statement, like an IF statement, to control which PEs will output data.
MST Example • Suppose state[$] holds information about whether a node is in V1, V2, etc. • Then, you could set up an association by if (state[$] == 1) then result[$] = TRUE; endif; • You can print with this association as follows: print tail[$], head[$], weight[$] in result[$]; • Only those records where state[$] == 1 would be printed.
Output Using “msg” • The msg command • Used to display user text messages. • Used to display values of scalar variables. • Used to display a dump of the parallel variables. • The entire parallel variable contents printed • Status of active responders or association variables ignored • Format: msg “string” list’ • msg “The answers are” max BB[X] B[$] • See Page 13-14 of ASC Primer
Assignment Statements • Assignment can be made with compatible expressions using the equal sign with • scalar variables • parallel variables • logical parallel variables • The data types normally have to be the same on both sides of the assignment symbol – i.e. don’t mix scalar and parallel variables. • A few special cases are covered on the next slide
Some Assignment Statement Special Cases Declarations for Examples: int scalar k; int parallel b[$]; Index parallel xx[$]; • If xx is an index variable with a 1 in at least one of its components, then following is valid: k = aa[xx] + 5; • Here, the component of aa used is one where xx is 1. • While selection is arbitrary (e.g., pick-one), this implementation selects the smallest index where xx[$] is 1. • The assignment of integer arithmetic expressions to integer parallel variables is supported. b[xx] = 3 + 5 • This statement assigns an 8 to the “xx component” of b. • The component selected is identified by first “1” in xx. • See pg 9-10 of Primer for more examples.
Before: mask aa[$] b[$] c[$] 1 2 3 4 1 3 5 3 0 2 4 -3 0 6 4 1 1 2 -3 -6 After: mask aa[$] b[$] c[$] 1 7 3 4 1 8 5 3 0 2 4 -3 0 6 4 1 1 -9 -3 -6 Exampleaa[$] = b[$] + c[$]; (1) 1 Note: “a” is a reserved word in ASC and so it can’t be used as a variable name. (see ASC Primer, pgs 29-30 and 39)
Setscope Mask Control Statement • Format: setscope <logical parallel variable> body endsetscope; • Resets the parallel mask register • setscope jumps out of current mask setting to the new mask given by its logical parallel variable. • One use is to reactivate currently inactive processors. • Also allows an immediate return to a previously calculated mask, such as an association. • Is an unstructured command such as go-to and jumps from current environment to a new environment. • Use sparingly • endsetscope resets mask to preceding setting.
Before setscope: mask aa used tail 1 5 1 7 1 22 0 6 1 5 1 9 0 41 0 7 After setscope: used= aa mask tail 5 1 100 22 0 6 5 1 100 41 0 7 Example logical parallel used[$];...used[$] = aa[$] == 5;setscope used[$] tail[$] = 100;endsetscope; After endsetscope: aa mask tail 5 1 100 22 1 6 5 1 100 41 0 7
The Scalar IF Statement • Scalar IF – similar to what you have used before – i.e. a branching statement – with the else part optional. Example: int scalar k; ... if k == 5 then sum =0; else b = sum; endif;
The Parallel IF Mask Control Statement • Looks like scalar IF except instead of a scalar logical expression, a parallel logical expression is encountered. • Format: if <logical parallel expression> then <body of then> [ else <body of else> ] endif • Although it looks similar, the execution is considerably different. • The parallel version normally executes both “bodies”, each for the appropriate processors • Useful as a parallel search control statement
Operation Steps of Parallel IF • Save the mask bit of processors that are currently active. • Broadcast code to the active processors to calculate the IF boolean expression. • If the boolean expression is true for an active processor, set its individual cell mask bit to TRUE; otherwise set its mask bit to FALSE. • Broadcast code for the “then” portion of the IF statement and execute it on the (TRUE) responders. • Compliment the mask bits for the processors that were active at step 1. (Others remain FALSE) • Broadcast code for the “else” portion of the IF statement and execute it on the active processors. • Reset the mask to original mask at Step 1.
Before: b mask 1 1 7 1 2 1 1 1 1 0 After: b then mask else mask 2 1 0 -1 0 1 -1 0 1 2 1 0 1 0 0 Example if (b[$] == 1) then b[$] =2; else b[$] = -1 endif;
IF-NOT-ANY Format if <logical parallel expression> then body of if elsenany body of not any endif; • Note this is an “if” statement with an embedded ELSENANY clause. • The “any”, “elsenany”, and “nany” commands provide support for the “AnyResponders” associative property in ASC.
The IF-NOT-ANY Mask Control Statement • Only one part of IF statement is executed. • Useful as a parallel search control statement • Steps • Evaluate the conditional statement. • If there are one or more active responders, execute the “then” block. • If there is no active responders, the ELSE-NOT-ANY (ELSENANY) block is executed. • When executing the ELSENANY part, the original mask is used – i.e. the one prior to the IF-NOT-ANY statement.
Before: aa b c 1 17 0 2 13 0 2 8 0 3 12 0 2 9 0 4 67 0 0 0 0 0 12 0 After: mask1 mask2 aa b c 0 0 1 17 0 1 0 2 13 0 1 0 2 8 0 1 1 3 121 1 0 2 9 0 0 0 4 67 0 0 0 0 0 0 0 0 0 12 0 Recall: uses set mask Example if aa[$] > 1 && aa[$] < 4 /*sets mask*/ if b[$] == 12 then c[$] = 1 /* search for b ==12 */ elsenany c[$] = 9; endif; /* action if no b is 12*/ endif;
Before: aa b c 1 17 0 2 13 0 2 8 0 3 4 0 2 9 0 4 67 0 0 0 0 0 12 0 After: mask1 mask2 aa b c 0 0 1 17 0 1 0 2 13 9 1 0 2 8 9 1 0 3 4 9 1 0 2 9 9 0 0 4 67 0 0 0 0 0 0 0 0 0 12 0 Recall – uses original mask Example if aa[$] > 1 && aa[$] < 4 /*sets mask*/ if b[$] == 12 then c[$] = 1 /* search for b ==12 */ elsenany c[$] = 9; endif; /* action if no b is 12*/ endif;