Home

an API for Asynchronous Parallel Programming User's Guide

image

Contents

1. 0 101 2 2 9 eurgepi SO T N e33e Ted tO 0 eNe 1922914 NOTAWZTIVNSTA agc c cde 9332eTed f93c T11 5 1933 Ted 1996 1011 H 1933 Ted 0 131q34ox ggc c1arq3 ox 0 IT1arq34ox 0 ol1arq3 fox este 70 141q34ox 0 z141q84ox 0 r1arq3 ox ggz lo1arq3 ox qe sc an IT f o fu xu peqepsrespp9dde estes 9424S peep sr 2842 TT e pez ur sre dde ni e s ear e ST PYL TT Y QZ UT T92 O uoraezr ensra 104 5 Au aeoTg 1 38o1z 4u aeoTy T qeoTg xu qeo12 1 38017 z Te sT3 9 16 0 z 3eTsueal13 19691 Y uri qur 3980 I FP 796 0 7 qeTsuezi T3 v atrq3 ox zeuo p uSrsun Oxra3eyusnaT3 ST and 409298 92 ta uN u 3aurad f LIG udang 144 19 LIS 444019 00102 19 te TOTS T u xu JoyeTsuerzl 3 Qnezp ddyT09 proa 9 0 0 zeaersuez 13 SRC CRRA o o CR A KOR KOR R KOR kk KK kk kk TV N T D T YO R KOR KOR KOR KOR KOR KOR KOR KOR KR KR KOR KOR kok kk k Au aeo17 1 380172 1 u 3eo1y 1 36o1z2 XU 220TF T 220TF z Te sT3 1 Lu xu jage sue 13 ATHAPASCAN Pam oer Roch amp al 72 epru amp Suru
2. u 3n02 p3s r fOxtzsendoats O pur f xeuy2 goezrs yo pu sc yoos fT xo zeyo T K Lug 10947 7 8 292848 T92 goezrs oR pues lt yoos 7447 Lug a9 g T X hug 109 Z TZX941EeATS fT Au xo K Au lt xn0 xg Tex 8 fT x fu yo LT Au lt xn0x7 x uc ano c tro X9e339AT2 fT uT3 xo azq8 o1 aqnyroToNTS fp savnD 10 uT3 qT3 yo e3e3s Teo 9 0 0 zeaersuea 12 Wasnvo HTgISSOd eaeas TT92 goezrs s e2 qx u qseqs 92 A2er xoos SSOILVZETVOSIA 30 AVIASIO e3eas TT92 yoezrs u xu sT 92 3xeu 1eqg2 A2ed 320S TINN iX908S IT 0 e atqshoz O8ZTYPUTI G97 Z 4tqBhor Q T atqsfhor Q 0 atqShor 9egr euri RR 0861 lt eura FT X50e g82T02 od dgqpp pIoxq3 02 yz017 uT3 39c LT141434ox 0 o1aTq3 oz O8ZTT gt 9WTI BB 0 lt eura FT sT 70 1 1 143 oz g9 Z ur3 z1arq34ox ggg lr1arq3 oxf0 Lolarq3ioz PZOT gt OWT BB 892 lt eura FT sT S R R K R R SR R R R K R R R R kal lok lll lok lake alle ale ale kalk SR R R ale kk kk kk kk kk k o ee Ee ARE UB Ko 4 Qe Fee EGSLOTSGgz T ATqSfor 719 Y 492
3. urq ogur ss ooe TT o sruqa amp umi xsq yp eurg aupxpg pssaro jq gt M i poreyS e fT Teo STI lt TT gt M x p zeuS Fe aoqez do TT o uoTinToa 1TT PFOA Pan oes Roch amp al q u3eu epnpourg cu SBUTI2S gt epnpourg W qpaeu epnpourg q ur 3eurjeu SPNTIUTE lt u orn sKs gt epnpourg q 3ren sKfs epnpourg q 3ex2os sfs epn ourg WV 320r s s epn ourg lt u u s sKs gt epn ourg lt y odt s s gt epn ourg lt y sed 3 sfs gt epnpourg lt WR919SOT gt epn ourg W orpas epnpourg T ueosedeyqy ouwerSord suwe13o01d s1que sefonue s Sess u s p uorjrurgeq ousnb13 uorhezruozuosuKs TeqoT3 64 o S3ess y STIH Paez A IOME f Hddvs yefoad f0 uznq rx fIpue pa3s gt gt uoradeoxe umou un qg2392 gt gt 3n02 p3s t Cc uoqeo fIpue p3s gt gt 3noo pa3s f1 4noo pis qurid 4 fu 4918 gt gt 3noo pas a uor3deoxq e qsuo5 m qs Ipue pas gt gt OTTe peq yore gt gt 3n02 p3s 4 B2OTTYPed Te 3suo2 2422 TPue P3s gt gt 818 pr eaur g23e gt gt 3n02 p3s 3 3ueunSayprieau e 3suo2 42429 f eaeeg uoo f a8xe o8xe wod i f0 uangaz 1 Tpu p3s gt gt fegdsrp spo qu seur qu gt gt 0 a3ze gt gt A e8esf gt gt 1199 p1S fIpu p3s gt gt
4. gt gt 3noo pas f2828 gt T fO T Uf doy S i 9318 IT fIpue pa3s gt gt 231 gt gt 9318 JUNO gt gt 3n02 pa3s f aSze oSae Aqrunuuoo er31UI 9398412 U83S g e woo Karunumoo Te I Terqrur S l t QUI i i i i y x a 0 pr seozog CIA r ET r pazeoq r T T pxeoq EF t Pxeoq t T T pxeoq Er F Er Pxeoq t Er r pze0q A8ze x ieqo 9318 qur ureu qur uoT3ezTTeT3TuT ueosedeyay quiod KzIqu ureu i f0 uznq x 54 Tpus p3s gt gt 3 gt gt u ou s 182JY gt gt 3n02 p3s oufs uoo gt gt 2 gt gt n 2u s edogeg gt gt 3n02 pa3s Kouenbezjg uoraezruoaqgou s TeqoT Y 3 FT i i 3104 e 0 ferdstp Y 3 Ft xu gt fo qur doy Y SAUT fO T QUI 107 22107 eua jo uoraeor ddy yoes oq o2107 aya uotqeotrddy z 54 C 1 se2207 If1111Px60q4 190 x Xse uoraesS qur ozoyoydoq ye f 33 0 iT FF C 1 se2207 f 1 pze0q 100 ASe uoTyesBoqur 02103 3104 Te 1 xu i FT C 1 se2207 f 1 pze0q 100 ASe uoraeaSeaur 02103 X104 e xu 93 1 LU T zt C 1 se2207 f 1 pze0q 100 ASe uorae
5. 4 eaeas TI 2 aonaqs geped a4 4 WQ 3ex ogfN epnpou ddo e3e4s e epnpou Uorjesr ensra BT Inog x lt y qp19u gt epnpou q ur 3eurjeu epn ou c yeyoos sks gt lt u u s sKs gt lt y odt s s gt TH TH TH TR TR Ht Ht u sed 4 sKfs epnpourg w paisrun epnpourg S3 y oos sep ro due T Inod q orggr3 epnpoutg seSeur sep epieSeanes Inod cq xesp a Touzey gt epnpourg y e egaequ reatig21q a xeSeuej3ndu 43ndup epnrout g e egiequ Soqeuy A aeSeueyanduj andu epn ourg q e egaequ sog A zeSeuej4ndu 4ndup epnrout u Suqeq a reuxey epnpourg c g afa qaey cq xraeyfa uaey epnpourg xq e eqix quooT fa 19 T uz y epnpourg cu ddyrofa 15 r1euxey SPOTIUTE cq 3tjuog a gt epnpourg 88 WA xx lt y n13 19 gt pnTourg cu 19 10 epnpourg LOTO lt 103294 gt epn ourj 107994 PIS w sS9urz3s pnTour lt uyatioSTe gt epnTout lt Y Y18U gt epnpourg UeaX4sor epn ourg W orpas epnpourg seq ep SOTITRIQUT ddV SX TI nDIINVd urj p ddV SHUDOILUYd J purr s p 3ue43euxed uorjeor ddy y TFF euegyr UT sn 10 z y H eof
6. zndno5 rs Q 3e 839320 eZIS PJ 29 05 eT ins ix ezrs qur 29898 99 1TOA999Y qur 9ZIS qur QUE x 9ZIS qur qur 9ZIS no uorxeuuoo aandnod rs Q 3e sq q5o 9ZTS py 3ex2os eT INS 4TT 1ATOA9I9I UOTIDUOH ix 310d qut eqnooqqyeyo0g proa 93no289 p 224208 uorqe d30 ix edf4 qur 31od JUT ADOSADIAD qui 3e3 nsex ue foau ez s 41od oreumu aqteynos qod np oz umu 358 zq uezed 19308 eun p i fOPTUPBTS proa Q 1991S Teugrs zou T nb I1TJ NS TT queufula7 es SIT puenb ep snsse2oid s T 3ue43eured i QeSessey SPTA zn qonzqsuoo ix Y eSgessey sser s uuop ep x qrup 3su Au xu qur 220128 jepod 3 UoT3esTTenSTA ep ouwezSord y euueqiSoad red Zoau eSessey x addo qeqs TT 5S SPNTIUTE cu peezyyd gt epnpourg lt y sed 3 sfs gt epnpourg cu T 00T sks gt epnpourg Panes Roch amp al 68 fepou qur IdWON F2PUFT 3z asemsr q4 g5osSrN Tooq f0 uangal f0 Y20Ss 1 x os eso o Omxeiasegst IT 293
7. 8 4 ne egesseu np roaug An oo f0 2 1989 eurj 3np53sq nqa sH x fu 34np593sy orjezxedeag 0 0 u xu UOTIRSTTENSTA ep euueugoad anod eSesseu np u 9839320 ezrs pj 224205 61 ans roaue zefoaue uoT3 ouo4 x ZTS uin4ei s uuop sep e I 3 PT TOAUJ este 10 uangaz enduox qs uorxeuuoo e gt O i PeeuyolqN 99 0 gt pe uqy FF noei e ynog enb x 1 fxeoiq gt PeeuqN zt peeuqN PeeuoldN fAnTISSN 192 1 0 PesyoLqu z qur ynq pgy aoez peeyqn POTA 2TTTe2 MTY ewumerZoad ne efoaue a3essay fpeegqN ynq fu aut xu 3ur a3ru Ssyroaug eSessej proa i pe yolqy TTun x uotidener ep T nog x 3np5 suwe13oxd ne sooTq s p senbristiogoere s p TOAUH x gnq z qur ynq ZTS peeuoIqN UT t 10 qur fezrs uznq z fx quT ynqa e3e3s 92 s uuop sep e re43 T ep IOAUX fosep pj aur este zr qur qnq 1849 TVNIOIUO 777 qur yyg ezrs qur nqr 97895 92 u1r0Ae20g eSessoj qur f0 uangaz enduox qs uorxeuuoo e ZTS no uorxeuuo2 ep rs Q e cated sk L 67 fdyx queqsoy 39n193s np q uz qur esseipy fuou ur Ippeyoos 220128 124208 esseipy Omxeasegst IT fO poo qur
8. 4 yonay4s 4 sse p o3e duro3 fiv1 107pa4vus fiuu o kKvssykw v fo quod fidoo 1 0c amser eyurm 30A rd c H r rpprulssr peor f rppiurisi peor3 p mau amsez L z zis peer qui oard lt L gt M poreygiite 4 lt lt p gt AermyAur gt a pareys Te 103e1edo proa amp euxre7pereus Aur Ardj 220198 4 sse o gt ge dur 3 qomd o puy 1 11osb ssoooe 1 lt lt r 4ury urA 4 pereug pe proa amp eur pereus Aur73108b 32na3s 4 sse o e3e dure3 fosso y 7L08b q exiypereus j y exzzypereus u eU E 45 ATHAPASCAN 0 2119 T IDY azs 3ur eare poreus 4ut Mau p t 21 3ur Aexie7poreus amp ui mou y Ely ur Aexre7poregs Kur uteu sqe crreu qui YHLAVHO IdV NI ATSAOIAHUd SV NIVW 0g Mau 0 0 s ob qomd lt posoys yourur ur finis yoard gt 02 Adie reuo 931e qui ureur ur qoud lt puo 10014 14 Z ur pasoys yona pds m 0 t 3 qur 11085 amp u1 DLLD y 310 IM usyy 0194 syo 2 4 1 f zis gt t 0 1 09 lt U gt Kerry Kur Meu 1 lt 3ur gt Ku 1 818 10ye zis qui pozwuiopuva yn uo jf am asodund busar sof 3541
9. eT ep SLOT 3ueuejerpeuur T ddy x t gt 4 ddy7T05 proa TOT S3Tey S3e3e s p suor eoTyTpou 3e slmnoTe0 urss m anges quene stew 2 m Ano e esru e sozde e uOrj uo rssep np inq p r f ru eT d T dd r 4 HLOONS T Teponepeusts T ep Tn Te np enbrugoe4 ep XTOUD ISIL HLdAQ 19 1qeu13 SS3T 10 8 0 T mad qze ToT3 x yyng uadeq ONIIHOIT 19 TaestaTs 0 rxepur T3 LIQ UIAANE Hldad TD 1187444408 Y01097 19 288 TOT8 ouetg T T I 8 neta 1 90 Z O 1 0 ep eT uorjrurgeq 40Y477T119 TD TqesrqI3 O 3rul3x uo ddy ed uorqouoz 93392 suep T9uedj eoinossey x egzeano p onbeyo anod stor eun fez eddy x puedg qx quo un p T ep 3ueuejerpeuur T ddy x 1 f 4UNMO LINI4 J3urad t t t ff Au 10 E x Ku out 212 22228 xU T STT 2 yo fkl 10 f qur HC TSXUST O T QUI 107 f 4usxul qeqs TT mou eoueAe p soSeut estioqne uo 83e4s T92 goezrs 93e3s T92 gjoezrs 93e3s T92 gjoezrs 93e3s
10. 1 meu Tsar y esto q e S ergqnop 19u 97T1M y Ope rtu gt e q FF y lt eTqnop gt 1 p zeug q fe proa eanduo 39n195 54 Ope x q pe z e lqnop m u qr4m 5 q lt eTqnop gt 1 p zeug e ce qnop i p zeuS 4oqez do proa uns qonzqs t 20 7 8 9 9xq 8x8 gt q etqnop e 8 aTqnop lt Uu T ueasedey2e gt epn ourg 9 ON Pam es 48 Roch amp al Figure 6 Execution graph corresponding to a 0 b 1 and h i 9 3 Scalar Product This example shows the use of a cumulative shared and of an array of parameters Parameter arrays are recursively split until their sizes are 1 Then the result is accumulated in a shared data di dhs 49 f0 WildVHO 147 NI 4 1440 SV GOHLAW NIVW 98e qur ureu qur 10 uznq x fTpue gt gt UI U qno gt gt mo sex TeA g3128A 3104 pue g
11. 8 89459uT3 RR GIS lt eura FT sT fo ara8foxto c arasfoz 9gzeura 1 Arq8foa ggc 0 arq8Kox vrIg euri BB 997 lt mri FT este OxraqeydoqT3 50 1 141q84ox 0 Ic1u1484ox50 T1arq3 ozt muri 01arq3 ox 9gz ur3 FT X2e q en q ueeaS o e eSuerio pei x e q eura umzqo ds eya YSnozyy yo qu u zour 8 c ur eut uorqno x ASP JO Jog 1 NOILVZITVOSIA an 113486 6 XxZ TGD 19 TB pK T X xZ 1ZX09T0ATS AxZ EXC TED LO TE xeexq o c atqshoz Q T A rq8 oz 1 ggg 0 Ar q8 fox Lx x amp c T7X9718 ATS ilsu z yyip pezeery eq prnous 3 negep 92e3s 1n2 e33e ed xes1q 99z3 3 21q301 0 1 11q38401 0 0 11q3L01 iy aseo xes1q 0 3 21q8 01 g9g3 1 121q3 01 0 0 21q3L01 g aseo 4 X XU gt X Q xX QUI 107 xes1q 0 3 21q38 01 g9g97 1 121q3 01 99z7 0 Atq3L01 iz eseo feezqfo c arasfozxtqzr r ataSfoztqgz o atq8fox 7 eseo f u gt fo f qur 107 xes1q 0 3 21q3 01 0 1 11q3 01 g97 0 11q3L01 aseo fs T92 ANI x9I89S 92 npou youtas 1 60900710 8 8 feuTLIOASINOTONIAN 2UTA npou qur 8 notre por aut sagueyo dZ 2w yseq yo
12. aT i X eje s euoZ PTO IX euoz mou JX euoz mou tx i X eje s euoZ PTO IX euoZ p o TX euoz meu feuoz pro euoz meu g u Sex g m Sex q q Sox iz q m Sex c 1 ete 9889 t fxeoiq f T313 erara 195 quxqu 143 euoz pro SX20Tq PA lt P PTOYS9IYA ix M T313 gaurads oorJerara xeu ATHAPASCAN tC C O 8oT s qu dxe aur qu 016 B0T zu euoz ppo n 8e2 erqnop 2o TT qur qu i OT 1q9 euoz pro T gt aqa euoz pro FT qu qur OT 249 euoz pro este fOT 249 euoz pro c3 22 ex FT colo 3 ese t fxeoiq f T313 erara 195 mod euoz pro 3o1qpepuej 3xey erara gaurads 110011913T3 zeyo T mod euoz pro I gt mod euoz pro gt fT n d uoz pro este fT mod euoz pro U 9 FT cele ese u aseo t fxeeiq 91373 eTat 27228 f 31 euoz po pY SUOTIBI9IT 4xeN T313 jaurads 110011913T3 10 31 uoz pTo 0 gt 3r euoz pro xeexq 1001 ar euoZ pro Qe fYe rq OT ar euoZ pO Pe xeexq 1001 37 euoZ PTO Ne fye iq OT ar euoz PTO fex u ee 2582 s eseo zeyo DO FT ese2 ese2 231AS elle
13. qos dno13 3 nejop 39s Te DISOQ 99IS NIOMTTR Mau uos s dnoid eseq e buoa1s 440m Burnpayos y s i qoa1d yoargpuy lt T QL sou 14 lt gt pereyg re onid v sof yosvas am t0sb lt Z ys0sb lt T 0c 1405b I pavpun s y bursn fivsso pasoys 2 y 1408 9m Tyrey azis THEY 1 4d09 z3 Q The zis 3ur Aezre7pereus amp ur Meu 21 Ureu lt qui gt ferre poreys u Mou 11 3 The lt JUP gt ARIIETPOILYST Aur S DVLSOTPaIsLVYST LW 8 g ur fiv410 y pds am z zis sqe Trey qui OT 192182 1 OZIS JUI y A3 tego 9318 qui hop qui Oe lt L gt feuviu noso u pro 02 4 sse o o3e dure3 Hpu gt gt 11 lt 33osb gt gt pue gt gt 1n09 jnopjs o s d y pursd Hpu gt gt exie go ezrs esn gt gt 1199 puedde lt T9 e8esn proa 2 s 30y3 puo 2uo 1s nf y o fiv410 puoo s y puaddo am q exiypereugfu spnjout p e810uI lt y 2 11 x 42112503 0014 2 m ddo Luosbureu cnreu THEY zis zIs r lt Z1 4 cUTeu Ueu zis r lt 11 i Ue cureu ren zis z3 amp doo y1 001 0Mosb sru3 area 2 IHLY 1 4doo ei e puedde sra 46 Roch amp al 9 2 Adaptative Quadrature Integration by Newton Cotes The
14. The declaration of a global data has the following prototype e al mem cc T gt x T pval for causal consistency e al mem pc T gt x T pval for processor consistency e al mem ac T gt x T pval for asynchronous consistency The type T must be communicable and pval assigns a pointer to the initial value assumed by the object This pointer can be null and is entirely managed by the system That is to say the pointer must be considered as lost by the programmer This declaration is permitted anywhere in the code An object of type a1 mem can be used as a parameter of a task or declared globally If recieved as a parameter the scope of the vaiable is limited to the procedure s body whereas it has the scope of the entire code if it is declared globally 8 3 Registration Due to some implementation characteristics the global data have to be registered effectively linking all the representatives one per processor to the same global data If the object is used in the task parameters this registration is made automatically Otherwise if the object is declared globally the registration must be manually performed Manually registering global data is done by invoking the register pval method on each object during the initialization phase between the 1 system init and al system init commit invocation The pval parameter assigns a pointer to the initial value taken by the object This pointer can be null and is e
15. puez rre pezxeug qe f r Pzeoq TVNIDIWO r rreo rreo pezeug je f r Pzeoq t C fxu gt l f0 f aur doy f 19ST uznq zx xu lt 5Szog gt p aeuS Ie meu T seoroz Kf fu ogur y lt lt x Ku ogur y lt lt ury ogur y lt lt qeqsoyuT Y lt lt IST xu lt t1T1 5 gt p zaeuS Ie meu r paeoq W BITE IST ue zqSI Ie lt lt ques13S Te gt ree SAWT fO T qur 107 f u e2oxog pereug e Meu x 9210J peieug e u TT92 pereug e mou pieoq s 92 peieug T e 13s0 uinjei STI 2 ou FO erairu AK Au oyun y gt gt x urogur y gt gt eurq ogur y gt gt e3e3s ogur y gt gt 1950 V 391192 3suoo 1ISO Wue z4SD Ie gt gt wue zqS0 Ie OI 7740 urj p 1T su nb ry uor ezruoxuou4s Teqo18 I gt Kou nb zy uorqezruozuou s eqoq3 gt OI TIHO JePUFt p 482e toge uor ezruosqou4s TeqoT3 qur Kerdsrp I gt u nb iy ferdstp gt grpueg g a3xe roge Aouenbozy ferdstp aut f pue p4s gt gt XU gt gt XU gt gt Ku gt gt U qan egrTeuep gt gt 3n02 pas 19ST uznq z c A8ze 1098 xu qur Karsu qur v lt lt 24st f 11148xe To03e Au qur V 322107 1IST queeux3gI 9 lt lt zoqez do queeijgI Te a8ze xxzeyo o8ze qur woo 4runuuo e 3TOp qut 1
16. q pergjrpou exez 4 Addvs yefoad t s 9e2 3xeu STT 27 X U s o2 fs 92 d l q x q qs TT TINN iX905 IT O euezga3sod ddyT05 proa r 73 ATHAPASCAN f euexjerqUI proa enaira jgrpueg T l 9108 Tt nb INVAY sreu eyoueTOep aros urssep nb seizde ee edde uorqouod 1 PTOA E eue2s ep urssep ep uorqouoq fs T92 3xeu o3e3s IT22 fsTT92 x 9989S T92 f euexgexd proa enaxra u xu qur TOT site S3e3e sep STnoTe 4008 x3ex2ogfN urssep np 3ueae spew np anof e astu seade ee edde uorqouoj aSessoy eqeatid x f O mexq sod proa fOszexoezLerepdn TOT s no e eeurssep fos weI nb seide stew saexoeri4 s p anof e 3ueae eepedde auop mou sr Sutmeiq Q 4 O ou4s pano 74 Roch amp al M iere aed a ATHAPASCAN 75 11 Frequently Asked Questions This section contains a list of frequently asked questions about ATHAPASCAN and some attempts at answering Please feel free to send us any questions that would enable us to enlarge this section Q On which systems do ATHAPASCAN run A Currently ATHAPASCAN has been test
17. al Fork lt add gt i 3 return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN THE API CHAPTER return 0 Figure 2 Demonstration of associativity and commutativity of cumulative mode It may seem as though the program was implemented according to the sequential depth first algorithm 36 Roch amp al gt ai Fork add i 2 ai Fork add i 1 This is not the case Naturally the above code is semantically correct as well and could produce the same result as the program in It is therefore important to realize that since the function F is associative and commutative the precise manner in which the reductions are performed cannot be predicted even in the case where initial values are known ATHAPASCAN 37 7 3 Shared Access Modes In order to improve the parallelism of a program when only a reference to a value is required and not the value itself ATHAPASCAN refines its access rights to include access modes An access mode categorizes data by restricting certain types of access to the data By default the access mode of a shared data object is immediate meaning that the task may access the object using any of the write read access or cumul methods during its execution An access is said to be postponed access right suffixed by p if the procedure will not directly perform an access on the shared data but will instead create other tasks that may ac
18. 1 Athapascan 0 is the communication layer based upon MPI and POSIX threads the extension independent from the transport library is called INUKTITUT 2 ATHAPASCAN 1 is the user end API 3 ATHAPASCAN also contains visualization tools for debugging purposes ATHAPASCAN is a high level interface in the sense that no reference is made to the execution support The synchronization communication and scheduling of operations are fully controlled by the software ATHAPASCAN is an explicit parallelism language the programmer indicates the parallelism of the algorithm through ATHAPASCAN s two easy to learn template functions Fork and Shared The programming semantics are similar to those of a sequential execution in that each read executed in parallel returns the value it would have returned had the read been executed sequentially ATHAPASCAN is implemented by an easy to use C interface Therefore any code written in either the C or C languages can be directly recycled in ATHAPASCAN The ATHAPASCAN interface provides a data flow language The program execution is data driven and determined by the availability of the shared data according to the access made In other words a task requesting a read access on shared data will wait until the previous task processing a write operation to this data has ended ATHAPASCAN is portable and efficient The portability is inherited from the Athapascan 0 communica tion layer of the environment whi
19. Shared lt myArray lt T gt gt amp this al Shared lt myArray lt T gt gt amp t2 void swap int il int i2 al Fork lt swap shared_array lt T gt gt al Shared lt myArray lt T gt gt amp this il i2 h ostream operator template lt class T gt 90 ostream amp operator lt lt ostream amp out const shared_array lt T gt amp z 4 al Fork lt ostream_shared_array lt T gt gt al Shared lt myArray lt T gt gt z return out The following main file tests the shared class As you can see there is no more reference to specific parallel code include athapascan 1 h include sharedArray h define SIZE 100 int doit int argc char argv shared_array lt int gt t1 10 t2 20 myArray lt int gt tab SIZE 10 f l the array for int i 0 i lt SIZE i Sm A 40 tab elts i i resize the shared array to test the methods t1 resize SIZE move the data to the shared array tl tab try to swap a data tl swap 2 27 append another shared array t2 tab tl append t2 cout lt lt t1 lt lt endl return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED return 0 Roch amp al M iere aed a 20 30 40 ATHAPASCAN 41 8 Other Global Memory Paradigm Access to shared data involve task synchronization tasks are unable to perform side effects In some applications li
20. i t2 read size tl access resize k for int j i j lt k j tl access elts j t2 read elts j i 10 20 30 40 ATHAPASCAN 39 h swap two elements of a shared array template lt class T gt struct swap_shared_array void operator al Shared_r_w lt myArray lt T gt gt tab int il int i2 T temp temp tab access elts i1 50 tab access elts il tab access elts i2 tab access elts i2 temp h print the data of a shared array to standard output template lt class T gt struct ostream_shared_array void operator al Shared_r lt myArray lt T gt gt tab unsigned int size tab read size 60 for int i 0 i lt size i cout lt lt tab read elts i lt lt h template lt class T gt class shared array public al Shared lt myArray lt T gt gt public constructors shared_array al Shared lt myArray lt T gt gt new myArray lt T gt shared_array unsigned int size al Shared lt myArray lt T gt gt new myArray lt T gt size 70 void resize unsigned int newSize al Fork lt resize_shared_array lt T gt gt al Shared lt myArray lt T gt this newSize void operator const myArray lt T gt amp a al Fork lt equal_shared_array lt T gt gt al Shared lt myArray lt T gt this a void append shared_array amp t2 80 al Fork lt append_shared_array lt T gt gt al
21. ur Tqnop xeTduoo O xaTduoo 2r qnd x trduo5 sseT 4 f z geuoz gues13sT Te lt lt ioqgezedo gueeryst e f z geuoz 4suoo no gues13s0 e gt gt 1o3exado gueeij4so e f Z g uoz qsuo5 qmo gues13so gt gt 1oqe1ado gues19so saod ag 917 y m7 qu 44 yx T tx e qnop fasuoo Te s Tqnop fasuo2 x Te os e qnop fasuoo Kfadue qur mod qur xun qur at qur y qur m qur J e qnop yx tf tx Tqnop uoz uoz 2r qnd I 7 gK yxoo x 443a uo330q 3 Jx dog 14 Tx euoz sseTo ud T ueosedeyze epnpourg lt u ueo13S gt epn ourg SHdAL eurgepit SHdAL J9PUFTH 3893 0qne o epueu T pueu gq se3erdue JI ux gquesTo Ox outdue3 f40 o Tepueu T pueu qq s qerdu l uee o Tg0 Tepueu T pueu TTe 01S o 27077 asqnsaed 40 07 0 9 rg80 WT TTXI HIT TTX T 90971401 SHQNTONT TIX I 3 SOWIAXXO qtt utauado xsn gIT TTX epni ur urmuedo isn SAANTONI TTX HIIJS3VW TV epnrour errjexew ES fOueetro pueu a f ueeqo oxd a Too lt aur i peaeugg euie uedeq 1 uorgei urm 3 peaeig4 qur pou qur proa 5 102 zz euoz S este tf erdstp uangez qsuo
22. 29 6 3 3 Example 3 Resizable Array A simple example of a dynamic structure is a mono dimensional array with two fields a size size and a pointer to an array with size number of elements include lt iostream h gt include lt stdlib h gt include lt string h gt include athapascan 1 h class myArray is an Athapascan 1 communicable class implementing a resizable myArray NB T has to be communicable too 10 template lt class T gt class myArray public unsigned int size The size of the myArray T elts ith entry is accessed by elts i empty constructor myArray size 0 elts 0 20 constructor myArray unsigned int k size k if size 0 elts 0 else elts new T size copy constructor myArray const myArray lt T gt amp a size a size elts new T size 30 for int i 0 i lt size i elts i a elts i destructor myArray delete elts resize the myArray void resize unsigned int newSize br 40 Packing operator template lt class T gt al_ostream amp operator lt lt al_ostream out const myArray lt T gt amp z out lt lt z size for int i z size 1 i gt 0 i out lt lt z elts i return out Unpacking operator template lt class T gt 50 al istream amp operator gt gt al_istream amp in myArray lt T gt amp z 4 in gt gt z size z elts new T z size for int
23. 7740 f atsuequr e Karsu qur s r e 39107 paolo proa SIVIS TIHO yey sseT uor32ung aT eTnun e epraoid ol y 8 32ni3s lt Y QITP1S gt SPNTIUTE xu u euy lt WR919SOT gt epn ourg ayy uor eTnuuy o qTsu qur 3 s z proa 103nqr23uo2 Teo qsuo5 odoy dd qeqs TT 62 k r 63 ATHAPASCAN pue pas gt gt X gt gt nu gt gt 92895 pe r x gt gt a e2Ipur 3e38 gt gt 3noo p38 dooT eur 3 ni f 0 qur doy aur X re2 r pezeug e q lt lt qur 303280A 4 I p zeuS e zoqgezedo proa A aur x re r pezeug e q lt lt IVD gt 10928a gt M I p zeuS e zogezedo proa xeggnq 09 qndqno qonaqs pue pas gt gt ouls seady gt gt 3n05 p3s oufs uoo qndqno xeSessey Tpu p3s gt gt 2U S queay gt gt 3n02 p3s HU 3ess q epnpourg u xu Q eargaeS Te lt aru andang gt x104 78 Tuo 104 54 7 7 1 NYYOSYdYHLY 804 Suvaurs aua 39105 eoxog pezeug e f t seor0z peep arTe i C XVW ANYU lt
24. Oq gt f g u 33ejyTe oT 3suoo y u 33ejyTe oT 3suoo due a3ejy e o1 0 aeWrIe2oT pexeug g aeyre2oT x peaeug y aey e2oT x pezeys proa 32ni3s g V O fo uorqeanduoo Tetquenbeg une f r aea y lt lt 2387 e 5 og gt o f qur soz T OQ gt t 50 t aur qoy W 23enTeooq 19ST gueeryst e do3exedo e t I S0 f r qea y gt gt 1950 e 5 og gt o f qur soz T f OQ gt t 0 t aur qoy Y V RIPNTEDOT q4suo5 1350 queo14so e 103ei1edo gueerijso Te 54 54 lt f r qea v C Crea f og gt o f aur doy t 09 gt t 0 t aur qoy W BIeNTe207 315u02 3ej e201 OqeyTeooq os 09 q2 erqnop 3ente20T 220128 dKq euorsueurp OM orseq y x c T ueosedeuqeo epnpourg XIX3ej O WT uee o O XT1JBU xTiqeu TTe PHIIJS3VW TV pntour errjexew foe aa te r 59 ATHAPASCAN f0 i WildVHO IdV NI GANIAAd 4 SV GOHLAW Pam MALA 60 Roch amp al M iere aed a ATHAPASCAN 61 10 Culminating Example lifegame cpp The lifegame program was developed to provide a visualization of the asynchronous task execution performed by ATHAPASCAN This program serves as an example for most of the concepts covered
25. STAZ S Sep Id T quene SIATESSIDIU suorqesrTerirutr S T 9INDIXH x Q soT qur f ezrs qur 19JJNq 9189S 99 A991 qur zts qur 19ggnq qur x Seeuuop s p STODIL Au xu qur f ezrs qur qeqs 92 pues qur atut Ssu 42ni4s zrs qur zeuo pu s qut 243T8u ans nb q yy p e u 1202 qoN an az s ne s uuop s p STOAUS pas o2edseueu Sursn pas ep eoedseueg x 310d qur an az s 1eq2 32euuoo qur ne qo uuoo 4U ddy105 SPOTIUTE Oxeaseysr Tooq q ur 1eurjeu epn ourg QaexoostN lt u orn sKs gt epnpourg lt y qtem sks gt epn ourg orp qnd lt y 30 20s sfs gt epnrout aex ogftN sseT W 320r s s epn ourg lt u u s sKs gt epnpouty t lt y odt s s gt puTourg tK Ku qur lt y sedfy sfs gt epnpourg tx Au qur W qpaeu epnpourg qur lt Y T1U93 gt epnpourg feqeqs Te cuq S UTAJS gt SPNTIUTE eaeas TT 5 39n195 y p d 4 cu peezyyd gt epnpourg cq Teu3rso epnpourg x lt Y P1STUN gt epnpourg uouts jrqeSou epoo seoons ts Q seq 30N x d2seapeoaq sed zqtreu np seeuuop geqgiaxsazosog s ndy sp SPOPOHr T sTeu J0H20S eun 1S9 2 zqreu pn ou INS QOL 22420s eun uuo SSTTTAN S x 19133n 39u ouw
26. bad implementation as it has an exponential complexity O 2 as opposed to the linear time complexity of other algorithms However this approach is easy to understand and to parallelize Sequential implementation First let s have a glance at the regular recursive sequential program include lt iostream h gt include lt stdlib h gt int fibonacci int n if n lt 2 return n else return fibonacci n 1 fibonacci n 2 int main int argc char argv make sure there is one parameter if argc lt 1 cout lt lt usage fibonacci lt N gt lt lt endl exit 1 cout lt lt result lt lt fibonacci atoi argv 1 lt lt endl return 0 Parallel implementation We assume that you wish to make this program parallel An easy way to do it would be to use the same algorithm recursive Well it s not that easy Two reasons for that 1 in the sequential program we used a function while ATHAPASCAN only supports procedure void function 2 you can access shared data only from a task having this data as a parameter ex you can t display the value of a Shared from the main To write this parallel program we will then need ATHAPASCAN 17 e a task doing the same job the sequential function was doing recursive e task to add the result of the two recursive calls to fibonacci e a task to print the result to stdout include lt athapascan 1 gt include lt iostream
27. e eos euoz 77 x Tduo t Sur uanq x tmod z lt lt ayy CZ lt lt GE 12 lt lt YZ lt lt mz lt lt 2 lt lt gX Z lt lt TAZ lt lt IX 7Z lt lt UT 2 uoz ur queoxjsr e lt lt zoqez do gQue zqsr e qno tmod z gt gt IUI CZ gt gt IZ gt gt YZ gt gt BUZ gt gt R CZ gt gt FX Z gt gt Z gt gt TX z gt gt qmo 2 euoz 4suo qno puesrqso e gt gt 103exedo qgueei4so e t qno fu Cu gt gt JAZ gt gt gt gt IX CX gt gt Q X a gt gt MUZ gt gt ufu gt gt IX CZ gt gt 1m u gt gt INO 2 euoz qsuo5 qno gues13so gt gt 103exedo gues17so 1 1 U r K gKf uangez 2 4suo eqeos euoz e qnop 02 2 1 1 87 1X7 X7 lt E To qu z i 10 147 y IX fIx asuoo adu uoz qur o mod mod aq aqq Cap jar SOUDE oa pa pK DAT gx DFE CH DEL CCE sod qur z n qur qur y aur m qur J e qnop yx T Tx uoz uoz o 0 aod 0 zua 0 3r Cow Co OF 0 EX O 0 Tx O uoz uoz nu 4 2 epnpourg t f0 uznq x WildVHO IdV NI GANIAAG 4 SV GOHLAW f S r pueu a ezrsex ASzexxzeuo 9318 qur
28. in this manual passing and declariation of Shared data Forking user defined structures the communicability of user defined classes and the internal scheduling of tasks by ATHAPASCAN The program as a whole can be divided into two parts the simulation lifegame cpp Message cpp Mes sage h and the visualization SappeJuggler cpp SappeJuggler h NJSocket cpp NJSocket h GOLApp cpp GO LApp h The simulation in particular lifegame cpp uses ATHAPASCAN to parallelize the code The Message class defined by the other two files sends the information needed for the visualization through the sockets it creates The visualization portion of this project contains no parallel code and is only used for recieving the messages sent by Message cpp and generating a graphic output with OpenGL from the information received Lifegame cpp creates a matrix of cells caracterized by a boolean state an integer time and two interger co ordinates x y as defined in the cell state class Given the state of the current cell and the current state of the cells surrounding it the program calculates a new state for the current cell updates the time variable and sends this inforlation as a message through a socket to visualization The visualization recieves this message and displays the matrix of cells Each cell is displayed with a color corresponding to the time at which the cell was updated and the information sent When running this program in ATHAPASCAN s differen
29. object gets the method T amp access that returns a reference on the data contained by the shared referenced by x Note that amp x access is constant during all the execution of the task and can not be changed by the user Example class incr 1 void operator al Shared_r_w lt int gt n n access n access 1 7 2 4 Accumulation Right Shared_cw al Shared c T gt is the type of a parameter whose value can only be accumulated with respect to the user defined function class F F is required to have the prototype struct cumul fn void operator T amp x const TE y 1 body to perform x lt accu x y Example struct add void operator int amp x int amp y x y The resulting value of the concurrent write is an accumulation of all other values written by a call to this function After the first accumulation operation is executed the initial value of x becomes either the previous value or remains the current value depending on the lexicographic access order If the shared object has not been initialized then no initial value is considered Since an accumulation enables a concurrent update of some shared object the accumulation function F is assumed to be both associative and commutative Note that only the accumulations performed with respect to a same law F can be concurrent If different accumulation functions are used on a single shared datum the seq
30. qaod zneaxes S We amp xeaseysr UN PY s 299UUOD D Fqutad gt a10d qur ABU JD9UUOD JIMDOSFN PUF 0 xoos f0 epoyze3seu O 19XDO0S N uq 3ex ost Ni q a3exoogfN epniourgt grpuest epniourgt IdWON F2PUFT u uaeu cu 3urzjs lt q sSurzjs x qp39u gt q ur 3eurj3eu lt y 30x 20s s s gt cu pasrun gt xq OTpys gt epniourgt epniourgt epniourgt epniourgt epniourgt epniourgt epniourgt epniourgt epniourgt IdWON eurgepi ouwerSord MT suwex3o01d sefonue soSessou s p Q eSessey eTIJN euegyr UT esn zor xexoeg eof q A Hddys yefoad dd 1 188nf ddeg ATHAPASCAN fo3no2oe yoos qur grpuesqt 93nooe p 3 y og yoos qur 293205 eg e eeroosse quepu d uorxeuuo fosep qur 99919 994008 ep Anoqdrioseq 4 4 OeSessegy SUTTUL ess y sseT e ep x t u qur xu 3ur 3ru 3syroaug proa TV ouwerSord np soo q sep s p TOAUH x f ezrs qur 21225 92 zxe foAug qur ZTS qur ynqx qur rie oaAug qur xi ZTSs qur reyo zefoaugq QUE 9ZIS no uorxeuuoo
31. self_node 0 al Shared lt int gt i new int 1 al Fork lt my write gt i 1 line A al Fork lt my_read gt i line B al Fork lt my write gt i 2 line C al Fork lt my_read gt i line D al Fork lt my write gt i 3 line E al Fork lt my_read gt i line F al system terminate return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 It is possible that the operations in line E and then in line F will execute before the preceeding lines because the rule described above is not broken So do not be surprised to see the following result on the screen RN 34 Roch amp al Keep this in mind especially when measuring the time of computations In that case of adding some extra synchronization variables to the code But be careful because this can decrease the efficiency with which the program runs 7 2 3 Update Right Shared_r_w al Shared r vc T gt is the type of a parameter thats value can be updated in place the related value can be read and or written Such an object represents a critical section for the task This mode is the only one where the address of the physical object related to the shared object is available Tt enables the user to call sequential codes working directly with this pointer In the prototype of a task the related type is al Shared_r_w lt T gt x Such an
32. soy sseT uorqounz e oqur UOTANTOAS yo uorqeTnsdeouq HT ewes egr etdurg peep qeqs ogur T gt p lt Karsuequr T 99 JT este arTe qeqs ogur lt qTsu quT T BR peep qeqs ogur FT gOVdS3HVN LAN 4007 OV e edseueu y T 22107 qasuo5 uoranpoae proa 42109 YITIDNCUA LIADOS WAN urj p ATT qeqas ogur qsuoo ATT ST Tooq add q qs TT SPNTIUTE lt u u3eu gt epn ourg 4 A Au oyu x x u ogur qur x qur 892 9PpnTOUT JO TT urTur T ue sedeq3e epnpourg 91895 qrur qeqass peep 99895 qTuT Tey dd meS yiT fout qeaS TT qeas Ied TVNIDINO uomo rpueg IT22 sseT 4 qeqs 1195 55505 DNE phish Kut ox futo oura peep ese f aTTe qe3s Z XYW ANYU lt Om qe ANNE ee NIDIA eei tx Au qur NN 108 BOBO ata 5 t y 9984S TT 5 sseT 2 lt TT 2 gt p zeqg ye y lt 20105 M T uorqe1gequr mo peieug e proa 1sez uoraed2e3ur 220125 yse e oqur jo uorae gnsdeoug s Peep 1004 qauoo eni4 aATTe Tooq 3Ssuo 4118
33. time of compilation use the command make CXXGLAGS in each of the folders Inuktitut and Athapascan JIT 1 3 Josue Ce lt an Roch amp al M iere aed a ATHAPASCAN 9 4 Getting Started API This chapter presents an overview of ATHAPASCAN s APT and demonstrates how to build ATHAPASCAN pro grams through simple examples NB The source codes of the examples presented in this tutorial are available online on our web site 4 1 Overview of ATHAPASCAN 4 1 1 Starting an ATHAPASCAN Program The execution of an ATHAPASCAN program is handled by a community A community restructures a group of nodes Inuktitut processes so that they can be distributed to the different parallel machines Therefore prior to the declaration of any ATHAPASCAN object or task a community must be created Currently this community only contains the group of nodes defined at the start of the application community is created by executing the instruction al Community com ai System create initial community argc argv Once the community has been created the following methods will be available e com is_leader that returns true on one and only one of the nodes processes of that community e com sync that waits for all of the created tasks to finish executing on the collection of nodes before execution on this node resumes e com leave indicates to the current process to leave that community Usually a community is defined in th
34. ureu qur f0 uznq x qnop4as ysnT FF 22eutuxa2 3s 8_urn f ueepo 2o1d n utn ooxd m f euoz meu pueu n z SIU9AJXI8919 9593 UTM x OoTseq Te qs ion Je T T T T seanqtz33 ypeuog popueu s 10 4 f tz mz 0 Q x uotgex ura 418 fu dnox8 4 negep yes Te dag fu dno13 ee93s xy1om ye Ipue gt gt x Z gt gt n W Z gt gt u X gt gt M Z gt gt h Tpu gt gt AT Z gt gt u pue gt gt Z gt gt n pue gt gt mod z Z42 2Z PTOYS9IYA n gt gt Tpu gt gt 9ZTS euT gt gt UOTJLA9IT Xew A gt gt uoz u gt gt gt gt pue gt gt 1189 f SI uorsroead aieo y Of adue z errun dpeu pueu m a RIEN aC Roch amp al 54 Kex xeu 4suoo pesseud ex pueu ura qur t Too urnq x oo xeu x O Z 0 9 0 T 2trd TT aur entq ro2 Too xeux x O Z 0 9 0 T1 ord aur Uuse13 Too pez oo TOD 1O0TODX f Too xeu x O Z 6 0 Z Z O T rd TT aur flo qu 2 eqqnop x e qnop 196399 0o xeu qur gt qur IOTONXZTO2 puew ura a cover p o qoad Tenqara xeuaQ E Ng s sez 1 8 q e X 40 X X 1 6 P 6 P d x x este 10 uanger C pidex d x FF f 8 g d x 100
35. wezeqg proa t Teosd asuoo eueu yder3 aeq 3suoo Teosd f pue gt gt ureU UT gt gt MMO T t ABIe Ieo 931e qur 4Top qut qt s 5 y isuo q aur qsuoo fe Haut proa I 2r qnd t ppe sser fTpu gt gt gt gt y i gt gt TRA gt gt WQ qrrpas epn ourg peq gt gt 3no grt PS MG lt y OTP35 gt epnpourg fTpu gt gt pe r x gt gt y u gt gt TRA gt gt 0 u UOTin x YSL gt gt epou j es ue4s s Te gt gt UO gt gt MO pue gt gt Jaaa 3nqep gt gt 3noo o Te sd X lt qut gt z pezeys Tea qur proa Teosd o x ux ueelo FT19A urnq x o reosd Teosd 4suoo eueu yder3 aeq qsuo Teosd ITe ATHAPASCAN 220128 PIIJS3VW TV erriexew f pan 50 Roch amp al Figure 7 Execution graph corresponding to pscal 3 execution 9 4 Mandelbrot Set This example intends to show the possible interaction between an ATHAPASCAN program and a X server The following code results in a visualization of the Mandelbrot set on a X window The algorithm is standard the size of the image is split in
36. which type of access it will perform on them on the fly detection of a task s precedence Therefore an ATHAPASCAN task can not perform side effects All manipulated shared data must be declared in the prototype of the task Moreover to detect the synchronizations between tasks according to lexicographic semantic any shared parameter of a task is tagged in the prototype of t according to the access performed by t on it This tag indicates what kind of manipulation the task and due to the lexicographic order semantics all the sub tasks it will fork is allowed to perform on the shared object This tag is called the access right it appears in the prototype of the task as a suffix of the type of any shared parameter Four rights can be distinguished and are presented below read write update and accumulation 7 2 1 Read Right Shared r al Shared r T gt is the type of a parameter thats value can only be read This reading can be concurrent with other tasks referencing this shared object in the same mode In the prototype of a task the related type is al Shared_r lt T gt x Such an object gets the method const T amp read const that returns a constant reference to the value related to the shared object x For example using the Class complex that is defined in class print void operator al Shared_r lt complex gt z cout lt lt z read x lt lt i lt lt z read y 1 7 2 2 Write Ri
37. 1eggnq dnges 0 50 Y Bapro m Sex pro 0 0 287 yynq ynq 4dp esxyldooy f mad p y 8ex pro n Sex pro 4ooan dp dewxtgeqyeerpy dewxtg 3017 391 pro uot8ez qur YIP QUI oZISOI x pueu UTM qut 14 8 q qut fe qur uru qur orqeqs t f0 uangaz fu 3317 u euoz nou fm7 8o17 n euoz mou f T U 3 x OK Te st uoz pTo T euoz pro uoz M u f A I XI x X Te s uoz p1o TX uoz p o jX uoz n u Ax Teteos euoz pro T T euoz p o I uoz n u fx7 1 X Teos uoz pro IX euoz plo IX euoz mau f uoz pro euoz mau M iere aed a ATHAPASCAN 57 B nn ae g m Ir m zn rinda Figure 8 Mandelbrot Set visualization main and mapping windows The execution was on two nodes each having three virtual processors 9 5 Matrix Multiplication This example shovvs the use of ATHAPASCAN for implementing a parallel application on matrix operations Matrix product and addition are implemented by classical bi dimensional block parallel algorithms Roch amp al 58 98e qur ureu qur f0 uznq x 01 Y u aeyexenbgreq O aeyTeooT n u c 3ejre oT gt p zeuS f t u O aeTeooT n u c 3ejre oT gt p zeuS f t V f OI gt 0 f aur s
38. 3 0803 VAINRIA INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE ATHAPASCAN an API for Asynchronous Parallel Programming User s Guide Jean Louis Roch Thierry Gautier R mi Revire N 0276 F vrier 2003 THEME 1 apport technique 74 INRIA RH NE ALPES ATHAPASCAN an API for Asynchronous Parallel Programming User s Guide Jean Louis Roch Thierry Gautier R mi Revirdil Theme 1 R seaux et syst mes Projet APACHE Rapport technique n 0276 F vrier 2003 pages Abstract ATHAPASCAN was an macro data flow application programming interface APT for asynchronous parallel programming The APT permits to define the concurrency between computational tasks which make synchronization from their accesses to objects into a global distributed memory The parallelism is explicit and functional the detection of the synchronizations is implicit The semantic is sequential and a ATHAPASCAN s program is independent from the target parallel architecture cluster or grid The execution of program relies on an interpretation step that builds a macro data flow graph The graph is direct and acyclic DAG and it encodes the computation and the data dependencies read and write It is used by the runtime support and the scheduling algorithm to compute a schedule of tasks and a mapping of data onto the architecture The implantation is based on using light weight process thread and one s
39. 3108b 3404 Te uosb proa fosso y 4os ozis p Aexre7poreus zis r peuSisun ferre poreys tu L 4Erre p rens 4exre7poreus Aur onqnd lt L gt erre poreys o1qnd sser lt L sse p e3ej dure3 OT 09 p1e3s 1 s110 pearruoy isy sso00e y 1 ozis gt r 0 1 gur 103 1 1 31x0 Hpu gt gt e znos yo ezrs gt 4s p ezrs doo guy gt gt 1199 AexryAmezis lt 9718 Jt 040p 2012094 ybnoua pasvys fo ozis st j043u00 40442 pearurog Aenry Nezis QUI 09 2 6 qui 44 17818 QUI uo lt p 4enuy Aur 4 peadeug Te lt lt L gt Auyiu gt M I pereys Te J103e1edo proa amp eure7pereus ur7z4doo 32na3s lt L sse p e3e dure3 fiviso paivys fu 0 03 fivsso pasoys fiu v fo quod fidoo 4 pe s syo wo 1 syjo lt duray ezis gt r 0 1 qui doy zis p meu dui lt L gt Aerry Aur 4 T 31x0 pus gt gt 1199 fezzysu yo ezts gt Aezze pezeys ezts doo ug gt gt 1199 amp errv AIN ZIS lt zIS Jt faa wolf pop 2012024 07 uyDnouo biq pauoys fo st og 1o4yuoo 40449 azis WOI Kerry KIN zIS qur zis qui 317018 qur Wo p 4ezny ur lt lt pL amp euyfur A peaeus pe 103e1edo proa
40. 3s0 uinjei weizZ0id ayy ure x Karsu qur y gt gt 1750 14 V 99107 qsuo so e ri0 Te gt gt queei390 Te 4 OI 4404 uTy pi fIpue pas TT9DaUut4danadng u gt gt an0d pas OI HONDA FePUFT f e3e4s Te2 goezrs 2 iroaeoej 3nd4no 1 Sres T1T 9 RYOSYaYELY uoa SNVAULS OI f qe3s TT go zrs ogur sseooe 9o2 sty 1 1xe oa ug 3nd4no f pue p3s TT paurzdanadn0 gt gt 3noo pys 24 1129 sty lt TT 5 gt m 1 p zeugS Ie 4oqez do proa peers e2107 Tuumo Tteoqurzdinqdng qo nzqs 2 lt TT O2 gt a peaeug e y lt 99107 M T uorqerSequi gt M2 pereug e ta Ozogezedo seq uora3e18o3ur 9210J f Tpu p3s gt gt a3ru andang gt gt 3n02 p3s 4 Ku xu aru 8syroaug 3nd4no 0 p Oeatte st zognqriquoo Kqarsu qur L33008 WAN e3noog3ex og andano xonqr quoo 92 isuoo 9240g 0210g eSessej meu 3nd3no Tpu p38 gt gt 3ru andang gt gt 3n02 p3s y Ku qur xu qur proa aesed ssoooe 18ggnq lt lt rego gt 10998a1 gt m 1 p zeugS e Ku qur xu qur 3oqez do proa sSseo22e j uoranpoae ss o e TT sTu3 aturandang 220228 K Ku ogur sseooe Teo STU x x fu ogur sseooe e2 styy 4
41. 908 FT gt Qe9so 3ex ogfN qur 10 trun fxeoiq gt 3uesqN Ft 3uegqN PUSSOLAN 0 puegoIqN x qur jnq yoos pu s qu SqN fquegqN zequtr jnq gt i PuegolqN errum p T nog x iq zequr jnq ZTS pu SolqN 4uT 10 qu SqN qur tz qur jnq x83 3S 92 Qzeasegsr FT pr eAur uznq xz 0 yoos FT 9ZTS qur nqx zeuo ezrs qur qeqs 92 pues 3eX2oSg N qur ZTS no uorxeuuoo eT sandnoo TS 0 zTs pj 224205 ET INS ToAu zefoaue UO0TIDUO A x f0 FTpue qTHOM WHOO IdW FLAT IdW ezrs ynq seog IdW IdWON F2PUFT t x fZ enduox qs uorxeuuoo eT gt O i pPeeuyolqN 99 0 gt pe uqy FF mno z e mog nb ertrun fxeoiq gt PeeuqN zt peeuqN PeeuoldN 0 peeYyoLqn X qur ynq x208 a2821 peeuqN fpeeuqN zequtr jnq gt zi pe yolqy Trun x uor4deoex ep T nog x iq z qur jnq ZTS peegoIlqN 4uT 10 peeyqn aur z quT jnq x qe4s 92 z iseysT IT pr eAur uznq xz 0 223208 FT zTs qur gnq Teyd eZTS qur nq xe3e3s 92 A281 38X2O0S N IUT fepoo epo2 qaod xneA1es S L
42. AS PREVIOUSLY DEFINED IN API CHAPTER return 0 Roch amp al 70 80 90 100 110 120 eR heap ard ATHAPASCAN 31 7 Shared Memory Access Rights and Access Modes Shared memory is accessed through typed references One possible type is Shared The consistency associated with the shared memory access is that each read sees the last write according to the lexicographic order Tasks do not make any side effects on the shared memory of type Shared Therefore they can only access the shared data on which possess a reference This reference comes either from the declaration of some shared data or from some effective parameter reference to some shared data is an object with the following type al Shared_RM lt T gt The type T of the shared data must be communicable see Section page The suffix RM indicates the access right on the shared object read r write w or cumul c and the access mode local or postponed p RM can be one of the following suffixes r rp W WP CW CWp r_w and rp wp Access rights and modes are respectively described in section page and page 7 1 Declaration of Shared Objects If T is a communicable type the declaration of an object of type al Shared lt T gt creates a new shared datum in the shared memory managed by the system and returns a reference to it Depending on whether the shared object is initialized or not t
43. T92 gjoezrs e ueAe p segeur estioqne uo axeu2 goezrs xeuy2 goezrs xeuy2 goezrs xeuy2 goezrs Ipue pas Q3AISO3U NAAT HAVH ST149 40 XO018 IVILINI Eg f9r Au STL os 92 xu Te x0 pues lt yoos x0 pues lt yoos x0 pues lt yoos x0 pues lt yoos x0 pues lt yoos x0 pues lt yoos xyo3 pues 3 os x0 pues lt yoos TVNID T T T 190 K Ku xo x Ku yo eut qo T 40 yo qeqas Teo f 1 3Tx p3s O gt 2989S TT 52 7o ztrs sIT O2 qeqs I92 A283 X2OS JT I atrxe pas O eaeas T I92 goezrs Ku xu s 92 1eq2 AD2eric 34208 IT 91895 TT 5 meu S 99 3xeu O TT TT 1 meu ex Ku arurgsu Ku a rurgsu SIT 92 fu xu f I arxe pas 0 gt 1TUT3SU JO9ZTS ITUT SUJ 9I8ISTTT9D AD9AX lt HDO0S JT I atrxe pas 0 gt 1TUI3SU JO9ZTS ITUT SUZ IBUYI AD9I lt HDOS JT faturSsu o3e3s TT 92 f T arxe pas 0 SvCv xe3sn o pepou 32euuooc x209 FT 0 SvCv 193sn 2 gepou 32euuo2c 3208 FT 0 gt SP3H 1 191SNTO Z9POU ID9UUOD lt HD0S FT 05 92109 SENBOX 1D9UUOI lt HD05 FF 0 SZI09 erTe180 32euuoo x208 IT 0 gt 93109 UTMbuoSTE 199UUOI lt HD0S FT np WON TINN iX905 IT f aexoogrN meu TIAN yoos yoos n eSessey x Ty euueigoid uorxeuuoo ep
44. TF xX X gt x z Tqnop qsuo2 p e qnop qsuoo d atqnop 3suoo otd e qnop 2484S feuoz pro uxnger 91313 erara aes mod euoz pro a pX Z 5 z 195 JorqTepuey 2TIT2 jaurads 110011913T3 uoz zeuoz meu feuoz mou euoz pro uoz gr euoz mou pueu ura uoz fTpue pue gt gt peteyues euoz zenbs peuooz e eurgep g uo33ng i pue gt gt euoz saenbs peuooz e eurgep Z uo mg m pue gt gt 9uoz ie ng8ue42e1i peuooz e eurgep I uoqqng u Tpu gt gt iMOpur amp oaql puey ur Surpurq Tpu gt gt Mb b ou IPue gt gt dpeu sty qutad q oh pue gt gt 9uoz 19981q seur3 9ATJ uo meapei Tpu gt gt 9UOZ Z TTeus s ur3 9AIj UO MeIpad Tpu gt gt n uoz uo s pue gt gt eanjotd 1 u 2 gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt gt TPUS gt gt 10T PToused3u3 L gt gt Tpu gt gt 01 PTOYS9IYI 3 mu gt gt Tpu gt gt uT 201Q 9PUey Zemod H A gt gt Tpu gt gt al 201q 9puey Zemod u gt gt Tpu gt gt 001 SUOTIRBASIT u gt gt Tpu gt gt HOT SUOTIBADIT P gt gt Tpu gt gt OOT SUOTIBASIT N ou gt gt pue gt gt OT SUOTIRASFT N A gt gt IPue gt gt Mopura 3oiqpepuey ur Surp
45. THOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 22 Roch amp al Remark encapsulating a C procedure Obviously a C procedure i e a C function with void as return type can be directly encapsulated in a C function class and thus parallized with ATHAPASCANHere is an example void f int i printf what could I compute with Zd An i struct f_encapsulated void operator int i i is some formal parameter fC i 3 int doit int a 3 f a Sequential call to the function f f_encapsulated a Sequential call to the function class f_encapsulated al Fork lt f encapsulated gt a Asynchronous call to the function class f_encapsulated return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 ATHAPASCAN 23 5 2 Allowed Parameter Types The system has to be able to perform the following with a task in order to Fork it 1 detect its interactions with the shared memory in order to be able to determine any synchronizations required by the semantic 2 move it to the node where its execution will take place Here are the different kinds of parameters allowed for a task e T to designate a classical C type that does not affect shared memory However this type must be communicable see Chapter entitled Shared Memory for a definition of communicable types e Shared T gt to designate a para
46. W qrIP3s epnpourg Pam oer Roch amp al 66 X T3uoyo epnpoutgt lt u ut qeutieu gt SPNTIUTE q 3ex20s s s epn ourg lt u u s sKs gt epnp ourg lt y odt s s gt epn ourg lt y sedfy sfs gt epnpourg lt Y P1STUN gt epnpourg S3 y oos s p ro due T Inod Y O i PUSSOLAN 39 0 gt FUSSAN FF Konu e ynog nb uotqeot tizen x eTrun fxeoiq gt 3uesqN Ft fquegqn puegolqN 0 puegoIqN ynq pg pues qu SqN q orpas epn ourg fqueSqu zequt jnq cu S UTIIS gt epnpourg 9Seq SOTITRIQTT 0 i pu solqy TTun p e onog x 407994 urj p fynq 407994 J purr fezrs puegoIqN aur 10 qu SqN qur T ueosedeuay euwex3oad ye INT euuexSord exque sefonue seSesseu sep uorjrurgeq fx quT ynqr 09898 IT22 u 3ess M eTIJN fosep p JUL F sz e AT sev TWNIDINO 777 Ind qur AVG ezis aur jnq eqs TT z foaux S ss y dur e1ez A zts no uorxeuuo2 ep rs Q Addvs yefoad G 3Tx f uu avezzoout s zq uezed s p roaug f anr53sy goezrs inTp3su ezeq o z foauq gt TYNIDIYO ae3s TT yo zrs qnTHSsyR 99e4S T92 axeKfoaug FT x UOTIRSTTENSTA ep
47. Z xeaseysr U PYANA PY SY 320uuo2 2 yaurad 904208 prpeaur 0 xoos 0 epoo FT JT pue tpa 8840018 Idhi i WHOO IdW poyI 4sew INT IdW 1 9poo aseog IdW FF IdNON F2PUFT a20d aneAxes S elle 183S8NST UNISVOS PY SY 199UUO2 9 ya3urad i fT Y20s PITeA ST 4208 eua fq y este fp 2po2 f uorxeuuoo suep I uou gyoezrs uoug x Ippeyoos ADOS JD9UUOD E ape d x20s 32euuoo uorxeuuoo epueue 310d suoqq 210d urs uou LHNI dV TTuey urs wou f q33u T u du appe urs uou appe u du fdooq zreqeurqs p 3ex2os ep essegdpe ep uorqezred rd este fg epos IN AI S QUN nuuoout 99TS 5 gt TION zneaxes eueufqasou3e8 dy IT este np q uz qur ss zpe T ep euoieuoey t fg epoo uerqrssodur 3exoos ep 9 NVIULS MOOS LAINI 4V 3ex2os yoos IT To20301d adfy uTeuop a3 y os q y oos eT ep uorqe 0 2 dd ddy 105 grpueg HSN TA Ongqdfa gt gt Tpue gt gt Oarur ddy s rn rizeq u gt gt O TIW 9aqafa pnaaqla fyoos qur Ontut dd 109 proa f poqd seu qur pe32e3041d S
48. aaqa jTes ueqs s Te y dnoz3 Surddeu qe or qnd uos fu sseT O podu 3185 weqsfs pe p xry Surddeu ye T 0T 0 1 seanqzaaaypeu g cuorSea fepdstp sixoj f Toorqu Too z x uorgex e3nduoo ooad pueu 01d UTA fpueu pueu UTM 5rqeqs f UCI k M I TO9 lt aur pereug gt ex1e wered y uy Z gt at 31 q s di epnpourg ul pueu 2oqd ura epnpourg To qu qur Z euoz x uorgeir ura proa 4U pueu ura epn ourg 4U UTA epnpoutg 4 Orepueu 4U 3seS UTA epnpourg Tepuew uznq z 3suoo eueu gdeig reyo 34suoo Tepuew 32nz3s lt y uessedey3e gt epn ourg lt q yyew gt epnpourg lt U OpPIS epn outg T097 lt U QTTPIS gt f eaeegy aseS urn q ueeijs epn ourg f epou x mezp oo1d n f Too x mezp pueu a o Tepuew f xequey 3seS UTM pesa 1 T09 T Too FTpue Q ZTS ToooT 0 T qut 4oy OZIS TO2 aur seu TO2 JUL 14 52 eR Ud i S 175947 172437 xe duoo b gt 3suo5 3xe duoo suo 102exad0 xatdwuoas xe duoo Txt II uanger 4suo2 sqe x Tduo5 Tqnop i suo uTr x Tduo atqnop 17 uanq x asuoo ex xe duoo 0 wt ex yz ur eTqnop xe duoo xepduoo o COOT CO T O x Tduo x Tduo suo x
49. ageaur 02103 X104 Te t 4u FT C 1 se2207 f F pze0q YOO lt ASe 02103 3104 e 0 if 38 T Au it FT C 1 se2207 F T pre0q 490 lt ASe uorqexBoqur 02103 X104 e 0 il zt C 1 se2207 f 1 pze0q 100 ASe UoTyesBoqur 02103 X104 e osi D 8 02i D FT EF Er r Pzeoq 1 se2207 E 1 pze0q 100 Ase uoTyesaBoqur 02103 X104 e O it IT TT yore yo sinoqu8tou eq4 uo doo7 C r Pzeoq TT 5 eua jo znoqugreu yore Wo eua uoraeanduo paxeoq gt C xu gt f fo f aur 107 SAUT fO T QUI 107 Tpu p38 gt gt 3 gt gt 8UI gt gt 1no2 p3s ay zoy TTe jo uoraeznduog I pue pas gt gt 2ufs seady gt gt 3n02 p3s oufs uoo pue pas gt gt 2u s queay gt gt qnoo p3s eU ed i E 892920 zts PJ 3ex2os ANS fT ITOABD I UOI32UO4 x 65 qnos y os soT 893n029 p 4 y oos eT op f oTaz s yoos osep ape3 IPeg Ippeyoos qoniqs e3no2e 3208s 3deo2e IPESTR 1IPeg Ippey X F 3 apeST d zpe d y os ad oe 39X20S eer2osse eTIJ T suep quepu d uorxeuuo un p UOTIDRAJXH f xpe zoezts IpeST 224208 VT ep T TTTEL 5 01 e
50. aim of this very simple divide and conquer strategy is to compute the integration of the function f on the interval a b according ot the Newton Cotes method a b b faz rat f fd 2 b V b a lt h g a b fdx M iere aed a 47 ATHAPASCAN ON Ox ur uee o O ON ON ON TELE PHIIJ33VW TV errjexew t f0 uangal WildVHO IdV NI 4 1440 ATSNOIAZYd SV GOHLAW NIVW o8ze qur ureu qur 10 uangal Ipue gt gt 9euop SJP HD gt gt MO sex sex qurad xioj f s r y q e fIpue gt gt uorj3ej3nduoo ma 4184S MO gt gt 2002 4u weu ydea3 y 521 eueu yders du3 Tqnop neu u lt erqnop gt pezeus etqnop n u s z lt eTqnop gt pezeys fdu lt lt q lt lt e lt lt uro f q deas mi pue q fe eu 8AT9 gt gt 2002 fdu4 q fe Tqnop A8ze 9318 QUI 2TOP qur eqqnop a pezeys lt aTqnop gt m pax1euys 14 gt gt ss oS5e s r gt gt Sel gt gt MO sex lt rqnop gt m p zeug zogezedo proa sex 3urad qonzqs 54 1 Zs x lt uns gt x104 t gs z Y q z qre lt omduoo gt xzog f Ts x y z q e e eanduoo x1oj4 ugsex eueu gde18 zsex sor eueu yde8 sex 1 meu gzsex
51. anooe yoos u qsTT Soquepuod uorx uuo ep xeu iquou qu 2eAe qu x2os ue4SIT Uorxeuuoo ep sepueuep SAT qd ooe nb u qsKs ne euB s x S quepu d suoTrx uuo ep TTy ap UOTIBIAD t G 3Tx rqrssodur 3ex2os ep UOSTBTT UOTIBDA 119p9S jqurad uu r 98 P rer uot Du pas autid I 4 19 ADOS qnos y oos FT 194208 eun p np qaod ne qu u uoeqqe ed a 310d QUI H20OS19919 3ex2os eT Uorjeel f xpe joezts ApeST qur peuStsun 294208 ep esseipe T ep TTTEL f pe ur appeyoos 39n195 124208 ep esseidpy qut eqnooqqyeyo0g eBessay pro e3nooe p 224208 ep uorqe d40 x os p np roaueg f azod uts uou syoqu 210dx TeooT uoTqequ s id z es e z rqu un p Neeser Sesseq i p arx e qrssodur ep wou np uorqu qq0 4ozz d O i ANenSuoTR uoug qoniqs Sep eueux2os4e23 IT 8174 ape d x os eueux2os4e8 ossozpe T ep uorqerednoey f uou goezrs ATHAPASCAN A 294208 eT ep zinenguoT t t g atxo erqrssodur 3939058 ep euoy 4 Oz i u
52. ation of a Fibonacci number algorithm Section page NB The parallel programs we present here are just for educational purpose The granularity of the tasks executed on remote nodes is small while the number of tasks is high If you try to run these programs on different architecture in order to compare the performances you will realize that it can take even more time to execute in parallel than sequentially This is a normal behavior for these kinds of programs ATHAPASCAN tinclude athapascan 1 h include lt iostream h gt include lt stdlib h gt struct add this method is instanciated by the cumul method of the concurrent write Shared data see fibonacci void operator int amp x const int amp a x a h sequential fibonnaci int fibo_seq int n if n 2 return n else return fibo seq n 1 fibo_seq n 2 struct fibonacci void operator int n int threshold al Shared_cw lt add int gt r if n lt threshold r cumul fibo_seq n else al Fork lt fibonacci gt n 2 threshold r fibonacci n 1 threshold r h struct print This procedure writes to stdout the result of fibo n where n is an int in the shared memory void operator int n al Shared_r lt int gt res cout lt lt Fibonacci lt lt n lt lt lt lt res read lt lt endl int doit int argc char argv al Shared lt int
53. cess the shared data In functional languages such a parallelism appears when handling a reference to a future value With this refinement to the access rigths ATHAPASCAN is able to decide with greater precision whether or not two procedures have a precedence relation A procedure requiring a shared parameter with a direct read access r has a precedence relation with the last procedure to take this same shared parameter with a write access However a procedure taking some shared parameter with a postponed read access rp has no precedence relation It is guaranteed by the access mode that no access to the data will be made during the execution of this task The precedence will be delayed to a sub task created with a type r In essence the type Shared can be seen as a synonym for the type a1 Shared wp T it denotes a shared datum with a read write access right but on which no access can be locally performed An object of such a data type can thus only be passed as an argument to another procedure 7 3 1 Conversion Rules When forking a task t with a shared object x as an effective parameter the access right required by the task t has to be owned by the caller More precisely the Figure page enumerates the compatibility at task creation between a reference on a shared object type and the formal type required by the task procedure declaration Note that this is available only for task creation and not for standard function calls wh
54. ch may be installed on all platforms where a POSIX threads kernel and a MPI communication library have been configured The efficiency with which ATHAPASCAN runs has been both tested and theoretically proven The ATHAPASCAN programming interface is related to a cost model that enables an easy evaluation of the cost of a program in terms of work number of operations performed depth minimal parallel time and communication volume maximum number of accesses to remote data The execution time on a machine can be related to these costs 7 ATHAPASCAN has been developed in such a way that one does not have to worry about specific machine architecture or optimization of load balancing between processors Therefore it is much simpler and faster to use ATHAPASCAN to write parallel applications than it would be to use a more low level library such as MPI 2 2 Reading this Document This document is a tutorial designed to teach one how to use ATHAPASCAN s API Its main goal is not to explain the way ATHAPASCAN is built If new to ATHAPASCAN it is recommened to read all of the remaining text However if the goal is to immediately begin writing programs with ATHAPASCAN feel free to skip the next two chapters They simply provide an overview of e how to install ATHAPASCAN S librairies include files and scripts Chapter e how to test the installation performed Chapter e the API Chapter The other sections will delve de
55. ddeu st Sutmerq fOeuwexzqezqut 1 f 4 yo Sutmerp x STY2 09 ABTTUTS Tt doo Sutmezp SUL suor3oung doo7 Sutmeiq eueu eso D3xequoo proa 93382 suep noTTes p TJpuedg eoznossey Seq ryye p eun p e ep sioT T dde Ipu d0 x quo un p ep qu u qerp uur T ddy 1ru axequo2 proa suep eeno e Tpu dO0 eoznossey aqzeano p e13eueg enbeuo anod stor eun fez Teddy 0uedg qx quo5 un p T ep SIOT 3ueuejerpeuur T ddy 1 HSnT4 Dnguqfa gt gt Au Qaruprde ddysepnorazeq gt gt 0 T Y 9aqfa onaaala Oaturrde proa Tenqata egeqorgge p T noq eoueuuoo eu nb quese spew IdY T p 3ueueoue seide soltessooou 9IND9XH fOYTUT PTOA Tenqata seoTAZeS sop uorqesrIerqruI Id T ep 3ueueoue quene seltesso2ou suorjesr er3rur se eqnoexy 0 Oddv109 Tengata an qon4aqas p un p anofy 5 uz y ddym fa uxex peuxey a ddy105 Qina42ni3suoo un p ynoly OTT Qn ddytpfa ottqnd ddy 799 sse se noraged s p uorqesrTenstAa 3ue43euxed uotzeottddy tK Ku qur tx Au qur t urq qur t qeqas sev d T
56. e A good way to write ATHAPASCAN applications is to hide ATHAPASCAN code in the data structures Proceeding that way will allow you to keep your main program free from parallel instructions making it easier to write and understand In Chapter 5 3 3 we wrote a communicable class implementing a resizable communicable array called my Array We are now going to write a shared data structure on top of this class include athapascan 1 h include resizeArray h class shared_array is a class hiding Athapascan code so that the main code of the application could be written as if it was sequential It is based upon the resizable array class called myArray resize the shared array template lt class T gt struct resize shared array void operator al Shared r w lt myArray lt T gt gt tab unsigned int size tab access resize size y affect a local myArray to a shared_array template lt class T gt struct equal shared array NB we use a read write access because we need the size void operator a1 Shared_r_w lt myArray lt T gt gt shtab myArray T tab 1 myArray lt T gt t new myArray lt T gt tab t gt resize shtab access size shtab access t h append a shared array to a shared array template lt class T gt struct append_shared_array void operator al Shared_r_w lt myArray lt T gt gt 11 al Shared_r lt myArray lt T gt gt 12 1 int i t1 access size int k
57. e we need to define a Shared data in order to hold the result and to create a task calling the Fibonacci function The fibonacci task works the same way the sequential function works first we check if n lt 2 if not we make recursive calls to get F n 1 and F n 2 AA A 18 Roch amp al The difference here is that we cannot directly access Shared data Thus we cannot add the two results directly and therefore need temporary Shared variables and a specific task to add them An other parallel implementation using concurrent write The parallel implementation we have just studied runs correctly but is not very efficient This program will run faster if we use one of ATHAPASCAN s features called concurrent write This kind of Shared data is designed so that every task can access the data and perform a concurrent cumul operation This means less synchronization needs to take place and thus less time is wasted waiting for other tasks to complete Ideally the granularity of the tasks we wrote is not big since forking tasks can sometimes consume more CPU time than a regular sequentially executed task The best way to increase the speed of the computation is then to Fork enough tasks for each processor to be busy and then let them carry on sequentially to avoid excessive communication For that purpose we introduce a threshold variable As indicated in the text the threshold is user defined Now examine the second parallel implement
58. e main method of the program To ensure a proper creation of a commuunity it is also necessary to catch any exceptions that might be thrown by the intialization procedures The skeleton 10 of an ATHAPASCAN program should resemble the following block of code int doit int argc char argv 1 return 0 int main int argc char argv 1 try ai Community com ai System create initial community argc argv Std cout lt lt count argc lt lt argc lt lt std endl if argc N for int i 0 i lt argc i std cout lt lt argv i lt lt std endl std cout lt lt usage lt lt argv 0 lt lt PROPER USAGE lt lt std endl return 0 doit argc argv com leave catch const al InvalidArgument amp E 1 std cout lt lt Catch invalid arg lt lt std endl catch const ai BadAlloc amp E Std cout lt lt Catch bad alloc lt lt std endl catch const ai Exception amp E std cout lt lt Catch E print std cout Std cout lt lt std endl catch std cout lt lt Catch unknown exception lt lt std endl return 0 Roch amp al The function doit should contain the code to be executed in parallel The function main in this case simply creates the community executes doit and catches any exceptions thrown The variable N represents the constant number of inputs needed to run the program when coding this line re
59. ed on e IBM SPx running aix 4 2 using LAM MPI or a dedicated switch with x1C 4 2 C compiler e Sparc or Intel multiprocessor and network of workstations using LAM MPI with CC 4 2 C compiler Athapascan 0 is currently supported on e AIX 32 5 and IBM MPI F IBM SP with AIX 4 2 and IBM MPI e DEC Alpha with OSF 1 4 0 and LAM MPI 6 1 e HP 9000 with HP UX 10 20 and LAM MPI 6 1 e SGI MIPS with IRIX 6 3 and LAM MPI 6 1 SGI MIPS with IRIX 6 4 and SGI MPI 3 1 e Sparc or Intel with Solaris 2 5 and LAM MPI 6 3 e Intel with Linux 2 0 25 MIT threads 1 60 6 and LAM MPI 6 3 Q How do I get a copy of ATHAPASCAN Q Where can I comment about ATHAPASCAN Q How do I get up to date information A There is a web page dealing with ATHAPASCAN at http www apache imag fr The ATHAPASCAN distri bution the manual the document you are reading and some other related papers are also available from this web page Q The compilation failed A1 MAKEFILE not known What do I do A Check to see if you properly set up your environment by sourcing the appropriate setup file the ones for Athapascan 0 and ATHAPASCAN Q The option a1 trace file has no effect at execution Why could this be A Make sure you are using program compiled with an appropriate ATHAPASCAN library one compiled to generate dynamic graph visualization information Q An ATHAPASCAN internal error occurs at execution What can I do to correct this error A If you are usi
60. emantics as pointer assignment Example 32 Roch amp al al Shared lt int gt x x is just a reference not initialized al Shared lt int gt a new int 0 a is a shared object initialized with the value 0 x a x points to the same shared object as a The following operations are allowed on an object of type Shared e Declaration in the stack as presented above e Declaration in the heap using the operator new to create a new shared object In the current implemen tation the link between a task and a shared data version is made through the C constructor and destructor of the shared object So to each construction must correspond a destruction else dead lock may occur Therefore in the case of allocation in the heap the delete operator corresponding to the already exectured new has to be performed e Affectation a shared object can be affected from one to another This affectation is symbolic having the same semantics as pointer affectation The real shared object is then accessed through two distinct references 7 2 Shared Access Rights In order to respect the sequential consistency lexicographic order semantic ATHAPASCAN has to identify the value related to a shared object for each read performed Parallelism detection is easily possible in the context that any task specifies the shared data objects that it will access during its execution on the fly detection of independent tasks and
61. emory e the stack a local memory private to the task This local memory contains the parameters and local variables it is the classical C or C stack This stack is automatically deallocated at the end of the task e The heap the local memory of the node Unix process that executes the task Objects are allocated or deallocated in or from the heap directly using C C primitives malloc free new delete Therefore all tasks executed on a given node share one common heap consequently if a task does not properly deallocate the objects located in its heap then some future tasks may run short of memory on this node 1Caution this is not verified either at compile nor at execution time The user has to take care of not including any reference on the shared memory in classical types 2In the current implementation the execution of a task on a node is supported by a set of threads that share the heap of a same heavy process representing the node 24 Roch amp al e The shared memory accessed concurrently by different tasks The shared memory in ATHAPASCAN is a non overlapping collection of physical objects of communicables types managed by the system ATHAPASCAN 25 6 Communicable Type Using a distributed architecture means handling data located in shared memory mapping migration consis tency In order to make ATHAPASCAN able to do this the data must be serialized This serialization has to be explicitly done by the p
62. eper into ATHAPASCAN s API so that the user can benefit from all of its functionalities They explain e the concepts of tasks and shared memory Chapters and respectively Roch amp al how to write the code of desired tasks Chapter how to make shared types communicable to other processors Chapter 77 which type of access right to shared memory should be used Section how to design parallel programs through complex examples Chapter how to select the proper scheduling algorithm for specific programs Appendix how to debug programs using special debugging and visualizing tools Appendix 77 M iere aed a ATHAPASCAN 7 3 Installation of Athapascan 1 version 2 0 Beta ATHAPASCAN is easy to install This chapter only covers the installation of Athapascan 1 version 2 0 Beta on a UNIX system The lastest releases of ATHAPASCAN software are available for download on APACHE s web site http www apache imag fr software athi archives 3 1 Installation of Inuktitut and Athapascan The entire installation is based upon the couple configure makefile There is a makefile in the top level of the both the Inuktitut the excecution support and the Athapascan JIT 1 3 folders that provide sound settings for the installation Modify the settings at the beginning of the file in order to finalize the installation to the desired folder Next execute make config In order to define certain variables at the
63. ere the C standard rules have to be applied type of formal parameter required type for the effective parameter al Shared r p T Shared_r p Shared rp wp Shared lt T gt al Shared Shared v p Shared rp wp Shared lt T gt al Shared cw p F T gt al Shared cw p F ai Shared wp al Shared lt T gt 1 al Shared_rp_wp lt T gt al Shared wp T RES ai Shared_r_w lt T gt ai Shared wp T RS Figure 3 Compatibility rules to pass a reference on some shared data as a parameter to a task 7 3 2 Shared Type Summary Figure page summarizes the basic properties of references on shared data 38 Roch amp al F ai Shared_cwp lt F T gt ai Shared_r_w T al Shared rp wp lt T gt a 1 ECRIRE RT CERE p a e cw Jof 16 1 gt a 15 A A PP E 4 al Shared_en__ ET J e e A C aio AAA 1 AA ARS A HAS Jofo Le e o o lo al Shared lt T gt Figure 4 Figure 6 3 Available types and possible usages for references on shared data Ae stands for a direct property and a o for a postponed one formal P denotes formal parameters type given at task declaration and effective P denotes effective ones type of object given at the task creation concurrent means that more than one task may refer to the same shared data version 7 4 Example A Class Embedding ATHAPASCAN Cod
64. ezSord un suep un p seeuuop s p queqqoured esse x 103294 epn ourg X10329A p38 4 ddo e3e4s 9o epnpourg 4q ddy T1050 epatout W orpas epnpourg lt U QTTPIsS gt SPNTIUTE w aenpfa qaey qy uieu epn ourg Q uaeu epn ourg lt y s8uti3s gt epnpourg lt ue zqsor gt epnpourg lt ue zqsor gt epnpourg seq ep K oTp3s3 epnpourg 14 009 urj p L33OUSfN J purr s p queqq uz d uorjeor ddy q 3exX 20S9fN o ddy109 oTTIN t FTpue euegdyr UT esn 10 xexoeg eof q perjrpou 10 0 exe7 4 estes ddVS 3efoad fepoyreqseu epou epoug qTHOM IdW xuex uuo I dW ATHAPASCAN Pam Roch amp al 70 3980 I 796 0 7 qeTsuezi13 Oxra3eyusnaT3 G N 4 3aurad t LIg M44408 HLdAQ T9 22 TOTS seap dd 109 ies ep ursseq x 1191 997 2 4912d zeyo esopoaxequoo ddy T1059 proa Uor32UOj 93392 Suep eeno esep T9uedj eoanossey x Seq ryye p exaeueg eun p ep SIOT T dde JDuedg qx quo un p
65. four until a given threshold has been reached The visualization is made during the computation so that visualization threads have to execute on the X server site a special scheduling policy is used for these threads 51 ATHAPASCAN quStey qur UYIPTA qur ezrser x qur f speary qu qur oord qu qur uorades qur Oz euoz aTuT qur oo d ura or qnd pueu oord ura sseT sed a epnpourg 4U 20zd urm epnpourg GNVW OOUd NIM SUTJOPE GNVW OOUd NIM F2PUFT y pueu oodd UTM grpuesqt feuoz mau uoz feuoz pro uoz 1027 euoz qur 93entad gt qut AOTODXETO9 TenidtA f AUBTOY qur UIPTA qur ezrsei x qur x guorgex qasuo5 uooz qur enadrA Kex qasuo5 pessead fex qur Tenqata pe3283042d dT q proa f Q uoz n u euoz f 109 qu qur uorades aut 2 euoz aTuT qur 2r qnd Kex ura uooz urm or qnd pueu uta sseT 4U ex ura epnpourg U uooz uTa GNVN NIM urj p H GNVW NIM J purr y pueu uru FTpue 4 ft ta eTqnop 93entad fasuoo u qur mod x Tduo fasuoo 3xer duoo qsuo2 xe duoo fasuo2 3xe duoo 4suo xe duoo fasuoo x rduo qsuo2 4oqez do xe duoo 34suo2 Zsqe 34suo2 ur e qnop 34suo2 ex e qnop
66. ght Shared w al Shared vc T gt is the type of a parameter whose value can only be written This writing can be concurrent with other tasks referencing this shared data in the same mode The final value is the last one according to the reference order In the prototype of a task the related type is al Shared_w lt T gt x ATHAPASCAN Such an object gets the method void write T address that assigns the value pointed to by address to the shared object This method assigns the value pointed to by address to the shared object No copy is made the data pointed by lt address gt must be considered as lost by the programmer Further access via this address are no more valid in particular the deallocation of the pointer it will be performed by the system itself Note To clarify the rule that each read sees the last write due to lexicographical order being observed Example class read complex void operator al Shared_w lt complex gt z complex a new complex cin gt gt a x gt gt a y z write a follow the example below tinclude athapascan 1 h include lt stdio h gt struct my read void operator al Shared_r lt int gt x printf x 4i n x read h struct my write void operator al Shared_w lt int gt x int val x write new int val E int doit int argc char argv al_system init argc argv al_system init_commit if al_system
67. gt include lt stdlib h gt struct add This procedure sums two integers in shared memory and writes the result in shared memory void operator al Shared_r lt int gt a al Shared_r lt int gt b al Shared_w lt int gt c c write new int a read b read struct fibonacci f This procedure recursively computes fibonacci n where n is an int and writes the result in the shared memory void operator int n al Shared_w lt int gt res if n lt 2 res write new int n else al Shared lt int gt resl 0 al Shared lt int gt res2 0 D al Fork lt fibonacci gt n 1 resl al Fork lt fibonacci gt n n 2 res2 al Fork lt add gt resl res2 res h struct print This procedure writes to stdout the result of fibo n where n is an int in the shared memory void operator int n al Shared_r lt int gt res cout lt lt Fibonacci lt lt n lt lt lt lt res read lt lt endl int doit int argc char argv al Shared lt int gt res int 0 al Fork lt fibonacci gt atoi argv 1 res al Fork lt print gt atoi argv 1 res return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 Explanation The purpose of this excercise is to share the Fibonacci computation with others processors Therefor
68. gt res int 0 al Fork lt fibonacci gt atoi argv 1 atoi argv 2 res al Fork lt print gt atoi argv 1 res return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 pan 19 20 Roch amp al As you reach this point you should now be able to write compile and run a simple ATHA PASCAN based program If you desire to write a real life more complex application you must further study the ATHAPASCAN library There are many useful concepts that have not yet ATHAPASCAN 21 5 Tasks The granularity of an algorithm is explicitly given by the programmer through the creation of tasks that will be asynchronously executed task is an object representing the association of a procedure and some effective parameters Tasks are dynamically created at run time A task creation execution in ATHAPASCAN can be seen as a standard procedure call The only difference is that the created task s execution is fully asynchronous meaning the creator is not waiting for the execution of the created task to finish to continue with its own execution An ATHAPASCAN program can be seen as a set of tasks scheduled by the system and distributed among nodes for its execution 5 1 Task s Procedure Definition A task corresponds to the execution of a C function object i e an object from a class having the void operator defined struct user task void opera
69. he thread an integer from 0 to al system thread count 1 4 1 5 Compilation and Execution The compilation of an ATHAPASCAN program is performed by the make command using the Makefile created upon installaion Be sure to modify the Makefile as needed to personalize the folders containing the include and library files An ATHAPASCAN program is executed in the same manner as one would execute any other program from the command line For example a common execution may resemble sh program name inputs 4 2 Basic Examples This section is a brief tutorial of how to build simple ATHAPASCAN programs it proceeds by teaching through examples The two examples presented in the section are getInfoTask cpp a program that demonstrates how to retrieve system information and Fibonacci cpp a program which commputes the Fibonacci number of a given input More thourough examples are offered in Chapter but use concepts that have not been discussed thus far 4 2 1 Simple Example 1 getInfoTask cpp 1 Fork 0 Shared g Let s start with an example implementing the ATHAPASCAN keyword Fork We will see later how to use Shared Assume we want to print to the console data about the task execution for example which processor is involved which node number etc Here is a basic example code include athapascan 1 h struct getInfoTask void operator int i cout lt lt Task number lt lt i lt lt endl cout lt lt N
70. hree kinds of declarations are allowed al Shared T gt x new T 5 The reference x is initialized with the value pointed to by the constructor parameter Note that the memory being pointed to will be deallocated by the system and should not be accessed anymore by the program x can not be accessed by the task that creates it It is only possible to Fork other tasks with x as an effective parameter Example al Shared lt int gt x new int 3 x is initialized with the value 3 double v new double 3 14 al Shared lt double gt sv v sv is initialized with the value v v can not be used anymore in the program and will be deleted by the system al Shared lt T gt x 0 The reference x is declared but not initialized Thus the first task that will be forked with x as parameter will have to initialize it using a write statement page Otherwise if a task recieves this reference as a parameter and attempts to read a value from it dead lock will occur Example al Shared lt int gt a new int 0 a is a shared object initialized with the value 0 al Shared lt int gt x 0 x is a NON initialized shared object al Shared lt T gt x The reference x is only declared as a reference with no related value X therefore has to be assigned to another shared object before forking a task with x as a parameter Such an assignment is symbolic having the same s
71. i z size 1 i gt 0 i in gt gt z elts i return in void myArray lt T gt resize unsigned int newSize erasing the data if newSize lt 0 60 size 0 if elts NULL delete elts elts 0 return new myArray is smaller 30 if newSize lt size T newtab new T newSize memmove newtab elts newSize sizeof T delete elts elts newtab size newSize return then new myArray is bigger T newtab new T newSize memmove newtab elts size sizeof T delete elts elts newtab size newSize return test tasks to see if the class is communicable struct myTaskW void operator al Shared_r_w lt myArray lt int gt gt x int z x access size for unsigned int i 0 i lt z i 4 x access elts i 2 i struct myTaskR void operator al Shared_r_w lt myArray lt int gt gt x cout lt lt size of the array lt lt x access size lt lt endl y struct resizeTab 4 void operator al Shared_r_w lt myArray lt int gt gt x unsigned int n x access resize n h int doit int argc char argv al Shared lt myArray lt int gt gt x new myArray lt int gt 100 al Fork lt myTaskW gt x al Fork lt myTaskR gt x al Fork lt resizeTab gt x 10 al Fork lt myTaskR gt x return 0 int main int argc char argv MAIN METHOD
72. ide communication active message This report presents the C library of the APT of ATHAPASCAN Key words parallel programming macro data flow scheduling cluster and grid computing MdC INPG Leader of the ArHAPASCAN S team jean louis roch imag fr T CR INRIA thierry gautier inrialpes fr Doctorant MENSR remi revireQimag fr Unit de recherche INRIA Rh ne Alpes ATHAPASCAN une Interface pour la Programmation Parall le Asynchrone Manuel de l utilisateur R sum ATHAPASCAN est une interface de type macro data flow pour la programmation parall le asyn chrone Cette interface permet la description du parall lisme entre t ches de calcul qui se synchronisent sur des acc s des objets travers une m moire globale distribu e Le parall lisme est explicite de type fonctionel la d tection des synchronisations est implicite La s mantique est de type s quentielle et un programme crit en ATHAPASCAN est ind pendant de l architecture parall le grappe ou grille L ex cution est bas e sur une interpr tation du programme qui permet la construction d un graphe macro data flow Ce graphe orient et sans cycle d crit les calculs et les d pendances de donn es lecture et criture il est utilis par le support d ex cution pour contr ler l ordonnancement des t ches de calcul et le placement des donn es sur les ressources de l architecture L implantation repose l utilisation de threads et de communications undi
73. ided to override this default definition 26 Roch amp al class complex is an Athapascan 1 communicable class complex z x y NB This class implements only the methods needed by Athapascan 1 5 class complex public double x double y empty constructor complex x 0 y 0 copy constructor complex const complex amp z x z x y 2 y destructor complex h packing operator al Ostream amp operator lt lt al Ostream amp out const complex amp z out lt lt z x lt lt ziy return out unpacking operator al Istream amp operator gt gt al Istream amp in complex amp z in gt gt zx gt gt zy return in Figure 1 Making the user defined class complex communicable 6 3 2 Example 2 Basic List with Pointers Let s go a bit deeper in the serialization and find out how to write a communicable class implementing a dynamic data structure based upon a list of pointers NB Even though the container classes from the STL are optimized and have been made communicable there is little use for these classes in a real life ATHAPASCAN application Therefore the class in this example is not optimized it is just an example This class implements a chain structure using pointers When running parallel application on a cluster of machines it is meaningless to communicate addresses Therefore we just communicate values The following class implements a chain structu
74. ioles BP 93 06902 Sophia Antipolis Cedex France diteur INRIA Domaine de Voluceau Rocquencourt BP 105 78153 Le Chesnay Cedex France http www inria fr ISSN 0249 0803
75. ke Branch amp Bound it s conveniant to share a variable with all other tasks data that can be read and written by anybody This variable usually contains the value of a reached minimum or maximum No information with respect to another task s activity is associated with this variable We are currently in the process of finishing the implementation of global variables for Athapascan 1 variables that can be both read and written on the collection of nodes in a community without the constraints of data dependancy that exist for Shareds Please bare with us as this project is still in development 8 1 Memory Consistencies The system offers three different consistencies on this memory e Causal Consistency where the data consistency is maintained along the paths of the precedency graph That is to say that if the task T preceeds the task 75 in the precedency graph then the modification on the memory made by 7 will be seen by T gt e Processor Consistency where the data consistency is maintained among the virtual processors of the system That is to say that the order of modification of the memory on a virtual processor P is the same that the order of modification seen on an other virtual processor P5 e An Asynchronous Consistency where the data consistency is maintained on the system in its globality That is to say that each modification made on one virtual processor will eventually be seen on other virtual processors 8 2 Declaration
76. m objects The complex type is communicable and has been previously defined in Chapter page al mem cc lt int gt x 0 al mem pc lt complex gt z 0 int min int amp a const int amp b if a b return 0 a b return 1 task T1 al mem ac lt double gt f f fetch_and_op amp min 0 01 task T2 if x read gt 5 z write new complex 1 2 2 5 int al main int argc char argv al_system init x register new int 1 y register new complex al system init commit al system terminate return 0 Figure 5 Basic usage of a1 mem objects 8 7 Consideration on Global Data Global data permits side effects to occur therefore the guarantee of a sequential execution is not maintained if these data are used ATHAPASCAN 43 9 Examples In this chapter we preset several complete examples of ATHAPASCAN programs These examples are simple enough to be extensively presented within the confines of this chapter and complex enough to demonstrate the implementation of ATHAPASCAN in the context of real world applications All these examples come with the library distribution 9 1 Simple Parallel Quicksort The aim of this implementation is to sort an array of data on two processors using an algorithm based upon a pivot This implementation uses the class my Array a resizable array as defined in page as well as the Shared array class as defined i
77. meter that is a reference to a shared object located in the shared memory T is the type of this object It must be communicable In the case of a classical T class the type T should not refer to the shared memory For example when initializing a shared object a1 Shared T gt s d where d points to an object of type T this pointer should no longer be used in the program il 5 3 Task Creation To create an ATHAPASCAN task the keyword al Fork must be added to the usual procedure call Here user task is a function class as described in the previous section C function object call ATHAPASCAN task creation seer ask args user task gt args The new created task is managed by the system which will execute it on a virtual processor of its choice cf Appendix 5 4 Task Execution The task execution is ensured by the ATHAPASCAN system The following properties are respected e The task execution will respect the synchronization constraints due to the shared memory access e all the created tasks will be executed once and once only e no synchronization can occur in the task during its execution Hence for most but not all implementations of ATHAPASCAN programs the system guarantees that every shared data accessed for either reading or updating is available in the local memory before the execution of the task begins 5 5 Kinds of Memory a Task Can Access Each ATHAPASCAN task can access three levels of m
78. n page What we wish to show here is how to embed parallel instructions in the classes representing a user s data structures Programming this way makes the main application code alot more simple to understand and to write The idea of the algorithm is 1 to split the array in two parts 2 to sort each array in parallel using qsort 3 to split those arrays in two parts elements pivot and elements gt pivot 4 to merge the arrays containing data lt pivot and data pivot 5 to append the second array to the first one e L fexe pereus Aur eg1eur prosa pajuos fay POY os INPA UW D 0 gt Roch amp al reqs e 79 lt lt gt 4eirry Xub poreys Te sty2 9 lt L 4eiyfur pereug pe 06 0 L 4eue pereus ur7z doo 404 Te zis qur res qui E lt L gt Aerre poreys ur 3suoo 4doo proa t zis 91898 e SIU 9 lt L feniyfur pereug Te 0 r 4eue poreus amp ur Adoo 3404 Te zis qur 11e38 qur eg lt gt 4exryAur 3suoo 4doo proa fidoo 08 1 e o3103e1edo p Aexre p reus 87 rL 4euy ur 3suoo 10ye19d0 proa d siya lt L gt 4enyt gt pateys 18 lt rL fezue pereus Aur Atqg 404 Te y d r perseug re yoArqpuy pro mds puo o puy t 0 sru3 F lt lt L gt euviu gt pereus te lt rL A4ezre poreus amp ur7
79. ng MPI LAM please clean up and reboot LAM before executing your ATHAPASCAN program If the problem persists please follow the instructions on the ATHAPASCAN webpage Q The compiler does nt find a task corresponding to my a1 Fork instruction Why could this be A Make sure that all the shared modes and rights are compatible A Make sure the procedure does not have too many arguments If so recompile your library after having increased the authorized number of parameters at configuration option nbp of configure script Aen un eV ME IA 76 Roch amp al Q I have tried all the previous suggestions and I still have some errors What shall I do A Send an e mail to Jean Louis Roch imag fr stating your problem M iere aed a ATHAPASCAN Josue Ce lt an 77 A Unit de recherche INRIA Rh ne Alpes 655 avenue de l Europe 38330 Montbonnot St Martin France Unit de recherche INRIA Futurs Domaine de Voluceau Rocquencourt BP 105 78153 Le Chesnay Cedex France Unit de recherche INRIA Lorraine LORIA Technop le de Nancy Brabois Campus scientifique 615 rue du Jardin Botanique BP 101 54602 Villers l s Nancy Cedex France Unit de recherche INRIA Rennes IRISA Campus universitaire de Beaulieu 35042 Rennes Cedex France Unit de recherche INRIA Rocquencourt Domaine de Voluceau Rocquencourt BP 105 78153 Le Chesnay Cedex France Unit de recherche INRIA Sophia Antipolis 2004 route des Luc
80. ntirely managed by the system that is to say must be considered as lost by the programmer The order of invocation must be the same on all virtual processors 5The result of registration is to associate an unique identifier to the object This identifier result of an incrementation of a variable locale to the virtual processor So two object are considered as identicall if their identifier are equal that is to say if they have been registered at the same rank 42 Roch amp al 8 4 Usage Three operations are allowed on an a1 mem object x representing a data of type T e The call x read returns a constant reference on the data located in x e The call x write pval writes the value pointed to by x The pointer T val must be considered as lost by the programmer e The call x fetch and op int f T amp a const T amp b val where the object val is of type const T amp and f designates a pointer to the funtion to be performed The first parameter of this function will be the data stored in x and the second will be stored in val The result of this function should be 1 if the data have been modified otherwise it should be 0 8 5 Destruction The destruction of a1 mem objects is managed by the system and occurs e When no task possess it for objects used as parameter e At the end of the program execution for objects that have been registered 8 6 Example The following example Figure shows the basic usage of a1 me
81. o eueu gdei8 reyo asuo lt flpue gt gt 7 SIoTo5 uSnoue jou gt gt 1199 uotS ox ferdstp 29n195 b 0 Yo FT O3uaooc pe zuq u qsKs e Too qu ar z nod z 4unoo epou ue3sfs Te Oz OZ arur 2oad a xo K epe s z rK z Tx X T s z IX Z aoToogKx qur meu aur pereug To27qu OZ OZ arur pueu yo qur E M X XCUIDILX uiczrI 2og E fea fa gt f E osm Eta f 107 f arut 3se8 uta f Z iq ar ezrs r ezts t T g 172 STS T Z 02 euoz ty sf r qur p 48ze Toye To5 qu qui Too qu qur TOD 8 lt aur pereug gt ferxe wered Z euoz x uorigei ur uorgex qnduo proa g a8xe roge 4T aur f c148xe 103e 1y3 qur T1 28318 1078 zezrs tr qur uanqer A81e Ieo 931e qur 2TOP aur 4 yes1q T f T0 qu 31 T erqnop IOOTF AUT I t Cv lt Ocsqez zr t f u mod z gt Z Too qu z Fpa pepueu x1oj yo T t 3T gt T t O F qur 107 CO lt u x m rz gr 1 uu cx g N ox x O 0 z x Tduo SYM Tal OZ A X 10 zxo oo qur eye af Ka fC R XE oxa rx uotSoz utn Too qu qur at qur u qur x Tqnop 1oTooz x qur GT Z gt t 0 T 4oy ef t gt f 0 tE r qu 101 uxngex y Aataotad qur este Too a pe
82. ode number lt lt al system self node lt lt out of lt lt al system node count lt lt endl cout lt lt Thread number lt lt al system self thread lt lt out of lt lt al_system thread_count lt lt endl h int doit int argc char argv for int i20 i lt 10 i al Fork lt getInfoTask gt i return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 16 Roch amp al This program is very simple and shows you how to write a task Fork is instanciated with the class getInfo Task and will execute the code of the method overloading operator Compiling Recall that compiling an ATHAPASCAN program is done by executing the make function from the command line For the getInfo Task example execute sh gt make getInfoTask To execute this newly created program enter the following on the command line sh getInfoTask NB To run a program build upon LAM MPI like the ATHAPASCAN library or Inuktitut you have to configure your cluster of machines so that they can run a rsh to each other 4 2 2 Simple Example 2 Fibonacci cpp multiple Fork and Shared algorithm The Fibonacci series is defined as F 0 0 F 1 1 F n F n 1 F n 2 Vn gt 2 There are different algorithms to resolve Fibonacci numberds some having a linear time complexity O n The algorithm we present here is a recursive method It is a very
83. oid write T prototype T access e accumulation write access a1 Shared_cw lt F T gt x use x cumul v prototype void cumul const T NB The call x cumul v accumulates v into the shared data according to the accumulation function F copy of v may be made during the first accumulation if the data present is not yet valid Example 14 Roch amp al struct add 1 void operator int amp a const int amp b const at b I struct addToShared 1 void operator al Shared_ cw lt add int T inti 1 T cumul i d int doit int argc char argv al Shared int myShared new int 10 al Fork lt addToShared gt myShared 5 return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 d Conversion Rules Due to the several types of Shared data with specific reading and writing capabilities there are restrictions on how Shared objects may be passed with respect to their access right Since this material is rather extensive Chapter page is devoted to the study of the Shared object and thus not covered in this chapter Adding thread information Some scheduling policies benefit from thread information ATHAPASCAN uses four variables to determine the best method of ececution Each datum has a default value but a better scheduling may be obtained if the programmer assigns significant values The variable information
84. or received as effective parameters The access to a shared object can be e read only ai Shared_r lt class T e write only al Shared w class T e cumulative write a1 Shared cw class F class T e read write al Shared_r_w lt class T NB1 A shared object can implement only communicable classes NB2 A pointer given as parameter has to be considered as lost by the programmer all further access through this pointer is invalid Declaration e al Shared lt T gt x the reference x must be initialized before use e al Shared lt T gt x 0 the reference x can be used but no initial data is associated with it e al Shared lt T gt x new T the reference x can be used and possesses an initial value NB Be aware that non initialized shared data is a common programming error that gives no compile time warning ATHAPASCAN Access Rights and Methods Each task specifies as a parameter the shared data objects that will be accessed during its execution and which type of access will be performed on them According to the requested access right tasks should use this methods e read only access a1 Shared r T x use x read prototype const T amp read const e write only access a1 Shared w T x use x write p prototype void write T NB Deallocation is made by ATHAPASCAN e read write access ai Shared_r_w lt T gt x use x write p or x access prototype v
85. ou yo zrs wouR xppex2os 2sep purq IT ST ossozpe d yoos putrq 193205 eT ep egeuuoy SANT AV prueg ups wou LHNI AV essezpe ep ANY xppe s appe ups wou VOUIOQUI essedpy f azodx suoquy 31od urs uou 4 np oxeumy uou yo zrs WOUF 2 uou goezrs woug Tey orezq s uuop TTTe3 ss zpe p euoz Q Y G 3TXx f etatssodut 394205 ep 4 I 0 dfq L3NI 4V 3ex2os s p FT To20301d adfy uTeuop a3 y os q y oos ep uorqe 0 fanen8uoT qur p u Tsun 124208 ep znenguoT fuou ur appex os 32ni4s 294208 ep esseipy ed 3 qur axod e9esse j qur 3e3 nse4a ue Konu eies aod ep oxaunu aqteynos qaod np 3s zq uezed 3ex os eun p ep x 1 lt TION ONVHONM TIDN E3Ten errua e8esn d suotydo sngegs d garenm xx OPpru 3rs eSessey proa 010919 TeuSts zou T nb aryyns IT es ST puenb ep sunsse oid seq z ururT p queqq uz d x pas o2edseueu Sursn 4U eSessej epnpourg
86. ove it end return its value T pop front if Inext return 1 else myList lt T gt x next T ret next value next x gt next x next 0 delete x return ret y packing operator template lt class T gt Pot PL 27 10 20 30 40 50 60 70 28 al Ostream amp operator lt lt al Ostream amp out const myList lt T gt amp z myList T x z out x size while z next 0 out z value x x next out x value return out unpacking operator template lt class T gt al Istream amp operator gt gt al Istream amp in myList lt T gt amp z int size int temp 0 in gt gt size for int i 0 i lt size i in gt gt temp z push back temp return in test tasks to see if the class is communicable struct myTaskW void operator al Shared_r_w lt myList lt int gt gt x for int i 0 i lt 100 i x access push back i y struct myTaskR void operator al Shared_r_w lt myList lt int gt gt x myList lt int gt z x access while z next 0 1 cout lt lt z pop front lt lt h int doit int argc char argv 4 al Shared lt myList lt int gt gt x new myList lt int gt al Fork lt myTaskW gt x al Fork lt myTaskR gt x return 0 Roch amp al 80 90 100 110 120 l a ATHAPASCAN
87. oz or lt aejyre2oT gt pezeyg meu T Y or lt exTeoov pexeug meu r V t OT gt t 5025 t qut doy t OT U lt FeNTe207 gt p zeus t OT V lt FeNTe207 gt p zeus gt ABI 9318 QUI 2TOP qur uorqounz ureu sql 3rseq Te qs xoa e Cf Cty dua r u lt ppextageubas gt x10 y f dua If1Dtly 04 90 O c TdraTnuxraqeub soyaoq Q egre oT mou duy lt aeype oT gt pezeys t UTP gt 4 0 3x aur 107 wtp gt o f aur Tf wtp gt T 50 t qur 107 qur Y sac IENTLI0T gt p zeus Y lt 1eHTeoo7 pexeug aeyexenbgieq pro x V V u u qonpoad xraqeu e uotzeqyndwog e exed 54 duep aejyre oT 19U 97TA1M Q r qea due 0 X aur 107 C 1 qea duer 4f 5 og gt f O f qur gar 09 gt t 0 t aur doy F qea g u x r aeac v u f Gey OS gt HE 0 g u 33ejyTe oT 3suoo y u 33eyTe2oT 3suoo due a3ej e o1 lt 1PNTEDOT gt M peaeug g aeyrTe oT i peaeug y aeyTe2 oT i p zeusS proa y Krdrarnuxrzqeub s gox V I FO uorqeqanduoo x Opeer g O peax y F r aeacg u C r qea y u Ger ii 09 gt T Opeax g Opeex y duerp aeyre oT meu earun m m m l r qe3 duej 0 aur 107 0 T aur zog
88. place N with the desired number If there are not enough inputs specified at run time the program should terminate and output a message describing the proper usage of the program It is necessary to execute the com leave function to facilitate the termination of the ATHAPASCAN application Calling doit from main simplifies the code making it easier to read Note From this point on all examples contained in this document will define a method doit It is to be assumed that the program is executed with a main as defined above That is to say that the main method will not be shown in the examples 4 1 2 Fork Fork is the keyword used by ATHAPASCAN to define tasks that are to be parallelized To Fork a task one must ATHAPASCAN 11 e write the code to be parallelized overload the operator of the class to be Forked syntax struct my Task 1 operator formal parameters task body invoke the task call to the method Fork syntax al Fork my task effective parameters Example struct PrintHello 1 void operator O int id 1 printf Hello world from task number n id b int doit int argc char argv al Fork lt PrintHello gt 00 return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 j NB All the formal parameters must be made communicable Cf Chapter e parameters ta
89. re using pointers When running a parallel application on a cluster of machines it is meaningless to communicate addresses Therefore only values are communicated e al Ostream we send first the number of values then the values themselves but we don t send the pointers e al Istream we receive first the number of values then we insert the values in the chain using local pointers ea iere aed s ATHAPASCAN include lt iostream h gt include lt stdlib h gt include athapascan 1 h class myList is an Athapascan 1 communicable class We use a chain structure to store the values NB T has to be communicable too ki template lt class T gt class myList f public T value myList next empty constructor myList value next 0 constructor myList T v myList n value v next n copy constructor myList const myList lt T gt amp d value d value if d next 0 next new myList T d next else next 0 destructor myList if next 0 delete next return the size of the list int size int s 0 myList lt T gt x this while x gt next 0 stc x x next delete x return s we push the data at the end of the list void push_back T newval myList lt T gt x this while x gt next 1 0 x x gt next x gt next new myList newval 0 we pop th efirst data from the list rem
90. rectionnelles mes sages actifs Ce rapport pr sente l utilisation de PAPI d ATHAPASCAN en tant que biblioth que C Mots cl s programmation parall le macro data flow ordonnancement grappe et grille de calcul ATHAPASCAN 3 1 Information and contacts The ATHAPASCAN project is still under development We do our best to produce as good documentation and software as possible Please inform us of any bug malfunction question or comment that may arrise More information about ATHAPASCAN and the APACHE project can be found online at http www apache imag fr The user can subscribes to the following mailing lists e http listes imag fr wws info id a1 hotline to have help from the ATHAPASCAN s group about installation or programming pitfalls or bug report e http listes imag fr wws info id a1 devel to reach the developers of ATHAPASCAN s group about implementation details questions remarks design The authors thank all the people who has worked on this project PhD Students e Fran ois Galil e e Mathias Doreille e Gerson Cavalheiro e Nicolas Maillard Engineers students e Arnaud Defrenne e Jo Hecker 4 Roch amp al Contents M iere aed a ATHAPASCAN 5 2 Introduction ATHAPASCAN 1 is the C application programming interface of ATHAPASCAN It is a library designed for the programming of parallel applications 2 1 About ATHAPASCAN ATHAPASCAN is build on a multi layered structure
91. retained is e The cost of the thread a C double e The locality of the thread a C int e The priority of the thread a C int Low values traduce higher priorities e An extra attribute a C double that represents whatever the scheduler wants it to represent This information is given at the thread creation ai Fork user task SchedAttributes infos parameters Where infos represent the list of the four possible thread attributes Here is an example of how to use the scheduling attributes al Fork user task SchedAttributes prio cost loc extra parameters Note that if both a specific scheduler and some information are given the order in which the variables are passed is not important ai Fork user task SchedAttributes prio cost loc extra sched group parameters ai Fork user task sched group SchedAttributes prio cost loc extra parameters ATHAPASCAN 15 4 1 4 System Information It is possible to get the following runtime information e al system node count returns the number of nodes e al system self node returns the node identification number on which the function is invoked an integer from 0 to al system node count 1 e al system thread count returns the number of a0 threads dedicated to a thread s execution on the virtual processor e al system self thread returns the a0 thread identification number that hosts the execution of t
92. rogrammer to suit the specific needs of the program NB All the classes and types used as task parameters must be made communicable 6 1 Predefined Communicable Types The following types are communicable e The following C basic types short unsigned short int unsigned int long unsigned long char unsigned char float double long double chars voidx e all types from the STIH Note that two generic functions for packing unpacking an array of contiguous elements are provided ai stream amp al pack al stream out const T x int size al Istream al_unpack ai Istream amp in T x int size Both functions require the number of elements They are specialized for basic C types 6 2 Serialization Interface for Classes Stored in Shared Memory A communicable type T must have the folloving functions e The empty constructor T e The copy constructor T const T amp NB the copy shall not have any overlap with the source e The destructor TO NB only one task executes the deallocation at a time The sending operator ai 0stream amp operator lt lt ai stream amp out const T amp x puts into the output stream out the information needed to reconstruct an image of x using the operator gt gt e The receiving operator ai Istream amp operator gt gt ai Istream amp in T amp x takes from the input Stream in the information needed to construct the object x it allocate
93. s t xes1q 10 3uo2 1027 zeuoz mou Oueero f woz n 027 0 O uorgezx a y ZTS I S ese iye iq 10 feuoz pro 2 y 24uo2 M u 3 2882 fxeoiq 0 Ea 1 02 3rnb b ese t fxeoiq y fex dreu cUe se u oqras 0 irnb T 3u09 qur Pam oer Roch amp al 56 T uangaz S r eztsoz gt qur ua3prA QUI ezrsei x pueu voad urs qut NM ae E xeuaQ 1 f speeryy qu ooxd qu uotqdeo woz 7 02 0 0 uorgex a31ur 2oxd ura gt speeqxya qu qur ooxd qu qur fuotadeas qur OZ euoz aTuT pueu vord upa qur s1039N1ISU0 PAD ul pueu ooid ura epnpourg O pueu oodd uta 10 uangaz fu 3317 u uoz n u fm7 8o17 n uoz n u y Ser x epe s euoz pro T uoz pTo j uoz n u sC TS f p n Sex x x Te5s uoz pro IX euoZ p o jx euoz mou f uoz pro euoz mau f 33nq Adp f 0 50 uorade y 301 m Sex Q 287 yynq Adp eeuy dopy 0 0 S q Sex pro juu m Sex mSex pro uru 0 Q 287 yynq yynq Adp eeay Kdopy Oueero f uorideo qu 3r q yipra 0 0 8 s
94. s and initializes x with the value related to the information Note that the system always calls this function with an object x that has been constructed with the empty constructor 6 3 Examples This section teaches through examples how to define a class or type as being communicable The three examples provided in this section are Complex Numbers which creates a simple communicable class Basic List with Pointers which generates a singly linked list which behaves as a queue data structure and Resizable Array which generates a class for the creation of a list of dynamic size 6 3 1 Example 1 Complex Numbers For example let us consider the complex numbers type This type can be set communicable by simply imple menting the four communication operators NB Note that the C provided defaults can often be used to impliment the empty constructor the copy constructor and the destructor In the case of the complex type the defaults operators are used refer to a C guide to learn more about these defaults constructors SYou must be careful when communicating pointers in fact if your program is executed on several nodes option a0n 2 for example the communication can be performed between the nodes and the pointer is often meaningless on other nodes By default the system considers that all types possess iterators that run all over the data this is the case of STL types For all others the necessary functions have to be prov
95. sk parameter can be 1 a regular object or variable ex al Fork my Task class T with T communicable 2 a Shared data that can be used by different tasks ex al Fork lt myTask gt a1 Shared_r lt myClass gt T with myClass communicable A Shared data can have different access rights Cf Chapter NB Shared data must be initialized A runtime error occurs if this is not done Example 12 Roch amp al struct print A Shared void operator al Shared_ int T 4 printf The Shared data parameter has the value d T read b int doit int argc char argv al Shared lt int gt myShared new int 10 al Fork print A Shared gt myShared return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 e communicable type Only serialized classes can be communicated The standard classes and types short unsigned short int unsigned int long unsigned long char unsigned char float double long double char voidx Standard container classes of the STL are already communicable by default User defined classes may be complex It is therefore necessary to explain to the library how these classes should be serialized These classes and types have to be explicitly made communicable by implementing specific class methods Cf Chapter 4 1 3 Shared Object A task can access only the objects it has declared
96. t gt X10g seade gt gt 119 f sex x lt Teosd gt x104 pue gt gt X10g queae gt gt 1189 i TaTaZ TRA eueu oueu ydez8 t 4 t PY weu yaurad eueu eueu udea8 r x 1 F Q PA4 x meu gaurzds 0r eueu rey f TG UT meu lt qur peaeug T14 FC CT aur meu lt qur gt pezeyg F X oer fOPZIS X gt T fO T aur 107 10 qur 8 roa lt lt aur gt pezeyg Xexie wered 8 roa x lt lt aur gt p zeuS Xexie wered f eueu ude12 sex O aur mau sez lt qur gt pezeys El SC t f sex g gx cTeosdoyxdoq f sex T cTeosdoydoq 1 ft t u J e u of T u 1x z ulcx y CT gW u FF t Cr u 1122 2004 rtf STE c u x r ex CHE r Tx Z UST f0 T aur rog f Wu z Uu K lt lt aut gt a pexeug Xexae wered g u T lt lt aut gt z peaeug Xexie wered f G u 6X lt lt aut gt a pexeug Xexae wered f Z U 1X lt lt aut gt z pexeug Xexie wered s Ope z 0 lt peox 0 x Tnum s z y CT u 3T f ezrs x u qur fTpu gt gt Q ZTS X gt gt uiu gt gt pou fT s u qsKs e gt gt UO gt gt MOD pue gt gt e2sd qngep gt gt MOD lt qur ppe gt m9 pezeys lt lt qur gt z peaeug gt ex1e wered x lt lt QUI gt z p zaeuS eaiie
97. t modes except sequential the asynchronous task exe cution is clearly displayed by the color variance from cell to cell S D 0 0 Karsuequr 3eser sseooe g 5 Q sseooe g uoranpoae sseooe 92 ST Karsu qsur qur S E lt 89107 gt 1 pezeyg Te T82 sty lt Te gt M 1 p zeug Te zoqgezedo seq uoT NTOA9 T 99 proa II92 yoee jo FO s rqtrs qur uns ay ST reo jo yes e q a q nouq 291037 ayy arsuequr sul TEO aua 7889 O 1925109 sr st D farsu ur seq Treo v q auBnoaq oxoy equi jo farsu qur uL 92107 sseT 4 120915 49004 NIO38 4 gy lt 82107 ou 1 p zeuS Te T92 styy lt TT gt 1 peaeug e 4oqez do proa xse3 uoranpoae qonzqs y BTT IST QUBS19ST 78 lt lt ue rzqSI Te V 31192 qasuo5 13so qguee14g0 9 gt gt 1o0zexedo wue zqS0 TIe X y 900107 14ST PWeaIISI Te lt lt roqez do wWue zqSI Te qur x qur eurj quezano qur y lt 82107 gt x peaeug je Teo styy lt TT 2 gt M x pa1eys Te zoqgezedo proa y geoxog ysuod iso e rig Te gt gt togeredo gueezrjgQ e IT 2 uoran oae qonzqs 20107 sse o TT sseT ueosedevay
98. tor parameters body A sequential hence not asynchronous call to such a function class is written in C user task effective parameters user task is executed according to depth first ordering Pay attention to the type of the effective parameters when performing a sequential call of a struct The type of the effective parameters must exactly match the type of the corresponding formal parameters For example a Shared parameter cannot be used where a Shared r parameter is required However when Forking a task this behavior is allowed see page to learn more about passing Shared parameters in ATHAPASCAN A task is an object of a user defined type that is instantiated with Fork al Fork lt user task gt effective parameters user task is executed asynchronously The synchronizations are defined by the access on the shared objects the semantic respects the Example The task hello world displays a message on the standard output struct hello world void operator No parameters here cout lt lt Hello world lt lt endl i3 int doit O hello world O immediate call not asynchronous al Fork hello world gt 0 O Creation of a task executed asynchronously al Fork hello world gt 0 O Creation of another task executed asynchronously return 0 int main int argc char argv MAIN ME
99. u fn7 8o17 m Sor mou y 47 8017 lt me 301 nou 7T f x U Box mou x a a lqnop aur M Sox meu Udo y 0 Q uorSex Ser meu Y x guorgex 4suo2 uooz pueu UTM qut t fquoo t fxeoiq 10 2uo2 zznq dp deuxtgos14x 0 0 uor3deo 4q S r n 8017 0 0 287 urmx ggnq dp esxyldooy 1 9 W Bo17 xz g n Sox c V r m3ex 0 Q 937 yynq nq fdp eaxykdony Oueero f x gygnq Sox nq doo f uadep qx fatta q00am 4dp yynq dewxtg g q 3ex g m 3317 0 0 1 uorgezx kp zK uoz pro y uoz n u tKp t uoz pro T uoz n u xp Jx7 ouoz p o jx uoz n u xp IX euoz p o IXx ouoz meu feuoz pro euoz meu f r euoz po gf uoz pro z p e qnop f rx euoz po yx uoz p o z xp eTqnop 2 ese t xes1q zznq dp deuxtgos14x 0 0 uor3deo 4q S r aT BaIT 0 0 287 urmx ggnq dp eeuy dopy 0 0 x Sex a ar 0 O 38 nq nq fdp eaxykdony Oueero Sex gygnq x mq 4do f uadep S r m Ser qooam 4dp dewxtqeaeaxNy 33nq deuxrq 10 z3uo2 fy ax L epe s euoz proT euoz mau euoz meu Ix T epe s euoz proT uoz p o r euoz meu
100. uence of resulting values obeys the lexicographic order semantics In the prototype of a task the related type is al Shared_cw lt F Tox Such an object gets the method void cumul T amp v that accumulates according to the accumulation function F v in the shared data referenced by x For the first accumulation a copy of v may be taken if the shared data version does not contain some valid data yet Example ATHAPASCAN 35 generic function class that performs the multiplication of two values template class T gt class multiply void operator T amp x const T amp val ps task that multiplies by 2 a shared object class mult by 2 void operator al Shared_cw lt multiply lt int gt int gt x x cumul new int 2 bez Note Keep in mind that a program written in ATHAPASCAN can benefit at run time from the associative and com munative properties of the accumulation function F It is therefore possible that the execution of the code in Figure will result in x 3 val 2 x 5 val 1 include athapascan 1 h include lt stdio h gt struct F void operator int amp x const int amp val printf x i val i n x val x val h struct add void operator al Shared_cw lt F int gt x int val x cumul val h int doit int argc char argv al Shared lt int gt i new int 1 al Fork lt add gt i 2
101. utq Lay gt gt Tpu gt gt 1192 gt Odreu pueu ura proa FSE T und orrqnd z uq0 1 sez f Too7qu uorades Ber arur uooz UTM To2 qu uorades Ber 97119 arur Kex UTM sel qur 7 02 m oz O O 331 uoTZez f mod euoz po p Z 2 Z 39s yorqTepuey jaurads 110011913T3 02 QZ f0Z euoz pro gt Too qu qur fuoradeas qur OZ euoz arur pueu UTM UT Re sZoyonaysuoy U pueu ur q u3eu 0 pueu UTM i fo uznq z STU 2 2 T WF 10 103 O q x Tduo Y asuoo u qur aod x Tduo xe duoo IT TU CORO T CORTO oxa xe duoo Y qsuoo 3xe duoo qsuoo xe duoo i f r o r a 2 17 xe duoo 3suo5 3xe duoo qsuoo xe duoo m Ei Rr miq fdp Q 50 uorade uq Q 287 utux yynq Adp eeuy dopy 0 0 u Sex m 3ex 0 0 937 ynq fdp eearzyfdooy Sex meu zts x Sex meu yynq x yynq 4do uadep y Sex meu mM Sax neu 30014 dp dewxtgeyeerpy 33nq dewxtg i 55 f MI mM Ba meu x y I eTqnop re2 aur u Sex no

Download Pdf Manuals

image

Related Search

Related Contents

Page 1 Page 2 概要 ・ 購入時の点検 ・ 部品名称 安全上のご注意  Schnellanleitung Virtuelles Terminal    取扱説明書 - SANUS    操作パネル 簡単取扱説明書  les accueils périscolaires  TAKEMOTO TFL-B 0 D40  Coby CSP94    

Copyright © All rights reserved.
Failed to retrieve file