Home
User Manual for the Pochoir Stencil Compiler
Contents
1. embedded in the base language C with the Cilk multithreading extensions 12 The remainder of this manual is organized as follows Section 2 describes how the Pochoir system can be installed Section 3 illus trates how a 2D heat equation can be specified using the Pochoir specification language Section 4 provides a full specification of the Pochoir embedded language Section A contains a complete list of error messages generated by the Pochoir system If you encounter bugs in the Pochoir system please email pochoir csail mit edu 2 INSTALLATION AND USE This section describes how to acquire install and use Pochoir on your Linux system The Pochoir compiler has been mainly tested under Ubuntu 10 04 and 11 04 Other Linux systems should This work was supported in part by a grant from Intel Corporation and in part by the National Science Foundation under Grants CCF 0937860 and CNS 1017058 also work but have not yet been tested If you wish to port Pochoir to another operating system please let us know by email ing pochoir csail mit edu Preliminaries Before you start you need following tool suite in your environ ment e Intel C Compiler Available with C Composer XE 2011 for Linux version 12 0 0 or later with Cilk Plus extension e The Glasgow Haskell Compilation System version 6 12 1 or later Parsec 2 1 0 1 or later if you want to compile the Pochoir compiler for your system By default the system carr
2. Pochoir_Shape lt l gt 1 0 0 1 0 1 0 0 The Pochoir template library checks for shape mismatch errors on temporal as well as spatial dimensions Pochoir array registration error If you run a Pochoir object without registering any Pochoir arrays the Pochoir template library reports Pochoir registration error You forgot to register Pochoir array Pochoir array access error Accessing a Pochoir array before registering it with a Pochoir ob ject causes the Pochoir template library to report Pochoir array access error A Pochoir array is accessed without being registered with a Pochoir object Pochoir array size mismatch error There can be multiple Pochoir arrays participating in one stencil computation that is be registered with the same Pochoir object If the user registers Pochoir arrays with different sizes with the same Pochoir object the Pochoir template library reports Pochoir array size mismatch error Registered Pochoir arrays have different sizes Pochoir illegal access by boundary function error Usually the Pochoir compiler assumes that the access to a Pochoir array in the boundary function should be within the domain by us ing the get method Suppose that an off boundary access in bound ary function occurs such as Pochoir_Boundary_2D heat_bv_2D arr t i J return arr get t 1 1 Pochoir_Boundary_End The Pochoir template library reports Pochoir illegal access by bound
3. S Ramachandran Cache oblivious algorithms In FOCS pages 285 297 IEEE 1999 M Frigo and V Strumpen Cache oblivious stencil computations In ICS pages 361 366 ACM 2005 M Frigo and V Strumpen The cache complexity of multithreaded cache oblivious algorithms Theory of Computing Systems 45 2 203 233 2009 P Hudak Building domain specific embedded languages ACM Computing Surveys 28 December 1996 Intel Corporation Intel Cilk Plus Language Specification 2010 Document Number 324396 001US Available from http software intel com sites products cilk plus cilk_plus_language_specification pdf S Kamil K Datta S Williams L Oliker J Shalf and K Yelick Implicit and explicit optimizations for stencil computations In Workshop on Memory System Performance and Correctness pages 51 60 ACM 2006 S Kamil P Husbands L Oliker J Shalf and K Yelick Impact of modern memory subsystems on cache optimizations for stencil computations In Workshop on Memory System Performance pages 36 43 ACM 2005 S Krishnamoorthy M Baskaran U Bondhugula J Ramanujam A Rountev and P Sadayappan Effective automatic parallelization of stencil computations In PLDI 2007 M Mernik J Heering and A M Sloane When and how to develop domain specific languages ACM Computing Surveys 37 316 344 December 2005 A Nakano R Kalia and P Vashishta Multiresolution molecular dynamics algorithm for realistic materials mo
4. Cylindrical nonperiodic in x with constant value 0 and periodic in y d Dirichlet varying with time e Neumann constrained first derivative 0 the multicore machines we used for benchmarking The static information about a Pochoir stencil computation such as the computing kernel the boundary conditions and the stencil shape is stored in a Pochoir object which is declared as follows e Pochoir_dimD name shape This statement declares name as a Pochoir object with dim spatial dimensions and computing shape shape where dim is a small pos itive integer and shape is an array of arrays which describes the shape of the stencil as elaborated below We now itemize the remaining Pochoir constructs and explain the semantics of each e Pochoir_Shape_dimD name cells This statement declares name as a Pochoir shape that can hold shape information for dim spatial dimensions The Pochoir shape is equivalent to an array of arrays each of which contains dim 1 integer numbers These numbers represent the offset of each mem ory footprint in the stencil kernel relative to the space time grid point t x y For example suppose that the computing kernel employs the following update equation uy X Y u1 x y QAt ga ei l y 4 uy 1 x 1 y 2uy 1 x QAt tayp Me ey 1 u_ 1 y 1 2m_1 y The shape of this stencil is 0 0 0 1 1 0 1 0 0 1 1 0 1 0 1 1 0 1
5. HOIR SPECIFICATION LANGUAGE This section describes the formal syntax and semantics of the Pochoir language which was designed with a view to offer as much expressiveness as possible without violating the Pochoir Guarantee Since we wanted to allow third party developers to implement their own stencil compilers that could use the Pochoir specification lan guage we have avoided to the extent possible making the language too specific to the Pochoir compiler the Intel C compiler and 1 define mod r m r m r lt 0 m 0 2 Pochoir Boundary _2D zero_bdry a t x y 3 return 0 4 Pochoir_Boundary_End a 5 Pochoir_Boundary 2D toroidal a t x y 6 return a get t 7 mod x a size 1l 8 mod y a size 0 9 Pochoir_Boundary_End b 10 Pochoir_Boundary_2D cylindrical a t x y 11 if z lt 0 x S acsize lj 12 return 0 13 return a get t x mod y a size 0 14 Pochoir_Boundary_ End c 15 Pochoir_Boundary_2D dirichlet a t x y 16 return 100 0 2 t 17 Pochoir_Boundary_ End d 18 Pochoir_Boundary _2D neumann a t x y 19 int xx x yy y 20 if x lt 0 xx 0 21 if x gt a size 1 xx a size l 22 if y lt 0 yy 0 23 if y gt a size 0 yy a size 0 24 return a get t xx yy 25 Pochoir_Boundary_ End e Figure 3 Pochoir specifications for a variety of boundary conditions a Grid Nonperiodic with constant value 0 b Toroidal periodic in both x and y c
6. The first cell in the shape is the home cell whose spatial coordi nates must all be 0 During the computation this cell corresponds to the grid point being updated The remaining cells must have time offsets that are smaller than the time coordinate of the home cell and the corresponding grid points during the computation are read only The depth of a shape is the time coordinate of the home cell minus the minimum time coordinate of any cell in the shape The depth corresponds to the number of time steps on which a grid point depends For our example stencil the depth of the shape is 1 since a point at time t depends on points at time t 1 If a stencil shape has depth k the programmer must initialize all Pochoir arrays for time steps 0 1 k 1 before running the computation e Pochoir_Array_dimD type depth name sizegim 1 size sizeg This statement declares name as a Pochoir array of type type with dim spatial dimensions and a temporal dimension The size of the ith spatial dimension where i 0 1 dim is given by size The temporal dimension has size k 1 where k is the depth of the Pochoir shape and are reused modulo k 1 as the computation proceeds The user may not obtain an alias to the Pochoir array or its elements e Pochoir_ Boundary_dimD name array idx idXgim 1 4dx1 idxq definition Pochoir_ Boundary_End This construct defines a boundary function called name that will be invo
7. User Manual for the Pochoir Stencil Compiler Charles E Leiserson Yuan Tang MIT Computer Science and Artificial Intelligence Laboratory Cambridge MA 02139 USA 1 INTRODUCTION Pochoir pronounced PO shwar 22 is a compiler and runtime system for implementing stencil computations on multi core processors A stencil defines the value of a grid point in a d dimensional spatial grid at time as a function of neighbor ing grid points at recent times before t A stencil computation 1 2 4 5 9 10 13 15 17 19 21 23 computes the stencil for each grid point over many time steps Stencil computations are conceptually simple to implement us ing nested loops but looping implementations suffer from poor cache performance Cache oblivious 8 20 divide and conquer stencil codes 9 10 are much more efficient but they are difficult to write and when parallelism is factored into the mix most appli cation programmers do not have the programming skills or patience to produce efficient multithreaded codes The Pochoir stencil compiler achieves a substantial performance improvement over a straightforward loop parallelization for typical stencil applications Pochoir allows programmers to write simple functional specification for arbitrary d dimensional stencils and then it automatically produces a highly optimized cache efficient parallel implementation The Pochoir language can be viewed as a domain specific language 3 11 16
8. ary function error Out of range access by boundary function at index 0 l 1 B REFERENCES 1 R Bleck C Rooth H Dingming and L Smith Salinity driven thermocline transients in a wind and thermohaline forced isopycnic coordinate model of the North Atlantic Journal of Physical Oceanography 22 12 1486 1505 1992 2 K Datta M Murphy V Volkov S Williams J Carter L Oliker D Patterson J Shalf and K Yelick Stencil computation optimization and auto tuning on state of the art multicore architectures In SC pages 4 1 4 12 Austin TX Nov 15 18 2008 5 11 12 13 14 15 16 17 18 19 A van Deursen P Klint and J Visser Domain specific languages An annotated bibliography SIGPLAN Not 35 6 26 36 June 2000 H Dursun K Nomura L Peng R Seymour W Wang R K Kalia A Nakano and P Vashishta A multilevel parallelization framework for high order stencil computations In International Euro Par Conference on Parallel Processing pages 642 653 2009 H Dursun K Nomura W Wang M Kunaseth L Peng R Seymour R K Kalia A Nakano and P Vashishta In core optimization of high order stencil computations In PDPTA pages 533 538 2009 J F Epperson An Introduction to Numerical Methods and Analysis Wiley Interscience 2007 H Feshbach and P Morse Methods of Theoretical Physics Feshbach Publishing 1981 M Frigo C E Leiserson H Prokop and
9. ary function for this periodic stencil performs calcu lations modulo the dimensions of the spatial grid Figure 3 shows how nonperiodic stencils can be specified including how to spec ify Dirichlet and Neumann boundary conditions 7 Line 10 as sociates the boundary function heat_bv with the Pochoir array u Each Pochoir array has exactly one boundary function to supply a value when the computation accesses grid points outside of the computing domain Line 11 registers the Pochoir array u with the heat Pochoir object A Pochoir array can be registered with more than one Pochoir object and a Pochoir object can have multiple Pochoir arrays registered Lines 12 14 define a kernel function heat_fn which specifies how the stencil is computed for every grid point This kernel can be an arbitrary piece of code but accesses to the registered Pochoir arrays must respect the declared shape s Lines 15 17 initialize the Pochoir array u with values for time step 0 If a stencil depends on more than one prior step as indicated by the Pochoir shape multiple time steps may need to be initialized Line 18 executes the stencil object heat for T time steps using ker nel function heat_fn Lines 19 21 prints the result of the compu tation by reading the elements u T x y ofthe Pochoir array In fact Pochoir overloads the lt lt operator so that the Pochoir array gt can be pretty printed by simply writing cout lt lt u 4 THE POC
10. deling on parallel computers Computer Physics Communications 83 2 3 197 214 1994 A Nitsure Implementation and optimization of a cache oblivious lattice Boltzmann algorithm Master s thesis Institut fiir Informatic Friedrich Alexander Universitat Erlangen Niirnberg 2006 L Peng R Seymour K Nomura R K Kalia A Nakano P Vashishta A Loddoch M Netzband W R Volz and C C Wong High order stencil computations on multicore clusters In JPDPS pages 1 11 IEEE 2009 H Prokop Cache oblivious algorithms Master s thesis Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology June 1999 A Taflove and S Hagness Computational electrodynamics The finite difference time domain method Artech House Norwood MA 2000 Y Tang R Chowdhury B C Kuszmaul C K Luk and C E Leiserson The Pochoir stencil compiler In SPAA ACM 2011 To appear S Williams J Carter L Oliker J Shalf and K Yelick Lattice Boltzmann simulation optimization on leading multicore platforms In JPDPS pages 1 14 IEEE 2008
11. f domain memory access occurs Each Pochoir array is associated with ex actly one boundary function at any given time but the programmer can change boundary functions by registering a new one e name Run T kern This function call runs the stencil computation on the Pochoir ob ject name for T time steps using computing kernel function kern After running the computation for T steps the results of the computation can be accessed by indexing its Pochoir arrays at time T k 1 where k is the depth of the stencil shape The program mer may resume the running of the stencil after examining the re sult of the computation by calling name Run T kern where T is the number of additional steps to execute The result of the compu tation is then in the computation s Pochoir arrays indexed by time T T k 1 APPENDIX A ERROR MESSAGES The Pochoir template library used during Phase 1 compilation reports as many bugs as possible to assist debugging This section documents the error messages If your code compiles and runs cor rectly during Phase 1 it should not encounter any compilation or runtime problems during Phase 2 If it does please report the bug to pochoir csail mit edu List of Pochoir error messages Compile time errors e Pochoir environment variable not set Run time errors e Pochoir off shape access error e Pochoir array registration error e Pochoir array access error e Pochoir array size mismatch error e Pochoir i
12. forming a source to source translation into a postsource C pro Pochoir Intel C Compiler Pochoir Template Library a Pochoir Es Pochoir Postsource Compiler Cilk Code yq Intel C Optimized Compiler Parallel Code Pochoir Template Library b Figure 1 Pochoir s two phase compilation strategy a During Phase 1 the programmer uses the normal Intel C compiler to compile his or her code with the Pochoir template library Phase verifies that the programmer s stencil specification is Pochoir compliant b During Phase 2 the programmer uses the Pochoir compiler which acts as a preprocessor to the Intel C compiler to generate optimized multithreaded Cilk code gram that employs the Cilk extensions The postsource is then com piled with the Intel compiler to produce the optimized binary exe cutable The Pochoir compiler makes the following promise The Pochoir Guarantee If the stencil program com piles and runs with the Pochoir template library during Phase 1 no errors will occur during Phase 2 when it is compiled with the Pochoir compiler or during the subsequent running of the optimized binary Usage of the Pochoir compiler The shell command pochoir without arguments causes the Pochoir compiler to output the basic usage of Pochoir compiler as follows Usage pochoir OPTION filename Try pochoir help for more options Usage pochoir OPTION fi
13. ies a Pochoir compiler for Intel 64 architecture for your convenience Acquire the software Please send email to pochoir csail mit edu requesting a copy of the Pochoir system You will receive a tarball with the software Set up the Pochoir environment Suppose that the home directory for Pochoir package is Spochoir which is the directory containing all the pochoir_xxx hpp and hs files Typing make pochoir in directory pochoir provided the Haskell compilation system and Parsec package are correctly installed will produce the Pochoir compiler named pochoir Compiling a program using Pochoir By default the Pochoir compiler assumes the Intel C compiler can be accessed via the name icpc Before compilation set up the environment variable by typing export POCHOIR_LIB_PATH Spochoir You can also add this line to your start up shell script As shown in Figure 1 the Pochoir system operates in two phases only the second of which involves the Pochoir compiler itself For the first phase the programmer compiles the source program with the ordinary Intel C compiler using the Pochoir template li brary which implements Pochoir s linguistic constructs using un optimized but functionally correct algorithms This phase ensures that the source program is Pochoir compliant For the second phase the programmer runs the source through the Pochoir com piler which acts as a preprocessor to the Intel C compiler per
14. ked to supply a value when the stencil computa tion accesses a point outside the domain of the Pochoir array array The Pochoir array array has dim spatial dimensions and idxdim 1 idx1 idxo are the spatial coordinates of the given point outside the domain of array The coordinate in the time di mension is given by idx The function body definition is C code that defines the values of array on its boundary A current restriction is that this construct must be declared outside of any function that is the boundary function is declared global e Pochoir Kernel_dimD name array idxt idxgim 1 idx idxg definition Pochoir_Kernel_End This construct defines a kernel function named name for updating a stencil on a spatial grid with dim spatial dimensions The spa tial coordinates of the point to update are idXgimn 1 idx1 idx and idx is the coordinate in time dimension The function body definition may contain arbitrary C code to compute the sten cil Unlike boundary functions this construct can be defined in any context e name Register_Array array A call to this member function of a Pochoir object name informs name that the Pochoir array array will participate in its stencil com putation e name Register_Boundary bdry A call to this member function of a Pochoir array name asso ciates the declared boundary function bdry with name The bound ary function is invoked to supply a value whenever an of
15. lename Run the Pochoir stencil compiler on filename auto optimize Let the Pochoir compiler automatically choose the best optimizing level Default debug Perform Phase 1 compilation by running the ordinary C compiler with the Pochoir template library split macro shadow Use macro tricks to split the interior and boundary regions split pointer Split the interior and boundary regions and use ordinary C style pointers to optimize the base case Phase 1 compilation To compile a stencil application named stencil cpp for debug ging you need to supply a debug option to the Pochoir compiler as following CC pochoir stencil stencil cpp CC o stencil 00 g debug stencil cpp In this phase the code will be compiled against a template li brary which provides a functionally correct serial looping imple mentation of stencil algorithm Also in this phase I the template library will try to capture as many bugs as possible as shown in Appendix Section A The command line option debug tells the pochoir compiler that it s the Phase 1 compilation Phase 2 compilation To compile your stencil specification for the Phase 2 optimization you just need to eliminate the debug option from the command line and supply any other optimizing option you might supply to icpc compiler to Pochoir CC pochoir stencil stencil cpp CC o stencil 03 stencil cpp Examples The directory Examples in the distributi
16. llegal access by boundary function error Pochoir environment variable not set When the Pochoir compiler is invoked it will automatically check the value of environmental variable POCHOIR_LIB_PATH and in clude it as the path to the Pochoir template library If the environ mental variable is not set up properly the Pochoir compiler reports Pochoir environment variable not set POCHOIR_LIB_ PATH Pochoir off shape access error This error message will try to capture the mismatch between the shape specified via Pochoir_Shape and registered in a Pochoir object declaration and the shape the user really used in the Pochoir_Kernel_dimD For example suppose that the user writes the following piece of stencil code as in Figure 4 1 Pochoir_Shape_1D heat_shape_1D 1 0 0 1 0 1 0 O 2 Pochoir_1D heat_1D heat_shape_1D j 3 Pochoir_Array_1D double a N_SIZE j 4 a Register_Boundary heat_bdry 5 heat_1D Register_Array a 6 Pochoir_Kernel_1D heat_1D_fn t i 7 a tel 2 0 125 daty 241 22 0 e 2 a t 1 2 Pochoir Kernel_End maiia Initialization heat_1D Run T_STEP heat_1D_fn O Woo 1 Figure 4 The Pochoir stencil source code for a periodic 1D heat equation After the user compiles and runs it with the normal C com piler the Pochoir template library reports the following errors at runtime Pochoir off shape access error Pochoir array index 0 999 Shape index 0 2 Input
17. on contains many exam ples illustrating the usage of Pochoir including Conway s Game of Life 2D heat equation 3D wave equation RNA secondary struc ture alignment etc 3 TUTORIAL To illustrate how to use Pochoir consider the 2D heat equation 6 du x y Pulx y Pu x y x IUa ae on an X x Y grid where u x y is the heat at a point x y at time t and Q is the thermal diffusivity might be solved using a stencil computation By discretizing space and time this partial differen tial equation can be solved approximately by using the following Jacobi style update equation u41 x y u x y QAt T Axe ur x 1 y ur x 1 y 2u x y QAt 7 u x y 1 u x y 1 2u x y Ay Figure 2 shows the Pochoir source code for the periodic 2D heat equation Line 7 declares the Pochoir shape of the stencil and line 8 creates the 2 dimensional Pochoir object heat having that shape The Pochoir object will contain all the state necessary to perform the computation Each triple in the array 2D_five_pt cor responds to a relative offset from the space time grid point t x y that the stencil kernel declared in lines 12 14 will access The compiler cannot infer the stencil shape from the kernel because the kernel can be arbitrary code and accesses to the grid points can be hidden in subroutines The Pochoir template library complains during Phase 1 however if an access to a grid point during
18. the kernel computation falls outside the region specified by the shape declaration Line 9 declares u as an XXY Pochoir array of double precision floating point numbers representing the spatial grid Lines 2 4 de fine a boundary function that will be called when the kernel func tion accesses grid points outside the computing domain that is if 1 define mod r m r m r lt 0 m 0 2 Pochoir_Boundary 2D heat_bv a t x y 3 return a get t mod x a size 1 mod y a size 0 4 Pochoir_Boundary_End 5 int main void 6 const int X 1000 Y 1000 T 1000 7 Pochoir_Shape_2D 2D_five_pt 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 8 Pochoir_2D heat 2D_five_pt j 9 Pochoir_ Array 2D double u X Y 10 u Register_Boundary heat_bv 11 heat Register_Array u 12 Pochoir_Kernel_2D heat_fn t x y 13 ttl xy y C uit stl Yo 42 wit ox y u t x 1 y CY u t x ytl 2 u t x y u t x y 1 u t x y 14 Pochoir_Kernel_End 15 for int x 0 x lt X x 16 for int y 0 y lt Y ty 17 u 0 x y rand 18 heat Run T heat_fn 19 for int x 0 x lt X x 20 for int y 0 y lt Y ty 21 cout lt lt u T x y 23 return 0 24 Figure 2 The Pochoir stencil source code for a periodic 2D heat equation Pochoir keywords are boldfaced it tries to access u t x y with x lt 0 x gt X y lt 0 or y gt Y The bound
Download Pdf Manuals
Related Search
Related Contents
バスエース600/600R Dey 202 - The University of North Carolina at Chapel Hill INSTRUCTION MANUAL 16H FDL Manual BOOK MagicWatch MWE800 User Manual - Cruising Television Copyright © All rights reserved.
Failed to retrieve file