Home

SparseX User's Guide

image

Contents

1. void spx_vec_scale spx_vector_t v1 spx vector t v2 spx value t num void spx vec scale add spx vector t v1 spx vector t v2 spx vector txv3 spx value t num void spx vec scale add part spx vector t v1l spx vector t v2 spx vector _txv3 spx value tnum spx index t start spx index t end void spx vec add spx vector t xv1 spx vector t v2 spx vector t xv3 void spx vec add part spx vector t v1 spx vector t v2 spx vector t v3 spx index t start spx index t end void spx vec sub spx vector t v1 spx vector t v2 spx vector t v3 void spx vec sub part spx vector t v1 spx vector t v2 spx vector t v3 spx index t start spx index t end spx value t spx vec mul const spx vector t v1 const spx vector t v2 spx value t spx vec mul part const spx vector t v1 const spx vector t xv2 spx index t start spx index t end spx error t spx vec reorder spx vector t v spx perm t xp spx error t spx vec inv reorder spx vector t v spx perm t xp void spx vec copy const spx vector t v1 spx vector t xv2 int spx vec compare const spx vector t v1 const spx vector t v2 void spx vec print const spx vector t v void spx vec destroy spx vector t v void spx timer clear spx timer txt void spx timer start spx timer txt void spx timer pause spx timer txt double spx timer get secs spx timer txt void spx log disable all void spx log error console void spx log warning console
2. void spx log info console void spx log verbose console void spx log debug console void spx log error file void spx log warning file void spx log info file void spx log verbose file void spx log debug file 29 30 void spx log all console void spx log all file const char xfile void spx log set file const char xfile spx errhandler t spx err get handler void spx err set handler spx errhandler t new handler A 2 Routine documentation A 2 1 void spx init Library initialization routine A 2 2 void spx finalize Library shutdown routine A 2 3 spx input t spx input load csr spx index t rowptr spx index t colind spx value t x values spx index t nr rows spx index t nr cols Creates and returns a valid tunable matrix object from a Compressed Sparse Row CSR representation Parameters in rowptr array rowptr of the CSR format in colind array colind of the CSR format in values array values of the CSR format in nr rows number of rows of the matrix in nr cols number of columns of the matrix in argument that specifies the indexing either SPX IND EX ZERO BASED or SPX INDEX ONE BASED Returns a handle to the input matrix or SPX INVALID INPUT in case an error occurs Possible error conditions 1 SPX ERR ARG INVALID any of the input arguments are invalid 2 SPX ERR INPUT MAT the input data a
3. lt athena cslab ece ntua gr gt Past contributors include e Kornilios Kourtis lt kkourt cslab ece ntua gr gt e Thodoris Gountouvas lt thgoud cslab ece ntua gr gt 51 Bibliography J Ankit pOSKI An Extensible Autotuning Framework to Perform Optimized SpMVs on Multicore Architectures 2008 S Balay J Brown K Buschelman V Eijkhout W D Gropp D Kaushik M G Knepley L C McInnes B F Smith and H Zhang PETSc users manual Technical Report ANL 95 11 Revision 3 4 Argonne National Laboratory 2013 URL http www mcs anl gov petsc R Boisvert R Pozo and Karin Remington The matrix market exchange formats Initial design Technical Report NISTIR 5935 National Institute of Standards and Technol ogy December 1996 E Cuthill and J McKee Reducing the bandwidth of sparse symmetric matrices In Proceedings of the 1969 24th National Conference ACM 1969 A Elafrou Sparsex A library for the optimization of the spmv kernel on multicore architectures Master s thesis National Technical University of Athens 2013 Intel Coorporation Intel Math Kernel Library Intel Coorporation 2013 URL http software intel com en us intel mkl V Karakasis T Gkountouvas K Kourtis G Goumas and N Koziris An Extended Compression Format for the Optimization of Sparse Matrix Vector Multiplication IEEE Transactions on Parallel and Distributed Systems TPDS 24 10 1930 1940 2013 IEEE K Kourtis V Kar
4. s integration in the numerical software stack would have an important impact on several scientific applications whose overall performance is sensitive to that of the SpMV kernel To this end we provide the means to facilitate this process through a low level interface Key goals and aspects of our interface sum up to the following Provide simple and clear semantics Every well designed API should be easy to use correctly and difficult to use incorrectly A user friendly syntax reduces the time 1 1 GETTING STARTED and intellectual overhead required to develop user software as well as making the development process less error prone Of course this is a universal objective in interface design and achieving it depends to a significant degree in minimizing the number of things the user must remember in order to effectively use the interface In the present context this implies the number of function names the user must remember should be small e to the extent possible the information that functions require as input pa rameters should be limited to information that the user would necessarily know Our interface tries to reflect the above guidelines as well as being as intuitive as possible that is usage of the interface should follow the user s natural train of thought on solving the problem This objective is usually complicated by the desire to serve users with different levels of expertise Facilitate integration t
5. spx partition t x spx mat get partition spx matrix t xA spx perm t spx mat get perm const spx matrix t xA spx error tspx matvec mult spx value t alpha const spx_matrix_t xA spx vector t xx spx vector t xy spx error tspx matvec kernel spx value t alpha const spx_matrix_t xA spx vector t xx spx value t beta spx vector t xy spx error t spx matvec kernel csr spx matrix t KA spx index t nr rows spx index t nr cols spx index t xrowptr spx index t colind spx value t values spx value t alpha spx vector t xx spx value t beta spx vector t y spx error t spx mat destroy spx matrix t xA spx partition t x spx partition csr spx index t rowptr spx index t nr rows size_t nr threads spx error t spx partition destroy spx partition t xp void spx option set const char option const char string spx vector tx spx vec create size_t size spx partition t xp spx vector t spx vec create from buft spx value t buff spx value t tuned size_t size spx partition t p spx vecmode t mode spx vector t spx vec create random size_t size spx partition t xp void spx vec init spx vector t xv spx value t val void spx vec init part spx vector t v spx value t val spx index t start spx index t end void spx vec init rand range spx vector t xv spx value t max spx value _t min spx error tspx vec set entry spx_vector_t v spx index tidx spx_value _t val
6. and returns a vector object whose values are randomly filled Parameters in size the size of the vector to be created in p a partitioning handle Returns a valid vector object or SPX INVALID VEC in case of an error A 2 27 void spx_vec_init spx vector t v spx value t val Initializes the vector object v with val Parameters in v a valid vector object in val the value to fill the vector with A 2 28 void spx vec init part spx vector t v spx value t val spx index t start spx index t end Initializes the start end range of the vector object v with val Parameters in v a valid vector object 41 in val the value to fill the vector with in start starting index in end ending index A 2 29 void spx vec init rand range spx vector tx v spx value t max spx value t min Initializes the vector object v with random values in the range min max Parameters in v a valid vector object in max maximum value of initializing range in min minimum value of initializing range A 2 30 spx error t spx vec set entry spx vector t v spx index t idx spx value t val Sets the element at index idx of vector v to be equal to val Default indexing is zero based but it can be overidden through the optional flag Parameters in v a valid vector ob
7. const spx vector t v1 const spx vector t v2 Returns the product of the input vectors v1 and v2 Parameters in vi a valid vector object in v2 a valid vector object Returns the product of the input vectors 44 A 2 39 spx value t spx vec mul part const spx vector tx v1 const spx vector t x v2 spx index t start spx index t end Returns the product of the range start end of the input vectors v1 and v2 Parameters in v1 a valid vector object in v2 a valid vector object in start starting index in end ending index Returns the product of the input vectors A 2 40 spx error t spx vec reorder spx vector tx v spx perm t p Reorders the input vector v according to the permutation p leaving the original vec tor intact Parameters in v a valid vector object in p a permutation Returns the permuted input vector or SPX INVALID VEC in case an error occurs Possible error conditions 1 SPX ERR ARG INVALID any of the input arguments is invalid A 2 41 spx error t spx vec inv reorder spx vector t v spx perm t p Inverse reorders the permuted input vector v according to the permutation p Parameters in out v the vector object to be inverse reordered in p a permutation Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 1 SPX ERR ARG INVALID any
8. input vectors v1 and v2 and places the result in v3 i e v3 v1 v2 Parameters in v1 a valid vector object in v2 a valid vector object in v3 a valid vector object A 2 35 void spx vec add part spx vector t x v1 spx vector t v2 spx vector t v3 spx index t start spx index t end Adds the range start end of the input vectors v1 and v2 and places the result in v3 Le v3 start end vi start end v2 start end Parameters in vi a valid vector object in v2 a valid vector object 43 in v3 a valid vector object in start starting index in end ending index A 2 36 void spx vec sub spx vector tx v1 spx vector t v2 spx vector t v3 Subtracts the input vector v2 from v1 and places the result in v3 i e v3 v1 v2 Parameters in v1 a valid vector object in v2 a valid vector object in v3 a valid vector object A 2 37 void spx vec sub part spx vector t v1 spx vector t v2 spx vector t v3 spx index t start spx index t end Subtracts the input vector v2 from v1 in the range start end and places the result in v3 i e v3 start end vi start end v2 start end Parameters in v1 a valid vector object in v2 a valid vector object in v3 a valid vector object in start starting index in end ending index A 2 38 spx value t spx vec mul
9. of the input arguments is invalid 45 A 2 42 void spx vec copy const spx vector tx v1 spx vector tx v2 Copies the values of v1 to v2 Parameters in v1 a valid vector object in v2 a valid vector object A 2 43 int spx_vec_compare const spx_vector_t v1 const spx_vector_t v2 Compares the values of v1 and v2 If they are equal it returns 0 else 1 Parameters in v1 a valid vector object in v2 a valid vector object A 2 44 void spx vec print const spx_vector_t v Prints the input vector v Parameters in v a valid vector object A 2 45 spx error t spx_vec_destroy spx vector t v Destroys the input vector v Parameters in v a valid vector object Returns an error code SPX SUCCESS or SPX FAILURE A 2 46 void spx timer clear spx timer tx t Initialises a timer object Parameters in t timer object to be initialized A 2 47 void spx timer start spx timer txt Starts a timer object 46 Parameters in t timer object to be launched A 2 48 void spx timer pause spx_ timer txt Pauses a timer object Parameters in t timer object to be paused A 2 49 double spx timer get secs spx timer txt Returns the elapsed time in seconds Parameters in t a timer object Returns the elapsed seconds A 2 50 void spx log disable_all Disables logging in SparseX A 2 51 v
10. or SPX INVALID MAT in case an error occurs Possible error conditions 1 SPX ERR FILE invalid filename argument or file read error 2 SPX ERR TUNED MAT reproducing the matrix failed A 2 11 spx index t spx mat get nrows const spx matrix tx Returns the number of rows of the matrix 34 Parameters in A the tuned matrix handle Returns the number of rows Possible error conditions 1 SPX ERR ARG INVALID the matrix handle is invalid A 2 12 spx index t spx mat get ncols const spx matrix tx A Returns the number of columns of the matrix Parameters in A the tuned matrix handle Returns the number of columns Possible error conditions 1 SPX ERR ARG INVALID the matrix handle is invalid A 2 13 spx index t spx mat get_nnz const spx matrix tx A Returns the number of nonzero elements of the matrix Parameters in A the tuned matrix handle Returns the number of nonzeros Possible error conditions 1 SPX ERR ARG INVALID the matrix handle is invalid A 2 14 spx partition t spx mat get partition spx matrix tx A Returns a partitioning object for the given matrix 35 Parameters in A the tuned matrix handle Returns a valid partitioning object or SPX INVALID PART in case an error occurs A 2 15 spx perm tx spx mat get perm const spx matrix t A Returns the permutation computed for the supplied matrix by applying the Reverse Cuthill Mc
11. rowptr nrows nr parts Create vectors from arrays spx vector t x view spx vec create from buff x NULL ncols parts SPX VEC AS IS spx vec create from buff y NULL ncols parts SPX VEC AS IS spx vector t y view Declare a tuned matrix handle spx matrix t A SPX INVALID MAT Run 128 iterations of the SpMV kernel y lt alpha A x beta y for i 0 i lt 128 i I spx matvec kernel csr amp A nrows ncols rowptr colind values alpha x view beta y view Cleanup interface objects spx_mat_destroy A spx vec destroy x view spx vec destroy y view spx partition destroy parts Shutdown library spx finalize 23 4 EXAMPLES 4 3 Example 3 This example illustrates the use of reorderng with SparseX The matrix in this example is loaded from a file in the MMF format Multithreaded execution is selected with 2 threads and a cpu affinity of 0 1 The symmetric variant of CSX is selected and tun ing is performed explicitly with sampling enabled but this time using sampling win dows of a fixed size and detecting only delta units Reordering is enabled through the SPX MAT REORDER option of the spx mat tune routine We must observe that the x and y vectors must be explicitly reordered before executing the kernel and inverse reordered after the execution 1 Initialize library spx init 4 Load matrix from file 5 spx input t in
12. tuned matrix or SPX INVALID MAT in case an error occurs 1 SPX ERR ARG INVALID the input matrix handle is invalid 2 SPX ERR TUNED MAT conversion to the CSX format failed A 2 7 spx error t spx mat get entry const spx matrix t A spx index t row spx index t column spx value tx value Returns the value of the corresponding nonzero element in row column where row and column can be either zero or one based indexes Default indexing is zero based but it can be overidden through the optional flag If the element exists its value is returned in value Parameters in A the tuned matrix handle in row the a row of the element to be retrieved in column the column of the element to be retrieved out value the value of the element in row column in argument that specifies the indexing either SPX IND EX ZERO BASED or SPX INDEX ONE BASED Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 1 SPX ERR ARG INVALID any of the input arguments is invalid 2 SPX OUT OF BOUNDS the element is out of range 3 SPX ERR ENTRY NOT FOUND the element doesn t exist i e is not nonzero A 2 8 spx error t spx mat set entry spx matrix t x A spx index t row spx index t column spx value t value Sets the value of the corresponding element in row column where row and column can be either zero or one based indexes Default indexing is
13. zero based but it can be overidden through the optional flag Parameters in A the tuned matrix handle in row the row of the element to be set in column the column of the element to be set in value the new value of the element in row column in argument that specifies the indexing either SPX IND EX ZERO BASED or SPX INDEX ONE BASED Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 33 1 SPX ERR ARG INVALID any of the input arguments is invalid 2 SPX OUT OF BOUNDS the element is out of range 3 SPX ERR ENTRY NOT FOUND the element doesn t exist i e is not nonzero A 2 9 spx error t spx mat save const spx matrix t A const char x filename Stores the matrix in the CSX format into a binary file Parameters in A the tuned matrix handle in filename the name of the binary file where the matrix will be dumped Returns an error code SPX_SUCCESS or SPX_FAILURE Possible error conditions 1 SPX ERR ARG INVALID the matrix handle is invalid Possible warnings 1 SPX_WARN_CSXFILE a filename hasn t been supplied so the default csx_file bin will be used A 2 10 spx matrix tx spx mat restore const char filename Reconstructs the matrix in the CSX format from a binary file Parameters in filename the name of the file where the matrix is stored Returns a handle to the tuned matrix
14. Kee algorithm Parameters in A the tuned matrix handle Returns a handle to the permutation object or SPX INVALID PERM in case an error occurs Possible error conditions 1 SPX ERR ARG INVALID the matrix handle is invalid or the matrix has not been reordered A 2 16 spx error t spx matvec mult spx value t alpha const spx matrix tx A spx vector t x spx vector t y Performs a matrix vector multiplication of the following form y alpha A z A 1 where alpha is a scalar xand y are vectors and is a sparse matrix in the CSX format Parameters in A the tuned matrix handle in alpha a scalar in x the input vector in out y the output vector Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 1 SPX ERR ARG INVALID any of the input arguments is invalid 2 SPX ERR DIM matrix and vector dimensions are not compatible 36 A 2 17 spx error t spx matvec mult spx value t alpha const spx matrix t A spx vector t x spx vector t y Performs a matrix vector multiplication of the following form y alpha A zx A2 where alpha is a scalar xand y are vectors and A is a sparse matrix in the CSX format Parameters in A the tuned matrix handle in alpha a scalar in x the input vector in out y the output vector Returns an error code Possible error conditions 1 SPX ERR ARG INVALID any of the input a
15. SparseX User s Guide by Athena Elafrou June 2014 Contents Contents 1 A B Getting started LL About SparseX ss ee eas poy ow yale we lah eee tea we eae 1 2 Goals and motivation weba oa ti SG ea Installation 2 1 Installation requirements 4 5 a 2 int ara a kee de hat dis 2 2 How to install SparseX a Bes Ae AA se 23 Linking against SparseX 60 be wk REM AEE SA Vad Interface Dil eee apen ag Aion ots Sr Bok Ser Je eger 3 1 1 Naming convention pw EA 3 1 2 Basie scalar types and objects 5 1 5 ds Ga a 32 Auxiliary TOUS 64 4 Aa 3 2 1 Loading input abc A o 322 Tuning to the CSX format yr ds 3 23 Changing matrix nonzero values 3 2 4 Storing and retrieving CSX a 3 25 Creating and modifying vector objects 326 Reordering 5 4 iaoa ei i eee onde Ee eR MDA re 3 3 Computational routines vg nors akser a 34 Timing ro tin s 0 set Edo 4 20 ue Ever wig we kl 4 50 Logging TOURN ES A a kaoset Ask AE Ge AE 4 3 6 Error handling in SparseX saa ao x Sok a TAKET Examples AL sExample T eus EA ad ed ia AAA 42 EXMplel auge a ear ARA NE TA A id 45 Examples scote da bh e a eo eee ee dre de AA Example 4 ees DA oh AP p a eaa NE TARN ae S Bathe As C bindings reference Acknowledgements REN SI anA 21 21 22 24 25 27 51 CONTENTS Bibliography ii 53 Getting started 1 1 About SparseX The SparseX library is a collection of low level pr
16. akasis G Goumas and N Koziris CSX An extended compression format for SpMV on shared memory systems In 16th ACM SIGPLAN Annual Sympo sium on Principles and Practice of Parallel Programming PPoPP 11 San Antonio TX USA 2011 ACM R Vuduc J W Demmel and K A Yelick OSKI A library of automatically tuned sparse matrix kernels volume 16 2005 53
17. ally in case of NUMA architectures where the performance of the SpMV kernel is sensitive to the correct placement of the involved data on the system s memory nodes information is given on the success or failure of the corresponding memory allocations Table 3 9 gives an overview of the available routines for configuring the logging process 3 6 Error handling in SparseX In general the SparseX interface distinguishes three types of errors a fatal errors generated by the operating system b logical errors i e created by the user that are fatal to the program execution and c logical errors that do not lead to program failure The first category may include memory and file related errors The second category may include errors due to invalid arguments supplied to one of the interface s routines Finally non fatal logical errors may occur for example when trying to set an option to an invalid value The interface handles errors with the use of error handling routines and error codes or invalid handle returns When an error condition is detected within a SparseX rou tine it is treated as following The routine calls the current error handler e Regardless of what action the error handler performs the routine returns an error code or an invalid handle depending on the routine The default error handler uses the logging framework described in the previous section to output messages of the following form prefix message s
18. be explicitly initialized Parameters in size the size of the vector to be created in p apartitioning handle Returns a valid vector object or SPX INVALID VEC in case of an error A 2 25 spx vector t spx vec create from buff spx value t buff spx value t x tuned size_t size spx partition t x p spx vecmode t mode Creates and returns a valid vector object whose values are mapped to a user defined array If SPX VEC AS IS is set then the input buffer will be shared with the library and modifications will directly apply to it In this case pointers buff and tuned point to the same memory location If SPX VEC TUNE is selected the buffer provided by the user might be copied into an optimally allocated buffer depending on the platform and tuned might point to this buffer Thus the user must always check whether buff equals tuned If the buffer is actually tuned then it should be used instead of the original The common free function applies to this buffer and will have to be explicitly called by the user 40 Parameters in buff the user supplied buffer in tuned the tuned buffer in size the size of the buffer in p a partitioning handle in mode the vector mode either SPX VEC AS IS or SPX_ VEC TUNE Returns a valid vector object or SPX INVALID VEC in case of an error A 2 26 spx vector t spx vec create random size_t size spx partition t p Creates
19. brary and enables information hiding Once created a matrix or vector is referenced only by its handle The available handles for manipulating matrices and vectors are defined by the following data types e spx input t which represents a handle to the input matrix e spx matrix t which represents a handle to the matrix in the CSX format e spx vector t which represents a handle to a dense vector object We must note here that since our API aims at exporting utilities for the CSX format alone matrices stored in different formats are only treated as input matrix represen tations hence the distinction between the spx input t and spx matrix t handles Complementary handle types include the spx partition t type which describes a partitioning object that is required for multithreaded execution the spx perm t for defining permutations in case the user wants to apply reordering to improve the sparsity structure of the matrix and the spx timer t which descibes a timer object These types will be described in more detail in the following sections 3 2 Auxiliary routines 3 2 1 Loading input matrices The user creates an input matrix object of type spx input t from a valid sparse matrix in one the following formats Matrix Market File MMF format In dealing with issues of I O SparseX is cur rently designed to support reading a sparse matrix from a file in the MMF format Boisvert et al 1996 File input is embedded as another form of a
20. e modified later 3 2 4 Storing and retrieving CSX Although significantly optimized the preprocessing cost of CSX is still non negligible This motivated us to implement an I O feature that will allow the user to avoid tuning a matrix that has already been converted to the CSX format in a previous session This feature involves serializing the tuned matrix handle and storing it in a binary file so it can be used in a future session to reconstruct an equivalent handle and directly perform the SpMV kernel This achieves a considerable speedup over the optimized preprocessing phase of CSX Our user API provides the spx_mat_save routine for storing the matrix and the spx_mat_restore routine for loading it from the binary file 3 2 5 Creating and modifying vector objects Dense vector objects of type spx_vector_t can be either wrappers of user defined arrays or they can be created and initialized explicitly by the user In a multithreaded scenario a vector might need to be partitioned among threads This is done trans parently during the creation process and thus cannot be controlled by the user The 14 3 2 Auxiliary routines only responsibility of the user is to provide the split points to the vector creation rou tine through an object of type spx partition t This object can be constructed ei ther implicitly during the tuning phase of CSX and subsequently extracted with the spx mat get partition routine or explicitly throught
21. ecting different data type precisions The SparseX library is defined in terms of two basic types spx index t for indexing data and spx value t for floating point values These are bound by default to int and double respectively but may be overidden at library build time through the index type lt TYPE gt and value type lt TYPE gt options of configure Benchmarking The SparseX distribution also includes a benchmarking framework in order to evaluate the SPMV performance This framework also supports other popular libraries that implement the SpMV kernel including Intel s Math Kernel Library Intel MKL Intel Coorporation 2013 a well established commercial product and the parallel Optimized Sparse Kernel Interface pOSKI library Vuduc et al 2005 Ankit 2008 which has been developed by the Berkeley Benchmark ing and Optimization BeBOP group http bebop cs berkeley edu You can enable this feature through the enable bench option of configure If one of the above libraries is installed on your system you can use the with mkl lt DIR gt or with poski lt DIR gt options respectively to enable them After you have successfully compiled the benchmarking framework the final executable named bench_spmv will be placed inside the src bench directory of the build directory The bench_spmv executable may be invoked as follows ENV lt value gt bench spmv f lt matrix fi
22. elements when loading the matrix Thus the format has been extended in order to also sup port row wise ordering of the elements and zero base indexing by introducing two optional fields in the header line called indexing which can be either 0 base or 1 base and ordering which can be either column or row Furthermore a sim plified version of the MMF format is supported which drops the header line and includes only the size line and data lines In this case however the data must be sorted both row and column wise Compressed Sparse Row CSR format An input handle can also be created from an existing user allocated pre assembled matrix in the CSR format The CSR data structures rowptr colind and values are shared in this case with the li brary thus the user must guarantee they will not be freed or reallocated before the destruction of the input handle Both zero based and one based indexing is supported by setting the appropriate argument in the corresponding routine Table 3 1 summarizes the available routines for loading input matrices 3 2 2 Tuning to the CSX format The user may convert the input matrix into the CSX format simply by calling the spx mat tune routine Even though the parameters of the preprocessing phase 11 3 INTERFACE have been expertly tuned the user can also experiment with the capabilities of CSX through the spx set option routine There are options for controlling a the runtime enviro
23. elta units can be detected Thus when the number of participating threads is large the user is advised to set a smaller number of samples especially in case of small matrices The user might of course achieve a similar effect by increasing the sampling portion We plan to address this issue by further refining the heuristic employed for computing the window size From the last group of options the most important for the user is the spx matrix symmetric option which enables the use of the symmetric variant of the CSX format The rest of the options are configured by default according to a best practice policy which also takes into account the underlying architecture 3 2 3 Changing matrix nonzero values The nonzero pattern of the input matrix fixes the nonzero pattern ofthe spx matrix t handle i e the matrix in the CSX format This means that insertion and deletion of elements is not supported once the matrix has been stored in the CSX format Modify ing existing nonzero values however is allowed through the spx mat set entry routine The spx mat get entry routine is also available for retrieving obtaining a single value Both routines assume zero base indexing in the supplied coordinates of the matrix element by default but one based indexing is also supported by providing the optional argument SPX INDEXING ONE BASED If the input matrix contains explicit zeros the library treats them as logical nonze ros whose values can b
24. et follows the look and feel of the BLAS interface in cluding e sparse matrix by vector multiplication e vector operations scale add subtract multiply The following sections proceed with a thorough presentation of the available rou tines both auxiliary and computational and describe essential ingredients of the user interface including the major objects that are intergal parts of it Some source code snippets are provided to give a more practical view of how the API may be used More elaborate examples are provided in th following chapter For complete C bindings of the available routines see Appendix A 3 1 1 Naming convention The naming convention for the public interface routines of the SparseX library has the following form return value type spx_object_function Every routine starts with the spx prefix which stands for SparseX followed by the name of the object it is associated with usually in a condensed form and a name that describes its functionality 3 INTERFACE 3 1 2 Basic scalar types and objects The SparseX library is defined in terms of two basic types spx index t for indexing data and spx value t for floating point values These are bound by default to int and double respectively but may be overidden at library build time as described in Chapter 2 Matrices and vectors are represented by handles in our interface The use of handles complements the object oriented approach of the core li
25. he CSX format and every subsequent call will use the previously tuned matrix handle Parameters in A either a pointer to an invalid matrix handle or a tuned matrix handle If A is equal to SPX INVALID MAT then the matrix in the CSR format is first converted to Co SX Otherwise the valid previously tuned matrix handle is used to perform the multiplication in nr_rows number of rows of the matrix A in nr_cols number of columns of the matrix A in rowptr array rowptr of the CSR format in colind array colind of the CSR format in values array values of the CSR format in alpha a scalar in x the input vector in beta a scalar in out y the output vector Returns an error code Possible error conditions 1 SPX_ERR_ARG_INVALID any of the input arguments is invalid 2 SPX ERR INPUT MAT the input data arrays do not correspond to a valid CSR representation 3 SPX_ERR_DIM matrix and vector dimensions are not compatible A 2 20 spx error t spx mat destroy spx matrix tx A Releases any memory internally used by the tuned matrix handle A 38 Parameters in A the tuned matrix handle Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 1 SPX ERR ARG INVALID the matrix handle is invalid 2 SPX ERR MEM FREE deallocation failed A 2 21 spx partition t spx partition csr spx index t x rowptr spx index t nr rows size
26. he spx partition csr routine when loading a matrix from the CSR format In a typical scenario the user loads the input matrix converts it to the CSX format in this phase the split points are auto matically computed and then creates the necessary vector objects as in the following sequence spx input t input spx input load mmff file spx matrix t A spx mat tune input spx_partition_t parts spx mat get partition A spx vector t x spx vec create size parts When the vector is created as a wrapper of a user defined array with spx vec cre ate from buff andthe SPX VEC TUNE option has been set the library may select to optimize data allocation and return a tuned array for further use Example 4 4 in Chapter 4 demonstrates use of tuned vectors Table 3 3 summarizes the available routines for creating and modifying vector ob jects while table 3 4 shows the routines involved in partitioning 3 2 6 Reordering When the matrix structure is very irregular resulting in an equally irregular access pat tern in the right hand side vector a significant amount of cache misses is introduced Additionally when the matrix suffers from an heterogeneous sparsity pattern a varying flop byte ratio is exhibited among the participating threads due to fluctuations in the nonzero density across the matrix leading to significant load imbalances Reordering the matrix using a bandwidth reduction technique has been proposed as a solution t
27. he user must not forget to cleanup all objects associated to our interface and shutdown the library before exiting It is also important to note that vector objects must be created after the tuning has taken place after retrieving the partitioning handle through the spx mat get parts routine 1 Initialize library 2 spx_init 4 Load matrix from MMF file 5 spx input t input spx input load mmf argv 1 7 Transform to CSX 8 spx matrix t A spx mat tune input 10 Create random x and y vectors 11 spx partition t parts spx mat get partition A 12 spx vector t xx spx vec create random spx mat get ncols A 21 4 EXAMPLES 13 parts 14 spx vector t y spx vec create random spx mat get nrows A 15 parts 17 Run 128 loops of the SpMV kernel 18 spx timer t t 19 double elapsed time 20 int i 22 spx timer clear amp t 23 spx timer start amp t 24 for i 0 i lt 128 i I 25 spx matvec kernel alpha A x beta y 26 27 spx timer pause amp t 28 elapsed time spx timer get secs amp t 29 printf Elapsed time Alf secs n elapsed time 31 Cleanup 32 spx input destroy input 33 spx_mat_destroy A 34 spx partition destroy parts 35 spx vec destroy x 36 spx vec destroy y 38 Shutdown library 39 spx finalize 4 2 Example 2 This example illustrates use of the higher level multiplication routine spx matvec ke rne
28. imitives written in the C C pro gramming languages that provides the means to developers of solver libraries and of scientific and engineering applications to easily attain high performance of the Sparse Matrix by Vector multiplication kernel SpMV on modern multicore architectures 1 2 Goals and motivation Most sparse iterative solver libraries for linear systems only support standard sparse matrix storage formats such as the CSR and COO formats thus exhibiting poor perfor mance in many cases However some recent projects that focus on high performance computing HPC including the open source Portable Extensible Toolkit for Scientific computation PETSc Balay et al 2013 and Intel s commercial Math Kernel Library MKL Intel Coorporation 2013 have expanded their suite of sparse matrix rep resentations with more elaborate formats such as the BCSR format which are much more efficient for particular types of problems The Compressed Sparse eXtended CSX format along with its symmetric variant are currently among the most highly optimized sparse matrix storage formats for per forming matrix by vector multiplications on multicore architectures especially in the context of iterative methods for the solution of large scale linear systems Kourtis et al 2011 Karakasis et al 2013 Its unique performance stability across a wide variety of problems makes CSX an excellent candidate for HPC applications and thus we believe CSX
29. is routine is equivalent to the BLAS Level 2 routine for matrix by vector multiplica tion and follows the BLAS convention in parameter ordering The last routine in Table 3 6 forms a higher level implementation of the kernel that hides the preprocessing phase of CSX by accepting a matrix in the CSR format as input This last routine can be efficiently used in a loop since only the first call will convert the matrix into the CSX format and every subsequent call will use the previously tuned matrix handle Refer to Example 4 2 of the following chapter for efficient use of this routine A wide variety of vector operations is also provided including addition subtraction multiplication and scaling We must note here that every routine in Table 3 7 is also available in a version that operates on a specific range of the input vectors These variants could be useful in a multithreaded scenario For a complete reference of the available vector operations see Appendix A 3 4 Timing routines Our interface also provides basic utilities for timing your code A timer in SparseX is represented by the spx timer t handle type and can be manipulated with one of the routines in Table 3 8 17 3 INTERFACE Routine spx matvec mult spx matvec kernel Description Performs the SpMV kernel y alpha A x with a tuned matrix handle as input Performs the kernel y alpha A x beta y with a tuned matrix handle as input spx_matvec_kernel_c
30. ject in idx an index inside the vector in val the value to be set in argument that specifies the indexing either SPX IND EX ZERO BASED or SPX INDEX ONE BASED A 2 31 void spx vec scale spx vector t v1 spx vector t v2 spx value t num Scales the input vector v1 by a constant value num and places the result in vector v2 Le v2 num v1 Parameters in v1 a valid vector object in v2 a valid vector object in num the constant by which to scale v1 A 2 32 void spx vec scale add spx_vector_t v1 spx_vector_t v2 spx_vector_t v3 spx_value_t num Scales the input vector v2 by a constant value num adds the result to vector v1 and places the result in vector v3 i e v3 v1 num v2 42 Parameters in v1 a valid vector object in v2 a valid vector object in v3 a valid vector object in num the scalar by which to scale v1 A 2 33 void spx vec scale add part spx_vector_t vi spx_vector_t v2 spx_vector_t v3 spx_value_t num spx_index_t Start spx_index_t end v3 start end vi start end num x v2 start end Parameters in v1 a valid vector object in v2 a valid vector object in v3 a valid vector object in num the scalar by which to scale v1 in start starting index in end ending index A 2 34 void spx vec add spx vector t v1 spx_vector_t v2 spx vector t v3 Adds the
31. l csr In this case our input will be in the CSR format and the conversion to the CSX format will be hidden in the multiplication routine First we explicitly parti tion the matrix in the CSR format through the spx partition csr routine and subsequently the x and y vector views are created from the user defined arrays The SPX VEC AS IS option indicates that no tuning will be performed to the input array so the second argument of the routine can be NULL Finally a loop of 128 iterations of the SpMV kernel is executed and the library objects are destroyed 1 Define CSR data structures 2 spx index t rowptr 0 5 6 10 15 18 22 24 29 33 38 3 spx index t colind 0 1 2 3 8 7 0 1 6 9 0 1 3 5 9 0 1 4 9 0 1 5 9 2 3 2 3 4 5 7 2 3 4 5 2 5 3 4 5 9 6 spx value t values 1 2 3 4 5 6 7 8 9 10 11 12 13 14 22 4 2 Example 2 11 12 13 14 16 17 19 20 22 23 25 26 27 29 30 31 32 33 34 35 37 38 40 41 42 43 44 45 47 48 49 50 51 53 54 15 16 17 18 19 20 21 22 23 24 25 26 26 1 26 2 27 28 29 29 1 29 2 30 31 31 1 31 2 32 Define vector arrays spx value t x 1 2 3 4 5 6 7 8 9 10 spx value t yl 10 19 0 28 0 31 0 42 1 32 2 64 0 75 2 36 0 91 1 spx index t nrows ncols spx value t alpha beta nrows 10 ncols 10 alpha 0 42 beta 0 10 Initialize library spx_init Partition matrix spx partition t parts spx partition csr
32. le gt 1 lt library name gt where lt library name gt can be one of SparseX MKL pOSKI Execution is controlled by a set of environment variables namely OUTER LO OPS LOOPS and NUM THREADS OUTER LOOPS defines the number of repe titions LOOPS the number of SpMV iterations and NUM THREADS the number of threads that will be used For the SparseX and MKL libraries you can also use CPU AFFINITY and GOMP CPU AFFINITY respectively to set the thread affinity The pOSKI library does not support explicit configuration of thread affinity 2 3 Linking against SparseX If you ever happen to want to link against the library you must either use libtool and specify the full pathname of the library or use the LLIBDIR flag during linking and do at least one of the following e add LIBDIR to the LD_LIBRARY_PATH environment variable during execution e add LIBDIR to the LD RUN PATH environment variable during linking e use the W1 rpath WLLIBDIR linker flag 2 INSTALLATION e have your system administrator add LIBDIR to etc ld so conf Interface 3 1 Overview The API primitives of the SparseX library fall into two broad categories the auxiliary routines and the computational routines The auxiliary routine set includes e sparse matrix and vector creation and update sparse matrix tuning into the CSX format e sparse matrix and vector reordering e CSX update CSX I O The computational routine s
33. nce especially on NUMA platforms In this case the user must supply a pointer to the tuned data as the second argument of the spx vec create from buff routine see x tuned andy tuned at line 11 If the data is ac tually tuned by our library the pointers will point to different memory regions How ever since the data is selectively tuned the user must always check the pointers for equality If the pointers differ then the tuned buffer should be used instead of the orig inal from this point on The common free function applies to this buffer and will have to be explicitly called by the user 10 11 12 13 14 15 16 17 18 20 21 22 Initialize library spx_init Load matrix from file spx input t input spx input load mmf argv 1 Transform to CSX spx matrix t A spx mat tune input Create x and y vector views from the corresponding buffers spx value t x tuned y tuned spx partition t parts spx mat get partition A spx vector t x view spx vec create from buff x amp x tuned ncols parts SPX VEC TUNE spx vector t y view spx vec create from buff y amp y tuned nrows parts SPX VEC TUNE Run 128 loops of the SpMV kernel spx value t alpha 0 8 beta 0 42 const size t nr loops 128 25 4 EXAMPLES 24 for i 0 i lt nr loops i 4 25 spx matvec kernel alpha A x view beta y view 26 gt 28 From this point on the user can u
34. nment b the preprocessing phase of CSX and c the CSX format it self The available options with their default values for every category are given in Table 3 2 The options for setting the runtime environment include the number of participat ing threads spx rt nr threads and the processor affinity spx rt cpu affinity If these options are not explicitly set by the user the library automatically detects the optimal configuration i e the number of threads is set to the number of available cores and the CPU affinity is adjusted according to a share nothing core filling pol icy which assigns threads to cores so that the least resource sharing is achieved The user can also control different aspects of the preprocessing phase For one she may select specific substructure types to be encoded through the spx preproc xform option The h v d ad br bc values correspond to horizontal vertical diagonal anti diagonal row aligned block and column aligned block substructures respectively In the case of none only delta units will be encoded She can also select the heuristic that will determine the scale of compression ratio will focus solely on maximizing the compression ratio while cost will also try to minimize the computational cost of the kernel The first heuristic is the default option in SMP architectures where lower memory bandwidths require a minimal memory footprint while the second heuristic is the default opti
35. o the aforementioned problems This involves applying row and column permutations in order to bring the nonzero elements of the matrix as close as possible to the main diag onal This is beneficial to a typical SpMV implementation since the homogenization of the nonzero element distribution leads to a better access pattern and load balance For the symmetric version of the kernel using SparseX with the spx matrix symmetric option enabled the obvious effect of this nonzero rearrangement in a mutlithreaded execution is the minimization of the reduction phase overhead Our user API currently provides an option for applying the Reverse Cuthill McKee RCM reordering algorithm that has been proposed for structurally symmetric matri ces Cuthill and McKee 1969 Reordering is performed by providing the optional argu ment SPX MAT REORDER to the spx mat tune routine The spx mat get perm and spx vec reorder routines allow the user to extract and apply the correspond 15 3 INTERFACE Routine Description spx vec create Creates a vector object spx vec create from buff Creates a vector object as a wrap per of a user defined array The in put array can be either left unmod ified or tuned SPX VEC AS IS SPX VEC TUNED spx vec create random Creates a randomly filled vector object spx vec init Initializes the vector object with a fixed value spx vec init rand range Initializes the vector object with randomly generated value
36. o large scale sparse solver libraries Even though the library can be used directly in applications that involve sparse matrix by vector multi plications its ultimate goal is to integrate readily into application level libraries that provide high level sparse kernel support including iterative solution meth ods of sparse linear systems such as the aforementioned PETSc library Inte gration in such systems has a number of advantages including the ability to hide data structure details from the user and of course the large potential user base that will assist in the better dissemination of the CSX format Transparently adjust to the target platform The SparseX library currently supports symmetric shared memory SMP and non uniform memory access NUMA mul ticore architectures Any architecture specific optimization is performed trans parently during the installation process of the library eliminating any need for the user to provide information on the hardware platform Allow for user inspection and control of the tuning process Tuning refers to the preprocessing phase of the CSX format i e converting the original sparse matrix into the CSX format A number of parameters can be explicitly set by the user in order to control different aspects of this phase such as the number of threads used defining specific substructure types to search for in the matrix selecting between CSX or CSX sym its symmetric variant as the target format and s
37. o on This adds a lot of flexibility to the tuning process and also affects performance in a direct manner For example if the user is aware of the sparsity structure of the matrix e g consisting mainly of blocks she can guide the detection process by selecting the block substructure types reducing the execution time of this phase Of course a poor selection may result in a significant performance degrada tion thus the user is advised to rely on the autotuning capabilities of CSX and 1 2 Goals and motivation only override an option when prior knowledge is available as in the example described previously For a detailed discussion on the basic elements of the library design and some gen eral implementation details refer to Elafrou 2013 Installation 2 1 Installation requirements In order to install and run SparseX your system must meet the following requirements e A fairly recent Linux OS e gcc g 4 6 gcc g 4 7 gcc g 4 8 e LLVM 3 0 and Clang e Boost Library gt 1 48 regex serialization system thread e numactl library gt 2 0 7 Currently SparseX has been tested on the following environments e Architectures Intel Xeon X7460 Intel Xeon X5560 Intel Xeon X5650 Intel Xeon E5 4620 C C Compilers gcc g 4 6 2 gcc g 4 7 2 gec g 4 8 2 e OS Linux kernel 3 7 10 2 2 How to install SparseX Step 1 Download and extract In order to use SparseX you must either explicitly download and e
38. oid spx log error console Activates logging of the Error level on stderr A 2 52 void spx log warning console Activates logging of the Warning level on stderr A 2 53 void spx log info console Activates logging of the Info level on stderr A 2 54 void spx log verbose console Activates logging of the Verbose level on stderr A 2 55 void spx log debug console Activates logging of the Debug level on stderr 47 A 2 56 void spx log error file Activates logging of the Error level on a file The file name should be previously provided through the spx log set file routine If not a default sparsex log file will be created A 2 57 void spx log warning file Activates logging of the Warning level on a file The file name must be previously provided through the spx log set file routine If not a default sparsex log file will be created A 2 58 void spx log info file Activates logging of the Info level on a file The file name must be previously pro vided through the spx log set file routine If not a default sparsex log file will be created A 2 59 void spx log verbose file Activates logging of the Verbose level on a file The file name must be previously provided through the spx log set file routine If not a default sparsex log file will be created A 2 60 void spx log debug file Activates logging of the Debug level on a file The file name must be previously
39. on for NUMA architectures where increased memory bandwidths might expose computationally intensive loads Furthermore the user can enable the use of statistical sampling in the detection process spx preproc sampling The preprocessing cost of CSX can be significantly halved by enabling the use of sam pling There are two sampling methods available portion and window In the first case the user can provide the portion of the matrix that she wishes to be sam pled spx preproc sampling portion and the number of sampling windows used per thread spx preproc sampling nr samples and the autotuning capabilities of CSX will automatically adjust the window size according to the following equation sampling portion nr_nZerOSper thread window size nr samples _ sampling portion nr_nzeroS otal 3 1 nr samples nr threads In the second case the user will have to explicitly set the window size to a mean ingful number of nonzero elements spx preproc sampling window size In both cases if the number of samples is not explicitly set the default number will be used Equation 3 1 designates that the window size has an inverse relation to the number of threads for fixed values of the sampling portion and the number of samples As the number of threads increases it is possible that the sampling window will become too small and restrict the detection capabilities of CSX For example ifthe window contains 12 3 2 Auxiliar
40. ourcefile lineno function where prefix can be either ERROR or WARNING depending on the error type When an er ror of the first category occurs the default error handler also outputs the error message generated by the operating system and subsequently exits the program with a nonzero exit code In case of logical errors on the other hand returning error codes instead of exiting seemed more appropriate since most routines make requests on available resources and their failure needs to be recoverable 19 3 INTERFACE Routine spx log all console spx log all file spx log set file spx log disable all spx log error console spx log error file spx log warning console spx log warning file spx log info console spx log info file spx log debug console spx log debug file Description Activates all logging levels and redirects output to stderr Activates all logging levels and redirects output to a logfile Sets a logging file Disables logging of all levels Activates logging of the Error level on stderr Activates logging of the Error level on a file Activates logging of the Warning level on stderr Activates logging of the Warning level on a file Activates logging of the Info level on stderr Activates logging of the Info level on a file Activates logging of the Debug level on stderr Activates logging of the Debug level on a file Table 3 9 Available routines to control logging If
41. provided through the spx log set file routine If not a default sparsex log file will be created A 2 61 void spx log all console Activates all logging levels and redirects output to stderr A 2 62 void spx log all file const char file Activates all logging levels and redirects output to a file The file name must be previ ously provided through the spx log set file routine If not a default sparsex log file will be created If the file already exists it will be overwritten Parameters 48 in file a filename A 2 63 void spx log set file const char file Sets the file that will be used when logging is redirected to a file If the file already exists it will be overwritten Parameters in file a filename A 2 64 spx errhandler t spx err get handler Returns a pointer to the current error handler either default or user defined Returns the current error handling routine A 2 65 void spx err set handler spx errhandler t handler This function allows the user to change the default error handling policy with a new one which must conform to the signature provided above Parameters in handler user defined routine 49 Acknowledgements The SparseX library was developed in the Computing Systems Laboratory of the Na tional Technical University of Athens NTUA and is actively maintained by e Vasileios Karakasis lt bkk cslab ece ntua gr gt e Athena Elafrou
42. put spx input load mmf file 7 Set tuning options 8 spx option set spx rt nr threads 2 9 spx option set spx rt cpu affinity 0 1 10 spx option set spx matrix symmetric true 11 spx option set spx preproc xform none 12 spx option set spx preproc sampling window 13 spx option set spx preproc sampling window size 600 15 Transform to CSX 16 spx matrix t A spx mat tune input SPX MAT REORDER 18 Create randomly filled vectors 19 spx index t ncols spx mat get ncols A 20 spx index t nrows spx mat get nrows A 21 spx partition t parts spx mat get partition A 22 spx vector t x spx vec create random ncols parts 23 spx vector t y spx vec create random nrows parts 25 Reorder vectors 26 spx perm t p spx mat get perm A 27 spx vec reorder x p 28 spx vec reorder y p 30 Run the SpMV kernel 31 y lt alpha A x beta y 32 spx matvec kernel alpha A x beta y 34 Inverse reorder the output vector 35 spx vec inv reorder y p 24 4 4 Example 4 37 38 39 40 41 42 44 45 Cleanup interface objects spx input destroy input spx_mat_destroy A spx_vec_destroy x spx_vec_destroy y spx partition destroy parts Shutdown library spx finalize 4 4 Example 4 Our last example illustrates use of the vector tuning option of our library which can be used to extract higher performa
43. rguments is invalid 2 SPX_ERR_DIM matrix and vector dimensions are not compatible A 2 18 spx_error_t spx matvec kernel spx value t alpha const spx matrix tx A spx_vector_t x spx value t beta spx vector t y Performs a matrix vector multiplication of the following form y alpha A x beta y A 3 where alpha and beta are scalars x and y are vectors and is a sparse matrix in the CSX format Parameters in A the tuned matrix handle in alpha a scalar in x the input vector in beta a scalar in out y the output vector Returns an error code Possible error conditions 1 SPX ERR ARG INVALID any of the input arguments is invalid 2 SPX ERR DIM matrix and vector dimensions are not compatible 37 A 2 19 spx error t spx matvec kernel csr spx matrix t xx A spx index t nr rows spx index t nr cols spx index t x rowptr spx index t colind spx value t values spx value t alpha spx vector t x spx value t beta spx vector t y Performs a matrix vector multiplication of the following form y alpha x beta y A 4 where alpha and beta are scalars x and y are vectors and A is a sparse matrix The matrix is originally given in the CSR format and converted internally into the CSX format This higher level routine hides the preprocessing phase of CSX It can be efficiently used in a loop since only the first call will convert the matrix into t
44. rrays do not correspond to a valid CSR representation A 2 4 spx input t spx input load mmf const char x filename Creates and returns a valid tunable matrix object from a file in the Matrix Market File Format MMF Parameters 31 in filename Returns Possible error conditions name of the file where the matrix is stored a handle to the input matrix or SPX INVALID INPUT in case an error occurs 1 SPX ERR FILE the file does not exist or an error occured while trying to read it possibly not in valid MMF format A 2 5 spx error t spx input destroy spx input tx input Releases any memory internally used by the sparse matrix input handle input Parameters in input Returns Possible error conditions the input matrix handle an error code SPX SUCCESS or SPX FAILURE 1 SPX ERR ARG INVALID the input matrix handle is invalid A 2 6 spx matrix t spx mat tune spx input t input Converts the input matrix into the CSX format by applying all the options previously set with the spx option set routine In case no options have been explicitly set the default values are used see Table 3 2 of the User s Guide Parameters in input the input matrix handle in optional flag that indicates whether the matrix should be reordered by applying the Reverse Cuthill McKee algo rithm SPX MAT REORDER Returns Possible error conditions 32 a handle to the
45. s spx vec get entry Retrieves the value of a single en try spx_vec_set_entry Sets the value of a single entry spx vec destroy Destroys a vector object Table 3 3 Available routines for creating and modifying vector objects oftype spx vector t Routine Description spx mat get partition Retrieves a partitioning object from a tuned matrix spx partition csr Creates a partitioning type object from a matrix in the CSR format spx partition destroy Destroys a partitioning object Table 3 4 Available routines for handling partitioning objects of type spx partition t 16 3 3 Computational routines Routine Description spx mat get perm Retrieves the permutation applied to the matrix after applying the RCM algorithm spx vec reorder Reorders the vector according to the supplied permutation spx vec inv reorder Inverse reorders a permuted vector according to the supplied permuta tion Table 3 5 Routines involved in reordering ing permutation on vector objects For an example that illustrates the use of reordering see Example 4 3 in Chapter 4 3 3 Computational routines This module includes three variations of the SpMV kernel see Table 3 6 and some vec tor operations see Table 3 7 The spx matvec mult routine implements a simple multiplication y alpha A x while a more generic version of the kernel is also provided i e y alpha A x beta y through the spx_matvec_kernel routine Th
46. se the tuned buffers 29 if x tuned x I 30 x x tuned 31 33 if y tuned y I 34 y y tuned 35 37 aar 38 39 41 Cleanup 42 spx_input_destroy input 43 spx_mat_destroy A 44 spx partition destroy parts 45 spx vec destroy x view 46 spx vec destroy y view 48 The user can apply the free function on the tuned buffers 49 free x 50 free y 52 Shutdown library 53 spx_finalize 26 C bindings reference Generated by Doxygen 1 8 6 A 1 28 Available routines void spx init void spx finalize spx input t spx input load csr spx index t rowptr spx index t colind spx value t values spx index t nr rows spx index tnr cols spx input t spx input load mmf const char filename spx error t spx input destroy spx input t input spx matrix t spx mat tune spx input t input spx error tspx mat get entry constspx matrix t A spx index trow spx index t column spx value t value spx error tspx mat set entry spx matrix t A spx index trow spx index _t column spx value t value spx error t spx mat save const spx matrix t A const char filename spx matrix t spx mat restore const char filename spx index t spx mat get nrows const spx matrix t xA spx index t spx mat get ncols const spx matrix t A spx index t spx mat get nnz const spx matrix t A
47. sparse matrix constructor A MMF file consists of four parts 1 Header line this is the first line in the file and contains an identifier and four text fields in the following form MatrixMarket object format field symmetry where object is either matrix this is the case we will consider here or vec tor format can be either coordinate for sparse matrices or array for dense matrices field is either real double complex integer or pattern and fi nally symmetry is either general symmetric skew symmetric or hermi tian 10 3 2 Auxiliary routines Routine Description spx input load mmf Creates an input object from a file on disk in the MMF format spx input load csr Creates an input object from a ma trix in the CSR format spx input destroy Destroys an input object Table 3 1 Available routines for creating and destroying input matrix objects of type spx input t 2 Comment lines begin with a percent sign and allow a user to store infor mation and comments 3 Size line specifies the number of rows columns and nonzero elements in the matrix 4 Data lines specify the location of the matrix entries and their values When the matrix is sparse the location of the matrix entries is given in the coor dinate format using one base indexing in a column wise ordering Since CSX operates on the elements of a sparse matrix in a row wise order the column wise ordering of the MMF format creates the need to sort the
48. sr Performs the kernel y alpha A x beta y with a matrix in the CSR for mat as input Table 3 6 Available routines for sparse matrix by vector multiply Routine Description spx_vec_add spx_vec_sub spx_vec_mul spx_vec_scale spx_vec_scale_add Adds two vectors Subtracts two vectors Returns the product of two vectors Scales the input vector by a constant value Scales the input vector by a constant value Table 3 7 Available vector operations Routine spx_timer_start spx_timer_pause spx_timer_clear Description Starts the timer Pauses the timer Re initializes the timer spx_timer_get_secs Returns the elapsed time in seconds Table 3 8 Timing utilities 18 3 5 Logging routines 3 5 Logging routines Logging is a critical technique for troubleshooting and maintaining software systems The logging framework of SparseX was designed to be typesafe threadsafe at line level flexible and as light weight as possible It adopts the look and feel of C streams and can be enabled or disabled both at compile time and runtime By default only the Error and Warning logging levels are enabled However the user may find the Info level particularly helpful in understanding the preprocessing phase of CSX while evaluating program performance For example Info activates the printing of statistics about the detected and encoded substructures during the con version to the CSX format Addition
49. t nr threads Creates a partitioning object for the matrix in the Compressed Sparse Row CSR format This routine should be used in conjunction withthe spx matvec kernel csr multiplication routine Parameters in rowptr array rowptr of the CSR format in nr rows number of rows of the matrix in nr threads number of partitions of the matrix Returns a partitioning object or SPX INVALID PART in case an error occurs A 2 22 spx error t spx partition destroy spx partition t p Releases any memory internally used by the partitioning handle p Parameters in p the partitioning handle Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 1 SPX ERR ARG INVALID the partitioning handle is invalid 2 SPX ERR MEM FREE deallocation failed 39 A 2 23 spx error tspx option set const char option const char x string Sets the option according to the description in string for the tuning process to follow For available tuning options and how to set them refer to Table 3 2 of the User s Guide Parameters in option the option to be set in string a description of how to set the option Returns an error code SPX SUCCESS or SPX FAILURE Possible error conditions 1 SPX ERR MEM FREE deallocation failed A 2 24 spx vector t spx vec create size_t size spx partition t p Creates and returns a vector object whose values must
50. the absolute path to the LLVM configuration script Ilvm config Similarly for the Boost library you can use the with boostdir option cd SPARSEX_DIR configure options 3 Type make to compile the package You can speed up the compilation process by using multiple tasks with the j flag of make make j8 4 Type make install to install the library and any data files and documentation When installing into a prefix owned by root it is recommended that the package be configured and built as a regular user and only the make install phase ex ecuted with root privileges By default make install installs the library under usr local 1lib and include files under usr local include You can spec ify an installation prefix other than usr local by giving configure in step 1 the option prefix PREFIX where PREFIX must be an absolute directory path make install Customizing the SparseX build using configure The configure step can be customized in many ways The most commonly used options are discussed below For a complete list run configure help and also refer to the INSTALL document available in the SparseX directory 6 2 3 Linking against SparseX Selecting a different build directory The SparseX library may be built in a different directory simply by creating the directory and running the commands in Step 2 of the previous section from that directory Sel
51. the user wishes to handle errors in a different manner she may set her own error handling routine by calling spx err set handler as long as it conforms to the signature set by our interface For more details see Appendix A A couple of macros are used to make the error handling a bit more convenient These macros are used throughout the interface and can be employed by the application programmer as well When an error is first detected one should set it by calling SETERROR O error code or SETERROR 1 error code message Both macros call the current error handler while the second also supplies a custom message 20 Examples This chapter introduces the SparseX interface through a series of examples Every ex ample highlights different aspects of our API in order to familiarize the user with most of the available routines These code examples can be located in the src examples subdirectory of the installation directory They could be used to determine e Whether SparseX is working on your system e How you should call the library e How to link the library 4 1 Example 1 In our first minimal example the input matrix is loaded from a file in the MMF format Depending on the number of available cores of the system serial or multithreaded execution is automatically selected Random vectors are created and finally a loop of 128 iterations of the SpMV kernel is executed and timed through the available timing routines of our interface T
52. xtract its source code from the associated git link or clone the git repository For the first option a tarball of the latest release can be downloaded from http research cslab ece ntua gr sparsex and subsequently extracted by typing tar xzf sparsex x y z tar gz These commands unpack the SparseX distribution into a subdirectory named sparsex x y z Replace the x y z string with the latest available version of SparseX ma jor minor patch software versioning scheme For the second option cd to a directory of your choosing and simply type git clone git github com cslab ntua sparsex git 5 2 INSTALLATION Note that there is no need to create a new directory sparsex as cloning the repository via git will do that for you Step 2 Compile The simplest way to compile this package includes the following steps 1 If the library has been downloaded in means of cloning the git repository or downloading the tarball available on the github link you must first run autoreconf vi in order to remake all of the configure scripts 2 cd to the directory containing the package s source code e g sparsex x y z and type configure to configure the package for your system If you have installed LLVM Clang in a non standard location that is not in your path you can instruct the compilation process to use your preferred LLVM installation through the with llvm configuration option by providing
53. y routines Option Default value Description spx rt nr threads cores Number of threads spx rt cpu_affinity 0 4cores 1 Thread affinity spx preproc xform all Substructure types that h v d will be detected br bc none spx preproc heuristic ratio SMP Heuristic that will try to cost NUMA maximize the compres sion ratio or minimize the computational cost spx preproc sampling none Use of statistical sam portion pling window spx preproc sampling nr samples 10 Number of sampling windows per thread spx preproc sampling portion 0 01 Portion of the matrix that will be sampled spx matrix symmetric false Use the symmetric vari ant of CSX spx matrix split blocks true Use of the split blocks optimization that im proves the evaluation of block types during the selection phase spx matrix full colind true NUMA Use of variable sized in false SMP tegers for the column in dex of each substructure spx matrix min unit size 4 Minimum number of nonzero elements re quired for a substructure unti to be valid spx matrix max unit size 255 Maximum number of nonzero elements re quired for a substructure unit to be valid spx matrix min coverage 0 1 Minimum percentage of matrix coverage required for a substruc ture instantiation not to be filtered out of the selection phase 13 Table 3 2 Available options for configuring the preprocessing phase of CSX 3 INTERFACE only a single row then only horizontal and d

Download Pdf Manuals

image

Related Search

Related Contents

取扱説明書 電気衣類乾燥機 家庭用 品番 AQD-K45  Ethan Frome  ASRock 3D Owner's Manual  Powermaster High Traffic Commercial Gate Operator UL 325 User's Manual  CAM MCOLD1 12  LauraStar Plusboard  Lincoln Electric SVM157-A User's Manual  Asus V3-P5P43 Specifications  Manual de instrucciones  Referenzhandbuch - CONRAD Produktinfo.  

Copyright © All rights reserved.
Failed to retrieve file