Home

Lib Lib - Fraunhofer SCAI

1. Syntax matrix_t hlib_matrix_build_coeff const blockcluster_t const coeff_fn_t void k const lrapx_t const double const int int matrix_t hlib_matrix_build_ccoeff const blockcluster_t const ccoeff_fn_t void k const lrapx_t const double const int int Arguments bct Block cluster tree over which the H matrix shall be built Coefficient function defining dense matrix arg Optional argument which will be passed to the coefficient function lrapx Defines the type of low rank approximation used in each admissible matrix block and can be one of LRAPX SVD use singular value decomposition LRAPX_ACA use adaptive cross approximation LRAPX_ACAPLUS use advanced adaptive cross approximation LRAPX_ACAFULL use adaptive cross approximation with full pivot search Since an H matrix is usually not an exact representation of the dense matrix but only an approximation the accuracy for this approximation has to be specified In H Lib this is always done in a block wise fashion That means that for a given dense matrix D and the corresponding H matrix A build out of D with an accuracy of gt 0 this accuracy only holds for all block indexsets defined by leaves t x s in the block cluster tree Dlexs Altxsl lt but not necessarily for the matrices itself This is a general property for all matrix operations in the context of H matrices As an example for building an H matrix with adaptive cross approximation we c
2. Syntax void hlib_solve_precond const solver_t solver const matrix_t A const matrix_t W vector_t X const vector_t b int info Arguments W Preconditioner to the linear equation system 3 6 Miscellaneous Functions The following functions are mostly included in H LiB for convenience and are usually available by other libraries or even the operating system itself 37 3 6 Miscellaneous Functions 3 6 1 Quadrature Rules Since H LiB is capable of discretising integral equations different quadrature rules are imple mented as part of the computations These rules are also exported so that external routines can benefit from them 3 6 1 1 Gaussian Quadrature Quadrature rules for Gaussian quadrature of order n over the interval 0 1 are constructed by the function Syntax void hlib_gauss_quadrature_1d const unsigned int order double points double weights int info Arguments n Order of the quadrature points Array of size order where the quadrature points will be stored weights Array of size order where the quadrature weights will be stored 3 6 1 2 Quadrature Rules for Triangles H LiB also provides quadrature rules for the integration over a pair of triangles e g when a integral equations on a surface grid These rules were developed by Stefan Sauter see SS04 There different rules apply to different cases of triangle interaction O M DY same triangle comm
3. argv int info void hlib_done int info Arguments argc Number of command line arguments argv array of strings containing the command line arguments info to return error status The function hlib_init expects the command line parameters argc and argv for the pro gram as arguments and initialises H LiB The values of argc and argv might be modified by hlib_init Accordingly hlib_done finishes H LiB The meaning of the argument info will be discussed in the next section Normally H LiB does not produce any output at the console This behaviour can be changed with Syntax void hlib_set_verbosity const unsigned int verb Here larger values correspond to more output e g error messages timing information algo rithmic details 3 1 2 Error Handling Almost all functions in the C interface of H Lib expect a pointer to an integer usually named info which is used to indicate the status of the function i e whether an error occurred and what kind of error this was If info points to NULL it will not be accessed and no information about errors will be delivered to the user The following list contains all error codes of H Lib C Language Bindings NO_ERROR No error occurred ERR_ARG invalid argument ERR_MEM insufficient memory available ERR_NULL null pointer encountered ERR_INIT not initialised ERR_COMM communication error ERR_DIV_ZERO division by zero ERR_INF inf
4. H Lib v0 11 User Manual by Ronald Kriemann il Contents 1 Preface 2 Installation 2 1 Prerequisites s 4 24 Ae Ree RR PRA ee ERE E 2 1 1 Operating System and Compiler 212 LAPACR s nauhaa a ec D a dt Zla Misc Tools s o act an sh o has ar ee shape 22 Configuration 25 2045 24 bP ee ee he eS bon eG Bi 2 3 Compilation s s aos p a a a Ri g a ee ep ae eA ke e eS ee He a 2A Final Installation Step 6 22 60 6266 62 88S RAO eee E k 3 C Language Bindings dul 3 2 3 3 3 4 3 5 3 6 3 7 General Functions and Data types 3 1 1 Initialisation and Finalisation 3 1 2 Error Handling is eca e e a E 4 oiled Data types ss 14 3 e se 225 oe ae aoe PANNE a aa 3 1 4 Reference Counting s s co pk Peek eee id ge path dla BS Cluster Trees and Block Cluster Trees 3 2 1 Cluster Trees s ea Le dos su a bises Dada semer a 3 2 2 Block Cluster Trees eses ne ae a EE Luna VECLOLS Ge a e da OR ae AR M ee ee Re ai es 3 3 1 Creating and Accessing Vectors 3 3 2 Vector Management Functions 3 3 3 Algebraic Vector Functions 33A Vector sk ote ee ass ee Se eo oe ew IMAREIGRS ye LS ee te we a ewe a we ee a a 3 4 1 Importing Matrices from Data structures 342 Building H Matric s ca cse eead
5. A typical example for the usage of the configuration system might be configure enable opt disable debug with cpuflags which enables an optimised compilation without debugging information and the usage of cpuflags to get the correct compiler flags Beside these H Lib related options the following parameters can be used to change the behaviour of the configuration system itself c check Turns on checking of used tools and libraries e g if the C compiler is capable of producing object files h help Prints a detailed description of all options for the configuration system q quiet Do not print any information during configuration s show Just print the current options of the configuration system without creating makefiles v verbose Print additional information during configuration All settings which where changed by the user will be written to the file config cache where one can optionally edit the parameters with a text editor 2 3 Compilation After H Lib was configured you can compile it by make Parallel compilation e g via j is also supported After compiling a library should reside in the 1ib directory and examples for the usage of H LiB should have been generated in the example subdirectory If CLAPACK was chosen as the LAPACK implementation a corresponding library can be found in the 1ib directory Installation 6 2 4 Final Installation
6. In example bemid c a preconditioner based on Gaussian elimination is computed in addi tion to the one due to LU factorisation presented here Finally one has to release all resources allocated in the example 48 hlib_vector_free x amp info CHECK_INFO 49 hlib_vector_free b amp info CHECK_INFO 50 hlib_vector_free one amp info CHECK_INFO 51 hlib_matrix_free A amp info CHECK_INFO 52 hlib_bct_free bct amp info CHECK_INFO 53 hlib_ct_free ct amp info CHECK_INFO 54 hlib_solver_free gmres amp info CHECK_INFO 55 free x_arr free b_arr 56 for i 0 i lt n i free vertices i 57 free vertices 58 hlib_done amp info CHECK_INFO Please note the manual freeing of the C arrays x_arr and b_arr which are associated with the vectors x and b This has to be done by the user since H Lib only releases the additional resources allocated for internal vector management C Language Bindings 48 Bibliography Beb00 BG05 BGH03 DD91 Fra GHO03 GL96 Gra01 Hac93 Hac99 Mat SS86 SS04 vdV92 M Bebendorf Approximation of boundary element matrices Numerische Mathe matik 86 565 589 2000 S B rm and L Grasedyck Hybrid cross approximation of integral operators Nu merische Mathematik 2 221 249 2005 S B rm L Grasedyck and W Hackbusch Hierarchical Ma
7. Syntax matrix_t hlib_matlab_load_ matrix const char const char int hlib_matlab_save_ matrix const matrix_t const char const char int Arguments M Sparse or dense matrix to save to filename filename Name of Matlab file to load save matrix from to matname Name of the matrix in the Matlab file 3 4 5 3 H Lib Since H matrices can not be stored in an efficient way in current matrix formats H Lib defines it s own file format for matrices This format also supports sparse and dense matrices The corresponding functions to load and save matrices from to files are Syntax matrix_t hlib_hformat_load_matrix const char int void hlib_hformat_save_matrix const matrix_t const char int Arguments M Matrix to save to filename filename Name of file to load save matrix from to 3 4 5 4 PostScript Matrices in H Lib can also be printed in PostScript format Here various options are available to define the kind of data in the resulting image MATIO_SVD print singular value decomposition in each block MATIO_ENTRY print each entry of matrix MATIO_PATTERN print sparsity pattern non zero entries These options can also be combined by boolean or e g to print the SVD and the sparsity pattern the combination MATIO_SVD MATIO PATTERN is used By default only the structure of the matrix is printed There different kind of blocks are marked by different colours 31 3 5 Algebra En dense matrix block
8. H matrices Of course inside H Lib this distinction is done and the appropriate or expected type is checked in each function To use matrices in H Lib three different ways are possible import a matrix given by some data structures build a matrix or load a matrix from a file The first method usually applies to sparse and dense matrices whereas H matrices are normally build by H LiB These different methods will be discussed in the following sections 3 4 1 Importing Matrices from Data structures Sparse Matrices Before using sparse matrices in H LiB they have to be imported to the internal representation For this the sparse matrix is expected to be stored in compressed row storage or CRS format The CRS format consists of three arrays colind coeffs and rowind The array colind holds the column indices of each entry in the sparse matrix ordered according to the row e g at first all indices for the first row then all indices for the second row and so forth Here the indices itself are numbered beginning from 0 In the same way the array coeffs holds the coefficients of the corresponding entries in the same order Both array have dimension nnz i e the number of non zero entries in the matrix The last array rowind has dimension n 1 where n is the dimension of the matrix It stores at position 1 the index to the colind and coeffs array for the ith row e g the entries for the ith row have the column indices 21 3 4
9. Step If you have not chosen a specific installation directory e g with the prefix option the installation is complete Otherwise you can install H LiB by using make install which copies the H Lib library and all include files to the corresponding directory It also copies the script hlib config which was generated by configure to the bin subdirectory This script can be used to gather the correct compilation flags when including H LiB in your projects Options to hlib config are prefix Returns to installation directory of H LiB version Prints the version of H Lib Iflags Prints correct flags to links programs with H LiB cflags Prints correct flags to compile programs with H LiB So to compile and link a program one would use the following statement CC o program program cc hlib config cflags lflags 3 C Language Bindings Although H LiB is programmed using C the functionality is also available in a C environ ment The corresponding functions for the C interface will be described in this chapter 3 1 General Functions and Data types 3 1 1 Initialisation and Finalisation Before using any functions H LiB has to be initialised to set up internal data In the same way when H Lib is no longer needed it should be finished Both is done by the following functions Syntax void hlib_init int argc char
10. double D 4 matrix _t M 3 4 2 Building H Matrices Before any H matrix can be built a corresponding block cluster tree has to be available which describes the partitioning of the block indexset of the matrix and defines admissible matrix blocks Please refer to Section 3 2 2 on how to get a suitable block cluster tree object 3 4 2 1 Sparse Matrices Sparse matrices are converted into H matrices by using Syntax matrix_t hlib_matrix_build_sparse const blockcluster_t bct const matrix_t S int info Arguments bct Block cluster tree defining partitioning of the block indexset of the sparse matrix Sparse matrix to be converted to an H matrix Usually the resulting H matrix is an exact copy of the given sparse matrix Although due to efficiency reasons this equality is only valid with respect to machine precision 3 4 2 2 Dense Matrices Since dense matrices involve an unacceptable memory overhead an H matrix approximation should not be built out of a given dense matrix but by constructing the data sparse approxi mation directly This is accomplished by various algorithms Attention The algorithms for computing a H matrix approximation of a given dense matrix only work for certain classes of matrices e g coming from integral equations They might not work for general matrices Be sure to check the applicability of the algorithms before using a certain routine Adaptive Cross Approximation Adaptive
11. latter is computed by singular value decomposition with n complexity and therefore only practical for small problem sizes 22 if n lt 2000 28 matrix t B hlib matrix _build_coeff bct kernel amp h LRAPX_SVD le 16 1 amp info CHECK_INFO 24 hlib_matrix_print_ps B bemid_A_svd ps MATIO_SVD amp info CHECK_INFO 25 printf A A _F A _F 4e n hlib_matrix_norm_spectral_diff A B amp info CHECK_INFO 26 hlib_matrix_free B amp info CHECK_INFO Beside the changed construction algorithm LRAPX SVD the blockwise accuracy was also in creased to machine precision at line 23 The relative difference with respect to the spectral norm is computed with hlib matrix norm spectral diff Since it is no longer used matrix B is released at line 26 Before the equation system can be solved the right hand side has also to be discretised and represented by a corresponding vector The above described ansatz leads to the following function for the right hand side b h pi x f x dx where f is chosen such that for the solution u 1 holds C Language Bindings 46 double rhs int i int n double a double i double n double b double i 1 0 double n double value 1 5 b a if fabs b gt le 8 value 0 5 b b log b if fabs a gt le 8 value 0 5 ax a log a if fabs 1 0 b gt le 8 value 0 5 1 0 b 1 0 b log 1 0 b
12. of the inverse operator Since not the real inverse is computed the decomposition can not be used for matrix arithmetic e g matrix multiplication Function hlib_matrix_inv_lu computes the LU decomposition and overwrites A with a matrix representing the inverse of the factors e g LU Therefore evaluating A afterwards corresponds to A and not A 39 3 5 Algebra Syntax void hlib_matrix_inv_lu matrix_t A const double eps int info Arguments A Dense or H matrix to be decomposed using LU factorisation A will be overwritten with the inverse of the factorisation result eps Block wise accuracy of the H arithmetic during the LU decomposition Remark When using LU decomposition to compute the inverse of an H matrix based upon a sparse matrix nested dissection for the cluster tree construction see Section 3 2 1 2 leads to an improved computational efficiency of the algorithm 3 5 2 Solving Linear Systems Various iterative methods are available to solve a linear system Richardson CG BiCGStab and GMRES iteration see Hac93 vdV92 and SS86 Furthermore all of these methods can be preconditioned with an appropriated matrix e g by the inverse of a matrix Before a system can be solved a solver object has to be created For the Richardson iteration this is accomplished by the function Syntax solver_t hlib_solver_rich const int maxit const double abs_red const double rel_red int ite
13. 2 algebra 31 BiCG Stab iteration 34 binary space partitioning 10 CG iteration 34 cluster tree 9 construction 10 management 10 data types 8 dot product 18 error handling 7 finalisation 7 Frobeniusnorm 27 GMRES 40 45 GMRES iteration 34 initialisation 7 inversion 33 41 LU factorisation 33 40 46 matrix 20 addition 32 inversion 33 41 LU decomposition 33 multiplication 33 norm 27 multiplication 33 nested dissection 10 40 norm Frobenius 27 matrix 27 50 spectral 27 vector 19 quadrature 36 Gaussian 36 triangles 37 reference counting 9 singular value decomposition 24 44 spectral norm 27 SVD 24 time CPU 38 wall clock 38 timing 38 vector 15 dot product 18 norm 19 Function and Datatype Index adm_t 16 blockcluster_t 15 bsp_t 13 ccoeff_fn_t 25 cluster_t 11 coeff_fn_t 25 complex_t 11 hlib_bct_bytesize 17 hlib_bct_free 16 hlib_bct_print_ps 17 hlib_cputime 40 hlib_ct_build_alg 15 hlib_ct_build_alg_nd 15 hlib_ct_print_ps 15 hlib_done 9 hlib_error_desc 10 hlib_gauss_quadrature_ld 38 hlib_init 9 hlib_matlab_load_vector 21 hlib_matlab_save_vector 21 hlib_matrix_build_ccoeff 26 hlib_matrix_build_coeff 26 hlib_samg_load_rhs 21 hlib_samg_load_sol 21 hlib_samg_save_rhs 21 hlib samg save sol 21 hlib_ sauter _quadrature_edge 39 hlib_ sauter quadrature_ eq 39 hlib_sauter_quadrature_sep 40 hlib sauter qua
14. H Lib are used The vector one contains the exact solution As in the previous example this unpreconditioned iteration is usually inefficient Therefore matrix inversion is used to speed up the process Since only matrix vector multiplication with the inverse is needed for the GMRES iteration LU factorisation is the method of choice 47 3 7 Examples 39 matrix_t LU hlib_matrix_copy_eps A le 4 amp info CHECK_INFO 40 hlib_matrix_inv_lu LU 1e 4 amp info CHECK_INFO 41 hlib_matrix_print_ps LU bemid_LU ps MATIO_SVD info CHECK_INFO 42 printf inversion error a hlib_matrix_norm_inv_approx A LU amp info CHECK_INFO 43 hlib_solve_precond gmres A LU x b amp info DE CHECK_INFO 44 hlib_vector_axpy x 1 0 one amp info CHECK_INFO 45 error hlib_vector_norm2 x amp info CHECK_INFO 46 printf error of solution 4e n error 47 hlib_matrix_free LU amp info CHECK_INFO To obtain a LU decomposition the matrix A is copied into LU Since only a preconditioner is needed this copy is not exact but with a reduced blockwise accuracy of 1074 The same accuracy is then used for the LU factorisation at line 40 of which the result is checked at line 42 There the largest eigenvalue of A LU is computed Solving the preconditioned system and computing the resulting error is analogous to the unpreconditioned case Remark
15. Matrices colind i 1 colind 1 1 and the coefficients coeffs i 1 coeffs 1 1 The value rowind n holds the number of non zero entries As an example the matrix int rowind 5 int colind 10 double coeff 10 To import a sparse matrix in CRS format into H LiB the following two functions can be used Syntax matrix_t hlib_matrix_import_crs const int const int const int const int const int const double const int int matrix_t hlib_matrix_import_ccrs const int const int const int const int const int const complex_t const int int Arguments rows cols Number of rows and columns of the sparse matrix nnz Number of non zero entries in the sparse matrix rowind colind coeffs Arrays containing the sparse matrix in CRS format sym If sym is non zero the sparse matrix is assumed to be symmetric The two routines only differ by the coefficient type of the sparse matrix which can either be real or complex valued To finish the above described example by creating a matrix object from the constructed arrays one has to add a call to the corresponding function C Language Bindings 22 int rowind 5 0 2 5 8 10 int colind 10 T2 I 2 Ss 2 3 double coeff 10 PA D A A A ARES matrix_t S Il o p o S hlib_import_crs 4 10 rowind colind coeffs Dense Matrices Although due to their high memory and computation overhead not t
16. _build_sparse bct S amp info CHECK_INFO 33 hlib_matrix_print_ps A crsalg_A ps MATIO_PATTERN amp info CHECK_INFO 34 hlib_matrix_inv A le 4 amp info CHECK_INFO 35 hlib_matrix_print_ps A crsalg_I ps MATIO_SVD info CHECK_INFO 36 printf size of Inverse 2f T double hlib_matrix_bytesize A info 1024 0 1024 0 Again A is overwritten by its inverse matrix Both matrices are also printed to PostScript files whereby for the inverse matrix the singular value decomposition of each matrix block is shown The linear equation system is then solved as in the previous case E hlib_solve_precond gmres S A x b info CHECK_INFO And also the deletion of all created objects is done by the same functions i 38 hlib_matrix_free A amp info CHECK_INFO 39 hlib_bct_free bct amp info CHECK_INFO 40 hlib_ct_free ct amp info CHECK_INFO Finalisation Finally all other objects which where created at the beginning should be released and H Lib be finished 41 hlib_vector_free x amp info CHECK_INFO 42 hlib_vector_free b amp info CHECK_INFO 43 hlib_matrix_free S info CHECK_INFO 44 hlib_solver_free gmres amp info CHECK_INFO 45 hlib_done amp info CHECK_INFO 3 7 2 Integral Equation The following e
17. _bytesize and printed at line 24 The above equation system can now be solved with the inverse of the LU factorisation of A as a preconditioner 24 hlib_solve_precond gmres S A x b amp info CHECK_INFO Again the GMRES iteration is used Finally the locally created objects e g the cluster tree the block cluster tree and the H matrix can be deleted 25 hlib_matrix_free A amp info CHECK_INFO 26 hlib_bct_free bct amp info CHECK_INFO 27 hlib_ct_free ct amp info CHECK_INFO Matrix Inversion for Preconditioning An alternative procedure for solving the above system fast is using matrix inversion Again before the matrix can be inverted a cluster tree and a block cluster tree have to be constructed Nested dissection as it was used for LU decomposition is not a suitable technique for matrix inversion see Section 3 2 1 2 Hence standard algebraic partitioning functions are called C Language Bindings 42 28 cluster_t ct hlib_ct_build_alg S amp info CHECK_INFO 29 hlib_ct_print_ps ct crsalg_ct ps amp info CHECK_INFO 30 blockcluster_t bct hlib_bct_build ct ct amp info CHECK_INFO 31 hlib_bct_print_ps bct crsalg_bct ps amp info CHECK_INFO Converting the sparse matrix into an H matrix is done by the same function as before Only the matrix inversion is now performed by hlib_matrix_inv 32 matrix_t A hlib_matrix
18. _vector_copy const vector_t x int info which returns a new vector object with it s own data If the original vector v corresponds to a C array the newly created vector does not represent this array but uses a new array The size of a vector can be determined by the function Syntax unsigned int hlib_vector_size const vector_t x int info Similarly the memory size of a vector in bytes is obtained by C Language Bindings 18 Syntax unsigned long hlib_vector_bytesize const vector_t x int info Finally vectors are released by using Syntax void hlib_vector_free vector_t x int info which frees all local memory of a vector This does not apply to associated C arrays e g if the vector was constructed with hlib vector import array There the array has to be deleted by the user 3 3 3 Algebraic Vector Functions A complete set of function for standard algebraic vector operations is available in H Lib In contrast to the vector copy function above the following routine does not create a new vector but copies the content of x to the vector y For this both vectors have to be of the same type Syntax void hlib_vector_assign vector_t y const vector_t x int info Arguments y Destination vector of the assignment Source vector of the assignment Scaling a vector e g the multiplication of each element with a constant is performed by Syntax void hlib_vector_sca
19. able H LiB contains a modified version of CLAPACK But it should be noted that this might result in a reduced performance of H Lib 2 1 3 Misc Tools To use the configuration system a Perl interpreter is needed and for the makefiles GNU make is mandatory Installation 4 2 2 Configuration If the right operation mode is chosen the configuration system of H LiB can be used to create all the makefiles needed to compile the package For this simply type configure which uses default settings for your operating system and compiler and chooses appropriate options for the compilation e g include directives libraries etc All available options for the configuration system can be printed by configure help which will result in the following list prefix dir Set the prefix for installing H Lib to the directory dir By default the local directory is chosen objdir dir Define a prefix for all object files during compilation e g tmp By default object files will be created in the same directory where the source files reside cc CC CXx CXKX Use CC and CXX as the C and C compiler By default appropriate settings for the considered operating system are used e g gcc and g for Linux ld LD ar AR Set the dynamic linker to LD and the static linker to AR By default operating system depen dent settings will be used ranlib RANLIB Define the
20. ated block indexset allows a low rank approximation in the matrix Block cluster trees are represented in H Lib by objects of type typedef blockcluster_s blockcluster_t 3 2 2 1 Block Cluster Construction Due to the definition of a block cluster tree two cluster trees are needed for the construction The actual building is performed by the routine C Language Bindings 14 Syntax blockcluster_t hlib_bct_build const cluster_t rowct const cluster_t colct int info Arguments rowct Cluster tree representing the row indexset of the block cluster tree colet Cluster tree representing the column indexset of the block cluster tree The cluster trees given to hlib_bct_build can be identical Usually the admissibility condition responsible for detecting admissible nodes in the block cluster tree is chosen automatically based on the given cluster trees The strategy can be changed by the user with the function Syntax void hlib_set_admissibility const adm_t adm const double eta Arguments adm Define admissibility condition to be either ADM_AUTO automatic choice ADM STD MIN standard admissibility with minimal cluster diameter ADM STD MAX standard admissibility with maximal cluster diameter or ADM_WEAK weak admissibility eta Scaling parameter for the distance between clusters in the admissibility condition Attention Since changes to the admissibility condition can lead to a failure of the H matrix app
21. ch function call to H LiB and tests whether an error occured The definition of CHECK_INFO is as follows define CHECK_INFO if info NO_ERROR char buf 1024 hlib_error_desc buf 1024 printf n s n n buf exit 1 There the complete error message is copied into the string buf and printed to the standard output Afterwards the program is aborted Coming back to the linear system since the complete data is now available one could solve it For this a solver object see Section 3 5 2 has to be created which is in this case a GMRES solver with a restart after 10 iteration steps and at most 1000 iterations Furthermore the residual should be reduced until a norm of 1078 was reached The system can then be solved with hlib_solve 13 solver_t gmres hlib_solver_gmres 10 1000 1e 8 0 amp info CHECK_INFO 14 hlib_solve gmres S x b amp info CHECK_INFO LU Factorisation for Preconditioning Since the standard iteration process is usually far to costly a suitable preconditioner based on the H matrix technique is constructed to speed up the process At first this is accom plished by using LU factorisation in combination with nested dissection see Section 3 2 1 2 and Section 3 5 1 4 But before the matrix can be factorised it has to be converted to an H matrix For this one needs a cluster tree and a block cluster tree Since no geometrical data is available in this 41 3 7 Exa
22. cross approximation or ACA is a technique which constructs an approximation to a dense matrix by successively adding rank 1 matrices to the final approximation For this only the matrix coefficients of the dense matrix are needed These coefficients are given by the user in the form of a coefficient function which evaluates certain parts of the global dense matrix The definition of these functions is as follows C Language Bindings 24 Syntax typedef void coeff_fn_t int n int rowcl int m int colcl double matrix void arg typedef void ccoeff_fn_t int n int rowcl int m int colcl complex_t matrix void arg Arguments n rowcl Number of row indices and the row indices in the for of an array at which the matrix shall be evaluated m colel Number of column indices and the column indices in the for of an array at which the matrix shall be evaluated matrix Array of dimension n m at which the computed coefficients at the positions defined by rowcl and colcl shall be stored in column major format arg Optional argument to the coefficient function see below Different variants of ACA are available in H LiB each of them with it s own advantages and disadvantages From a computation point of view interesting are the original formulation see Beb00 and advanced ACA see BG05 ACA for short These two have a linear complexity in the dimension of the matrix and a quadratic complexity i
23. d by H LiB to free objects Attention To repeat it again never directly release an object e g via free void since it might be shared by other objects leading to an undefined behaviour of the program The usage of the H Lib routines also has the advantage that some further checks are performed to test whether an object was already released or not This means that instead of an undefined program behaviour an error is generated if you want to access an object previously freed 3 2 Cluster Trees and Block Cluster Trees H matrices are based on two basic building blocks cluster trees and block cluster trees A cluster tree defines a hierarchical decomposition of an indexset whereas a block cluster tree represents a decomposition of a block indexset Both objects have to be created before building an H matrix 3 2 1 Cluster Trees Inside H Lib cluster trees are represented by objects of type typedef struct cluster_s cluster_t and can be created in various ways according to the type of data supplied to H LiB C Language Bindings 10 3 2 1 1 Cluster Tree Management Functions To safely free all resources coupled with a cluster tree the following function can be used Syntax void hlib_ct_free cluster_t ct int info The object ct and all coupled resources are freed from memory unless the cluster tree is used by another object Of interest is also the amount of memory used by the cluster tree Thi
24. data is available an algebraic algorithm can be used which is based only on the connectivity described by a sparse matrix as it results from finite difference or finite element 13 3 2 Cluster Trees and Block Cluster Trees methods Since this does not always reflect the real data dependency in a grid the resulting clustering usually leads to a less efficient matrix arithmetic than the geometrical approach As in the previous case the algebraic method can be combined with nested dissection Syntax cluster_t hlib_ct_build_alg matrix_t S int info cluster_t hlib_ct_build_alg_nd matrix_t S int info Arguments S Sparse matrix defining connectivity between indices Due to the simple structure of the previous examples the resulting cluster trees using alge braic clustering would be identical 3 2 1 3 Cluster Tree I O The tree structure of cluster trees can be exported in the PostScript format to a file by Syntax void hlib_ct_print_ps const cluster_t cb const char filename int info Arguments ct Cluster tree to be printed filename Name of the PostScript file to which ct shall be printed to 3 2 2 Block Cluster Trees The next building block for H matrices are block cluster trees which represent a hierarchical partitioning of the block indexset over which matrices are defined They are constructed by multiplying two cluster trees and choosing admissible nodes e g nodes where the associ
25. ding crucial aspects of the topic are displayed as Preface 2 Examples for a specific function or algorithm are enclosed by 2 Installation In this section the installation procedure of H LiB is discussed This includes the various tools and libraries need for H Lib to compile and work the configuration system of H LiB and the usage of the supplied tools for compiling programs with H Lib 2 1 Prerequisites 2 1 1 Operating System and Compiler H Lib was tested on a variety of operating systems and is known to work on the following environments Linux Solaris AIX HP UX Darwin Tru64 and FreeBSD In particular each POSIX conforming system should be fine The more crucial part plays the compiler H Lib demands a C compiler which closely follows the C standard The following compiler versions are known to work GCC Version 3 4 and above including 4 x Intel Compiler Version 5 and above Portland Compiler Version 2 x Sun Forte Studio Version 5 3 and above IBM VisualA ge Version 6 HP C Compiler Version 3 31 and above Compaq C Version 6 5 and above 2 1 2 LAPACK Internally H LiB uses LAPACK see DD91 for most arithmetic operations Therefore an implementation of this library is needed for H LiB Most operating systems provide a LAPACK library optimised for the corresponding processor e g the Intel Math Kernel Library or the Sun Performance Library If no such implementation is avail
26. drature vtx 40 hlib_set_admissibility 16 hlib_set_bsp_type 13 hlib_set_verbosity 9 hlib_vector_alloc_cscalar 18 hlib_vector_alloc_scalar 18 hlib_vector_assign 20 hlib_vector_axpy 20 hlib_vector_bytesize 19 hlib_vector_caxpy 20 hlib_vector_cdot 20 hlib_ vector _cfill 19 hlib_vector_copy 19 hlib_vector_cscale 20 hlib_vector_dot 20 hlib_vector_entry_cget 18 hlib_vector_entry_cset 18 hlib_vector_entry_get 18 hlib_vector_entry_set 18 hlib_vector fill 19 hlib_vector_fill rand 19 hlib_vector free 20 hlib_vector_import_array 17 hlib_vector_import_carray 17 hlib_vector_norm2 21 hlib_vector_scale 20 hlib_vector_size 19 hlib_walltime 40 lrapx_t 26 matop_t 33 matrix_t 22 vector_t 17 51
27. eof double coord i 0 double i 0 5 double n ct hlib_ct_build_bsp n 1 coord amp info bct hlib_bct_build ct ct amp info A hlib_matrix_build_coeff bct coeff_fn amp n LRAPX_ACAPLUS le 4 1 info Remark The above example is also implemented in the file examples bemid c There also source code for the function integrate_a is available Hybrid Cross Approximation To be done 27 3 4 Matrices 3 4 3 Matrix Management Using the following two function one can get the number of rows and columns of a specific matrix Syntax unsigned int hlib_matrix_rows const matrix_t A int info unsigned int hlib_matrix_columns const matrix_t A int info To obtain a copy of a block cluster tree associated with a matrix the function Syntax blockcluster_t hlib_matrix_bct const matrix_t A int info is available In contrast to the copy operations so far e g for cluster trees and block cluster trees copying a matrix always creates a new matrix The copy can be either exact or up to a given accuracy Both operations are done with the functions Syntax matrix_t hlib_matrix_copy const matrix_t A int info matrix_t hlib_matrix_copy_eps const matrix_t A double eps int info Arguments eps Block wise accuracy of the copy compared to A Deallocating a matrix is accomplished with Syntax void hlib_matrix_free mat
28. guments tril_pts tri2_pts weights Arrays of size order 39 3 7 Examples 3 6 2 Measuring Time Two types of time can be measured by H Lib the CPU time and the wall clock time The first corresponds to the time spent by the program actually computing things This type of time has the advantage that the load on the machine does not have any influence on the value of the time In contrast to this the wall clock time is the actual time as measured by a real clock This type of time is dependent on the load on the computer system and therefore might be different between two runs of the program The actual functions to obtain both types are Syntax double hlib_walltime double hlib_cputime The absolute value of these functions is normally not usable Only the difference between two measurements returns the passed time 3 7 Examples In the following two typical example for the usage of H Lib are discussed in more detail 3 7 1 Sparse Linear Equation System In this section a linear equation system Sx b with a sparse matrix S and a right hand side b is considered Such a system usually occures in the context of finite difference or finite element discretisations The linear system itself is provided in the form of a SAMG dataset with the basename samg matrix see Section 3 3 4 1 and Section 3 4 5 1 Remark The complete example with additional output statements and timing of each function is a
29. h double i h 2 0 12 hlib_set_bsp_type BSP_CARD 13 ct hlib_ct_build_bsp n 1 vertices info CHECK_INFO 14 hlib_ct_print_ps ct bemid_ct ps info CHECK_INFO 15 hlib_set_admissibility ADM_STD_MIN 2 0 16 bct hlib_bct_build ct ct amp info CHECK_INFO 7 hlib_bct_print_ps bct bemid_bct ps amp info CHECK_INFO C Language Bindings 44 The dimension of the problem is defined by the variable n which also defines the stepsize h of the discretisation Evaluationg info is again done by the macro CHECK_INFO which is defined as in the previous section After initialising H LiB the coordinate information is created at the lines 8 to 11 At this the middle of each interval 2 Ht was chosen as the coordinate for the corresponding vertex i With this the cluster tree can be created as it is done at line 13 Here the clustering strategy was explicitly set to cardinality balanced clustering see Section 3 2 1 2 The tree is afterwards printed to the file bemid_ct ps in PostScript format Finally one can construct the block cluster tree Again the type of construction is explicitly set in the form of a strong admissibility condition with a scaling parameter of size 2 see also Section 3 2 2 1 Attention Choosing the clustering strategy as well as the admissibility condition is only suggested if the user is absolutely sure that the corresponding options real
30. he preferred type of matrix H LiB is also capable of handling dense matrices As for sparse matrices they have to be imported for further usage When using dense matrices the user has to keep in mind a very important aspect of their storage Attention In contrast to the standard way in which C addresses dense matrices H Lib expects them to be in column major format e g stored column wise 4 For example the matrix O I has to be stored in an array as follows double D 4 1 3 2 4 The reason for this is the usage of LAPACK inside H LiB LAPACK is originally written in Fortran and therefore uses column major format for all matrices To import a dense matrix into H Lib the following two functions for real and complex valued matrices are available Syntax matrix_t hlib_matrix_import_dense const int rows const int cols const double D const int sym int info matrix_t hlib_matrix_import_cdense const int rows const int cols const complex_t D const int syn int info Arguments rows cols The number of rows and columns of the dense matrix D Array of dimension rows cols in column major format containing the matrix coefficients sym If sym is non zero the matrix is assumed to be symmetric 23 3 4 Matrices Importing the above defined matrix D is accomplished by 1 Si 2 Le hlib_matrix_import_dense 2 2 D 0 info
31. if fabs 1 0 a gt 1e 8 value 0 5 1 0 a 1 0 a log 1 0 a return value The actual vectors for b and the solution stored in x are constructed out of C arrays A _ ___ z_z z z gt o rr 27 double x_arr double malloc n sizeof double 28 double b_arr double malloc n sizeof double 29 vector_t x 30 vector_t b CHECK_INFO CHECK_INFO hlib_vector_import_array x_arr n amp info hlib_vector_import_array b_arr n amp info 3 31 for i 0 i lt n i b_arr i rhs i n Please note that the access to the elements of the vector b at line 31 is done via the array Barr The solution x with a reduction of the norm of the residual to 1078 is afterwards computed using the GMRES iteration 32 solver_t gmres hlib_solver_gmres 10 1000 1e 8 O amp info CHECK_INFO 33 hlib_solve gmres A x b amp info CHECK_INFO Since the solution is known one can compute the error x 1 2 34 vector_t one hlib_vector_copy x amp info CHECK_INFO 35 hlib_vector_fill one 1 0 amp info CHECK_INFO 36 hlib_vector_axpy x 1 0 one info CHECK_INFO 37 double error hlib_vector_norm2 x amp info CHECK_INFO 38 printf error of solution 4e n error Here the linear algebra functions for vectors in
32. inity occurred ERR_NAN not a number occurred ERR_NOT_IMPL functionality not implemented ERR_FOPEN could not open file ERR_FCLOSE could not close file ERR_FWRITE could not write to file ERR_FREAD could not read from file ERR_FSEEK could not seek in file ERR_FNEXISTS file does not exists ERR_GRID_FORMAT ERR_GRID_DATA ERR_CT_INVALID ERR_CT_INCOMP ERR_CT_SPARSE ERR_BCT_INVALID ERR_VEC_INVALID ERR_VEC_INVALID ERR_MAT_INVALID ERR_MAT_NSPARSE matrix not a sparse matrix ERR_MAT_NHMAT matrix not an H matrix ERR_MAT_INCOMP_TYPE matrices with incompatible type ERR_MAT_INCOMP_CT matrices with incompatible cluster tree ERR_SOLVER_INVALID invalid solver ERR_GRID_INVALID invalid grid ERR_LRAPX_INVALID invalid low rank approximation type invalid format of grid file invalid data in grid file invalid cluster tree given cluster trees are incompatible missing sparse matrix for given cluster tree invalid block cluster tree invalid vector wrong vector type invalid matrix To get detailed information about the error e g where it occurred inside H LiB you can use the following function Syntax hlib_error_desc char desc int size Arguments desc Character array to copy description to size Size of character array desc in bytes which returns a corresponding string By default only the error codes will be returned by functions in H LiB To also produce a corresponding message at
33. ion 29 3 4 Matrices Syntax double hlib_matrix_norm_inv_approx const matrix_t const matrix_t int info 3 4 5 Matrix I O Various matrix file formats are supported by H Lib which will be discussed in this section 3 4 5 1 SAMG The SAMG package see Fra defines a matrix file format for sparse matrices and vectors in a linear equations system namely the solution and the right hand side Im and exporting these objects is supported by H Lib Unfortunately H Lib is not able to handle coupled systems saved in this format A special property of the SAMG format is the distribution of the data onto several files Therefore not the exact name of the file storing a matrix has to be supplied to H LiB but the basename e g without the file suffix Syntax matrix_t hlib_samg_load_matrix const char int void hlib_samg_save_matrix const matrix_t const char int Arguments S Sparse matrix to store in SAMG format basename Name of the matrix file without suffix 3 4 5 2 Matlab For now H LiB only supports the Matlab V6 file format see Mat e g without compression for dense and sparse matrices All other Matlab data types e g cells and structures are not supported Furthermore H matrices can not be saved in the Matlab format Since a matrix in the Matlab file format is associated with a name this name has to be supplied to the corresponding I O functions C Language Bindings 30
34. k cluster tree When summing up H matrices this is done up to a given accuracy As usual this accuracy is block wise Sparse and dense matrices are always added exactly 33 8 5 Algebra Syntax void hlib_matrix_add const double const matrix_t const double matrix_t const double int void hlib_matrix_cadd const complex_t const matrix_t const complex_t matrix_t const double int Arguments A B Matrices to be added The result will be stored in B alpha beta Additional scaling factors for both matrices eps Block wise accuracy of the addition in the case of H matrices 3 5 1 3 Matrix Multiplication Matrix multiplication can only be performed with dense and H matrices Furthermore if the arguments are H matrices they have to have compatible cluster trees e g for the product the column cluster tree of A and the row cluster tree of B must be identical This also applies to the row cluster trees of and C as well as for the column cluster trees of B and C Otherwise the function exits with a corresponding error code The multiplication itself is defined as the update to a matrix C with additional scaling arguments C aAB BC The matrices A and B can also be transposed or conjugate transposed in the case of complex valued arithmetic C Language Bindings 34 Syntax void hlib_matrix_ mul const double const matop_t const matrix_t A const matop_t const matrix_t B const double matrix_t G const do
35. le vector_t x const double f int info void hlib_vector_cscale vector_t x const complex_t f int info Summing up to vectors is implemented in the more general form yY y Ax with vectors x and y and the constant a This operation is performed by the functions Syntax void hlib_vector_axpy vector_t const double const vector_t int i void hlib_vector_caxpy vector_t const complex_t const vector_t int ji Real and complex valued dot products can be computed with 19 3 3 Vectors Syntax double hlib_vector_dot const vector_t x const vector_t y int info complex_t hlib_vector_cdot const vector_t x const vector_t y int info And the euclidean norm of a vector is returned by the function Syntax double hlib_vector_norm2 const vector_t x int info 3 3 4 Vector I O In this section functions for reading and saving vectors from to files are discussed Beside it s own format H Lib supports several other vector formats 3 3 4 1 SAMG A special property of the SAMG format see Fra is the distribution of the data onto several files Therefore not the exact name of the file storing a vector has to be supplied to H Lib but the basename e g without the file suffix Furthermore only special vectors in a linear equations system are stored namely the right hand side and if available the solution of the system To read these vectors the following functions are a
36. ly apply to the problem under consideration Otherwise the H matrix technique might fail e g a desired accuracy is unreachable or the computational complexity is to high When in doubt use the standard settings After the partitioning of the block indexset in the form of the block cluster tree is computed the actual H matrix can be build For this adaptive cross approximation or ACA see Sec tion 3 4 2 is used in this example ACA needs a matrix coefficient function which computes the matrix entries for given index pairs In our case this function is given by 3 1 which after evaluating the integral translates into the following code void kernel int n int rowcl int m int colcl double matrix void arg int rowi colj double h double arg for colj 0 colj lt m colj int idxi colcl colj for rowi 0 rowi lt n rowi int idx0 rowcl rowi double value double d h fabs double idx0 idx1 1 0 if idx0 idxi value 1 5 h h h xh log h else value 1 5 h h 0 5 d 2 0 h d 2 0 h log d 2 0 h d 1 0 h d 1 0 h log d 1 0 h if fabs d gt 1e 8 value 0 5 d d log d matrix colj n rowi value 45 3 7 Examples The arguments n rowcl m and colc define a submatrix of dimension nxm with row and column indices stored in rowcl and colcl The coefficients should be stored in column wise ordering in ma
37. mples example both objects are constructed algebraically with the functions hlib_ct_build_alg_nd and hlib bct build 15 cluster_t ct hlib_ct_build_alg_nd S amp info CHECK_INFO 16 hlib_ct_print_ps ct crsalg_ct_nd ps amp info CHECK_INFO 17 blockcluster_t bct hlib_bct_build ct ct amp info CHECK_INFO 18 hlib_bct_print_ps bct crsalg_bct_nd ps amp info CHECK_INFO At line numbers 17 and 19 the two trees are printed to the files crsalg_ct_nd ps and crsalg bct nd ps respectively Now the sparse matrix S can be converted to an H matrix which then is factorised into LU factors with a blockwise precision of 1074 19 matrix _t A hlib_ matrix _build_sparse bct S info CHECK_INFO 20 hlib_matrix_print_ps A arsalg_A_nd ps MATIO_PATTERN info CHECK_INFO 21 hlib_matrix_inv_lu A 1e 4 amp info CHECK_INFO 22 hlib_matrix_print_ps A crsalg_LU_nd ps MATIO_SVD amp info CHECK_INFO 23 printf size of LU factor 2f MB n double hlib_matrix_bytesize A amp info 1024 0 1024 0 The matrix A is overwritten with its LU factorisation or to be precise with the inverse of its LU factorisation Both matrices are printed to PostScript files at the lines 21 and 23 In the case of the factorised matrix the singular value decomposition of the matrix is printed as chosen by the parameter MATIO_SVD The size of the LU factors is determined by hlib_matrix
38. n the rank of the approximation This reduced complexity is possible since only a minor part of all coefficients is used inside the algorithms Unfortunately this sometimes leads to errors in the approximation and hence both methods represent a heuristic approach In practise however at least ACA works quite well To guarantee a certain approximation one has to look at all coefficients of the matrix which then leads to a quadratic complexity in the size of the matrix block This algorithm is called ACAFull see BGH03 Although the approximation can be guaranteed the resulting rank due to ACAFull might not be minimal leading to an increase in the memory usage of the resulting H matrix Since an non optimal rank might also happen with ACA or ACA each approximation is truncated afterwards to ensure minimal memory overhead This truncation procedure has a computational complexity linear in the size of the matrix block and quadratic in the rank An optimal rank right from the beginning can be achieved with singular value decomposition or SVD Although this algorithm is not directly related to adaptive cross approximation it is included in H LiB Unfortunately SVD has a cubic complexity in the size of the matrix block and is therefore only applicable for small matrices After defining a coefficient function the following two routines can be used to construct a H matrix approximation to the corresponding dense matrix 25 3 4 Matrices
39. on edge common vertex separated triangles The quadrature points are build for each triangle individually where the triangle itself is the standard 2d simplex 0 1 0 0 1 0 Therefore you have to transform the computed coordinates to your triangles Computing the quadrature rules for equal triangles in done with C Language Bindings 38 Syntax void hlib_sauter_quadrature_eq const unsigned int order double tril_pts 2 double tri2_pts 2 double weights int info Arguments order Order of the quadrature tril pts tri2 pts Array where the 2d quadrature coordinates for both triangles are stored The array have to be of size 6 order weights Array of size 6 order holding the quadrature weights Similar defined are the functions for triangles with a common edge a common vertex or separated triangles Syntax void hlib_sauter_quadrature_edge const unsigned int order double trii_pts 2 double tri2_pts 2 double weights int info Arguments tril_pts tri2_pts weights Arrays of size 5 order Syntax void hlib_sauter_quadrature_vtx const unsigned int order double trii_pts 2 double tri2_pts 2 double weights int info Arguments tril_pts tri2_pts weights Arrays of size 2 order Syntax void hlib_sauter_quadrature_sep const unsigned int order double trii_pts 2 double tri2_pts 2 double weights int info Ar
40. onsider the integral equation 1 i log a yluly dy f a 0 1 with a suitable right hand side f 0 1 R We are looking for the solution u 0 1 R A standard Galerkin discretisation with constant ansatz functions y 0 lt i lt n 1 ze i al pita 42 TE lea O otherwise C Language Bindings 26 leads to a linear equation system with the matrix coefficients Qij 1 1 oa los le ue dite iti it J log x yldydx i j n integrate a i j n where integrate_a denotes a function which evaluates the integral All of this is put together in the following example There the coordinates of the indices are set at the centre of the support of each basis function The function coeff_fn evaluates integrate a at the given indices and writes the result into the given dense matrix Please note the access to D in column major form The actual H matrix is built using ACA which is the recommended variant of adaptive cross approximation void coeff_fn int n int rowcl int m int colcl double D void arg int de int n int arg for i 0 i lt n i for j 0 j lt m j D j n i integrate_al rowcllil colcl j n Hee void build_matrix int i info int n 1024 double x coord n double malloc sizeof double n cluster_t ct blockcluster_t bct matrix_t A for i 0 i lt n i coord i double malloc siz
41. or decomposing matrices e g Cholesky and LU factorisation and for solving linear equation systems with the Richardson CG BiCG Stab or GMRES iteration Furthermore it con tains methods for directly converting a dense operator into an H matrix without constructing the corresponding dense matrix by either using adaptive cross approximation or hybrid cross approximation see BG05 In addition to these mathematical functionality H LiB also has routines for importing and exporting matrices and vectors from to various formats e g SAMG Matlab PLTMG and the H Lib format Audience This documentation is intended for end users of H LiB e g users with basic knowledge in the field of H matrices who are interested in using H matrices for solving specific problems without caring about certain aspects of the implementation It is not intended for programmers who want to change the internal arithmetic of H matrices are want to implement new algorithms For this please refer to the H LiB Developer Man ual Conventions The following typographic conventions are used in this documentation CODE For functions and other forms of source code appearing in the document TYPES For data types e g structures pointers and classes FILES For files programs and command line arguments Furthermore several boxes signal different kind of information A remark to the correspond ing subject is indicated by Important information regar
42. or without nested dis section 11 3 2 Cluster Trees and Block Cluster Trees Syntax cluster_t hlib_ct_build_bsp const int const int double int cluster_t hlib_ct_build_bsp_ nd const int const int double kk const matrix_t int Arguments n Number of vertices dim Spatial dimension of the vertex coordinates coord Array of vectors of dimension dim containing the coordinates of the vertices Sparse matrix defining connectivity of the vertices The type of binary space partitioning can be changed by Syntax void hlib_set_bsp_type const bsp_t bsp Arguments bsp Defines the strategy used by the binary space partitioning algorithm and can be one of BSP AUTO Automatically decide suitable strategy This is the default BSP GEOM Partition such that sub clusters have an equal geometrical size BSP CARD Partition such that sub clusters have an almost equally sized indexset Consider the following example of a 1d problem over the interval 0 1 with 8 vertices 0 7 0 1 The cluster tree for this example is now build using standard binary space partitioning For this the coordinates of the indices have to be defined before the corresponding function is called double pos 8 1 0 0 0 125 0 25 0 375 0 5 0 625 0 75 0 875 1 0 double coord 8 amp pos 0 amp pos 1 amp pos 2 amp pos 3 amp pos 4 amp pos 5 amp pos 6 amp pos 7 cluster_t ctl hlib_ct_build_bs
43. p 8 1 coord amp info The resulting tree would then look as follows C Language Bindings 12 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 OD 43 65 6D 0 0 Q 3 4 5 6 7 Now the cluster tree shall be computed with binary space partitioning and nested dissection Here beside the coordinates the connectivity of the indices is also needed For this we assume that geometrically neighboured indices are also algebraically connected which would result in the following sparsity pattern of a corresponding matrix Ok x OX x OX x K xk For the definition of such a matrix please refer to Section 3 4 1 The corresponding source for this example is double pos 8 0 0 0 125 0 25 0 375 0 5 0 625 0 75 0 875 1 0 double coord 8 pos 0 pos 1 amp pos 2 pos 3 amp pos 4 amp pos 5 amp pos 6 amp pos 7 matrix_t S see chapters below cluster_t ct2 hlib_ct_build_bsp_nd 8 1 coord S info Due to the interface nodes chosen by the nested dissection part of the algorithm the result ing cluster tree has mostly a ternary structure In the following tree the coloured nodes correspond to the interface vertices 0 1 2 3 4 5 6 7 0 1 2 4 5 6 7 O 0 67 Functions for Algebraic Clustering If no geometrical
44. pecial data type typedef struct vector_s vector_t One reason for this is the usage of special vector types in parallel environments Furthermore this data type gives H LiB additional information to check the correctness of vectors e g the size 3 3 1 Creating and Accessing Vectors Although vectors in H Lib are not equal to arrays they can be defined by such data Syntax vector_t hlib_vector_import_array double unsigned int int vector_t hlib_vector_import_carray complex_t unsigned int int Arguments arr Array of size size size Size of the array C Language Bindings 16 The arrays are directly used by the vector data type i e changing the content of the arrays also changes the content of the vector This gives the possibility to efficiently access the vector coefficients as it is shown in the following example unsigned int n 1024 double arr double malloc sizeof double n vector_t X hlib_vector_import_array arr n amp info for i 0 i lt n i arr i i 1 The coefficient of the vector x would also equal 1 as does the array element arr i 1 Such kind of vectors e g scalar vectors can also be created directly by H Lib Syntax vector_t hlib_vector_alloc_scalar unsigned int size int info vector_t hlib_vector_alloc_cscalar unsigned int size int info Arguments size Size of the scalar vector Since no external C a
45. proximate method and due to efficiency the computational effort for computing the norm is restricted in H Lib the result of this procedure does not necessarily represent the exact spectral norm of the matrix Although in practise the convergence behaviour for most matrices is quite well Computing the Frobenius and the spectral norm for a single matrix is done by the following two functions Syntax double hlib_matrix_norm_frobenius const matrix_t A int info double hlib_matrix_norm_spectral const matrix_t A int info Furthermore if the matrix A is not ill conditioned the spectral norm of 47 can be obtained with Syntax double hlib_matrix_norm_spectral_inv const matrix_t A int info To compute the norm of the difference A Bl between two matrices A and B in the Frobenius or the spectral norm the functions Syntax double hlib_matrix_norm_frobenius_diff const matrix_t const matrix_t int info double hlib_matrix_norm_spectral_diff const matrix_t A const matrix_t B int info are available Here in the case of the Frobenius norm both matrices have to be of the same type and if they are H matrices defined over the same block cluster tree This does not apply to the spectral norm since it only relies on matrix vector multiplication Finally the approximation of the inverse of a matrix by another matrix B e g the norm Z ABI can be computed by the funct
46. r Arguments maxit Maximal number of iterations abs_red Absolute reduction of the l2 norm of the residual or negative if this reduction shall not be checked rel_red Relative reduction of the l2 norm of the residual compared to the initial norm of the residual or negative if this reduction shall not be checked Similarly for the CG and the BiCG Stab iteration the functions are Syntax solver_t hlib_solver_cg const int const double const double int solver_t hlib_solver_bicgstab const int const double const double int C Language Bindings 36 In the case of the GMRES Iteration an additional parameter is expected which describes the dimension of the local Krylov subspace e g when to restart Syntax solver_t hlib_solver_gmres const int restart const int maxit const double abs_red const double rel_red int iter Arguments restart Defines the number of iteration steps after which a restart is performed during the GMRES iteration e g the dimension of the constructed Krylov subspace This solver object can than be used with the actual solution functions Using the above iteration methods to solve without preconditioning is done by Syntax void hlib_solve const solver_t solver const matrix_t A vector_t dE const vector_t D int info Arguments solver Defines iteration method Xy D Define linear equation system If a preconditioner is available one can use
47. ranlib command to bless static archives enable debug disable debug enable opt disable opt enable prof disable prof Enable or disable compilation of H LiB with debugging optimisation or profiling flags Any combination of the above is possible By default H LiB will be compiled with debugging options cflags FLAGS cxxflags FLAGS Define the compiler flags for C and C files with cpuflags PATH Enable the usage of cpuflags which will determine appropriate compiler flags for the combi nation of compiler operating system and processor The optional argument PATH defines the correct path to cpuflags By default the supplied copy of cpuflags will be used lapack FLAGS Defines the linking flags for the LAPACK implementation needed by H LiB By default a modified copy of CLAPACK is used which can also be defined by specifying CLAPACK 5 2 3 Compilation with out x DIR x11 cflags FLAGS Turns on off X11 support in H LiB The optional argument DIR specifies the location of the X11 environment e g includes and libraries Alternatively you can define the compilation flags for X11 directly with out zlib DIR zlib cflags FLAGS zlib lflags FLAGS Turns on off zlib support in H LiB The optional argument DIR specifies the location of the zlib environment e g includes and libraries Alternatively you can define the compilation flags for zlib directly
48. rix_t A int info Again please remember to only use this function and not free to prevent undefined behaviour The memory usage of a specific function can be obtained by Syntax unsigned long hlib_matrix_bytesize const matrix_t A int info In some situations it might be necessary to access single matrix coefficients For this the following functions are available to return the entry Aij Syntax double hlib_matrix_entry_get const matrix_t const unsigned const unsigned int complex_t hlib_matrix_centry_get const matrix_t const unsigned const unsigned i int C Language Bindings 28 Remark For H matrices obtaining a single coefficient has the complexity O klog n where k is the maximal rank in the matrix and n the number of rows columns of A It should therefore only be used when absolutely necessary 3 4 4 Matrix Norms H LiB supports two basic norms for matrices the Frobenius and the spectral norm The Frobenius norm plays a crucial role in the approximation of each matrix block where the local accuracy is always meant with respect to the Frobenius norm of the local matrix The spectral norm e g the largest eigenvalue gives a better overview of the global approximation e g how good the computed approximate inverse compares to the exact inverse In the case of the spectral norm the largest eigenvalue is computed by using the Power iteration see GL96 Since this is also only an ap
49. rox imation or to a reduced computational efficiency of the H matrix arithmetic only change the default behaviour if you really know what you are doing To access the row and column cluster trees of a given block cluster tree one can use the functions Syntax cluster_t hlib_bct_row_ct const blockcluster_t bct int info cluster_t hlib_bct_column_ct const blockcluster_t bct int info which will return a copy of the corresponding cluster tree objects 3 2 2 2 Block Cluster Tree Management Functions A block cluster tree object is released by the function Syntax void hlib_bct_free blockcluster_t bct int info 15 3 3 Vectors which frees all resources associated with it This includes the row and column cluster trees if no other object uses them The memory footprint of an object of type block cluster tree can be determined by Syntax unsigned long hlib_bct_bytesize const blockcluster_t bct int info which returns the size in bytes 3 2 2 3 Block Cluster Tree I O The partitioning of the block index set over which a block cluster tree lives can be written in PostScript format by Syntax void hlib_bct_print_ps const blockcluster_t bct const char filename int info Arguments bct Block cluster tree to be printed filename Name of the PostScript file bct will be printed to 3 3 Vectors Instead of representing vectors by standard C arrays H LiB uses a s
50. rray is available to access the elements of the vectors this is accom plished by H Lib functions To get a specific element of a vector the following two functions can be used Syntax double hlib_vector_entry_get const vector_t unsigned int int complex_t hlib_vector_centry_get const vector_t unsigned int int Arguments x Vector to get element from Position of the element in the vector Similarly the setting of a vector element is defined 17 3 3 Vectors Syntax void hlib_vector_entry_set const vector_t unsigned int const double f int info void hlib_vector_centry_set const vector_t x unsigned int i const complex_t f int info Arguments X Vector to modify element in Position of the element to modify New value of the i th element in x Two more functions are available to change a complete vector First all elements can be set to a given constant value with Syntax void hlib_vector_fill vector_t x const double f int info void hlib_vector_cfill vector_t x const complex_t f int info Arguments x Vector to be filled with constant value Value to be assigned to all elements of the vector Furthermore the vector can be initialised to random values with Syntax void hlib_vector_fill_rand vector_t x int info 3 3 2 Vector Management Functions A copy of a vector is constructed by using the function Syntax vector_t hlib
51. s admissible low rank matrix blocks non admissible low rank matrix blocks see sparse matrix blocks no matrix block exists In addition special information about each block is printed For dense and sparse blocks the dimension is shown in the lower left corner For low rank blocks there the rank of the matrix is printed If SVD is chosen to be printed for each block the actual rank of the matrix is printed at the centre of each block For low rank matrices this is only shown if the SVD rank differs from the rank of the matrix An example for a sparse matrix and a dense matrix with and without SVD looks like Tha a Bi bo Mia bel E 12 n H 19 12 E 12 ta The actual PostScript image is finally produced by the function Syntax void hlib_matrix_print_ps const matrix_t const char const int int A filename options info Arguments A Matrix to be printed filename Name of the PostScript file A shall be printed to options Combination of MATIO options or 0 3 5 Algebra 3 5 1 Basic Algebra Functions 3 5 1 1 Matrix Vector Multiplication The multiplication of a matrix A with a vector x is implemented in H LiB as the update of the destination vector y as y Py aAz The special cases with 8 0 or a 0 are also handled efficiently Furthermore the multipli cation with the adjoint matrix 4 is supported by H LiB The type of opera
52. s information can be obtained by the function Syntax unsigned long hlib_ct_bytesize const cluster_t ct int info This function returns the size of the memory footprint in bytes 3 2 1 2 Cluster Tree Construction Two different methods are available to build a cluster tree The first algorithm is based on geometrical data associated with each index e g the position of the unknown and uses binary space partitioning to decompose the indexset If no geometry information is available the connectivity information between indices defined by a sparse matrix can be used in a purely algebraic method Furthermore both algorithms can be combined with nested dissection which introduces another level of separation between neighbouring indexsets and is especially suited for LU decomposition methods see Section 3 5 1 4 Functions for Geometrical Clustering Geometrical clustering is based on binary space partitioning which either is performed with respect to the cardinality or the geometrical size of the resulting sub clusters The detection of a separating interface between two neighbouring indexsets by the nested dissection technique is accomplished by the connectivity information defined by a sparse matrix and therefore does not need geometrical information Remark For the geometrical clustering only the coordinates of each index are needed not the grid itself The following two functions perform the geometrical clustering with
53. the console the verbosity level has to be increased to be at least 1 3 1 3 Data types Most data types in H Lib are defined as handles implemented by pointers to the actual data This applies to cluster trees Section 3 2 1 block cluster trees Section 3 2 2 matrices 9 3 2 Cluster Trees and Block Cluster Trees Section 3 4 and solvers Section 3 5 2 Although they are pointers a user must not use standard C functions like malloc or free to allocate or deallocate the associated memory see also Section 3 1 4 Since H Lib is also capable of handling complex arithmetic a special data type typedef struct double re im complex _t is introduced to allow the exchange of information between an application and H Lib Here re represents the real part of a complex number whereas im corresponds to the imaginary part Real valued data is represented by double 3 1 4 Reference Counting Most objects in H Lib e g cluster trees block cluster trees and matrices might be referenced by more than one variable As an example a block cluster tree always stores the defining row and column cluster trees see Section 3 2 2 To efficiently handle these references inside H Lib reference counting is used i e each object stores the number of references to it Due to this a copy operation is done by just increasing this reference counter without any further overhead This also means that you must use the functions provide
54. tion is chosen by a parameter of type C Language Bindings 32 typedef enum OP_NORM OP_ADJ matop_t where OP_NORM corresponds to Ax and OP_ADJ to A a The following two functions above computation Both routines can handle real and complex valued matrices and vectors The difference in the name only applies to the type of the constant factors Syntax void hlib_matrix_mulvec const double const matrix_t const vector_t const double vector_t const matop_t int i void hlib_matrix_cmulvec const complex_t const matrix_t const vector_t const complex_t vector_t const matop_t int Arguments A Sparse dense or H matrix to multiply with Argument vector of dimension hlibmatrix columns A if Az is performed and hlib matrix rows A in the case of A x Result vector of the multiplication of dimension hlib matrix rows A if Ax is performed and hlib matrix columns A in the case of A x alpha beta Scaling factors for the matrix vector product and the destination vector matop Defines multiplication with A or adjoint matrix of A 3 5 1 2 Matrix Addition The sum of two matrices A and B in H LiB is defined as B aA pB with a 6 R or a 6 C depending on using real or complex arithmetic The two matrices for the matrix addition have to be compatible i e of the same format For example it is not possible to add a sparse and a H matrix Furthermore H matrices have to be defined over the same bloc
55. trices Technical report Lecture note 21 MPI Leipzig 2003 J Dongarra and J Demmel LAPACK a portable high performance numerical li brary for linear algebra Supercomputer 8 33 38 1991 Fraunhofer SCAI http www scai fraunhofer de SAMG file format specification L Grasedyck and W Hackbusch Construction and Arithmetics of H Matrices Computing 70 295 334 2003 G H Golub and C F Van Loan Matriz Computations The Johns Hopkins University Press 3rd edition 1996 L Grasedyck Theorie und Anwendungen Hierarchischer Matrizen Dissertation University of Kiel 2001 W Hackbusch Iterative L sung gro er schwachbesetzter Gleichungssysteme B G Teubner Stuttgart 1993 W Hackbusch A sparse matrix arithmetic based on H matrices I Introduction to H matrices Computing 62 2 89 108 1999 The MathWorks http www mathworks com MAT File Format Version 7 Y Saad and M H Schultz GMRES generalized minimal residual algorithm for solving nonsymmetric linear systems SIAM J Comput 7 3 856 869 1986 S Sauter and C Schwab Randelementmethoden Analysen Numerik und Imple mentierung schneller Algorithmen Teubner Stuttgart 2004 H van der Vorst Bi CGSTAB A fast and smoothly converging variant of Bicg for the solution of nonsymmetric linear systems SIAM J Sci Stat Comput 13 631 644 1992 49 Index ACA 23 43 ACA 24 ACAFull 24 adaptive cross approximation 23 43 addition 3
56. trix The additional argument arg contains the h stepwidth of the discretisation Equipped with the coefficient function the code for construction an H matrix looks like 18 matrix_t A hlib_matrix_build_coeff bct kernel h LRAPX_ACAPLUS le 8 1 amp info CHECK_INFO 19 hlib_matrix_print_ps A bemid_A ps MATIO_SVD amp info CHECK_INFO 20 long bytesize hlib_matrix_bytesize A amp info 21 printf compression ratio 2f 2f MB compared to 2f MB n 100 0 double bytesize double n double n double sizeof double double bytesize 1024 0 1024 0 double n double n double sizeof double 1024 0 1024 0 ACA is chosen by the option LRAPX_ACAPLUS which defines the advanced version of ACA The coefficient function together with the stepsize form the second and third parameter to hlib matrix build_coeff Furthermore the blockwise accuracy of the H matrix approxima tion is set to 1078 Finally the second last parameter indicates the symmetry of the matrix A singular value decomposition of each matrix block is printed to the bemid_A ps at line 19 To determine the efficiency of the H matrix approximation in terms of memory usage the consumption of the matrix is calculated at line 20 and compared to a dense matrix at line 21 One could also compare the quality of the approximation due to the H matrix technique with the best approximation The
57. uble int info void hlib_matrix_cmul const complex_t const matop_t const matrix_t A const matop_t const matrix_t B const complex_t matrix_t C const double int info Arguments A B Dense or H matrices used as factors for the matrix multiplication matop_A matop_B Defines multiplication with A B or the corresponding adjoint matrices A and B Dense or H matrix containing the result of the matrix multiplication alpha beta Additional scaling arguments for the product and the destination matrix eps Block wise accuracy of the H arithmetic during the matrix multiplication 3 5 1 4 Matrix Inversion In H LiB the inverse of a matrix is computed using Gaussian elimination This method is implemented for dense and H matrices e g sparse matrices can not be inverted The following functions computes the corresponding inverse to the given matrix A which will be overwritten by the result Syntax void hlib_matrix_inv matrix_t iA const double eps int info Arguments A Dense or H matrix to be inverted A will be overwritten by the result eps Block wise accuracy of the H arithmetic during the inversion The quality of the computation can be checked by computing the spectral norm of J AB with B being the approximate inverse of A by using function hlib matrix inv approx see Section 3 4 4 An alternative algorithm is the LU decomposition of a matrix which allows the fast evalu ation
58. um Re eh ee ie 8 4 3 4 3 Matrix Management 344 Matrix Norms 0 44 4 4 4 a 4 ou du ue mue ha S45 Mam 6 yee A A A D ee ee Algebra e whe oe a BS ee ee ee ee EN a ee ee 3 0 1 Basic Algebra Functions s iia 64 64 kaa oa bee ER Re dure 3 5 2 Solving Linear Systems Miscellaneous Functions 3 6 1 Quadrature Rules 3 0 2 Measuring Time 245 dis onu ake e Oe tue Examples 4 4 fee ese ge EER RA eat pates Re BR ee ee 3 7 1 Sparse Linear Equation System _ OorwWwwww Co WON NNN 111 Contents iv 3 1 2 Integral Equation 4 sor da 2 62244 a be dada RG da 4 42 Bibliography 49 1 Preface H Lib is a software library implementing hierarchical matrices or H matrices for short This type of matrices first introduced in Hac99 provide a technique to represent various full matri ces in a data sparse format and furthermore allow usual matrix arithmetic e g matrix vector multiplication matrix multiplication and inversion with almost linear complexity Examples of matrices which can be represented by H matrices stem from the area of partial differential or integral equations For a detailed introduction into the topic of H matrices please refer to Hac99 Gra01 and GHO03 Beside the standard arithmetic mentioned above H LiB also provides additional algorithms f
59. vailable Syntax vector_t hlib_samg_load_rhs const char basename int info vector_t hlib_samg_load_sol const char basename int info Syntax void hlib_samg_save_rhs const vector_t const char int void hlib_samg_save_sol const vector_t const char int Due to the restrictions of the SAMG format the vector has to be of a scalar type 3 3 4 2 Matlab For now H LiB only supports the Matlab V6 file format see Mat e g without compression for dense and sparse vectors All other Matlab data types e g cells and structures are not supported Both types of vectors will be converted to scalar vector types upon reading e g sparse vectors become dense Conversely only scalar vectors can be saved in the Matlab format Since a vector in the Matlab file format is associated with a name this name has to be supplied to the corresponding I O functions C Language Bindings 20 Syntax vector_t hlib_matlab_load_vector const char const char int void hlib_matlab_save_vector const vector_t const char const char int Arguments v Scalar vector to be saved in Matlab format filename Name of Matlab file containing the vector vecname Name of the vector in the Matlab file 3 3 4 3 HLIB To be done 3 4 Matrices All matrices in H LiB are represented by the type typedef struct matrix_s matrix_t No difference is made between special matrix types e g sparse matrices dense matrices or
60. vailable in examples crsalg c A similar example with geometrical clustering can be found in examples crsgeom c Initialisation and Data Import At first in the example below H LiB is initialised and the data is imported from the corre sponding files C Language Bindings 40 1 include lt hlib c h gt 2 int 3 main int argc char argv 4 matrix t 5 5 vector_t b x 6 int n info 7 char mtx_datal samg_ matrix 8 hlib_init amp argc amp argv info CHECK_INFO 9 S hlib_samg_ load matrix mtx_data amp info CHECK_INFO 10 hlib_matrix_print_ps S crsalg_S ps MATIO_PATTERN amp info CHECK_INFO 11 b hlib_samg_load_rhs mtx_data amp info CHECK_INFO 12 x hlib_matrix_col_vector S amp info CHECK_INFO The sparse matrix is imported into the variable S at line 9 There only the basename of the SAMG file is provided without any suffix This matrix is afterwards printed to the file crsalg S ps in PostScript format see Section 3 4 5 4 As the option MATIO_PATTERN is given only the pattern of non zero matrix elements is printed After importing the matrix the right hand side b is read from the SAMG file at line 11 Again only the basename is given to identify the corresponding file The solution vector x is then constructed by asking the matrix S for a suitable vector see Section The macro CHECK_INFO looks at the value of the variable info after ea
61. xample was already used in Section 3 4 2 to demonstrate the conversion of dense matrices to H matrices Now it should be discussed in more detail in the context of a complete example 43 3 7 Examples Remark Source code for the complete example with additional output statements and timing of each function can be found in examples bemid c Remember that the following integral equation is considered 1 I log x ylu ydy f x 2 0 1 Here the function u 0 1 R for a given right hand side f 0 1 R The Galerkin discretisation uses constant ansatz functions y 0 lt i lt n x i itl TEI efi O otherwise This leads to a linear equations system with the matrix A defined by 1 1 aig f eo melo alestadduda 3 1 f f log s yldyas 2 E In order to represent the system matrix in the H matrix format one has to create a cluster tree and a block cluster tree For the cluster tree coordinate informations are necessary This leads to the following code include lt hlib c h gt int main int argc char argv 1 int info 2 int n 1024 3 double h 1 0 7 double n 4 cluster t Ct 5 blockcluster_t bct 6 hlib_init amp argc amp argv amp info CHECK_INFO 7 hlib_set_verbosity 2 8 double vertices double malloc n sizeof double 9 for int i 0 i lt n i 10 vertices i double malloc sizeof double 11 vertices i 0

Lib Lib - Fraunhofer SCAI

Contents

Download Pdf Manuals

Related Search

Related Contents