Home

ROSE User Manual - ROSE compiler infrastructure

image

Contents

1. Functions required by the tree traversal mechanism MyInheritedAttribute EVALUATE INHERITED _ATTRIBUTE FUNCTION SgNodex astNode MylInheritedAttribute inheritedAttribute MySynthesizedAttribute EVALUATESYNTHESIZED_ATTRIBUTEFUNCTION SgNodex astNode MyInheritedAttribute inheritedAttribute SubTreeSynthesizedAttributes synthesizedAttributeList Fi Functions required by the tree traversal mechanism MyInheritedAttribute MyTraversal EVALUATE INHERITED ATTRIBUTE FUNCTION SgNodex astNode MylInheritedAttribute inheritedAttribute Note that any constructor will do MyInheritedAttribute returnAttribute 8 5 EXAMPLES 95 return returnAttribute MySynthesizedAttribute MyTraversal EVALUATE SYNTHESIZED_ATTRIBUTE FUNCTION SgNodex astNode MylInheritedAttribute inheritedAttribute SubTreeSynthesizedAttributes synthesizedAttributeList Note that any constructor will do MySynthesizedAttribute returnAttribute return returnAttribute int main int argc char x argv Main Function for default example ROSE Preprocessor This is an example of a preprocessor that can be built with ROSE This example can be used to test the ROSE infrastructure SgProject project frontend argc argv MyTraversal treeTraversal MylInheritedAttribute inheritedAttribute Ignore the return value since we don t need it treeTraversal traverseInputFiles project inheritedAttribute return backen
2. with boost filesystem boost _filesystem gcc mt with boost wave special lib use the Wave library from boost it is possible to specify a certain library for the linker e g with boost wave boost_wave gcec mt d 1_33_1 with sqlite3 ARG use SQLite 3 library default yes optionally specify the prefix for sqlite3 library with mysql ARG use MySQL client library default yes optionally specify path to mysql_config with alternate_backend_Cxx_compiler lt compiler name gt Specify an alternative C back end compiler with alternate backend C_compiler lt compiler name gt Specify an alternative C back end compiler with alternate backend fortran compiler lt compiler name gt Specify an alternative fortran back end compiler with xml prefix PFX Prefix where libxml is installed optional with xml exec prefix PFX Exec prefix where libxml is installed optional with valgrind Run uninitialized field tests that use Valgrind Figure 2 2 Example output from configure help in ROSE directory Part 2 36 CHAPTER 2 GETTING STARTED Chapter 3 Writing a Source To Source Translator This chapter contains information about how to build ROSE translators Numerous specific examples are in the ROSE Tutorial a separate document from this ROSE User Manual 3 1 ROSE Tutorial The ROSE Tutorial contains additional
3. Functions that visit only the nodes in the memory pool that is specified in a VariantVector and returns a std list lt Sgllodex gt s template lt typename NodeFunctional gt queryMemoryPool NodeFunctional VariantVector NULL queryMemoryPool roseFunctionPointerOneParameter VariantVector NULL queryMemoryPool SgNode roseFunctionPointerTwoParameters VariantVector NULL queryMemoryPool Type0fQueryTypeOneParameter VariantVector NULL queryMemoryPool SgNode TypeOfQueryTypeTwoParameters VariantVector NULL 6 3 Predefined Queries CHAPTER 6 QUERY LIBRARY For the convenience of the user some common functions are preimplemented and can be invoked by the user through an enum variable There are two types of preimplemented queries a TypeOfQueryTypeOneParameter and a TypeOfQueryTypeTwoParameters enum TypeOfQueryTypeOneParameter t VariableDeclarations VariableTypes FunctionDeclarations MemberFunctionDeclarations ClassDeclarations StructDeclarations 6 4 USER DEFINED FUNCTIONS 71 UnionDeclarations Arguments ClassFields StructFields UnionFields StructDefinitions TypedefDeclarations AnonymousTypedefs AnonymousTypedefClassDeclarations A TypeOfQueryTypeTwoParameters requires an extra parameter of SgNode type like for instance the Type OfQueryTypeTwoParameters ClassDeclarationNames which takes a SgName which represents the class name to look for enum Type0fQueryTyp
4. c filename extension This implies conformance with the C89 standard violations are issued as warnings c ANSI C strict mode edg A This is the mode used when compiling with the edg A in addition to the edg c or rose C_only options file must have c filename extension This implies conformance with the C89 standard violations are issued as errors 3 C99 mode a ANSI C99 default mode This is the mode used when compiling with the edg c99 file must have c filename extension This implies conformance with the C99 standard This is the same as using rose C99_only b ANSI C99 strict mode This is the mode used when compiling with the edg a in addition to the edg c99 or rose C99 only options file must have c filename extension This implies conformance with the C89 standard violations are issued as errors Note that in ANSI C99 flexible array structures can not be data members of other structures See test2005_189 c for an example 4 UPC mode This is the mode used when compiling with UPC specific modifiers use edg upc Note that we have modified the EDG front end to support this mode for both C and C programs The generated code does not support calls to a UPC runtime system at present so this is just the mode required to support 5 15 HOW TO CONTROL LANGUAGE SPECIFIC MODES C C C99 UPC 67 building the translator for C or C which would introduce the transformations required to cal
5. e Loop Optimizations called by ROSE users e Abstract Syntax Tree AST Attributes tree decoration A family of AST Traversals as used with Attribute Grammars e AST Rewrite mechanisms e AST Query Mechanisms e C and C code generation from SAGE III e AST File IO e Significant robustness for C C99 and C handles large DOE applications e AST Visualization e and more SAGE III is an automatically generated body of software that uses ROSETTA a tool we wrote and distribute with ROSE ROSETTA is an IR generator that as its largest and most sophisticated test generates SAGE III The connection code that was used to translate EDG s AST to SAGE II was derived loosely from the EDG C source generator and has formed the basis of the SAGE III translator from EDG to SAGE III s IR Under this license we exclude the EDG source code and the translation from the EDG AST in distributions and make available only a binary of those parts with use EDG front end AST translation and the source to all of ROSE which does not depend on EDG No part of the EDG work is visible to the user of ROSE We can make the EDG source available only to those who have the free EDG research license We want to thank the developers at Edison Design Group EDG for making their work so widely available under their research license program 5 MIE Check spelling of student names Markus Schordan was the first post doctorate researcher on the ROSE proj
6. include rose h class MyIndentLevel public MylIndentLevel level 0 unsigned int level class MyIndenting public AstTopDownProcessing lt MyIndentLevel gt protected void virtual evaluateInheritedAttribute SgNodex astNode private unsigned int tabSize Figure 7 4 Headerfile MyIndenting h include MyIndenting h MyIndenting MyIndenting tabSize 4 MyIndenting MyIndenting unsigned int ts tabSize ts MyIndentLevel MyIndenting evaluateInheritedAttribute SgNodex node MyIndentLevel inh if dynamic_cast lt SgBasicBlock x gt node inh level inh level 1 printspaces inh levelxtabSize cout lt lt node gt get_class_name lt lt endl return inh Figure 7 5 Implementation file My ndenting C 78 CHAPTER 7 AST PROCESSING include rose h include MyVisitor h int main int argc charx argv SgProject astNode frontend argc argv MyVisitor v v traverseInputFiles astNode preorder Figure 7 6 Example main program My IndentingMain C 7 6 AstBottomUpProcessing This class allows to use synthesized attributes The user needs to implement the function evaluateSynthesize dAttribute to compute from a list of synthesized attributes a single return value Each element in the list is the result computed at one of the child nodes in the AST The return value is the synthesized attribute value computed
7. 12 5 2 Modifications to the ELF File Header o 12 6 Usage ies ses rra ee A AAA CONTENTS CONTENTS 11 131 Sahay eed ei bebeaddadtedawtagad d A sama 4 134 14 2 Usage cc eh eA Ow RR a SE Sle a Ee AE OES we ee RO Oe OS 134 14 53 Variables 22 2 ok BE Reel a eta eh ee Bae ee a POSE Sema eee eaten ts 134 14 4 Execution Walkthrough aaa ee ee 135 aan gh pola a age Ge Ee gh See es ets wh glans eee ae 135 dial oe oe ee a ec Se ds ee 135 i UGS Mn a a eA AAS Oe oe a a we EE e A 136 14 4 4 Create QMTest test and Execute Backend ooa 2000000004 136 Sey ow AO AAA ae eee ee ee 136 fet oe dk ata eG ee od oe od dee eo eRe he Soe bb e Eee 136 139 RE a a Bn Bok a ee Ta a ee be ae ee ER a 139 es Ae eee Be ate A ee ras e a N 139 15 3 Easy Mistakes to Make How to Ruin Your Day as a ROSE Developer 139 sh Foe A O 140 gba had ya a Wee A NN a Se oe 140 AREA oo Pe ee eee ee ae a 142 143 Aen 143 AAA SEER So erigi Oe Se as Gh 145 a Ge pene ae aes Be Gee oh a So nea eee Ge Ge 145 16 4 Check In Process s oe 26 2 444444 222 Gee dd aa 145 16 5 Adding New SAGE III IR Nodes Developers Only o o oo ooo 146 arar 149 O AAA AAA ee ee ee eg ap A 150 bot ON dot dati pls eae ee eG a ee ee A Ps 150 A STON 151 16 10Maintaining the ROSE Email List casc roseQllnl gov oo ooo oo 151 ade ai A A A A 153 16 12Avoidin
8. levels of loops simultaneously 14 1 Backend and ROSE argument construction block o ooo o 135 TAE 135 Abie A A A E OR AAA E 136 14 4 Create qmt and Execute Backend 00 0 ce ee 136 14 5 makefile before editingj ee 137 14 6 makefile after editing 6 ee ee 137 Ds oe a ee Oe ay Senin A eH ed Ge n a A E a n 137 4 3 find name qmt output 2 ee 138 List of Tables 15 16 LIST OF TABLES Chapter 1 Introduction 1 1 Why you should be interested in ROSE ROSE is a tool for building source to source translators You should be interested in ROSE if you want to understand or improve any aspect of your software ROSE makes it easy to build tools that read and operate on source code from large scale applications millions of lines Whole projects may be analyzed and even optimized using tools built using ROSE 1 2 Problems that ROSE can address ROSE is a mechanism to build source to source analysis or optimization tools that operate directly on the source code of large scale applications Example tools that have been built include e Array Class abstraction optimizer Source to source instrumenter e Loop analyzer e Symbolic complexity analyzer e Code coverage tools e and many more Example tools that can be built include e Custom optimization tools e Custom documentation generators e Custom analysis tools e Code pattern recognition tool
9. 56 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION appears in the AST and in the unparsed generated code as int x int y int z This feature could be changed at some point but it has not been a high priority and may be more desirable than the alternative Typedef template arguments are expressed in terms of their base type This is not something that we can fix or change EDG simply represents at least some and maybe all tem plate arguments with their types normalized to strip away all typedefs Fixing this would allow generation of code that is easier to verify visually This may receive some attention in the future Comments within templates Comments within templates are ignored and not reproduced in the generated source code This is because the template code is held in the AST as a string generated by EDG and EDG ignores the comments We currently output the comments at either the top or bottom of the template declaration Later then the template declaration is represented as an AST the comments will be folded into place where they belong Member functions of template instantiations Member functions of template instantiations use the same IR node as templated member functions of templated classes and templated member functions of non templated classes This is because the reason why a SgTemplateInstantiationMemberFunctionDecl exists to store the pointer to the SgTemplateDeclaration and there is only one of thes
10. NOTE Do not modify section offsets and sizes by modifying the section table entries Changes to these values will be overwritten with actual current section offsets and sizes when the section table is unparsed e SgAsmElfSectionTableEntry set_sh_offset e SgAsmElfSectionTableEntry set_sh_size 124 CHAPTER 12 BINARY ANALYSIS SUPPORT FOR THE ANALYSIS OF BINARY EXECUTABLES e SgAsmElfSectionTableEntry set_sh_addr NOTE Do not modify segment offsets and sizes by modifying the segment table entries Changes to these values will be overwritten with actual current segment offsets and sizes when the segment table is unparsed e SgAsmElfSegmentTableEntry set_offset e SgAsmElfSegmentTableEntry set_filesz e SgAsmElfSegmentTableEntry set_vaddr e SgAsmElfSegmentTableEntry set_memsz 2a Section Segment resizing low level The size of an ELF Section or Segment can be modified by calling SgAsmGenericSection set_size for file size and set_mapped size for mapped memory However this is a low level interface that doesn t take into account other sections in the same file The preferred way to resize a section is with SgAsmGenericFile shift_extend NOTE For many kinds of sections making the section larger will create an unreferenced area internal hole at the end of the section Other sections will automatically do something with the new address space e g SgAsmElfStringSection will add the new address space to its free list 2
11. Published in Proceedings of the SIAM Parallel Conference Minneapolis MN March 1997 Balsara D Quinlan D Parallel Object Oriented Adaptive Mesh Refinement Published in Proceedings of the SIAM Parallel Conference Minneapolis MN March 1997 Quinlan D AMR A Design for Parallel Object Oriented Adaptive Mesh Refinement Published in Proceedings of the IMA Workshop on Structured Adaptive Mesh Refinement Minneapolis MN March 1997 Brislawn K D D L Brown G S Chesshire W D Henshaw K I Pao D J Quinlan W J Rider and J S Saltzman An Object Oriented Approach to Grid Generation and PDE Computations Using Adaptive Moving Overlapping Grids Presented at 1996 Parallel Object Oriented Methods and Applications Conference Santa Fe NM February 29 1996 and also at the Fifth International Conference on Numerical Grid Generation in Computational Field Simulations Mississippi State April 4 1996 Bradley Brislawn Quinlan Zhang Nuri Wavelet subband coding of computer simulation output using the A array class library Proceedings of Space Earth Science Data Compression Workshop Snowbird UT March 1995 165 166 13 14 15 16 17 18 19 20 21 22 23 BIBLIOGRAPHY Parsons R and Quinlan D A P 4 Array Classes for Architecture Independent Finite Difference Computations Proceedings of the Second Annual Object Oriented Numeric
12. pragma pack struct G pragma pack 2 unsigned short a pragma pack struct H pragma pack 4 unsigned short a pragma pack pragma pack 4 struct I unsigned short a pragma pack struct J unsigned short a 19 Expressions in C typeidQ construct Expressions within are sometimes normalized This is an example of input code using the typeid operator include lt iostream gt include lt typeinfo gt using namespace std struct A virtual AQ struct B 4 struct C struct D C void foo B bobj A ap amp bobj A amp ar bobj cout lt lt ap lt lt typeid ap name lt lt endl cout lt lt ar lt lt typeid ar name lt lt endl D dobj C cp amp dobj C amp cr dobj cout lt lt cp lt lt typeid cp name lt lt endl cout lt lt cr lt lt typeid cr name lt lt endl cout lt lt expression lt lt typeid true amp amp false name lt lt endl bool t f cout lt lt expression lt lt typeid t amp amp f name lt lt endl int less more cout lt lt expression lt lt typeid less lt more name lt lt endl cout lt lt expression lt lt typeid less more name lt lt endl cout lt lt expression lt lt typeid less more name lt lt endl This is the associated output code using the typeid operator with some reformattin
13. 46 46 42 42 42 36 35 32 28 26 24 23 22 21 19 19 18 18 18 15 14 12 11 11 10 10 RON 6 0 00 Y o 0 ASS O O memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory memory consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumptio
14. Christian Iwainsky University of Erlanger Germany Lingxiao Jiang University of California at Davis Alin Jula Texas A amp M Han Kim University of California at San Diego Milind Kulkarni Cornell University Markus Kowarschik University of Erlanger Germany Gary Lee Univer sity of California at Berkeley and Purdue University Chunhua Liao University of Houston Ghassan Misherghi University of California at Davis Peter Pirkelbauer Texas A amp M Bobby Philip Univer sity of Colorado Radu Popovici Cornell University Robert Preiss xxx Austria Andreas Saebjornsen University of Oslo Norway Sunjeev Sikand University of California at San Diego Andy Stone Col orado State University at Fort Collins Danny Thorne University of Kentucky Nils Thuerey University of Erlanger Germany Ramakrishna Upadrasta Colorado State University at Fort Collins Christian Wiess Munich University of Technology Germany Jeremiah Willcock Indiana University Brian White Cornell University Gary Yuan University of California at Davis and Yuan Zhao Rice University Friendly Users Paul Hovland Argonne National Laboratory Brian McCandless Lawrence Livermore National Labora tory Brian Miller Lawrence Livermore National Laboratory Boyana Norris Argonne National Lab oratory Jacob Sorensen University of California at San Diego Michelle Strout Colorado State Uni versity Bronis de Supinski Lawrence Livermore
15. SAGE UI INTERMEDIATE REPRESENTATION 5 7 Compiling ROSE generated Code Using ROSE These are a few notes about parts that might be difficult if they are encountered in code generated by ROSE meaning that they had to first appear in an applications source code and the user wanted to run the generated code through ROSE again I can t imagine why It is a rare but interesting possibility There are only a few cases where we generate code that might be a problem to compile using ROSE When compiling for g default ROSE generates code that will avoid specific bugs in g 1 static const data members defined in the class definition floats only EDG accepts static and g supports const and neither accepts what the other considers correct ROSE generates code specific for the back end and so the back end must be specified in when running configure for ROSE We don t currently support EDG as a back end though we support Intel C as a back end and they use EDG so this should work 5 8 Correctness of AST When processing the AST traversing it or rewriting it it is useful to understand why things are the way they are in the AST s implementation This section attempts to outline the properties that constitute the correctness of the AST 1 Null pointers in the AST In general any null valued pointer is an error in the AST This is a policy in SAGE III and is dramatically different from SAGE II Our push for this policy has been incremen
16. The alternative is to build the AST fragments to be inserted directly using SAGE III and the constructors for its objects In general the direct construction of AST fragments is exceedingly tedious and while aspects can be automated the most extreme example of this automation is the AST constructions from source code strings A disadvantage is that the generation of the AST fragment from strings is slower but it is only a compile time issue 8 3 2 Using SAGE III Directly to Specify Transformations It is possible to build AST fragments directly using SAGE III and insert these into the AST This alternative to the use of strings is more complex and is only briefly referenced in this section The constructors for each of the SAGE III objects form the user interface required to build up the AST fragments The documentation for these appear in the reference chapter of this manual A few notes 1 Use the Sg File_Info Sg File_Info generateDefaultFileInfoForTransformationNode _ static member function to generate the Sg_File_Info object required for each of the constructor calls This marks the IR nodes as being part of a transformation and signals that they should be output within code generation unparsing 8 4 AST Rewrite Traversal of the High Level Interface The AST Rewrite Mechanism uses a traversal of the AST similar to the design of a traversal using the AST Processing Chapter 7 part of ROSE The example code specifically shows a
17. because they contain keywords from C as variable names etc In order to compile these C language programs their files must use a c lower case c as a filename extension suffix This is an EDG issue related to the front end parsing and the language rules that are selected seemingly independent of the options specified to EDG and based partly on the filename suffix Fortunately most C language programs already use the lower case c as a filename extension suffix Test code test2006_110 c demonstrates an example where the c suffix is required 68 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION Chapter 6 Query Library 6 1 Introduction This chapter presents defined techniques in ROSE to do simple queries on the AST that don t require an explicit traversal of the AST to be defined As a result these AST queries are only a single function call and can be composed with one another to define even composite queries using function composition Builtin queries are defined to return AST IR nodes Node Queries strings name queries or numbers number queries Any query can optionally execute a user defined function on a SgNode This makes it easier to customize a query over a large set of nodes Internally these functions will accumulate the results from the application of the user defined function on each IR node and return them as an STL list std list SgNode There are three different types of queries in the NodeQuery mechan
18. c as the only valid filename extension for c language and cc cp c cpp cxx as the valid filename extensions for C language On case insensitive systems ROSE handles c and C as valid filename extensions for c language and cc cp c cpp cxx CC CP C CPP CXX as valid filename extensions for C There are some inconsistencies in the filename handler such as 1 not recognizing CC CP C CPP CXX as valid filename extensions for C language on case sensitive systems and 2 not recognizing CxX cPp etc as valid filename extensions for C language on case sensitive systems The sole reason for the inconsistency is that of compatibility with GNU as well as EDG 15 5 IR Memory Consumption The Internal Representation is used to build the AST and for large programs it can translate into a large number of IR nodes Typically the total number of IR nodes is about seven times the number of lines of codes seems to be a general rule perhaps a bit more when templates are used more dominantly The memory consumption of any one file is not very significant but within support for whole program analysis the size of the AST can be expected to be quite large Significant sharing of declarations is made possible via the AST merge mechanisms C and C have a One time Definition Rule ODR that requires definitions be the same across separate compilations of files intended to be linked into a single application ODR
19. is not clear that it is required to observe namespaces since the instantiation directive could reference fully qualified type names This should be sufficient to resolve type ambiguity 5 6 5 Templates Details There are several details to enumerate 1 Comments in templates are removed They are saved in the SAGE III AST but likely in incorrect positions and not within the template before or after the template declaration They are not lost since they are retrieved using a separate lex pass internally within ROSE When template declarations appear in AST form they will be placed into the correct positions in the generated code 2 Options specific to templates can be classified as follows e No transformations on templates This is the first case to got working and it is the easiest case to some extent Template instantiation can be handled entirely by the vendor compiler making life simple for ROSE We also don t have to generate any template specializations e Transformations on templates This case can be separated into two separate cases The second is harder to handle than the first Transformations on template functions including member functions This case will force transformations to happen as the templated functions are instantiated and could not happen at any earlier phase The instantiation can happen at an earlier stage than prelinking if we force auto instantiation of templates often triggered automatically
20. time sec 0 567450 fixup defining and non defining declarations time sec 0 686581 fixup symbol tables time sec 0 547633 fixup global symbol table time sec 0 000000 fixup local symbol tables time sec 0 547604 fixup templateHandlingOptions time sec 0 546708 mark transformations for output time sec 0 529240 check the isModifiedFlag in each IR node time sec 0 130703 AST Comment Processing time sec 0 020377 AST Consistancy Tests time sec 9 429836 AST Object Code Generation backend time sec 0 756793 AST Code Generation unparsing time sec 0 009177 AST Backend Compilation SgProject time sec 0 744890 AST Object Code Generation compile output time sec 0 743146 Chapter 16 Developer s Appendix 16 1 Building ROSE from the Source Code Repository Checkout for developers only The instructions for building ROSE from SVN are a little more complex A few GNU software build tools are required not required for the user ROSE Distribution e g ROSE 0 9 3a tar gz Required tools Note current dependencies e autoconf Autoconf version 2 53 or higher is required Newer versions of Autoconf introduce experimental features that could also be problematic The Autoconf development has not been particularly good at verifying compatibility with previous releases of their work Some users have reported having to install version 2 53 specifically to use ROSE Check the ROSE C
21. use the Thread library from boost it is possible to specify a certain library for the linker e g with boost thread boost thread gcc mt with boost date time special lib use the Date_Time library from boost it is possible to specify a certain library for the linker e g with boost date time boost_date_time gcc mt d 1_33 _1 with boost regex special lib use the Regex library from boost it is possible to specify a certain library for the linker e g with boost regex boost_regex gcc mt d 1_33_1 with boost program options special lib use the program options library from boost it is possible to specify a certain library for the linker e g with boost program options boost_program_options gcc mt 1_33_1 with boost serialization special lib use the Serialization library from boost it is possible to specify a certain library for the linker e g with boost serialization boost _serialization gcc mt d 1_33_1 with boost system special lib use the System library from boost it is possible to specify a certain library for the linker e g with boost system boost_system gcc mt with boost filesystem special lib use the Filesystem library from boost it is possible to specify a certain library for the linker e g
22. 1 Introduction The loop transformation package implements all the algorithms published by Yi and Kennedy B6 including the transitive dependence analysis algorithm by Yi Adve and Kennedy These algorithms automatically optimize the loop structures of applications for better performance For now the implementation aims only to improve the cache locality of applications running on a single processor machine In the future it can be expanded to optimize parallel applications by maximizing the parallelism and minimizing the communication cost of loop structures 32 81 28 To optimize applications for better cache locality this package applies the following loop transformations in terchange fusion fission or distribution and blocking or tiling The implementation can successfully optimize arbitrary loop structures including complex non perfect loop nests such as the one from LU factorization with no pivoting in Figure The following examples illustrate the effect of applying the transformations Figure uses a pseudo code of matrix multiplication to illustrate the effect of applying the package to optimize perfect loop nests The original code is in a After performing dependence analysis on this loop nest the package applies loop interchange transformation to improve the data reuse in caches note that in C C language the matrix is stored in row major order The transformed code is shown in b The cache locality of this code can be
23. 3 execute the reconstructed original backend compile line of project s build system The script qm sh exits with the same code as the exit status of the backend process 14 5 Example The following example edits a trivial makefile and builds QMTest files with qm sh by editing the makefile By inserting the qm sh wrapper before each instance of g in this case it is possible to generate qmt test files The modified makefile is shown below After the edits have taken place it is evident that qm sh wraps around each compile line of the makefile The arguments to qm sh are themselves encompassed by the variable MYCC leaving minimal edits to the makefile itself The makefile may now be run with make and the project will be made along with all the QMTest qmt files 14 6 Running the Tests This section describes how to collect and run the test created by qm sh after building the project with an edited build system When the project has completed building the QMTest files will most likely be scattered across 14 6 RUNNING THE TESTS CXX g CFLAGS g Wall CPU out main o registers o reader o decoder o CXX CFLAGS o CPU out reader o registers o decoder o main o main o main c registers h reader h decoder h instruction h CXX CFLAGS c main c o main o registers o regi sters c registers h main h CXX CFLAGS c registers c o registers o reader o reader c reader h instruction h CXX CFLAGS c reader c
24. 96 Needed ONLY for development versions Automake is a tool for automatically generating Makefile in files compliant with the GNU Coding Stan dards Libtool version gt 1 5 6 Needed ONLY for development versions To simplify the descriptions of the build process we define Source Tree Location of source code directory there is only one source tree Compile Tree Location of compiled object code and executables There can be many compile trees representing either different configure options compilers used to build ROSE and ROSE translators compilers specified as backends for ROSE to compile ROSE generated code or architectures We strongly recommend that the Source Tree and the Compile Tree be different This avoids many potential problems with the make clean rules Note that the Compile Tree will be the same as the Source Tree if the user has not explicitly generated a separate directory in which to run configure and compile ROSE 26 CHAPTER 2 GETTING STARTED If the Source Tree and Compile Tree are the same then there is only one combined Source Compile Tree Alternatively numerous different Compile Trees can be used from a single Source Tree More than one Compile Tree allows ROSE to be generated on different platforms from a single source either a generated distribution or checked out from SVN ROSE is developed and tested internally using separate Compile Trees Use of Optional Software More functionality wi
25. AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory Memory IR MEMORY CONSUMPTION Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Pool Statistics Statis
26. Analysis Global call graphs are available examples are in the ROSE Tutorial 9 1 2 C Class Hierarchy Graph Analysis Class hierarchy graphs are available examples are in the ROSE Tutorial 9 1 3 Control Flow Graphs Control graphs exist in two forms one that is closely tied to the AST and another that is separate from the AST See the ROSE Tutorial for examples of how to use these 9 1 4 Dependence Analysis Complete use def chains are available the ROSE Tutorial shows examples of how to access this information 101 102 CHAPTER 9 PROGRAM ANALYSIS 9 1 5 Open Analysis The Open Analysis project provides a connection to ROSE and permits the use of their pointer analysis with ROSE More details on Open Analysis and a reference later 9 1 6 More Program Analysis Current work and collaborations will hopefully support an significant expansion of the program analysis supported within ROSE We are working with a number of groups on pointer analysis abstract interpretation etc 9 2 Database Support for Global Analysis The purpose of database support in ROSE is to provide a persistent place for the accumulation of analysis results The database used within ROSE is the publicly available SQLite relational database Work has been done to provide a simple and extensible interface to SQLite The demonstration and testing of the ROSE database mechanism has been supported through the construction of the call graph and class hierarchy
27. Executing a translator built with ROSE is just like running a compiler with the compiler name changed to the name of the translator executable All the command line arguments except ROSE specific and EDG specific options are internally handed to the backend compiler additional command line options required for the EDG front end are specified for the frontend along with any EDG specific options e g edg no_warnings All ROSE and EDG specific options are stripped from the command line that is passed to the backend compiler for final compilation of the ROSE generated code so as not to confuse the backend compiler Figure 3 4 2 shows the execution of a test code through an example translator 3 4 3 Example Output from a ROSE Translator Figure shows the output of the processing through the translator FIXME It appe figure reference r inc 40 CHAPTER 3 WRITING A SOURCE TO SOURCE TRANSLATOR Simple Makefile To Compile exampleT Example Makefile for ROSE users This makefile is provided as an example of how to use ROSE when ROSE is installed using make install This makefile is tested as part of the make distcheck rule run as part of tests before any CVS checkin The test of this makefile can also be run by using the make installcheck rule run as part of make distcheck Location of include directory after make install ROSE INCLUDE_DIR home liao6 daily test rose 20081014_
28. However a traversal of the AST will include all the items in the include files and alternative traversal will allow you to only travers the input file and skips all other files e g header files We don t have a database unless you consider the AST as a database in memory For the case of iostream this will be large but that is what your program really is so that is how it has to be represented such details are important for type analysis and that trickles into every other part of analysis especially for C The sharing is part of the support for whole program analysis global analysis and it permits redundant parts of the code e g declarations namespaces etc from being represented more than one when handling many files acorss translation units tens hundreds or thousands Preprocessing you mention that ROSE can refer to code locations as they are before preprocessing although it inputs preprocessed files So where exactly do you get the fine grained row column info from if you only see the preprocessed files I assume you use line directives but is this really enough e g in the presence of whitespace removal by some preprocessors The frontend of EDG includes CPP and thus it reports source code positions before the CPP translation thus we get and save this information For Fortran we have to handle the CPP translation more explicitly and so we only have the source position after translation but Fortran is always a bit
29. II introduced the use of the EDG front end and dropped the handling of Fortran but its work was incomplete 5 1 2 Difference Between SAGE II and SAGE III The SAGE III IR is now completely generated using the ROSETTA IR generator tool a source code generation tool which we developed to support our work within ROSE Initial versions of SAGE II were well done but not complete Numerous details were addressed in the work on SAGE II as part of its preparation for use within ROSE We are very thankful to the initial developers of SAGE II for all their work Sage III hopefully fulfills on a number of the goals behind their work SAGE III continues to use the EDG frontend and has updated the versions of EDG in use over SAGE IT and separated out the EDG work so that the connection of SAGE III to EDG is easier to maintain and update in the future with new versions of EDG 5 1 3 Differences Between SAGE III and ROSE ROSE uses SAGE III internally and adds numerous more sophisticated mechanisms For example ROSE adds e Attribute mechanisms for use within traversals ideas borrowed from attribute grammars e A sophisticated AST rewrite mechanism to simplify the development of transformations 45 incomplete doc 46 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION e A more sophisticated persistent attribute mechanism e Loop analysis and optimization loop fusion fission blocking etc Operators for conversion of AST subtrees to strings and
30. Make in Parallel ROSE uses general Makefiles and is not dependent on GNU Make However GNU Make has an option to permit compilation in parallel and we support this Thus you may use make with the j lt n gt option if you want to run make in parallel a good value for n is typically twice the number of processors in your computer We have paid special attention to the design of the ROSE makefiles to permit parallel make to work we also use it regularly within development work 2 2 6 Installing ROSE Installation using make install is optional but suggested Users can simplify their use of ROSE by using it from an installed version of ROSE This permits compilation using a single include directory and the specification of only two libraries as in g I lt install dir gt include o executable executable C L lt install dir gt lib lrose ledg 1m RT_LIBS See the example makefile in ROSE exampleTranslators documentedExamples simpleTranslatorExamples exampleMakefile in Section for exact details of building a translator on your machine setup by configure and tested by 30 CHAPTER 2 GETTING STARTED make installcheck Note that the tutorial example codes are also tested by make installcheck and the example makefile there can also serve as an example autoconf uses usr local as the default location for all installations Only root has write privileges to that directory so you will likely get an error if you have not overridd
31. National Laboratory Chadd Williams University of Maryland Beata Winnicka Argonne National Laboratory Ramakrisna xxx Colorado State University at Fort Collins and Andy Yoo Lawrence Livermore National Laboratory Support Steve Ashby David Brown Bill Henshaw Bronis de Supinski and CASC management Funding Fred Johnson Department of Energy DOE and Mary Zosel Lawrence Livermore National Laboratory To be clear nobody is to blame for the poor state of the current version of the ROSE documentation but myself Contents 1 Introduction 17 MES AAA 17 1 2 Problems that ROSE can addressl 0 a a a 17 A Mae a amp ee Gob A OR ee ee ee A 18 e 18 Grids RT A ta A de en ok a A e RU Re AE 18 1 6 ROSE as a Compiler Framework 2 19 A eee etc Oe erase ee ad as Gs ds A E tees ee ee 19 Dato toa Uae teeta TA GGA AEG Be Ae ca RA 19 E9 About Chis Mantall lt lt 34 2 ces a ake ae oe we ow eR A ee Ste A at ae aca 19 2 Getting Started 23 2 1 ROSE Documentation and Where to Find It 0 00000 0000020508 23 A ais a bo ae oe ae AA AAA AA 24 pudre bh OR EASES EG ee 24 A ee ee ee ee ee ee een RR 26 2 2 3 Building ROSE From a Distribution ROSE 0 9 3a tar gz 27 Sb net he See Oe oes e ESS HEELS OS GS a 28 2 2 5 Running GNU Make in Parallel 2 a 29 2 2 6 Installing ROSE coro 646444464 nha aa a ORO eee eee aa 29 Mates te debts op shy hse edhe Se a Sa Ge ces Guede ee ws Gp te eee ae 30 2
32. ROSE_DEPRECATED_FUNCTION after the function declaration and before the closing As in void old_great_idea_function ROSE_DEPRECATED_FUNCTION e Deprecated data members Use the macro ROSE_DEPRECATED_VARIABLE to specify that a data members or variables is to be deprecated This is difficult to do because data members of the IR are all automatically generated and thus can t be edited in this way Where a data member of the IR is to be deprecated it should be specified explicitly in the documentation for that specific class in the ROSE docs testDoxygen directory which is the staging area for all IR documentation definitely not in the ROSE src frontend SageIII docs directory which is frequently overwritten See details on how to document ROSE Doxygen Related Pages void old_great idea_data_member ROSE_DEPRECATED_VARIABLE 16 8 Code Style Rules for ROSE I don t want to constrain anyone from being expressive but we have to maintain your code after you leave so there are a few rules i Document your code Explain every function and use variable names that clearly indicate the purpose of the variable Explain what the tests are in your code and where they are located Write test codes to test your code these are assembled in the ROSE tests directory or subdirectories of ROSE tests roseTests Use assertions liberally use boolean values arguments to ROSE_ASSERT lt expression gt Use of ROSE_ASSERT true false for er
33. Sci entific Applications Using Communication Structures Proceedings of the 1991 Conference on Parallel Processing IL Forslund D Wingate C Ford P Junkins S Jackson J Pope S Experiences in Writing a Distributed Particle Simulation Code in C USENIX C Conference Proceedings San Francisco CA 1990 High Performance Fortran Forum Draft High Performance Fortran Language Specification Version 0 4 Nov 1992 Available from titan cs rice edu by anonymous ftp Lee J K Gannon D Object Oriented Parallel Programming Experiments and Results Proceedings of Supercomputing 91 Albuquerque Nov IEEE Computer Society and ACM SIGARCH 1991 pg 273 282 Lemke M Quinlan D Fast Adaptive Composite Grid Methods on Distributed Parallel Architectures Proceedings of the Fifth Copper Mountain Conference on Multigrid Methods Copper Mountain USA CO April 1991 Also in Communications in Applied Numerical Methods Wiley Vol 8 No 9 Sept 1992 Lemke M Quinlan D P a C Virtual Shared Grids Based Programming Environment for Architecture Independent Development of Structured Grid Applications Arbeitspapiere der GMD No 611 20 pages Gesellschaft fiir Mathematik und Datenverarbeitung St Augustin Germany West February 1992 Lemke M Quinlan D P a C Virtual Shared Grids Based Programming Environment for Architecture Independent Development of Structured Grid Applications accepted for CONPAR VAPP V Sep
34. Whole Program Analysis Support 11 1 Introduction The AST merge support in ROSE is a mechanism to generate a single binary file representing the AST for a whole program that could consist of thousands of files A focus in this work has been on the scaling required to handle realistic large scale laboratory applications 11 2 Usage See tutorial for an example 119 120 CHAPTER 11 AST MERGE WHOLE PROGRAM ANALYSIS SUPPORT Chapter 12 Binary Analysis Support for the Analysis of Binary Executables 12 1 Introduction ROSE supports the disassembly and analysis of binary executables for x86 and AMR instruction sets ROSE implements this support as part of general research work to support combining analysis for source code and analysis for binaries 12 2 The Binary AST 12 2 1 The Binary Executable Format ROSE handles Linux and Windows binary formats thus ELF format for Linux and PE NE LE DOS formats for Windows The details of each format are represented in IR nodes in the AST using structures common to the representation of such low level data About 60 IR nodes have been added to ROSE to support the binary executable formats this support allows the analysis of any Linux or Windows OS2 or DOS binary The binary executable file format can be analyized seperately from the instructions using the command line option rose read_executable_file_format_only this allows graphs generated using the ROSE visualization mechanis
35. args definition 0 24 SgFunctionParameterList SgFunctionDefinition CSize 0 CSize 1 body 23 SgBasicBlock CSize 3 0 y 12 5 SgVariable Declaration CSize l 19 22 SgWhileStmt SgReturnStmt CSize 2 CSize 1 0 condition body expression_root 4 10 18 21 SglnitializedName SgExprStatement SgBasicBlock SgExpressionRoot CSize 2 CSize l CSize l CSize 1 initpte itemptr expression_root 0 operand_i 3 SglnitializedName CSize 2 9 SgExpressionRoot CSize 1 17 20 SgExprStatement SgV arRefExp CSize l CSize 0 initpte itemptr expression_root 5 SgAssignlnitializer 16 dh SgGreaterThanOp SgExpressionRoot CSize 1 CSize 2 CSize 1 operand_i hs_operand_i ths_operand_i operand_i L 6 7 15 SglntVal SgV arRefExp SglntVal SgAssignOp CSize 0 CSize 0 CSize 0 CSize 2 lhs_operand_i ths_operand_i 11 14 SgV arRefExp SgSubtractOp CSize 0 CSize 2 lhs_operand_i ths_operand_i 12 SgV arRefExp CSize 0 CHAPTER 7 AST PROCESSING Figure 7 11 Numbers at nodes show the order in which the function evaluateSynthesizedAttribute is called in a bottom up processing 7 11 VISUALIZATION toot 1 54 SgGlobal CSize l 0 2 53 SgFunctionDeclaration CSize 2 args definition 3 4 SgFunctionParameterList CSize 0 5 52 SgFunctionDefinition CSize 1 body 6 51 SgBasicBlock CSize 3 0 l 12
36. at this node and passed upwards in the AST template lt SynthesizedAttributeType gt class AstBottomUpProcessing public SynthesizedAttributeType traverse SgNode node SynthesizedAttributeType traverseWithinFile SgNode node void traverseInputFiles SgProject projectNode typedef SynthesizedAttributesList protected SynthesizedAttributeType virtual evaluateSynthesizedAttribute SgNode astNode SynthesizedAttributesList synList 0 SynthesizedAttributeType virtual defaultSynthesizedAttribute The type SynthesizedAttributesList is an opaque typedef that in most cases behaves like a Standard Template Library STL vector of objects of type SynthesizedAttributeType in particular it provides iterators and can be indexed like a vector The main difference to vectors is that no operations for inserting or deleting elements or otherwise resizing the container are provided These should not be necessary as the list of synthesized attributes is only meant to be read not modified Using an iterator to operate on the list is necessary when the number of child nodes is arbitrary For example in a SgBasicBlock the number of SgStatement nodes that are child nodes ranges from 0 to n where n synList size For AST nodes with a fixed number of child nodes these values can be ac cessed by name using enums defined for each AST node class The naming scheme for attribute access is lt CLASSNAME gt _ lt MEMBERVARIABLENAME gt The met
37. defined functions returns a std lististd string s and the C Style functions take a std string as a second parameter The predefined functions implemented in this interface are namespace NameQuery enum TypeOfQueryTypeOneParameter VariableNames VariableTypeNames FunctionDeclarationNames MemberFunctionDeclarationNames ClassDeclarationNames 72 CHAPTER 6 QUERY LIBRARY ArgumentNames ClassFieldNames UnionFieldNames StructFieldNames FunctionReferenceNames StructNames UnionNames TypedefDeclarationNames TypeNames J enum Type0fQueryTypeTwoParameters VariableNamesWithTypeName 35 6 6 Number Queries The number query provides exactly the same interfaces as the NodeQuery except for two differences the user defined functions returns a std listjintjs and the C Style functions take an int as a second parameter The predefined functions implemented in this interface are namespace NumberQuery enum TypeOfQueryTypeOneParameter af NumberOfArgsInConstructor Number0f0perands NumberOfArgsInScalarIndexingOperator enum Type0fQueryTypeTwoParameters NumberOfArgsInParanthesisOperator Chapter 7 AST Processing 7 1 Introduction ROSE aids the library writer by providing a traversal mechanism that visits all the nodes of the AST in a predefined order and to compute attributes Based on a fixed traversal order we provide inherited attributes for passing information down the AST top down proces
38. details and the steps used in examples of increasing sophistication The ROSE Tutorial also explains a number of useful features of ROSE including AST Traversals There are a number of different kinds of traversals including a classic object oriented visitor pattern and a more general useful traversal mechanism that supports a single visit function Each traversal can operate on either just those IR nodes that have positions in the source file non shared typically statements and expressions or over all IR nodes shared and non shared AST Queries The ROSE Tutorial demonstrates the ROSE AST query mechanism and how to build more complex user defined queries PDF Output of AST ROSE includes a number of ways to visualize the AST to support debugging and AST construction i e how specific C examples map to the IR A PDF representation of the AST permits the hierarchy of bookmarks to index the tree structure of the AST This technique works on large scale ASTs typically a 300K node AST from a 40K line source code will define a 400Meg PDF file DOT Output of AST For smaller ASTs less than 100K nodes the AST can be viewed as a DOT graph For very small ASTs the graph can be converted to postscript files but for larger graphs 500 IR nodes special dot viewers are required e g zgrviewer AST Rewrite Mechanism The ROSE Tutorial shows examples of how to use a range of AST rewrite mechanisms for supporting program transform
39. doing so will have no effect on the currently defined sections they will continue to use the original string table for their names 12 5 3 Modifications to ELF String Tables and their Containing Sections 1 Move Extend See SgGenericFile shift_extend When a string table is extended the new address space is added to the table s free list 2 New String A new string can be created by calling the SgAsmStoredString allocator and passing a string table something derived from SgAsmGenericStrtab and the initial string value The string is not actually allocated space in the file until the new file is unparsed or until someone calls SgAsmStoredString get_offset 12 5 STATIC BINARY REWRITING 127 3 Value modification A string can be modified by assigning a new value via SgAsmStoredString set_string Storage is not allocated for the new value until the AST is unparsed or someone calls SgAsmStoredString get_offset The previous value is freed 4 Shared strings Three forms of sharing are supported 1 Two objects section names symbol names etc share the same string and changing one string causes the other to change as well This kind of sharing is not typically encountered in ELF although the underlying string table classes support it 2 Two objects have independent strings that happen to have the same value and point to the same offset in the string table In this case changing one string doesn t change the other This kind o
40. forward declarations can exist as in class X first forward declaration class X second forward declaration class X defining declaration The first forward declaration is the firstNondefiningDeclaration within ROSE All forward declarations are marked as forward declarations see declarations modifiers documentation isForward member function The second forward declaration is just another declaration and should not be referenced as a firstNondefining declaration from any other declaration Its defining declaration is set in the AST fix up phase The following code is legal but particularly bothersome it now works in ROSE 1This is in spite of the fact that they are implemented in many compilers They are not part of the C or C language so they are not implemented in ROSE They are however one of the most common language extensions to C and C compilers even certain standard following front ends such as EDG 64 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION void foo struct X ptr first declaration but not really a forward declaration class X first or second forward declaration not really sure if this is the first or second class X 1 defining declaration one one of these is allowed in the same scope In this code example the first declaration of X appears in the function parameter list of the forward declaration of the function foo This is not a typical forward struct declaration We keep tra
41. get_definition may return a valid pointer to it even for a function prototype Thus the explicit marking of declarations as a prototypes is critical to its interpretation as a function prototype c Pointers to SgBasicBlock All pointers of type SgBasicBlock should be valid pointers 5 9 AST NORMALIZATION SUBTLE WAYS THAT ROSE OUTPUT DIFFERSFROM THE ORIGINAL SOURCE CODE55 d Other NULL pointers A conscious attempt is made within ROSE to not communicate information through a null valued pointer Unfortunately this has been a switch from the original design of SAGE II which had NULL pointers throughout the AST In general within the newer work any NULL pointer is currently an error 2 What lists can be empty SAGE III uses STL lists internally children on many IR nodes are contained in such STL lists There are nodes where the STL lists can be empty These nodes include a SgBasicBlock b SgGlobal c SgExpresionList d SgNamespaceDeclaration 3 Which access functions are simple and which do meaningful computation This question will be addressed later when we can automate queries of this sort In general member functions beginning with get_xxx and set_xxx get or set a private data member named p_xxx Most such functions are trivial access functions but some have more complex semantics Given that there are over 200 IR nodes in the SAGE III IR and that each has numerous member functions we will defer addressing t
42. group is a single loop in the original code such as the 2 7 or k loop in the matrix multiplication code in Figure 10 1 traditional loop interchange analysis for perfect loop nests would suffice however if the group includes non common loops surrounding different statements such as the k s1 and j s2 loops in the non pivoting LU code in Figure 10 2 a transitive dependence analysis is performed on the dependence graph and the transitive dependences are used to determine the safety of fusing and shifting these loops Because dependence hoisting is realized by combining a sequence of traditional loop distribution interchange and index set splitting transformations on single or perfectly nested loops the complexity of applying dependence hoisting is equivalent to that of the corresponding sequence of sub transformations In the worst case applying dependence hoisting to a loop nest takes time proportional to N L D where N is the number of statements in the nest L is the depth of the nest and D is the size of the dependence graph for the nest In an average case however dependence hoisting requires much less time to finish For a perfect loop nest dependence hoisting is 10 3 ANALYSIS AND TRANSFORMATION TECHNIQUES 115 equivalent to a standard loop interchange on perfect loop nests followed by a single loop distribution in which case the required complexity is O N D 10 33 Transformation Framework To optimize applications for bett
43. is significantly leveraged within the AST merge mechanism to share all declarations that appear across multiple merged files Still a one million line C application making significant use of templates can be expected to translate into 10 20 million IR nodes in the AST so memory space is worth considering The following is a snapshot of current IR node frequency and memory consumption for a moderate 40 000 line source code file one file calling a number of header files Note that the Sg File Info IR nodes are most frequent and consume the greatest amount of memory This reflects our bias toward preserving significant information about the mapping of language constructs back to the positions in the source file to support a rich set of source to source functionality AST Memory Pool Statistics number0fNodes 114081 memory consumption 5019564 node Sg_File_Info AST Memory Pool Statistics number0fNodes 31403 memory consumption 628060 node SgTypedefSeq AST Memory Pool Statistics number0fNodes 14254 memory consumption 285080 node SgStorageModifier AST Memory Pool Statistics number0fNodes 14254 memory consumption 1140320 node SgInitializedName AST Memory Pool Statistics numberUfNodes 8458 memory consumption 169160 node SgFunctionParameterTypeList AST Memory Pool Statistics numberUfNodes 7868 memory consumption 1101520 node SgModifierType AST Memory Pool Statistics number0fNodes 7657 memory consumption 398164 node SgC
44. just suspending it prevents your changes from taking effect 16 14 Enabling PHP Support 1 Fetch and install PHP tested with 5 2 6 from http www php net downloads php PHC requires a few specific configure flags in order to be able to use PHP properly Fill in your choice of PHP install location where appropriate in place of usr local php configure enable debug enable embed prefix usr local php make amp amp make install 2 Fetch and install PHC tested with svn version 11487 Currently only the development release works with ROSE 16 15 BINARY ANALYSIS 155 svn checkout http phc googlecode com svn trunk phc read only cd phc read only touch src generated configure prefix usr local php with php usr local php make amp amp make install 3 Finally due to an incongruence in the class hierarchies of PHC and ROSE the following changes have to be made to the installed usr local php include phc AST_fold h Hopefully this can be resolved soon so that ROSE works with an unmodified upstream PHC src generated AST_fold h 2008 07 30 10 35 32 000000000 0700 src generated AST_fold h rose 2008 08 13 15 30 37 000000000 0700 1037 7 1037 7 00 case Nop ID return fold_nop dynamic_cast lt Nop gt in case Foreign ID return fold_foreign dynamic_cast lt Foreign gt in return 0 assert 0 ee 1271 7 1271 7 case Nop ID return fold_nop dynamic_cast lt Nop gt in case F
45. node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node SgBasicBlock SgNullStatement SgFunctionType SgThisExp SgArrowExp SgFunctionDefinition SgTypedefDeclaration SgTypedefSymbol SgTypedefType SgEnumFieldSymbol SgDotExp SgReturnStmt SgExprListExp SgCastExp SgFunctionCallExp SgBoolValExp SgExprStatement SgClassSymbol SgMemberFunct ionRefExp SgUsingDeclarationStatement SgPntrArrRefExp SgFunctionRefExp SgTemplateInstantiationFunctionDecl SgClassDefinition SgMultiplyOp SgStringVal SgArrayType SgUnsignedLongVal SgTemplateInstantiationDirectiveStatement SgTemplateInstantiationDefn SgUnsignedIntVal SgAssignOp SgAddOp SgBaseClassModifier SgBaseClass SgConditionalExp SgNamespaceDefinitionStatement SgNamespaceDeclarationStatement SgEquality0p SgComma0pExp SgConstructorInitializer SgPragma SgPragmaDeclaration SgEnumVal SglfStmt SgEnumType SgEnumDeclaration SgEnumSymbol SgPointerDerefExp SgShortVal SgSubtractOp SgQualifiedName SgAddressOf0p SgCharVal SgLessThan0p SgGreaterOrEqual0p SgPlusPlus0p SgNotEqual0p SgUnsignedShortVal SgAnd0p SgPointerMemberType SgLongIntVal SgDivideOp SgBitAndOp SgMinusMinusOp SgDoubleVal SgFloatVal SgUns
46. of strings to AST fragments e Database support for global analysis e C Template support e Fast binary AST File IO e An AST merge mechanism for supporting whole program analysis across hundreds of files Complete language support for C C99 UPC C Fortran 66 Fortran 77 Fortran 90 95 and Fortran 2003 e AST visualizations program visualization for debugging e ROSE User Manual and ROSE Tutorial Documentation e Full IR documentation via Doxygen web pages e Web site with software and svn repository access e And lots more 5 2 Comments Handling Comments are placed into the SAGE III AST using a separate pass over the source file EDG does not preserve comments at this time and we felt it was important to preserve them within the unparsed generated output of the source to source mechanism that ROSE defines Comment processing can also be addressed using the AST Rewrite Mechanism though the order of how the comments appear in the code is determined by the order of invocation of the AST insert function with a comment as the input string Internally the comments annotate the AST tree decoration so that AST queries may use the comments at will 5 3 C Preprocessor cpp Directive Handling The C Preprocessor cpp directives not pragma are handled internally using the same mechanism as com ments Although they are fully expanded at compile time they are reinserted back into the unparsed source code as it is
47. ours To checkout ROSE assumes access to repository at LLNL type svn checkout file usr casc overture ROSE svn ROSE trunk ROSE svn rose This will checkout a copy of the source code for ROSE from the svn repository Any directory name can be used for svn rose in the example commandline Update an existing version from SVN Run svn update from inside the ROSE directory at the top level to update an existing version of ROSE with the new changes in the SVN repository After being checked out or updated from SVN Run the build script in the top level ROSE directory to build all configure scripts and Makefile in files using automake This is the difference between the development environment and the distribution This script will call the different autoconf tools required to setup ROSE and also checkout other work common to multiple projects within CASC Build a compile directory for the compile tree Make a separate directory to be the root of the compile tree There can be many compile trees if you want Note Before the next step be sure you are using the correct compiler g C compiler see ChangLog file for current version used for development generally any 3 x version and that you are using the correct version of autoconf and automake Running configure Type lt pathToSourceTree gt configure help to see the different configuration options lt pathToSourceTree gt is meant to be the absolute or relative path to the
48. prefix for instance prefix HOME For better control use the options below Fine tuning of the installation directories bindir DIR user executables EPREFIX bin sbindir DIR system admin executables EPREFIX sbin libexecdir DIR program executables EPREFIX libexec datadir DIR read only architecture independent data PREFIX share sysconfdir DIR read only single machine data PREFIX etc sharedstatedir DIR modifiable architecture independent data PREFIX com localstatedir DIR modifiable single machine data PREFIX var libdir DIR object code libraries EPREFIX 1lib ncludedir DIR C header files PREFIX include oldincludedir DIR C header files for non gce usr include infodir DIR info documentation PREFIX info mandir DIR man documentation PREFIX man Program names program prefix PREFIX prepend PREFIX to installed program names program suffix SUFFIX append SUFFIX to installed program names program transform name PROGRAM run sed PROGRAM on installed program names X features x includes DIR X include files are in DIR ox libraries DIR X library files are in DIR System types build BUILD configure for building on BUILD guessed host HOST cross compile to build programs to run on HOST BUILD Optional Features disable FEATURE do not include FEATURE same as enabl
49. projects for sale then I would encourage you to contact EDG and buy a license from them This is was a few companies have done and they have consulted EDG on this point Our goal is to especially encourage open source C work using ROSE Clearly we derive robustness in C in ROSE from the use of EDG and we are thankful to there liberal research license Is there a list of projects compiled with rose I don t release a list of projects and specific research groups using ROSE We have read that you plan a windows port Until which date do you plan to port the project We hope to have a windows port using Cygwin it worked a while back but was not tested often so we have to fix some details for it to work again So it is not a big deal but I can t promise when it would happen ROSE computes different kinds of stuff from the actual AST and semantic type info the docs mention control flow data flow slicing and some more Are these types of things computed accurately That is can you fully rely on the computed info Are they computed for the entire C C language or a subset Just to give an example there are implicit calls to destructors of static objects e g f A a will get a A call at the end of f s scope Do you take such info into account when computing call dataflow control graphs If so I wonder how you built this info in do you first construct some form of IR intermediate rep atop of which you compute the data
50. simpler specification of transformations Operations using this interface have the following properties e Strings used to specify transformations e Operations are immediate e Operations are local on the specified node of the AST 8 2 4 High Level Rewrite Interface This interface presents the same string based rewrite mechanism as the mid level interface but adds additional capabilities This interface is the most flexible rewrite interface within ROSE Although it must be used within a traversal to operate on the AST it provides a mechanism to express more sophisticated transformations with less complexity due to its support of relative positioning of transformation strings within the AST relative to the current node within a traversal The high level rewrite mechanism uses the same three functions as the other rewrite interfaces but with an expanded range of enum values to specify the intended scope and the location in that scope The scope is specified using the ScopeldentifierEnum type defined in the HighLevelCollectionTypedefs class These enum values are e unknownScope e LocalScope e ParentScope e NestedLoopScope e NestedConditionalScope e FunctionScope e FileScope e GlobalScope e Preamble The position in any scope is specified by the PlacementPositionEnum type which is defined in the HighLevel Collection Typedefs class These enum values are e PreamblePositionInScope TopOfScope TopOfIncludeRegion Bottom
51. since we want to allow many people to contribute to ROSE However if we have to debug your code be prepared that we will do a number of things to it that might offend you 1 We will add documentation where we think it is appropriate 2 We will add assertion tests using ROSE_ASSERT macros wherever we think it is appropriate 3 We will reformat your code if we have to understand it and the formatting is a problem This may offend many people but it will be a matter of project survival so all apologies in advance If you fix anything later your free to reformat your code as you like We try to change as little as possible of the code that is contributed 16 10 Maintaining the ROSE Email List casc roseQlInl gov There is an open email list for ROSE which can be subscribed to automatically The list name is casc rose These are the email commands available to users of the list To use them a user sends a message to Majordomo with one or more of these commands in the body of the message Each mailing list has a special request address where commands can be sent For example to use the casc rose mailing list casc roseQlists llnl gov send commands to casc rose requestUQlists lInl gov It is also possible to send commands directly to majordomo lists lInl gov However be sure to specify which list you want to use With all the commands below you can leave out list if you are sending to casc rose requestUlists lInl gov e subscr
52. source tree where the SVN version was checked out After options have been selected type lt pathToSourceTree gt configure lt selected options gt to run the configure script Running the configure script with no options is sufficient uses default values which are either already set or which the configure script will figure out on your machine For more on ROSE configure options see 2 2 4 Running Make after running configure After configuration after the configure script is finished run make or gmake If you have a development version then you can also make distributions by running make dist If you want to build a new distribution AND test it run make distcheck make or gmake may be used interchangeably See details of running make in parallel Testing your new version of ROSE Automated tests are available within the distribution of ROSE To run these tests type make check Tests on a modern Intel Linux machine currently take about 15 minutes to run Installing ROSE From this point you can generate ROSE the way a user would see it as if you had started with a ROSE Distribution Type make install to install ROSE See details of installing ROSE 16 2 HOW TO RECOVER FROM A FILE SYSTEM DISASTER AT LLNL 145 9 Testing the installed version of ROSE To test the installed version of ROSE type make installcheck To test compilation this forces one or more of the Example translators to be built using only the header files from the i
53. special roseFreshTestStub xyz sh file examples are in the ROSE scripts directory it holds the required paths for the environment to be setup 32 CHAPTER 2 GETTING STARTED 2 4 2 What Parts of ROSE Are Robust We consider the compiler construction issues IR code generation AST traversal support and low level AST transformation mechanisms to be robust These are the mechanisms that are dominantly tested by the re gression suits and application codes Specifically a ROSE translator is built that does no transformation e g Identity Transformation C in the ROSE Tutorial Input files are processed with this translator and the following steps are tested for each source file e EDG s AST is built internally e ROSE s AST the SAGE III AST is built from the EDG AST e EDG s AST is deleted e ROSE s AST traversals are tested e ROSE s AST Attribute Mechanism is tested in each IR node e ROSE s AST internal tests are done all tests must pass e ROSE s Code Generator is used to regenerate the source code e Vendor compiler compiles the ROSE generated source code Note that separate tests to run the executables generated form the vendor compiler s compilation of the ROSE generated sources are not automated This is not yet a standard test in ROSE just verified infrequently 2 4 3 What Parts of ROSE Are Not Robust Basically the program analysis lags in robustness The robustness of the program analysis and opt
54. the evaluate functions change along with the derivation of the attributes from an AST_Rewrite nested class if USEREWRITE MECHANISM define EVALUATEINHERITED_ATTRIBUTEFUNCTION evaluateRewriteInheritedAttribute define EVALUATESYNTHESIZED_ATTRIBUTEFUNCTION evaluateRewriteSynthesizedAttribute else define EVALUATEINHERITED_ATTRIBUTE_FUNCTION evaluateInheritedAttribute define EVALUATESYNTHESIZED_ATTRIBUTE FUNCTION evaluateSynthesizedAttribute endif Build an inherited attribute for the tree traversal to test the rewrite mechanism class MyInheritedAttribute public Note that any constructor is allowed MyInheritedAttribute y Build a synthesized attribute for the tree traversal to test the rewrite mechanism class MySynthesizedAttribute if USEREWRITE MECHANISM public HighLevelRewrite SynthesizedAttribute endif public MySynthesizedAttribute hi tree traversal to test the rewrite mechanism if USEREWRITEMECHANISM A specific AST processing class is used built from SgTopDownBottomUpProcessing class MyTraversal public HighLevelRewrite RewriteTreeTraversal lt MyInheritedAttribute MySynthesizedAttribute gt else Any AST processing class may be used but the conversion is trivial if SgTopDownBottomUpProcessing is used class MyTraversal public SgTopDownBottomUpProcessing lt MyInheritedAttribute MySynthesizedAttribute gt endif public MyTraversal
55. to do without the extra cost of transitive dependence analysis This package examines the original loop structures of programs and performs transitive dependence analysis only when required 10 3 2 Dependence Hoisting Transformation As the base technique for loop interchange fusion and blocking this package implements a novel loop trans formation dependence hoisting first introduced by Yi and Kennedy 37 that facilitates a combined fusion and interchange transformation for a group of arbitrarily nested loops Applying the dependence and transitive dependence analysis algorithms this transformation first selects a group of arbitrarily nested loops such as the k s1 k loop surrounding s and the s2 loops in the non pivoting LU code in Figure a that can be legally fused and then placed at the outermost position of a code segment It then performs the transformation through a compound sequence of traditional transformations on single loops and perfectly nested loops A com bined interchange and fusion transformation is established on an arbitrary loop structure as a result An example of the transformation result is shown for the non pivoting LU code in Figure 10 2b here the transformation is applied to the k s and j s2 loops in a Given a group of loops as input for a dependence hoisting transformation the safety of fusing and shifting these loops is determined from the dependence constraints on iterations of these loops If the
56. to the SgClassDeclaration it makes more sense to process the SgClassDeclaration associated with the SgClassDefinition instead of the SgClassDefinition directly Presently the processing of the SgClassDefinition is not supported through any indirect processing of the SgClassDeclaration this could be implemented in the future 6 SgGlobal This case is not implemented It would require special handling but it might be implemented in the future 7 SgBasicBlock used in a SgForStatement Because of the declaration of the for loop C language construct index variable this case would require special handling This case could be implemented in the future 8 SgBasicBlock used in a SgFunctionDefinition Because of the declaration of the function parameter variable this case would require special handling This case could be implemented in the future 9 SgBasicBlock used in a SgSwitchStatement Example of code in generated intermediate file for a SgBasicBlock used in SgSwitchStatement int main local stack 0 int x int y switch x local stack 1 int GlobalScopePreambleStart int GlobalScopePreambleEnd 100 int int CHAPTER 8 AST REWRITE MECHANISM CurrentLocationTop0fScopeStart CurrentLocationTopUOfScopeEnd CurrentLocationBeforeStart CurrentLocationBeforeEnd CurrentLocationReplaceStart case 0 yt break default y break int int int int int CurrentLocationReplaceEnd Curren
57. types of statements the AST rewrite mechanism can not currently process This section enumerates these and explains why each is difficult or not currently possible Note that some appear unable to be handled while others will only require special handling that is not yet implemented 8 7 LIMITATIONS KNOWN BUGS 99 1 Why we have to skip SgCaseOptionStmt statements Example of code in generated intermediate file for a SgCaseOptionStmt int GlobalScopePreambleStart int GlobalScopePreambleEnd int CurrentLocationTop0fScopeStart int CurrentLocationTop0fScopeEnd int CurrentLocationBeforeStart int CurrentLocationBeforeEnd int CurrentLocationReplaceStart case 0 y break int CurrentLocationReplaceEnd int CurrentLocationAfterStart int CurrentLocationAfterEnd int CurrentLocationBottom0fScopeStart int CurrentLocationBottom0fScopeEnd The problem is that marker declarations that appear after the SgCaseOptionStmt are included in the scope of the SgCaseOptionStmt while those that appear before it are not in the same scope 2 SgDefaultOptionStmt see reason 1 above 3 SgCtorInitializerList This case would require special handling to be generated in the intermediate file and it would require special handling isolated from the AST This case can probably be handled in the future with extra work 4 SgFunctionParameterList see reason 3 above 5 SgClassDefinition Since the SgClassDefinition is so structurally tied
58. use a Majordomo mailing list https lists 11nl gov mj user commands html Web interface for modifying a Majordomo mailing list https lists 11n1 gov majordomo Details 1 List name is casc rose not casc rose llnl gov 2 Must be on site at LLNL 16 11 HOW TO BUILD A BINARY DISTRIBUTION 153 16 11 How To Build a Binary Distribution The construction of a binary distribution is done as part of making ROSE available externally on the web to users who do not have an EDG licence We make only the EDG part of ROSE available as a binary library and the rest is left as source code just as in an all source distribution There are a few steps 1 Configure and build ROSE normally using configure use all options that you require in the binary distri bution 2 optional Run make dist this will build an all source distribution of ROSE 3 Rerun configure without the with edg_source_code true option 4 Run make dist this will build a binary distribution using the binary libraries build in step one 16 12 Avoiding Nightly Backups of Unrequired ROSE Files at LLNL If your at LLNL and participating in the nightly builds and regression testing of ROSE then it is kind to the admin staff to avoid having your testing directory often many gigabytes of files backed up nightly There is a file nsr that you can put into any directory that you don t need to have backed up The syntax of the text in the file is skip Additio
59. user wants to force instantiation of all templated classes functions and static data members then he or she can do so by including a test code that forces the explicit instantiation of every class function static data member or using explicit template instantiation directives If a class template has been modified then we need to make sure that all the class definition member functions and static data members are instantiated on the next pass through the prelinker The process should involve a call to the EDG function void set_instantiation_required_for_template_class_members a_type_ptr class_type 5 6 6 Different Modes of Template Instantiation We first supported only a single mode of template instantiation Later we will consider supporting additional modes later ROSE will respond to the EDG options to control automatic template instantiation using the option edg tmode where the mode is either 1 none default No template instantiation will be done 2 used Only templates that are used in the translation unit will be instantiated 3 all All possible templates will be instantiated 4 local Only used templates will be instantiated and they will be forced to be local to the file All instantiated functions will be declared as static Note that static functions and member functions are only seen by the local file scope translation unit typically the source file verify the details here 54 CHAPTER 5
60. your LD LIBRARY PATH For example LD_LIBRARY_PATH LD_LIBRARY_PATH home youraccount opt boost_1_35_0 lib 28 CHAPTER 2 GETTING STARTED 6 Run the configure script Type AbsoluteOrRelativePath configure prefix pwd with boost BOOST_installTree to run the ROSE configure script The path to the configure script may be either relative or absolute The prefix option on the configure command line is only required if you run make install suggested because the default location for installation is usr local and most users don t have permission to write to that directory This is common to all projects that use autoconf ROSE follows the GNU Makefile Standards as a result of using autoconf and automake tools for its build system As of ROSE 0 8 9a the default setting for the install directory prefix is the build tree For more on ROSE configure options see section 7 Run make Type make to build all the source files See details of running make in parallel in section 2 2 5 8 To test ROSE optional Type make check to test the ROSE library against a collection of test codes See details of running make in parallel 9 To install ROSE type make install Installation is optional but suggested Users can simplify their use of ROSE by using it from an installed version of ROSE This permits compilation using a single include directory and the specification of only two libraries See details of installing ROSE in section 2 2
61. 1 for i 0 i lt n 1 i 1 for j 0 j lt n 1 j 1 tot i j 0 0 duzli 5 1 duz i 5 1 for k 1 k b 1 lt n 2 k 1 duzlillj k duz iljl k afk x duz illjl k 1 BIR totli j tot i ls ak duz a A duzlilljlin 1 duzla g n 1 tot g bin 1 duzli 5 n 2 duz i 5 n 2 eln 2 x duz i 5 n 1 for k n 3 k gt 0 k 1 duz illj k duz i lj k c k duz i lj k 1 elk duz i jn 1 b after fusion Figure 10 3 Optimizing tridvpk from Erlebacher combining loop interchange and fusion thus fusing multiple levels of loops simultaneously 112 CHAPTER 10 LOOP TRANSFORMATIONS to optimize their applications and one for compiler developers that intend to extend this package for various purposes 10 2 1 End User Interface The following function comprises the package interface for end users of the ROSE source to source infrastruc ture which applies various traversal and rewrite mechanisms to transform C applications using the SAGE intermediate representation Boolean SageLoopTransformation unsigned argc char argu SgGlobal r SgNodex n 10 2 Here both SgGlobal and SgNode are classes defined by the SAGE intermediate representation the SgGlobal pointer r represents the global root of an input program and the SgNode pointer n represent
62. 120001 install include Location of Boost include directory BOOST_CPPFLAGS pthread I home liao6 opt boost_1_35_0 include Location of library directory after make install ROSE_LIB_DIR home liao6 daily test rose 20081014_120001 install lib ROSE_LIBS ROSE_LIB_DIR librose la ROSE SOURCE DIR sourcetree exampleTranslators documentedExamples simpleT Default make rule to use all exampleTranslator if x ROSE_IN BUILD_TREE present xpresent then echo ROSE IN BUILD _T Example suffix rule for more experienced makefile users C o g c I ROSE_INCLUDE_DIR o 0 0 C Compile the exampleTranslator using the file identityTranslator C exampleTranslator lo libtool mode compile g I ROSE INCLUDE DIR BOOST CPPFLAGS example Translator exampleTranslator lo libtool mode link g 4 0 exampleTranslator exampleTranslator lo 1 Rule used by make installcheck to verify correctness of installed libraries check exampleTranslator c ROSE_SOURCE DIR testCode C Figure 3 2 Example of makefile to build the example translator Notice that we use the identityTranslator C file presented in ROSE Tutorial 3 4 RUNNING THE PROCESSOR 41 help Option Output ROSE pre release alpha version 0 9 3a This ROSE translator provides a means for operating on C C and Fortran sour
63. 2 8 Getting Help s i emoi u lek saan E a BOL ee ai 30 A E a Boe we te 4 wae Me 2 ch idat et atone 30 DB en see E Ses Gt a a A A ee a A 30 2 4 1 How We Test ROSE 30 2 4 2 What Parts of ROSE Are Robustl 0 0000 0 ee ee ee 32 2 4 3 What Parts of ROSE Are Not Robustl ee 32 2 0 Submitting a Bug Report 6 oo 4 32 O enn axe oe 32 37 31 ROSE Tutoriall yrs sd OS A ech AGREE a EE DARA eae RSG 37 Sih sh peopel wo Geta ay ah Caen ates ae GG ae He Ae ea Peep toe te eee es 38 PERERA eee ee BB Boe a eae eae 38 8 CONTENTS a Rik Ae aes BAR IIA See 39 3 4 1 Translator Options Defined by ROSE o o 0 000 000 0008 39 3 4 2 Command Line for ROSE Translators 2 ee 39 eG aa a ad dsd 39 43 penis we ghee dow DAT dde GG eee de Oa ee A de Na 43 DESIDIA AAA A E A AAA A ee A E 43 4 3 Directory Structure a ae aale paaa a a ga ee aw aes SESS 44 ged hh do a E a eS A AO ee We Se Se ot oat Hee 44 4 4 1 Implementation of ROSETTA 0 0 0 0 0 0002 ee ee 44 4 4 2 Implementation of Fortran SuUpportl ee 44 45 Se BS Syd eee ee ORS Ge ae ee ee ee ina 45 5 1 1 Differences Between SAGE and SAGE H o o e 45 5 1 2 Difference Between SAGE Il and SAGE WJ 2 2 ee ee 45 Ee ee ee eee ae 45 5 2 Comments Handling io nessa rra Oba e ded A A a aaoi 46 5 3 C Preprocessor cpp Directive Handling e a 46 54 Pragma Handling e cade mana Oe ee e
64. 40 node SgTypeSignedChar AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeShort AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeLongLong AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeLongDouble AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeLong AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeFloat AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeEllipse AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeDouble AST Memory Pool numberOfNodes 1 memory consumption 40 node SgTypeDefault AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeChar AST Memory Pool Statistics number0fNodes 1 memory consumption 40 node SgTypeBool AST Memory Pool Statistics numberU Nodes 1 memory consumption 40 node SgTryStmt AST Memory Pool Statistics numberUfNodes 1 memory consumption 40 node SgGlobal AST Memory Pool Statistics number0fNodes 1 memory consumption 36 node SgFunctionTypeTable AST Memory Pool Statistics number0fNodes 1 memory consumption 36 node SgCatchStatementSeq AST Memory Pool Statistics numberU Nodes 1 memory consumption 232 node SgUsingDirectiveStatement 15 6 Compilation Performance Timings An initial snapshot of the performance for the previous 40 000 line single file is included so that it is clear that the performan
65. 6 10 Testing the installation of ROSE optional To test the installation and the location where ROSE is installed against a collection of test codes the application examples in ROSE tutorial type make installcheck A sample makefile is generated 2 2 4 ROSE Configure Options A few example configure options are e Minimal configuration ROSE configure with boost BOOST_installTree This will configure ROSE to be compiled in the current directory separate from the Source Tree The installation from make install will be placed in usr local Most users don t have permission to write to this directory so we suggest always including the prefix option e g prefix pwd e Minimal configuration prefered ROSE configure prefix pwd with boost BOOST_installTree Configure in the current directory so that installation will also happen in the current directory a install subdirectory will be built e Turning on compiler debugging options prefered ROSE configure with CXX_DEBUG g with C_DEBUG g with CXX WARNINGS Wall prefix pwd with boost B00ST_installTree Configure as above but with debugging and warnings turned on Wal1 is specific to the gnu compilers e Adding Fortran support ROSE configure prefix pwd with boost BOOST_installTree with java The Open Fortran Parser will also be enabled allowing ROSE to process Fortran code The programs java javac and jar must be e
66. 7 16 SgVariable Declaration CSize l 17 44 45 50 SgWhileStmt SgReturnStmt CSize 2 CSize l o condition body expression_root 8 15 SglnitializedName CSize 2 18 27 28 43 SgExprStatement SgBasicBlock CSize 1 CSize 1 46 49 SgExpressionRoot CSize 1 initpte itempte expression_root 0 operand_i 9 14 19 26 29 42 47 48 SglnitializedName SgExpressionRoot SgExprStatement SgV arRefExp CSize 2 CSize 1 CSize l CSize 0 initpte itemptr operand_i expression_root 10 13 20 25 30 41 SgAssignInitializer dh SgGreaterThanOp SgExpressionRoot CSize 1 CSize 2 CSize l operand_i lhs_operand_i ths_operand_i operand_i 21 22 31 40 SgV arRefExp SgAssignOp CSize 0 CSize 2 lhs_operand_i ths_operand_i 32 33 34 39 SgV arRefExp SgSubtractOp CSize 0 CSize 2 lhs_operand_i ths_operand_i 35 36 SgV arRefExp CSize 0 Figure 7 12 The pair of numbers at nodes shows the order in which the function evaluateInheritedAttribute first number and evaluateSynthesizedAttribute second number is called in a top down bottom up processing 87 88 CHAPTER 7 AST PROCESSING Chapter 8 AST Rewrite Mechanism The Abstract Syntax Tree AST Rewrite Mechanism permits modifications to the AST To effect changes to the input source code modifications to the AST are done by a ROSE translator and new version of the source code is produced Although analysis is possible by only reading the AST transformations and
67. E Regression Tests Our regression test of collected bugs reported over several years helps prevent the reintroduction of old bugs during the development process Additional test codes and applications codes help provide more complete testing of ROSE Elsa Regression Tests Recent work has included the a separate regression test suit from the Elsa project an open source C parser project This is tested infrequently at this point but will be folded into standard ROSE regression tests in the future We wish to thanks Scott McPeak for the use of his rather large collection of tests that he uses within Elsa about 1000 test codes that test many corners of the C C99 and C language 2 4 ROBUSTNESS OF ROSE 31 Application Codes ROSE will be released after tests are complete on approximately 10 separate one million line application codes 1 KULL This is an important application at LLNL 2 ALE3D This is an important application at LLNL 3 ARES This is an important application at LLNL 4 CHOMBO This is an Adaptive Mesh Refinement AMR library at Lawrence Berkeley National Laboratory 5 DiffPack This is a numerical library originally developed at University of Oslo Norway The developers have been substantial collaborators to the ROSE project 6 ROSE The compilation of compiler project ROSE with itself is a milestone for any compiler project ROSE can be used to compile the ROSE source code and has provided a good test of
68. EH HEHHEHE HHHH HHH HEHEHE TEAR HEHEHEHEH HEHEHEHEHE HEHEHEHE HEHEHEHEH HEHEH HH HHHH HEHHEHE 122 123 124 case TEST_CLASS in 125 strings SubStringTest qmtest create o TEST a program PROGRAM a substring ERROR SUMMARY O errors from O contexts a arguments ARGUMENTS test TEST_CLASS 126 127 qmtest create o TEST a program PROGRAM a arguments ARGUMENTS test TEST_CLASS 128 esac create qmtest test file with test class 129 130 esac main ttttHtHHHRHHRHKHHHHRHHRRHHRHRHRHRHRHHHRHRRHHHHHRRHHRHHHRHHHHHHHHHHH 131 132 133 if BACKEND 0 4 NULL then 134 touch 0FILE gt amp dev null create dummy file and pipe error to NULL 135 exit 0 always exit 0 136 fi skip backend compilation 137 138 Execute backend compilation with original compile line 139 BACKEND 140 exit 134 CHAPTER 14 TESTING WITHIN ROSE 14 1 Introduction qm sh is a wrapper for the compile line in an arbitrary project build system that creates qmtest test files that test ROSE The basic assumption is that it is possible to isolate and modify the compile line command in most project build systems For example Makefile systems using make specify compile line commands after labels delimited by a colon One example of this may be gcc c g Wall hello c From this line qm sh would create a qmtest test file that executes a ROSE translator in the place of gcc but with the exact same arguments c g Wall and mor
69. ERATOR_END The C source code is extracted from between the named markers text labels in the named file and inserted into the generated source code Using this technique very small amounts of specialized code can be tailored for each IR node while still providing an automated means of generating all the rest Different locations in the generated code can be modified with external code Here we add the source code for a function Adding the set_type and get_type member functions It is not clear that this is required but all expressions must define a function that can be used to describe its type of the expression It is unfortunate but it is generally in compiling the generated source code that details like this are discovered ROSETTA has room for improvement VarArgOp setFunctionSource SOURCE_SET_TYPE_DEFAULT_TYPE_EXPRESSION Grammar Expression code VarArgOp setFunctionSource SOURCE_DEFAULT_GET_TYPE Grammar Expression code Modify the EDG SAGE connection code to have the new IR node built in the translation from EDG to SAGE III This step often requires a bit of expertise in working with the EDG SAGE connection code In general it requires no great depth of knowledge of EDG Two source files are usually involved a src frontend CxrFrontend EDG_SAGE_Connection sage_gen_be C which converts IL tree to SAGE III AST and is derived from EDG s C C generating back end cp gen_be c b sage_tl_to_str C contains help
70. INE_TABLE_PROJECTS DEFINE_TABLE FILES int main int argc char argv GlobalDatabaseConnection db int initOk db initialize assert initOk 0 CREATE_TABLE db projects CREATE TABLE db files initialize project string projectName testProject this should be given at the command lin string fileName testFile C this should be retrieved from a SgFile projectsRowdata prow UNKNOWNID projectName UNKNOWNID projects retrieveCreateByColumn amp prow name projectName long projectId prow get_id get id of this file filesRowdata frow UNKNOWNID projectId fileName files retrieveCreateByColumn amp frow fileName fileName frow get_projectId long fileld frow get_id do some work std cout lt lt Project ID lt lt projectId lt lt File ID lt lt fileld lt lt std endl db shutdown return 0 Figure 9 2 Source code for the predefined tables example 9 2 DATABASE SUPPORT FOR GLOBAL ANALYSIS 107 Database Graph Example include lt iostream gt include GlobalDatabaseConnection h include TableDefinitions h DEFINE_TABLE_PROJECTS DEFINE_TABLE_FILES DEFINE_TABLE_GRAPHDATA DEFINE TABLE GRAPHNOLDE DEFINE TABLE_GRAPHEDGE CREATE TABLE2 testtable string name double number DEFINE TABLE2 testtable string name do
71. OfIncludeRegion BeforeCurrentPosition 92 CHAPTER 8 AST REWRITE MECHANISM e ReplaceCurrentPosition e AfterCurrentPosition e BottomOfScope Function prototypes of interface functions void insert SgNode string HighLevelCollectionTypedefs ScopeIdentifierEnum HighLevelCollectionTypedefs PlacementPositionEnum Example of how to use specific insertion of transformation into the AST required traversal not shown insert astNode int x HighLevelCollectionTypedefs FunctionScope HighLevelCollectionTypedefs Top0fScope Operations using this interface have the following properties e Adds relative positioning to the specification of transformations e Requires traversal for operation on the AST e Operations are delayed and occur durring the required traversal all operations are completed by the end of the traversal e Operations occur on AST nodes along a path defined by the chain from the current input node to the operator to the root node of the AST SgProject 8 2 5 Advantages and Disadvantages of Rewrite Interfaces Each interface builds upon the lower level interfaces and each has some advantages and disadvantages Table 8 2 lists the major features and requirements associated with each The high level interface Level 4 presents the most sophisticated features but only works as part of a traversal of the AST The mid level interface is the lowest level interface that permits the specification of tra
72. QMTest class 31 PROGRAM 3 executable name 32 BACKEND 4 The execution string with backend compiler 33 ARGUMENTS I PWD argument stub general 34 OFILE The original object file 35 36 dic 37 38 for ARG in 39 do 40 COUNT 41 42 if COUNT gt 4 then 43 44 if ARG 0 9 testopt then 45 ARGUMENTS ARGUMENTS echo ARG sed e s testopt g 46 continue 47 fi parse out specific options to test only and not to backend 48 49 BACKEND BACKEND ARG build original compile line 50 51 Hoas attt HHHH HEHEHEHEHE HEHEHEHE HEHEH HEHEH HEHEHEHEHE HEHEHEHEHE HEHE HEHE HEHEHE EHR HEHEHE EHR HEHE HEHEHE HEHEHE HEHEHEHE HEHEHEHEHE HEHEHEHE 52 53 54 case ARG in 55 I includeFullPath 56 57 c cpp C cC 58 if ARG 0 1 gt J then 59 ARG pwd ARG 60 fi take care of absolute paths 61 62 rename the QMTest output test file Replace space period and plus 63 with their equivalents and change all chars to lower case 64 if MODE f J then 65 TEST echo ARG sed e s _ g sed e s _dot_ g A 66 sed e s plus g gawk print tolower 0 qmt 133 67 fi 68 case C C source files 69 70 o 71 if MODE o J then v2 FLAG 1 73 elif MODE f J then 74 FLAG 2 75 fi spike out the object flag 76 77 continue 78 c
73. ROSE User Manual A Tool for Building Source to Source Translators Draft User Manual version 0 9 3a Daniel Quinlan Chunhua Liao Thomas Panas Robb Matzke Markus Schordan Rich Vuduc and Qing Yi Lawrence Livermore National Laboratory Livermore CA 94550 925 423 2668 office 925 422 6278 fax dquinlan llnl gov liao64lInl gov panas2 llnl gov matzkelQlInl gov markus complang tuwien ac at richieQcc gatech edu qingyiQ cs utsa edu Project Web Page www rosecompiler org UCRL Number for ROSE User Manual UCRL SM 210137 DRAFT UCRL Number for ROSE Tutorial UCRL SM 210032 DRAFT UCRL Number for ROSE Source Code UCRL CODE 155962 ROSE User Manual pdf ROSE Tutorial pdf ROSE HTML Reference html only This ROSE User Manual is a very unevenly edited manual and contains many passages which simply seemed to its editors like a good idea at the time from the Hitchhiker s Guide To The Galasy October 14 2008 October 14 2008 Preface Welcome to the ROSE Compiler Framework Project The purpose of this project is to provide a mechanism for construction of specialized source to source translators sometime referred to less precisely as preprocessors ROSE provides simple programmable mechanisms to read and rewrite the abstract syntax trees generated by separate compiler front ends ROSE includes the Edison Design Group EDG front end in binary form within public distributions and is internally based upon SAGE III th
74. SAGE II for their work 2 1 ROSE Documentation and Where to Find It To simplify user access to the ROSE documentation the pre built postscript files are included in the ROSE docs Rose directory of each ROSE distribution These versions are always kept up to date by the au tomated build system that generates ROSE distributions e ROSE Web Page The ROSE Web page is located at The web page contains the ROSE manual tutorial and developer API The API provides details about IR nodes and their usage interfaces The documentation is generated by Doxygen e ROSE offline Web content ROSE docs Rose ROSE 0 9 3a HTML Docs ps gz ROSE HTML documentation that is available without internet access e MANUAL ROSE docs Rose ROSE 0 9 3a UserManual ps gz This is the ROSE User Manual which explains basic concepts about and capabilities within ROSE e TUTORIAL ROSE docs Rose Tutorial ROSE 0 9 3a Tutorial tar gz This is the ROSE Tutorial with numerous examples of how to use ROSE The tutorial documentation is constructed using the following steps 1 Actual source code for each example translator in the ROSE tutorial directory is included 2 Each example is compiled 3 Inputs to the examples are taken from the ROSE tutorial directory 23 24 CHAPTER 2 GETTING STARTED 4 Output generated from running each example is placed into the tutorial documentation Thus the ROSE tutorial contains exact examples and each example may be manipulated
75. Semantically the two forms of code are equivalent since the redundant constructor calls will be optimized away and so this represents a form of normalization within the AST Redundant casts and copy constructors The use of redundant casts are represented as nested calls to copy constructors Code such as std string arg5 std string std string std string is normalized to be 5 9 AST NORMALIZATION SUBTLE WAYS THAT ROSE OUTPUT DIFFERSFROM THE ORIGINAL SOURCE CODE59 std string arg5 std basic_string lt char std char_traits lt char gt std allocator lt char gt gt std basic_string lt char std char_traits lt char gt std allocator lt char gt gt 14 Array indexing represented as pointer arithmetic Array indexing is translated by EDG into pointer arithmetic It is not clear if this specific sort of AST normalization is desirable Code such as void foobar double d1 double d2 void foo double array int n array n new double 100 foobar amp array n n amp array n n is normalized to be void foobar double d1 double d2 void foo t double array int n array n new double 100 foobar array n n array n n 15 Case statements always have an attached SgBasicBlock object 16 Qualifiers are often normalized to longer names since they are computed on the fly as needed during un parsing The original qualified names
76. U gt U A lt U gt staticTemplateDataMember_T This is a template static data member SgVariableDeclaration template lt class U gt float A lt U gt staticTemplateDataMember_float template specialization for variable was originally defined to be float template lt gt float A lt double gt staticTemplateDataMember_float template specialization for variable error this is not possible type mismatch template lt gt float A lt double gt staticTemplateDataMember_T In the case of a SgVariableDeclaration the information about whether or not it is a specialization is kept with the SgVariableDeclaration instead of the SgInitializedName objects that stand for the individual variables Since the get_parent member function returns a pointer to the SgVariableDeclaration from the SgInitializedName this information is indirectly available from the SgInitializedName Enums typedefs namespaces etc cannot appear as templated declarations As a result only a few declarations contain template specific information SgClassDeclaration SgFunctionDeclaration SgVariableDeclaration 5 6 2 How Templates effects the IR Some IR nodes are present to support the use of templates in C These include e SgTemplateParameters Derived from SgSupport e SgTemplateArguments Derived from SgSupport e SgTemplateDeclaration Derived from SgDeclarationStatement Holds the template string any comments are removed Templat
77. UI INTERMEDIATE REPRESENTATION and appears in the AST and in the unparsed generated code as class A a class A aptr amp a class A aptrptr amp aptr aptr 1 aptr 1 aptrptr 1 aptrptr 1 Graptr gt Aptr 1 aptr gt Aptrptr 1 Normalization of const ref const 4 Const references such as X lt A const amp gt x3 are presently normalized to be X lt const A amp gt x3 Template arguments explicitly output Template types are output with template arguments Code such as std string var std string is normalized to be std string var std basic_string lt char std char_traits lt char gt std allocator lt char gt gt Constructor calls are really variable declarations C classes can define constructors When they do the constructors are represented in the AST as a member function declaration and marked specifically as a constructor conversion operators and destructors are also member function declarations and marked explicitly However the call to a constructor is a bit special in C and does not appear in the AST as a member function call It appears as a variable declaration within the AST fragment representing the variable declaration a SgConstructorInitializer is used So where a variable of a class type X is written in the code as X variable the form in the AST is more similar to the code represented by X variable XO
78. X Y y would not compile if unparsed as 159 class X 1 Y y Y class Y 4 class X Note that we can t just make a declaration as being a defining declarations since they are shared internally types and symbols can reference them etc Why are comments and CPP directives following the statements being removed and reinserted elsewhere 1 have been working on a translator based on the ROSE tutorial CharmSupport C translator If an include statement is in the top of the input code then the struct added to the top of the source file will contain the include statements in an obviously bad place struct AMPI_globals_t 1 A Comment include stdio h int a_global_int 3 I am specifying the end of construct for the SgClass Definition to be Sg_File_Info generateDefaultFilelnfoForTransformationNode The class declaration is prepended into the global scope How do I correctly insert the new definition and declaration into the top of a file either before or after the include statements The answer for anyone interested is found in a discussion relative to the ROSE Tutorial example Global Variable Handling currently Chapter 30 The problem is that comments and preprocessor cpp directives are attached to the statements When I wrote the tutorial example showing how to collect the global variables and put them into a data structure I was not careful to use the low level rewrite mechanism to do the removal of the variable
79. a four byte magic number usually 0x7f E L F The magic number can be modified by changing the string from SgAsmFileHeader get_magic It must be exactly four characters in length 5 ELF File Purpose lib executable core etc The file purpose should be modified by setting two fields using 1 SgAsmElfFileHeader set_p_e_type 2 SgAsmGenericFormat set purpose Both members should be set to compatible values The former is the value from the ELF specification and the latter is a constant PURPOSE_UNSPECIFIED PURPOSE LIBRARY PURPOSE EXECUTABLE PURPOSE_CORE_DUMP PURPOSE_PROC_SPECIFIC PURPOSE_OTHER FIXME set_p_e_type should probably call set_purpose but we can t go the other direction because the mapping is N 1 6 ELF Version To change the ELF version assign a new value by calling set_version on the object returned by SgAsmGeneric Header get_exec_format This doesn t have any effect on the code generated by the unparser since the parser only knows about ELF format 1 7 ELF Target Architecture Modify the target architecture by calling two functions SgAsmElfHeader set_e_machine sets the ELF specific value SgAsmGenericHeader set_isa sets the generic value You should call both with consistent values 126 CHAPTER 12 BINARY ANALYSIS SUPPORT FOR THE ANALYSIS OF BINARY EXECUTABLES 8 ELF Section or Segment Table location The SgAsmElfFileHeader set_ e_shoff and set_e_phoff methods have been rem
80. al stays within the file The return type T and the other parameters are discussed for each Ast Processing class in the following sections Further the following virtual methods can be defined by the user the default implementations are empty void atTraversalStart called by the traversal code to signal to the processing class that a traversal is about to start 73 FIXME This cha cover both Object Oriented Visi Traversal which h build but which I Markus is implement and t traversals based on ite the memory pools built in the latest inte of ROSE in December three of th implemented are pi the ROSE Tutorial wh a section for the uni classic Object Orien Pattern Traversal n memory pools as a only the code need 74 CHAPTER 7 AST PROCESSING include rose h class MyVisitor public AstSimpleProcessing protected void virtual visit SgNodex astNode Figure 7 1 Headerfile My Visitor h void atTraversalEnd called by the traversal code to signal that a traversal has terminated all nodes have been visited As these methods are the same for all processing classes they are not repeated in the class descriptions below 7 3 AstSimpleProcessing This class is called Simple because in contrast to three of the other processing classes it does not provide the computation of attributes It implements a traversal of the AST and calls a visit function at each node of the AST This can be done as a preo
81. ally Each interface has only three functions insert replace and remove 8 2 1 SAGE III Rewrite Interface This lowest possible level of interface is implemented as member functions on the SgNode objects It is used internally to implement the higher level interfaces including the Low Level Rewrite Interface Uniformly oper ations of insert replace and remove apply only to SAGE III objects representing containers SAGE III objects that have containers internally such as SgGlobal SgBasicBlock etc Strings cannot be specified at 89 90 CHAPTER 8 AST REWRITE MECHANISM insert SgNode string scope location Relative Positioning String High Level Interface eine contains state Based level 4 replace SgNode string scope location remove SgNode Stri Mid Level Interf insert SgNode string location rng id Level Interface ane i Based level 3 replace SgNode string location remove SgNode insert SgNode SgNode Absolute Positioning Low Level Interface has contains no state level 2 replace SoNode remove SgNode s SAGE 1111 f insert SgNode SgNode level 1 es replace SgNode SgNode remove SgNode Table 8 1 Different levels of the ROSE Rewrite mechanism this level of interface only subtrees of the AST may be specified New AST fragments must be built separately and may be inserted or used to replace existing AST subtrees in the AST Operations using this interface ha
82. are lost and as a result the generated types can be excessively long and not at all similar to the original source code For example the STL map const_iterator can become std p bree lt std map lt int int std less lt int gt std allocator lt std pair lt constint int gt gt gt key type std map lt int int std less lt int gt std allocator lt std pair lt constint int gt gt gt value ype std _Selectlst lt std map lt int int std less lt int gt std allocator lt std pair lt constint int gt gt gt value_type gt std map lt int int std less lt int gt std allocator lt std pair lt constint int gt gt gt key_compare std allocator lt std pair lt constint int gt gt gt const iterator This problem could be fixed by computing a style alias table to permit the shortest type name to always be used Either that or we should explicitly store the lists of qualified names and recompute them only where transformations have been done 17 Unnamed typedefs of enums are normalized to enums EDG appears to normalize unnamed typedefs to be enums and the information about the origin as an unnamed typedef is lost Since there appears to be no difference in the EDG AST ROSE is unable to recover when the typedef keyword was used This is not a real problem and the semantics of the application is the same Without the name of the typede
83. as with the Intel C and C compiler Future work will focus on portability to other platforms important to users Software Requirements You will require ONLY a C compiler to compile ROSE ROSE is written in C Present development work is done on Intel Linux platforms using the GNU g 3 4 x and 4 x and the Intel compilers ROSE users may either obtain a free research license from EDG and hence ROSE with EDG source code or alternatively obtain ROSE that contains a binary version of the EDG work The latter is limited to specific platforms and versions of compilers See EDG www edg com for details and limitations on how their software may be used There is more information in the ROSE Manual see chapter Getting Started section Getting a Version of the EDG License for Research Use Use of Required Software The following software is required in order to build and use ROSE e ROSE There are three versions of ROSE supported the Distribution Version for users typical the External Development Version for advanced users and collaborators and the Internal Development Version intended only for ROSE development team The development versions are what are found in the ROSE software repositories and have additional software requirements subversion JDK autoconf automake Doxygen LaTeX etc 2 2 ROSE INSTALLATION 25 Distribution Version Provided as a tared and compressed file in the form ROSE 0 9 3a tar gz It can be obtain
84. ase o T9 80 default 81 esac 82 83 A HHHH HEHEHEHE HHEH HHHH HEHEHEHEHE HEHEHEHEH HEHE HEHEHEHEH HEHEHEHE HEHEHE EHEHEHEH HHEH HH HHH HEHHEHE EHEH 84 85 86 if FLAG gt 0 then 87 OFILE ARG 88 fi name the object file after o declaration 89 90 if FLAG 1 then 91 if ARG 0 1 then 92 ARG pwd ARG 93 fi if argument not specified with absolute path then append PWD 94 95 rename the QMTest output test file replace space period and plus 96 with equivalent symbols and change all chars to lower case 97 TEST echo ARG sed e s _ g sed e s _dot_ g 98 sed e s plus g gawk print tolower 0 qmt 99 100 FLAG 0 reset FLAG 101 continue 102 elif FLAG 2 then 103 FLAG 0 reset FLAG 104 continue 105 fi if the o flag used create the object name and TEST name from object 106 107 HE LAL LEER EER HEH HEHEHEHE HEHEHE HEHEHEHE HEHEHEHE HEHEHEHEHE HEHEHEHEHE TAR AETHER REE RE 108 109 110 ARGUMENTS ARGUMENTS ARG 111 fi if argument is not qm sh argument 112 113 fitdonettt tt ttt HHHH HEHEHEHEH HEHH HHH HHHH HEHEHE HEHEHEHE HEHEHEHEHE HEHEHEHEHE HH HEHEH HEHEHEHEH HHHH HHHH HHHH 114 115 116 done for all command line arguments to qm sh 117 118 OBJECT TEST o name the object after the test file name 119 ARGUMENTS ARGUMENTS o 0BJECT 120 121 done cas ett ttt HHHH HHHH HH
85. ast installation default yes disable libtool lock avoid locking might break parallel builds Stenable purify apie veo e Owes Goes Enable purify API in code Augment the linker with purify enable purify windows turn on use of PURIFY windows option enable insure Augment the linker with insure enable dq developer tests Development option for Dan Quinlan disregard enable purify use memory managment that purify can understand disable checking don t do consistency checking in parser enable stand alone compile standalone edgcpfe enable cp backend generate c code as output enable sage backend generate sage tree enable rosehpct enable build of the ROSE HPCT module enable assembly semantics Enable semantics based analysis of assembly code enable purify linker Optional Packages with PACKAGE ARG use PACKAGE ARG yes without PACKAGE do not use PACKAGE same as with PACKAGE no0 with boost DIR use boost default is yes it is possible to specify the root directory for boost optional with boost libdir LIB_DIR Force given directory for boost libraries Note that this will overwrite library path detection so use this parameter only if default library detection fails and you know exactly where your boost libraries are located with boost thread special lib
86. ations 37 Where is the example ction We need to get jure closer to the tect Need to get the figure closer to the test 38 CHAPTER 3 WRITING A SOURCE TO SOURCE TRANSLATOR Example Source to Source Translator Example ROSE Translator used for testing ROSE infrastructure include rose h int main int argc char argv Build the AST used by ROSE SgProject sageProject frontend argc argv Run internal consistancy tests on AST AstTests runAllTests sageProject Insert your own manipulation of the AST here Generate source code from AST and call the vendor s compiler return backend sageProject Figure 3 1 Example of simple translator building and AST unparsing it and compiling the generated un parsed code 3 2 Example Translator This section shows an example translator that uses ROSE and how to build it The ROSE Tutorial discusses the design of the translator in more detail for now we need only an example translator to demonstrate the practical aspects of how to compile and link an application translator using ROSE In this example line 12 builds the AST a pointer of type SgProject Line 15 runs optional internal tests on the AST These are optional because they can be expensive several times the cost of building the AST Look for details in the Related Pages of the Programmer s Reference for what tests are run Line 20 generates the source cod
87. b Section Segment resizing high level The preferred way to extend a section is to call SgAsmGenericFile shift_extend which extends sections that contain the resized section and shifts sections that are right higher address of the resized section This function also takes into account alignment constraints memory address space and optionally holes in the address space 12 5 2 Modifications to the ELF File Header 1 Entry Point RVA The entry RVA stored in the ELF File Header is adjusted whenever the section into which it points is moved in memory It is also possible to adjust this address explicitly by modifying the first and only entry in SgAsm GenericHeader entry_rvas NOTE An RVA rose_rva_t is an offset with respect to the beginning of some section If the section starting memory address changes then the RVA implicitly changes RVA s are virtual addresses relative to some format wide base address Multiple sections can be mapped to the same memory e g text and ELF Load Segment 2 are typically overlap in memory but since an RVA is associated with only one section modifying the other section s has no effect on the RVA even if the RVA happens to be inside the other sections as well NOTE The binding between an RVA and a section can be modified with rose_rva_t set_section In fact the link can be completely broken by passing a null section pointer in which case the RVA is not relative to any section 2 F
88. be overwritten with actual current section offsets and sizes when the section table is unparsed e SgAsmElfSectionTableEntry set_sh_offset e SgAsmElfSectionTableEntry set_sh_size e SgAsmElfSectionTableEntry set_sh_addr NOTE Do not modify segment offsets and sizes by modifying the segment table entries Changes to these values will be overwritten with actual current segment offsets and sizes when the segment table is unparsed e SgAsmElfSegmentTableEntry set_offset e SgAsmElfSegmentTableEntry set_filesz e SgAsmElfSegmentTableEntry set_vaddr e SgAsmElfSegmentTableEntry set_memsz 1b Section Segment file address shifting high level The SgAsmGenericFile shift_extend method is the preferred way to make minor offset and or size adjustments to an ELF Section or Segment It is able to shift a section to a high file and or memory address and or extend the segment e It takes into account all sections in the file adjusting their offsets and or sizes accordingly e Sections to the right of the the section in question Sq are shifted upward to make room and prevent overlaps Sections overlapping with Sq are extended to contain all of what they previously contained e The shift amounts are adjusted to satisfy alignment constraints of all affected sections e Unreferenced areas of the file can optionally be utilized as unused address space e Adjusting file address spaces also adjusts the memory address spaces in a compatible manner
89. being unparsed Internally the directives annotate the AST tree decoration so that AST queries may use the directives at will Note that pragmas are a part of the language specification grammar and not a CPP directive Note also that extern C is also recognized so that it can be placed around include directives and other identified blocks of declarations Internally such declarations are explicitly marked as having extern C linkage 5 4 PRAGMA HANDLING 47 5 4 Pragma Handling The pragma is special and is not really a C Preprocessor cpp directive It is formally part of the C and C language grammar and thus we are justified in putting it into the AST with the rest of the language constructs comments and directives are open for a degree of interpretation as to where they can be attached within the AST Details of this subject may be open to minor changes in future releases of ROSE Pragmas are the mechanism in which C and C permit language extension Of course some people describe this a bit differently but pragma is not interpreted by CPP and it is interpreted by the compiler And it has a specific semantics since it is part of the language grammar The EDG documentation refers to them as pragma declarations so they should be treated that way This also is why they only really work in the grammar if they are declarations since they are only permitted were common declarations are permitted and no where else Note t
90. binariiies that require DLL support the required symbols in the SgAsmELFSymbolTable and the SgAsmCoffSymbolTable are left in place to support the DLL mechanism where as all entries in the ME We should get a SgAsmGenericSymbolTable are removed get more details from Robb for the details of what ols are left in stripped e Checking the symbols in the executable using nm and what symbols are ROSE permits a programmable interface to the binary executable file format but unix utility functions ed to support dynamic hehe whey ane shire provide text output of such details For example use nm D libs librose so c filt less to generate a list of all the symbols in an executable text output In this case ct filt resolved the original names from the mangled names for executables built from C applications The C symbols appear at the bottom of the listing Chapter 17 FAQ This chapter accumulates frequently ask questions FAQ about ROSE The questions are not created by the authors such FAQs are not particularly useful 1 Is ROSE a preprocessor a translator or a compiler Technically no ROSE is formally a meta tool a tool for building tools ROSE is an object oriented frame work for building source to source translators A preprocessor knows nothing of the syntax or semantics of the language being preprocessed typically it recognizes another embedded language within the input file or attempts to recognize subsets of sourc
91. ce classes of the implementation to port it to a different compiler infrastructure other than ROSE Second they can provide their own profitability analysis to expand the transformation policy classes of the implementation 10 3 ANALYSIS AND TRANSFORMATION TECHNIQUES 113 Porting to a different compiler infrastructure The package provides the following infrastructure independent interface to compiler developers AstNodePtr LoopTrans formation LoopT rans formInter face amp inter face const AstNodePtr amp head 10 3 Here the class LoopTransformInter face provides the interface for accessing the intermediate representation of an arbitrary compiler and the pointer reference AstNodePtr represents an arbitrary code fragment to be transformed Both classes AstNodePtr and LoopTransformInter face need to be defined at location outsideInter face LoopTrans formInter face h which currently contains the ROSE implementation of these two classes By rewriting this file a compiler developer can port the package to a completely different infras tructure this package already works under two compiler infrastructures the ROSE C infrastructure and the DSystem Fortran infrastructure at Rice University 29 Plugging in different profitability analysis algorithms This package provides a static configuration class LoopTrans formOptions defined in the location driver LoopTrans formOptions h of the package distribution for plugging in different loop
92. ce code Usage rose OPTION FILENAME If a long option shows a mandatory argument short option as well and similarly for optional arguments Main operation mode rose o output FILENAME file containing final unparsed C code relative or absolute paths are supported Operation modifiers rose output_warnings it as well as on x86 and ARM object code is mandatory for the equivalent compile with warnings mode on rose C_only rose C follow C89 standard disable G rose C99_only rose C99 follow C99 standard disable rose Cxx_only rose Cxx follow C 89 standard rose UPC_only rose UPC follow Unified Parallel C 1 2 specification rose upc_threads n Enable UPC static threads compilation with n threads n gt l1 static threads dynamic default otherwise rose Fortran rose F rose f compile Fortran code determining version of Fortran from file suffix rose Fortran2003 rose F2003 rose f2003 compile Fortran 2003 code rose Fortran95 rose F95 rose f95 compile Fortran 95 code rose Fortran90 rose F90 rose f90 compile Fortran 90 code rose Fortran77 rose F77 rose f77 compile Fortran 77 code rose Fortran66 rose F66 rose f66 compile Fortran 66 code rose FortranlV rose FIV rose fIV compile Fortran IV code rose Fortranll rose FII rose fIl compile Fortra
93. ce code of the source to source is a small multiple of the cost of the compilation using g when g is used at its fastest with no optimization Performance Report resolution 0 010000 number of IR nodes 289439 memory used 20144 Kilobytes AST SgProject parse argc argv time sec 18 382917 AST SgProject parse time sec 18 381067 AST SgFile Constructor time sec 18 380805 AST Front End Processing SgFile time sec 4 846442 AST Constrution included Sage III Translation time sec 4 840888 EDG AST Constrution time sec 0 807095 AST EDG Sage III Translation time sec 3 926241 AST post processing time sec 13 513127 fixup function definitions missing body time sec 0 379914 fixup template declarations time sec 0 435447 reset parent pointers time sec 2 468755 subTemporaryAstFixes time sec 1 303070 initialize IR nodes containing explicit scope data member time sec 0 122380 reset template names time sec 1 433229 fixup class data member initialization time sec 0 575695 fixup for generation of GNU compatable code time sec 0 580172 testing declarations no side effects to AST time sec 0 638836 fixup storage access of forward template declarations EDG bug time sec 0 542976 fixup template specializations time sec 0 860818 mark template specializations for output time sec 0 595816 mark template instantiations for output
94. changes in the output code from the input code can only be accomplished by rewriting portions of the AST The AST is the single intermediate form manipulated by the preprocessor All changes are eventually output as modifications to the input source code after being processed through the intermediate form The material in this chapter builds on material presented in the previous two chapters Writing a Source to Source Preprocessor chapter and AST Processing chapter 7 This chapter presents the required AST Rewrite Traversal and the simple interface functions to the AST_Rewrite class A section is included that demonstrates code that rewrites the AST for any input code More complex examples are possible but each uses the AST Rewrite Mechanism in a similar way The ROSE Tutorial documents a few more interesting examples 8 1 Introduction The rewrite mechanism in ROSE contains four different levels of interface within its design Table 8 1 shows the different levels of the interface design for the ROSE rewrite mechanism Each level consists of simple tree editing operations insert replace and remove that can operate on statements within the AST 8 2 Multiple Interfaces to Rewrite Mechanism There are four different levels of interfaces in the rewrite mechanism because there are many different program transformations requirements Each level builds on the lower level and the highest level interface is the most sophisticated intern
95. changing either the example translators or the inputs to the examples e PAPERS ROSE ROSE_RESEARCH_PAPERS tar gz These are the current ROSE related research papers The ROSE project maintains a mailing list casc rose at ln dot gov The email list is only used to request help and announce internal and external releases Anyone who would like to be on the email list can send an email to dquinlan at llnl dot gov Those with LLNL internal access may check who is on the email list at ROSE Email List current subscribers LLNL internal URL link cmg r Inl gov casc computing mailing_lists casc rose html At some point the email list will be made more generally open to automated substription 2 2 ROSE Installation 2 2 1 Requirements and Options ROSE is general software and we ultimately hope to remove any specific software and hardware requirements However our goal is to be specific about where and how ROSE is developed and where it is regularly tested Required Hardware Operating System ROSE has been developed on Linux Intel platforms We have not yet addressed significant portability issues for ROSE But EDG has addressed portability issues for their C frontend and it is available on nearly all platforms see www EDG com for details ROSE is currently developed on Linux Intel platforms and works with all modern versions of the GNU compilers 3 4 x and later ROSE also works on both 32 bit and 64 bit architectures as well
96. checkers may be defined over the AST or any other graph built within ROSE to store program analysis information See the Compass manual for more details on supported binary analysis The ability to perform analysis of binary executables using Compass makes no assumptions that it is compiled with any specific options or that it contains debug information symbols etc 12 5 Static Binary Rewriting As part of general research on transformations of binaries separate from analysis a number of techniques have been developed to support classes of transformations This static rewritting of the binary permits the development of performance tools that could support the analysis and rewritting of binaries for support of High Performance Computing HPC As principal focus in on IBM BGL and Cray XT support DOE Office of Science supercomputers 12 5 1 Generic Section Segment Modifications la Section Segment file address shifting low level The low level movement of an ELF Section or Segment within the file address space is performed with SgAs mGenericSection set_offset It changes the location of the section in the file and updates all relative virtual addresses RVAs that were primarily associated with the moved section The main problems with this function are that it doesn t take into account the file locations of other sections the file alignment constraints of the moved section or the memory mapping Specifically after calling this function t
97. ck of which is the defining declaration and which is the first nondefining declaration the information about which is a forward declaration is somewhat redundant The unparser can t just use the result of isForward since declarations can be shared This would result in unparsing the class definition multiple times Thus we separate the two concepts of defining and nondefining Defining declarations are never shared except through the definingDeclaration pointer only non defining declarations are shared through the firstNondefiningDeclaration pointer SAGE III contains a SgDeclarationStatement IR node from which all declarations IR nodes are derived e g SgClassDeclaration SgFunctionDeclaration etc Contained in the SgDeclarationStatment IR node are pointers accessed through corresponding get and set_ member functions access functions to the first declaration called firstNondefiningDeclaration and the defining declaration called definingDeclaration Both of these pointers are used internally when a pointer is required to a declaration so that the same first declaration can be shared and within the unparser most importantly to output the definition where it appeared in the original code These pointers are initialized in the EDG Sage interface code and are in a few cases redundant forward declarations where only the first one is given a proper reference to the defining declaration fixed up in the ROSE src frontend SagelII AstFix
98. ct the generated source code including comments and CPP translator control directives include if else endif etc and the original application s indentation and variable names etc ROSE is unique because it makes traditional compiler infrastructure accessible to library and tool developers who are not likely to have a significant compiler background Still some basic knowledge of an Abstract Syntax Tree AST is assumed and unfortunately currently required Figure 1 1 shows the different phases of processing within ROSE 1 7 ROSE Web Site We have a ROSE Project Web page that can be accessed at the Web pages at http www rosecompiler org This site is updated regularly with the latest documentation and software as it is developed 1 8 ROSE Software Documentation ROSE is not yet released publicly on the Web but is available within the SciDAC Performance Evaluation Research Center PERC project and through limited collaborations with the developers at universities and other laboratories Since the spring of 2006 we have made ROSE available via a password protected web page to all who have ask for access More information is available on the Web pages located at http www rosecompiler org Web pages are updated regularly postscript versions of documentation are available as well 1 9 About This Manual This section includes a description of what this manual provides how to use the manual and the terminology rela
99. d project FIXME This shoulo This example shows the setup required to use the AST Rewrite Mechanism The next section shows how to that fits onto a add new code to the AST The main function is as in example of how to use a traversal see chapter Note that the differences between the traversal required for use with the AST Rewrite Mechanism is different from the traversals associated with The exact differences are enabled and disabled in the example 8 4 by setting the macro USE_REWRITE MECHANISM to zero 0 or one 1 The differences between traversals using AstTopDownBottomUpProcessing lt InheritedAttrbute SynthesizedAttribute gt and traversals using the AST Rewrite Mechanism AST_Rewrite RewriteTreeTraversal lt InheritedAttrbute SynthesizedAttrib are both required to use the AST Rewrite Mechanism They are 1 InheritedAttributes must derive from AST_Rewrite InheritedAttribute 2 Must define constructor InheritedAttribute InheritedAttribute SgNode astNode 3 Must define copy constructor InheritedAttribute InheritedAttribute const InheritedAttribute amp X SgNode astNode 4 SynthesizedAttribute must derive from AST_Rewrite SynthesizedAttribute 5 Must derive new traversal from AST Rewrite RewriteTreeTraversal lt InheritedAttrbute SynthesizedAttribute gt instead of AstTopDownBottomUpProcessing lt InheritedAttrbute SynthesizedAttribute gt 8 5 Examples This section presents several examples using th
100. de to the EDG front end You will not need a version of EDG directly from them 34 CHAPTER 2 GETTING STARTED configure help Option Output Part 1 configure configures ROSE 0 9 3a to adapt to many kinds of systems Usage sourcetree configure OPTION VAR VALUE To assign environment variables e g CC CFLAGS specify them as VAR VALUE See below for descriptions of some of the useful variables Defaults for the options are specified in brackets Configuration h help display this help and exit help short display options specific to this package help recursive display the short help of all the included packages V version display version information and exit q quiet silent do not print checking messages cache file FILE cache test results in FILE disabled C config cache alias for cache file config cache n no create do not create output files sredir DIR find the sources in DIR configure dir or Installation directories prefix PREFIX install architecture independent files in PREFIX usr local exec prefix EPREFIX install architecture dependent files in EPREFIX PREFIX By default make install will install all the files in Jusr local bin usr local lib etc You can specify an installation prefix other than usr local using
101. declarations from the global scope and the insertion of the same variable declarations into the scope of the class declaration the struct that holds the previously global variables Since the comments and cpp directives were attached to the variable declaration they got moved with the declaration into the new struct that holds them see the example in the tutorial I should have used the rewrite s mechanism for removing and reinserting the variable declarations since it is careful to disassociate and reassociate comments and cpp directives In fact it is quite incredible that I didn t use that slightly higher level interface because I wrote all that stuff several years ago and it was so much work to get it all correct I m a big believer in using the highest level of interfaces possible which perhaps means I should document them better in the Web pages for the IR instead of just in the ROSE User Manual The AST Rewrite Mechanism functions to use are the LowLevelRewrite remove SgStatement astNode and LowLevelRewrite insert SgStatement targetStatement SgStatementPtrList newStatementList bool insertBeforeNode These will automatically disassociate any cpp directives and comments from the surrounding statements and reattach them so that they don t wander around with the statements being removed inserted or replaced 1 Check if this code is n our regression tests 160 10 11 12 CHAPTER 17 FAQ I will
102. dor compiler It is not clear if ROSE needs to be so aggressive in eliminating template instantiations by the back end vendor compiler but doing so allows all template instantiations to be made available for transformation using ROSE For simplicity we only output within code generation those template instantiations that are required due to transformations and allow the back end compiler to generate as many of the required template instantiations as possible In order for a template to be transformed we must save it into the SAGE III AST If it is a class template then we only want to unparse it into the final transformed code if it was modified Otherwise its member functions and static members will not be defined at link time Fundamentally specialization of a class disqualifies the instantiation of its member functions from the original template declaration because the newly instantiated template class becomes a distinct and separate class no longer associated with the original template The vendor compiler can generate code for the new template class using the original template declaration or the member functions associated with the original template declaration All the functions must be generated to go along with the new specialized form of the templated class which we had to specialize to permit it to be transformed This potentially massive generation of all the member functions of a class applies only to transformations on class templa
103. e FEATURE no0 enable FEATURE ARG include FEATURE ARG yes enable dot2gml_translator Configure option to have DOT to GML translator built bison version specific tool disable dependency tracking speeds up one time build enable dependency tracking do not reject slow dependency extractors with ROSE_LONG_MAKE_CHECK_RULE yes specify longer internal testsing by make check rule disable xmltest Do not try to compile and run a test LIBXML program disable binary analysis tests Disable tests of ROSE binary analysis code enable edg union struct debugging Specify if EDG Union Struct debugging support is to be used with_Mesa_help string enable doxygen developer docs Enable display of internal project detail with Doxygen enable doxygen generate fast docs Enable faster generation of Doxygen documents using tag file mechanism to connect Sage III documentation to Rose documentation documentation not as presentable enable static PKGS build static libraries default no enable ltdl install install libltdl enable shared PKGS Figure 2 1 Example output from configure help in ROSE directory Part 1 2 6 GETTING A VERSION OF THE EDG LICENSE FOR RESEARCH USE configure help Option Output Part 2 build shared libraries default yes enable fast install PKGS optimize for f
104. e either because a the template declaration is of the class and the member function is declared in the class or b the template declaration is of a member function of a templated class and is defined declared outside of the class In this case the member function can be for a template or non template member function but not both Calls via dereferencing of function pointers Function calls from dereferencing pointers to functions can be represented with two different forms of syntax For example xPtr O xPtr O appears in the AST and in the unparsed generated code as xPtr O xPtr O C style cast are normalized to C style casts EDG appears to normalize all C style cases to C style casts We are working on the analysis to backout where C style casts could in fact be C style casts of a specific classification const_cast static_cast dynamic_cast and reinterpret_cast 5 9 AST NORMALIZATION SUBTLE WAYS THAT ROSE OUTPUT DIFFERSFROM THE ORIGINAL SOURCE CODE57 8 Floating point literal normalization Floating point literals are internally represented in EDG as float double or long double dependent on the type thus the exact string representing the floating point literal is lost We have modified EDG to save the string representation from the token stream the floating point literal this work is recent and handles all the different ways that floating point literals can be expressed even including h
105. e ROSE Tutorial for examples CFG As used in ROSE this is the Control Flow Graph not Context Free Grammar or anything else EDG Edison Design Group the commercial company that produces the C and C front end that is used in ROSE IR Intermediate Representation IR The IR is the set of classes defined within SAGE III that allow an AST to be built to define any application in C C and Fortran application Query as in AST Query Operations on the AST that return answers to questions posed about the content or context in the AST ROSE A project that covers both research in optimization and a specific infrastructure for handling large scale C C and Fortran applications Rosetta A tool written by the ROSE team used within ROSE to automate the generation of the SAGE TIT IR SAGE and SAGE IT An older object oriented IR upon which the API of SAGE III IR is based Semantic Information What abstractions mean short answer This might be better as a description of what kind of semantic information ROSE could take advantage not a definition 163 FIXME Define ti terms IR nod Attribute Synthesize Accumulator Att 164 CHAPTER 18 GLOSSARY Telescoping Languages A research area that defines a process to generate domain specific languages from a general purpose languages Transformation The process of automating the editing either reconfiguration addition or deletion or some combination of input application par
106. e a oe e baa a eee BESS 47 a a a E ce Se oe aan Chee 47 5 6 Template Handling in C 0 0 0 0 0000 2 ee ee 48 A beer ees 48 5 6 2 How Templates effects the IR 2 aaa ee 49 a ge Ue is te ake Ass 50 5 6 4 Umnparsing Templates e 50 5 6 5 Templates Details e 2 222 224 eee 8 86648 2 a A ae 51 aa DADA Da a ee 53 E hd Oe ge See ene 54 EIA A ke Pee E AR 54 5 9 AST Normalization from the Original Source Code 5 10 Non Standard Features aca 62 Bh ap AD a ee aa EA et So BA 63 5 12 Comments About Declarations Defining Declarations vs Nondefining Declarations 63 a au ay Ae Lat tems E ele er ee tee te ee A en 64 pad a A AA ee eee 65 5 15 How to Control Language Specific Modes C C C99 UPC o o o ooo ooo o 65 o 67 5 15 2 Use c filename suffix to compile C language files o o o 67 CONTENTS 6 Query Library 61 Introduction lt ss resou eee eee eee ee eee ee ee SEAS ORR RR hae aaa a ee de CE PARE AS G 2 1 Interface Functions 2 22 24 ba wats ee A a ad eee ee Raaiey eee 2 eee A ee a vee ee A 6 4 User Defined Functions 2 ee Pod bh hoe alae i a iadd OGRE E aed da Poe SE Le ae aed sedit ide pipid dine 44a eee caadeceh ead ing Yar 7 AST Processing LE Tr duction ooo ara aaa Ok ee ea ee RE ee eet a amp Ak Be a ears ee Ii 7 3 AstSimpleProcessing 2 1 ee 1 3 1 Example oA edad see atadeee eee eee etdatbese de e e a TA AstPrePo
107. e different interfaces to specify simple transformations 96 CHAPTER 8 AST REWRITE MECHANISM 8 5 1 String Specification of Source Code Both the mid level and high level interfaces use strings to specify source code The examples below show how to specify the strings Specification of Source Code Specification of source code is straight forward However quoted strings must be escaped and strings spanning more then one line must use the string continuation character e MiddleLevelRewrite insert statement int newVariable locationInScope e MiddleLevelRewrite insert statement timer functionName locationInScope e MiddleLevelRewrite insert statement Starting Comment An int y int y for y 0 y lt 10 y z 1 n Ending Comment n locationInScope Specification of CPP Directives Specification of CPP directives as strings is as one would expect except that where quotes appear in the string they must be escaped to remain persistent in the input string e MiddleLevelRewrite insert statement define TEST locationInScope e MiddleLevelRewrite insert statement include lt foo h gt locationInScope e MiddleLevelRewrite insert statement include foo HA locationInScope Specification of Comments Specification of comments are similar e MiddleLevelRewrite insert statement C style comment test locationInScope e MiddleL
108. e first object in the results vector should be identical to the inserted one Lines 27 and 28 modify the name and number fields in memory The modify call in line 29 then updates the database by changing the existing row in the table and making the changes persistent Line 32 is an exemplary call to delete a row of the table the deletion uses the id of a row so all other fields do not have to contain the same values as the row stored in the database The insert statement in line 35 simply inserts the row just deleted into the table again leaving the test table in a different state Hence executing the example program multiple times should fill the test table with multiple rows In line 37 the connection to the database is closed Try to add a call to GlobalDatabase DEBUG_dump before the shutdown function call and run it multiple times to see how the automatic id assignment works 9 2 2 Working With the Predefined Tables While the first database example worked on a self defined table this tutorial will explain how to use one of the tables that are predefined for usage within ROSE Its source code is shown in Figure 9 2 These tables are easier to use because their structure is already defined in the TableDefinition h file Lines 6 and 7 define the tables used for storing information about projects and files in ROSE using the macros DEFINE_TABLE_PROJECTS and DEFINE_TABLE_FILES These macros call the corresponding macros from the previous exa
109. e found in test2006_95 C 5 10 Non Standard Features C Extensions That We Are Forced to Handle Philosophically I don t think much of language extensions we don t add any we don t think we should add any and we would not trust ourselves to add them correctly That having been said there are a few C extensions that are introduced by EDG only one that I know of and a fair number by g Because in many cases these features are implemented differently we find them all worth avoiding However some applications use them so we are somewhat forced to support them and handle the differences between how they are supported within both the EDG front end and the back end compiler most often GNU g We list specific non standard features of C that we are forced to handle because applications we compile mistakenly use them One non standard feature that requires special handling in ROSE is the in class initialization of static const non integer types In class initialization refers to code such as class X public static const int integerValueConstant static const int integerValueConstant static const bool booleanValueConstant static const char charValueConstant 42 Legal C code 42 Legal C code true Legal C code 0 Legal C code Illegal C code non standard does not compile with EDG but does with g static const double doubleValueConstant1 3 14 Illegal C code non standard but comp
110. e from the AST and compiles it using the associated vendor compiler the backend compiler 3 3 Compiling a Translator We can use the following makefile to build this translator which we will call exampleMakefile to avoid name collisions within the build system s Makefile In this case the test code and makefile have been placed into the following directory CompileTree ExampleTranslators DocumentedExamples SimpleTranslatorExamples The makefile exampleMakefile is also there To compile the test application type make f exampleMakefile This builds an example translator and completes the demonstration of the build process a process much like what the user can create using any directory outside of the ROSE compile tree 3 4 RUNNING THE PROCESSOR 39 3 4 Running the Processor This section covers how to run the translator that you built in the previous section Translators built with ROSE can be handed several options these are covered in subsection The command line required for the example translator is presented in subsection Example output from a translator is presented in subsection 3 4 1 Translator Options Defined by ROSE The details of these options can be obtained by using the help option on the command line when executing the translator For example using the example translator from the previous section type exampleTranslator help Figure shows the output from the help option 3 4 2 Command Line for ROSE Translators
111. e if the user of qm sh should specify them qm sh also mimics the compile line process of the project s build system so that all dependencies are built as normal by the backend compiler 14 2 Usage qm sh lt flo gt lt QMTest test class gt lt ROSE translator gt lt Backend compiler gt compiler arguments ROSE arguments lt flo gt The output file naming mode Option f specifies qm sh to use source file names in naming output 4 qmt files Option o specifies qm sh to use object file names as specified by the compile line o flag to the backend compiler for naming qmt output files lt QMTEest test class gt The test class of the created test file i e rose RoseTest or command ExecTest lt ROSE translator gt The full path specifying a ROSE translator lt Backend compiler gt The name of the backend compiler used in the normal compilation of the project build system Specify NULL as the lt Backend compiler gt if you want to skip backend compilation compiler arguments The arguments specified on the command line of the project build system ROSE arguments The arguments to the ROSE translator prefixed with rose lt ROSE argument gt e x rose edg no_warnings Note these may be placed anywhere after the lt Backend Compiler gt argument 14 3 Variables COUNT The for loop counter keeps track of number of qm sh arguments FLAG Logical flag variable used in naming
112. e language In contrast translators process the input language with precision identical to a compiler Since ROSE helps build source to source translators we resist calling the translators compilers since the output is not machine code This point is not a required part of the definition of a compiler many language compilers use a particular language as an assembly language level typically C These are no less a compiler But since we do source to source we feel uncomfortable with calling the translators compilers the output language is typically the same as the input language The point is further muddled since it is common in ROSE to have a translator hide the call to the vendor s com piler and thus the translator can be considered to generate machine code But this gives little credit to the vendor s compiler So we prefer to refer to our work as a tool or framework for building source to source translators 2 What does the output from a ROSE translator look like A great deal of effort has been made to preserve the quality of your original code when regenerated by a translator built using ROSE ROSE preserves all formatting comments and preprocessor control structure There are examples in the ROSE Tutorial that make this point clear 3 How do I debug my transformation There are a couple of ways to debug your transformation but in general the process starts with knowing exactly what you want to accomplish An example of your tran
113. e name Template parameters e SgTemplateInstantiationDecl may be renamed to SgTemplateInstantiationClass Declaration Derived from SgClassDeclaration Reference to SgTemplateDeclaration Template arguments e SgTemplateInstantiationFunctionDecl Derived from SgFunctionDeclaration Reference to SgTemplateDeclaration Template arguments e SgTemplateInstantiationMemberFunctionDecl Derived from SgMemberFunctionDeclaration 50 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION Reference to SgTemplateDeclaration Template arguments e SgTemplateInstantiationDirective This forces the explicit instantiation of the specified template when and where it appears in the source code Nodes not added include a judgement call for now SgTemplateClassDeclaration SgTemplateFunctionDeclaration SgTemplateMemberFunctionDeclaration SgTemplateDataMemberDeclaration There are many types of template declarations at present there is an enum type which identifies each category of template declaration The enum type is enum template_type_enum e_template_none e_template_class e_template_m_class e_template_function e_template_m_function e_template_m_data hs oOPWNRO A data member of this type is held in the SgTemplateDeclaration We might have to distinguish between template member functions and member functions of template classes so that we can exclude instantiation of template member functions separately from
114. e of user defined analysis data Such database support is one way to handle global analysis the other way is to build the whole application AST Visit www sqlite com for details and to download software An example showing the use of the ROSE database mechanism is in the ROSE Tutorial Use of SQLite requires special ROSE configuration options so that the SQLite library can be added to the link line at compile time See ROSE configuration options for more details configure help mpicc mpicc is a compiler for MPI development If ROSE is configures with MPI enabled one can utilize features in ROSE that allow for distributed parallel AST traversals 2 2 2 Building BOOST The following is a quick guide on how to install BOOST For more details please refer to www boost org 1 Download BOOST Download BOOST at www boost org users download 2 2 ROSE INSTALLATION 27 Untar BOOST Type tar zxf BOOST VersionNumber tar gz to untar the BOOST distribution Create a separate compile tree Type mkdir compileTree to build a location for the object files and documentation use any name you like for this directory e g BOOST_BUILD Create a separate install tree Type mkdir installTree to create a location for the install filesto reside e g BOOST INSTALL Change directory to the new compile tree directory Type cd compileTree This changes the current directory to the newly created directory Run the confi
115. e previous example in combination with a AST traversal to build a simple callgraph 9 2 DATABASE SUPPORT FOR GLOBAL ANALYSIS 105 Database Connection Example database access include GlobalDatabaseConnection h include TableAccess h CREATE TABLE2 testtable string name double number DEFINE_TABLE 2 testtable string name double number int main int argc char xargv GlobalDatabaseConnection db int initOk db initialize assert initOk 0 TableAccess lt testtableRowdata gt testtable amp db testtable initialize add a row testtableRowdata testrow UNKNOWND name 1 0 testtable insert amp testrow select amp modify vector lt testtableRowdata gt results testtable select number 1 0 assert results size gt 0 results 0 set_name string results 0 set_number 2 0 testtable modify amp results 0 this uses the ID of the row newname remove entry testtable remove amp results 0 add again for next run testtable insert amp results 0 db shutdown return 0 Figure 9 1 Source code for the database connection example 106 CHAPTER 9 PROGRAM ANALYSIS Table Creation Example include lt iostream gt a atabase access include GlobalDatabaseConnection h include TableDefinitions h DEF
116. eTwoParameters FunctionDeclarationFromDefinition ClassDeclarationFromName ClassDeclarationsFromTypeName PragmaDeclarationFromName VariableDeclarationFromName 3 6 4 User Defined Functions Both C style functions and C style functionals can be used for the user defined query functions The C style functionals can be used together with powerful concepts like std bind etc to make the interface very flexible An example functional is class DefaultNodeFunctional public std unary_function lt SgNode std list lt SgNode gt gt public result_type operator SgNode node result_type returnType returnType push_back node return returnType For the legacy C Style interface there are two type of functions typedef std list jSgNode roseFunctionPointerOneParameter SgNode typedef std list SgN ode roseFunctionPointerTwoParameters SgNode SgNode The second function allows a user defined second parameter which can be provided to the interfaces directly This parameter has no side effect outside the user defined function For the querySubTree the second parameter to the interface will be the parameter to the user defined function but for the memory pool traversal and the query of a node list the first parameter will be the second parameter to the user defined function 6 5 Name Queries The name query provides exactly the same interfaces as the NodeQuery except for two differences the user
117. ect he made significant con tributions while employed at Lawrence Livermore National Laboratory LLNL including the AST traversal mechanism We continue to work with Markus who is now at Vienna University of Technology as an Associate Professor We were also fortunate to leverage a significant portion of Qing Yi s thesis work under Ken Kennedy and we would like to thank her for that work and the work she did as a post doc at Lawrence Livermore National Laboratory We continue to work with her although she is now at the University of Texas at San Antonio There are many additional people to thank for our current status in the ROSE project Contributing Collaborators Markus Schordan Vienna University of Technology Rich Vuduc Georgia Tech and Qing Yi University of Texas at San Antonio Post docs including former post docs Chunhua Liao from University of Houston Thomas Panas from Vaxjo University Sweden Markus Schordan from University of Klagenfurt Austria Rich Vuduc from University of California at Berkeley and Jeremiah Willcock from Indiana University Qing Yi from Rice University Students Gergo Barany Technical University of Vienna Michael Byrd University of California at Davis Gabriel Coutinho Imperial College London Peter Collingbough Imperial College London Valentin David Uni versity of Bergen Norway Jochen Haerdtlein University of Erlanger Germany Vera Hauge University of Oslo Norway
118. ed from outreach scidac gov projects rose This is the most typical way that users will see and work with ROSE But it is less up to date compared to development versions External Development Version It is available from the SciDAC Outreach Center s subversion repository We put a subset excluding the EDG part essentially of the internal developer version of ROSE into the external repository to enable people to have quick access to the most recent new features in ROSE The external repository is synchronized with the internal repository once a day in ideal conditions Several branches also exist to accept contributions from external collaborators Internal Development Version Only available directly from the LLNL s internal Subversion SVN repository The details of building this version are located in the Appendix of the Manual g version gt 3 4 x In order to use OpenMP or gFortran g gt 4 2 x is required BOOST version gt 1 35 0 Visit for more details about BOOST and www boost org users download for download and installation instructions JAVA version gt 1 5 0_11 A SUN Java virtual machine JVM is needed A Java compiler JDK is also required for development versions Autoconf version gt 2 59 Needed ONLY for development versions Autoconf is an extensible package of M4 macros that produce shell scripts to automatically configure software source code packages Automake version gt 1
119. ed to it by name by defining a unique name string for the attribute The user needs to implement a class that inherits from AstAttribute Instances of this class can be attached to an AST node by using member functions of SgNode attribute Example let node be a pointer to an object of type SgNode class MyAstAttribute public AstAttribute 1 public MyAstAttribute int v value v private int value node gt attribute setAttribute mynewattribute new MyAstAttribute 5 Using this expression an attribute with name mynewattribute can be attached to the AST node pointed to by node Similarly the same attribute can be accessed by name using the member function getAttribute MyAstAttribute myattribute node gt attribute getAttribute mynewattribute AST attributes can be used to combine the results of different processing phases Different traversals that are performed in sequence can store and read results to and from each node of the AST For example the first traversal may attach its results for each node as attributes to the AST and the second traversal can read and use these results 7 10 Conclusions All AST Processing classes provide similar interfaces that differ only by the attributes used AST node attributes can be used to attach data to each AST node and to share information between different traversals Additional examples for traversal attributes pdf and dot output can be found in e ROSE exampleTranslato
120. en kera I A mee pie ive i i for j k 1 j lt n 1 j 1 s2 alglli af 4 als k x a k i s2 a 7 li als i alj k a k i for j 1 i lt n 1 i 1 s1 j a 5 i aljlli alj lj for i k 1 i lt n 1 i 1 for i j 1 i lt n 1 i 1 E alj i a 5 i als k a k i a al5 i a 5 0 5115 a original code b after loop interchange c after blocking row dimension Figure 10 2 Optimizing non pivoting LU In b the k s1 loop is fused with the j s2 loop and the fused loop is then put at the outermost position achieving a combined interchange and fusion transformation the code in c achieves blocking in the row dimension of the matrix through combined interchange fusion and tiling transformations the package to optimize complex non perfectly nested loop structures Although the original loops in a are not perfectly nested the package recognizes that the k s loop k loop surrounding statement s can be re combined with the loop j s2 and that the recombined loop can then be placed outside of the original k s2 loop The transformed code in b simultaneously achieves two effects the fusion of k s with j s2 loop and the interchange of k s2 with j s2 loop Section 10 3 2Jexplains this combined interchange and fusion transformation in more detail The code in b can further be blocked and the result is shown in c Figure 10 3 illustrates
121. en the default value with a new location To change the location you need to have used the prefix install_dir to run the configure script You can rerun the configure script without rebuilding ROSE 2 2 7 Testing ROSE A set of test programs is available Type make check to run your build version of ROSE using these test codes Several years of contributed bug reports and internal test codes have been accumulated in the ROSE tests directory 2 2 8 Getting Help You may use the following mailing list to ask for help from the ROSE development team casc rose dot lnl dot gov 2 3 Building Translators Using ROSE At this point you should have installed ROSE For examples of ROSE translators see the ROSE 0 9 3a Tutorial tar gz and the examples in the ROSE tutorial directory 2 4 Robustness of ROSE A significant focus of the ROSE project is on the robustness of the software supporting our project We have based the C and C support upon the use of the EDG frontend the same commercial quality frontend used by most commercial C compilers ROSE is a research project at a Department or Energy DOE national laboratory As such it must handle DOE laboratory applications that scale to a million lines of code or more ROSE is not an academic research project nor is it a commercial product This section will layout what we do to test ROSE what parts we consider to be robust and exactly what we mean by robust 2 4 1 How We Test ROSE ROS
122. er functions forming SAGE III AST from various EDG IL entries It is derived from EDG s il_to_str c For the SgVarArgOp example the following EDG SAGE connection code is needed in sage_gen_be C a_SgExpression_ptr sage_gen_expr an_expr_node_ptr expr a_boolean need_parens LD eee case eok_va_arg sageType sage_gen_type expr gt type sageLhs sage_gen_expr_with_parens operand_1 NULL if isSgAddressOfOp sageLhs NULL sageLhs isSgAddressOfOp sageLhs gt get_operand else sageLhs new SgPointerDerefExp sageLhs NULL Lassen result new SgVarArgOp sageLhs sageType goto done_with_operation 16 6 SEPARATION OF EDG SOURCE CODE FROM ROSE DISTRIBUTION 149 9 Modify the unparser to have whatever code you want generated in the final code generation step of the ROSE source to source translator The source files of the unparser are located at src backend unparser For SgVarArgOp it is unparsed by the following function in src backend unparser CrzCodeGeneration un parseC rxz_expressions C void Unparse_ExprStmt unparseVarArgOp SgExpression expr SgUnparse_Infog info SgVarArgOp varArg isSgVarArgOp expr SgExpression operand varArg gt get_operand_expr SgType type varArg gt get_type curprint va_arg unparseExpression operand info curprint unp gt u_type gt unparseType type info curprint 16 6 Separation of EDG Source Code from ROSE Distribut
123. er locality this package uses dependence hoisting to achieve three loop transfor mations loop fusion interchange and blocking It uses a construct computation slice or simply slice to encode the input information necessary to perform each dependence hoisting transformation For example for the de pendence hoisting transformation on the non pivoting LU code from Figure 10 2 a to b the computation slice contains two loops k s and j s2 Each computation slice must be valid in that the corresponding dependence hoisting transformation does not reverse any dependence direction of the original program To model the memory performance of applications this package associates each computation slice with a floating point number which defines the number of array references that can be reused at each iteration of the slice that is the number of references that can be reused when the loops in the slice are placed at the innermost position of a loop structure 30 Here the floating point number is necessary to model the spatial reuses resulted from references residing in the same cache line where in average less than one reference could be reused at each iteration of the slicing loops loops in the computation slice These floating point numbers provide the data reuse information of computation slices to the transformation framework which then uses the information to guide loop interchange fusion and blocking transformations Using the data reuse
124. ernative is to just list multiple source files on the command line however this is not a meaningful solution for applications contianing hundreds or thousands of files C template details are addressed by having each file instantiate all the templates that it requires and then we record which of these are used by the file All used instantiated templates are represented as specialized templates in the AST and any transformed instatiated specialized templates are output as template specializations else the backend compiler is used to instantiate the required templates so that we can reduce the code generation required MI Filtering say you have a program like include jiostreamj f cout lt lt x I assume you don t save all the stuff in iostream and included headers in your fact database it ll be huge then If not however you cannot simply discard all stuff from system headers since the user code may refer to them like you need the def of std cout in the example above How do you handle this There also were some remarks in the docs about something like sharing of semantically identical declarations that occur in different parts of the code Like if you have n declarations of int you would only store one Is this done within a translation unit or across translation units In the file containing a CPP include directive the the generated file will be essentially identical i e with the CPP include directive
125. es 2298 memory consumption 101112 node SgVarRefExp AST Memory Pool Statistics numberUfNodes 2195 memory consumption 114140 node SgSymbolTable AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes 2072 memory consumption 721056 node 1668 memory consumption 400320 node 1667 memory consumption 393412 node 1579 memory consumption 101056 node 1301 memory consumption 31224 node 1300 memory consumption 364000 node 1198 memory consumption 455240 node SgMemberFunctionDeclaration SgVariableDeclaration SgVariableDefinition SgMemberFunctionType SgTemplateSymbol SgTemplateDeclaration SgTemplateInstantiationMemberFunctionDecl AST Memory Pool Statistics numberUfNodes 1129 memory consumption 54192 node SgIntVal AST Memory Pool Statistics number0fNodes 1092 memory consumption 56784 node SgAssignInitializer AST Memory Pool Statistics number0fNodes 1006 memory consumption 52312 node SgExpressionRoot 15 5 AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST AST
126. es C AST fix up phase They are handy in transformations since they simplify how one can find a declaration and the definition if it is required 5 13 Mangled Names and Qualified Names Several C constructions IR nodes have qualified names These are used to specify the location of the construct within the space of names we have avoided calling the space of names the namespace since that is a specific C construct presented by the C program Note that none of the get_mangled functions are called within the EDG Sage translation I think At least none are called directly IR nodes that contain a get_qualified_name member function are e SgEnumDeclaration e SgTypedefDeclaration e SgTemplateDeclaration e SgNamespaceDeclarationStatement e SgClassDeclaration e SgTemplatelnstantiationDecl e SgMemberFunctionDeclaration e SgScopeStatement e SgGlobal 5 14 PASSING OPTIONS TO EDG AND ROSE 65 e SgBasicBlock e SgNamespaceDefinitionStatement e SgClassDefinition e SgTemplateInstantiationDefn e SgNamedType Mangled names are a mechanism to build unique mappings to functions classes and any other constructs that could be identified using a non unique string Mangled names should include the qualified names of any scopes in which they are contained IR nodes that contain a get_mangled_name member function are e SglnitializedName e SgStatement all derived classes Note that mangled names include parts that represen
127. essor will be provided within ROSE The package distribution within ROSE also includes a loop optimization tool called LoopProcessor which automatically transforms the Fortran loops in C C applications for better performance In addition the package also provides two levels of internal user interfaces one for end users that intend to apply this package 10 2 INTERFACE FOR END USERS AND COMPILER DEVELOPERS for j 0 j lt n L j 1 for i 0 i lt n 1 i 1 duzla duzt 1 for k 1 k lt n 2 k 1 for j 0 j lt n 1 j 1 for i 0 i lt n l i 1 Na duz il 9 k duz i lj k alk duz i 5 k 1 x b k for j 0 7 lt n 1 j 1 for i 0 i lt n 1 i4 1 tot 7 0 for k 0 k lt n 2 k 1 for j 0 j lt n 1 j 1 for i 0 i lt n 1 i 1 tot tota dk duc ila for j 0 j lt n 1l j 1 for i 0 i lt n 1 i 1 duz il j n 1 duz i n 1 totlil j bin 1 for j 0 7 lt n 1 j 1 for i 0 i lt n l i 1 duzlilljlin 2 duzti lilln 2 efn 2 duzli lljlin 1 for k n 3 k gt ee 1 for j 0 j lt n 1 j 1 for i 0 i lt n 1 i 1 duz i 5 k duzli lj k c k duzli 5 k 1 elk x duz i 5 n 1 a original code 11
128. esulting Rowdatalnterface class will contain standard functions to retrieve information about the table or its columns An instance of this class has all private member variables to store the data of a single row of the table Furthermore it has get_ column X name functions together with the corresponding set_ column X name value functions to modify the values By convention tables used in ROSE will have one column more than specified hence n 1 in total The first column which is always added is a column of type int with the name id This is used to easily identify all rows of a table Rowdatalnterface classes used as template argument with a TableAccess class are required to have an id column The class created by CREATE_TABLE will be called tablename Rowdata where tablename is the first argument for the CREATE_TABLE macro call The DEFINE_TABLE call is necessary to define global and static member variables of the Rowdatalnterface class It has to be called once in a project e g in the source file containing the main function with exactly the same parameters as the CREATE_TABLE call Thus lines 9 2 DATABASE SUPPORT FOR GLOBAL ANALYSIS 103 6 and 7 together with lines 16 and 17 define the test table as having three columns an integer id column a name column storing a string and finally a third column number storing a double precision floating point number The initialize call in line 17 will ensure the
129. evelRewrite insert statement C comment test locationInScope Specification of Macros The specification of macros is similar to CPP directives except that longer macros often have line continuation and formatting We show how to preserve this in the example macro definition below Transformation involving the use of a macro is more complex if the macro call is to be preserved in the final transformation left unexpanded in the generation of the AST fragment with the rewrite mechanism 8 6 EXAMPLE USING AST REWRITE 97 Macro Definition A macro definition is similar to a CPP directive The long example is taken from the Tuning Analysis Utilities TAU project which instruments code with similar macro calls e MiddleLevelRewrite insert statement include lt foo h gt locationInScope e MiddleLevelRewrite insert statement include foo h locationInScope e MiddleLevelRewrite insert statement define PRINT MACRO name name locationInScope e MiddleLevelRewrite insert statement n ifdef USE_ROSE n If using a translator built using ROSE process the simpler tauProtos h header n file instead of the more complex TAU h header file until ROSE is more robust n include tauProtos h n n This macro definition could be placed into the tauProtos h header file n define TAU_PROFILE name type group n static TauGroup_t tau_gr group n static FunctionInfo tauFI name type tau_gr
130. exidecimal representation of floating point literals The value as a float double or long double is also stored explicitly in the AST to simplify forms of analysis Constant folded values are stored in the AST as well with full unfolded constant expressions output in the generated code by default to reproduce the original source code as much as possible 9 Normalization of member access from a pointer Member function access can be represented with two different forms of syntax For example xPtr gt foo xPtr f00 appears in the AST and in the unparsed generated code as xPtr gt foo xPtr gt foo The following code is normalized differently and somewhat inconsistently xPtrPtr gt fo0 xPtrPtr fo0 appears in the AST and in the unparsed generated code as xPtrPtr foo xPtrPtr foo when operators are explicitly defined by the user as in class A public AO AC int x int y int amp operator int i A operator gt const return Aptr A amp operator const return Aptr Ax Aptr A Aptrptr 3 The following code is normalized differently and somewhat inconsistently Aa A aptr da A aptrptr amp aptr aptr gt operator 1 aptr 1 aptrptr gt operator 1 aptrptr 1 aptr gt Aptr gt operator 1 aptr gt Aptrptr gt operator 1 58 10 11 12 13 CHAPTER 5 SAGE
131. f the typedef type can t be referenced except through its tag name However since there are subtle ways in which the tag name is not a type name in C requires the struct keyword this could be an issue for C I have not isolated a code to demonstrate this as a problem Thus within ROSE code such as typedef enum enumType zero 0 one two is normalized to be enum enumType zero 0 one two 60 18 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION This is demonstrated in test2005_188 C Packing pragmas pragma pack normalizations The use of packing pragmas is handled separately from other pragmas within ROSE Most pragmas are strings and no special processing is done internally Packing pragmas assume a stack based semantics and allow pragma pack n Sets packing alignment to value n 1 2 4 8 16 powers of 2 pragma pack push n Push previous packing alignment value and set new value to n pragma pack pop Use previously pushed value of packing alignment pragma pack push pragma pack n pragma pack pop Alternative to pragma pack push n and pragma pack pop pragma pack resets to packing alignment selected by compiler default value ROSE will normalize this to explicit packing pragmas for each structure translating the pack push n and pack pop to explicit values using pack n The reasons this is done is because this is that EDG stores the packing alignment values directly w
132. f sharing is often encountered in ELF 3 Two objects have independent strings and one is an ending substring of another e g main and do main Changing one string does not affect the other This kind of sharing is also common in ELF 5 String table internal holes If a sequence of bytes in a string table is not referenced by anything known to the parser then those bytes are marked as internal holes and are prevented from moving with respect to the beginning of the string table Internal holes are not placed on the string table free list because something we didn t parse might be pointing to them The internal holes are available with SgAsmGenericSection congeal 6 Reallocation of all strings A string table can be repacked by freeing all it s strings and then reallocating We can reallocate around the internal holes or through the internal holes strtab free_all strings free_all_strings true blows away internal holes strtab reallocate The ELF allocator will do its best to overlap storage e g domain overlaps with main 7 Deletion of a string A string is deleted by changing its value to the empty string 8 Stored strings vs non stored strings If a string value has storage space in a file such as an ELF Section name then it s an instance of SgAsmStored String Otherwise the string is either an std string or SgAsmBasicString SgAsmBasicString and SgAsmStored String both der
133. fIncludeRegion int y_FileScope_TopOfIncludeRegion int y_FunctionScope_TopOfIncludeRegion int y NestedConditionalScope TopOfIncludeRegion int y NestedLoopScope_TopOfIncludeRegion int y_ ParentScope_TopOfIncludeRegion int y_ LocalScope_TopOfIncludeRegion include lt stdio h gt int y_ LocalScope_BottomOfIncludeRegion int y_ ParentScope_BottomOfIncludeRegion int y NestedLoopScope_BottomOfIncludeRegion int y NestedConditionalScope BottomOfIncludeRegion int y_ FunctionScope_BottomOfIncludeRegion int y_FileScope_TopOfScope int y_FileScope_BottomOfIncludeRegion int y FileScope BeforeCurrentPosition int y GlobalScope TopOfScope int y GlobalScope_BottomOfIncludeRegion int y GlobalScope_BeforeCurrentPosition int main int y_ FunctionScope_TopOfScope int y_NestedConditionalScope_TopOfScope int y_ NestedLoopScope_TopOfScope int y_ ParentScope_TopOfScope for int i 0 i lt 1 i int y LocalScope_TopOfScope int y LocalScope BeforeCurrentPosition mt x int y_LocalScope_AfterCurrentPosition int y LocalScope BottomOfScope int y_ ParentScope_BottomOfScope int y_ NestedLoopScope_BottomOfScope int y_NestedConditionalScope_BottomOfScope int y FunctionScope BottomOfScope return 0 int y FileScope AfterCurrentPosition int y FileScope _BottomOfScope int y GlobalScope_AfterCurrentPosition int y GlobalScope_BottomOfScope 8 7 Limitations Known Bugs There are several
134. figuration for LaTeX 16 4 Check In Process NOTE Get permission from the ROSE Development Team before you make your first check in If you have access to the SVN repository at LLNL and are building the development version of ROSE available only from SVN not what we package as a ROSE distribution e g not from a file name such as ROSE 0 9 3a tar gz then there are a number of steps to the checkin process 1 Make sure you are working with the latest update run svn update in the top level directory 2 Run make amp amp make docs amp amp make check amp amp make dist amp amp make distcheck amp amp make install amp amp make installcheck depending on how aggressively you want your changes to be tests e Not all tests must be run but we will know who you are via svn blame if the nightly test fail e All changes must at least compile so that you don t hold back other developers who update often 146 CHAPTER 16 DEVELOPER S APPENDIX 3 svn commit m jdescription of what you didj If you do not have access to the SVN repository at LLNL and you wish to contribute work to the ROSE project please make a patch Using the external SVN access via LBL use svn diff to build a patch Consider options diff cmd arg DQ 7 28 2008 This section still needs to be completed 16 5 Adding New SAGE III IR Nodes Developers Only We don t expect users to add nodes to the SAGE III Intermediate Representation IR ho
135. flow call graphs and similar Tf so did you actually add all the implicit semantics of C manually to the AST Hope the question is not too unclear How do you handle global static objects The information computed is as accurate as possible and alwasy represent the full languge including full C Fortran 2003 etc Some languges are newer e g Fortran 2003 and PHP so that will still has to mature Implicit calls to constructors destructors short curcuit evaluation etc are not inserted specific analysis in src midend programTransformation implicitCodeGeneration is used This code introduces implicit calls into the AST as explicit calls which are ignored by the code generation unparser Global static objects are not handles specially but are structurally represented in the AST Note that C static constructor evaluation orders are compiler implementation dependent 13 14 15 16 161 Linking to do general full program analysis you need linking How did you implement this Did you actually build in all the C C linking semantics by hand We support whole program analysis by permitting the AST s from several files to be merged this saves space in the header file duplication and provides an efficient means of handling large scale applications This work is currently experemental and works on a 100K C program seperated over 50 files but is less robust for C code It is ongoing research work A less scalable alt
136. further improved by loop blocking and the result is shown in c Figure 10 2 uses the pseudo code of LU factorization without pivoting to illustrates the effect of applying 1 The package does not include the recursion transformation algorithm in this publication for k 0 k lt n 1 a b for i 0 i lt n 1 i 1 for i 0 i lt n 1 i 1 for zj 0 xj lt n 1 274 b for j 0 j lt n 1 j 1 for k 0 k lt n 1 k 1 for i 0 i lt n 1 i 1 for k 0 k lt N 1 k 1 for j 0 j lt N 1 j 1 for k zk k lt min n 1 2 b 1 k 1 dll ci alik olki dl clli a OLA for j 24 lt min n Lay b 1 j 1 dll clillj alle blk j a original code b after loop interchange c after loop blocking Figure 10 1 Optimizing matrix multiplication first applying loop interchange to arrange the best nesting order in b then applying blocking to exploit data reuses carried by k and j loops in c 109 110 CHAPTER 10 LOOP TRANSFORMATIONS _ _ A A sie xs for k 0 k lt N 2 rk b a k 0 k lt n 2 k 1 a j 0 7 lt n 1 j 1 ie GS ae a for i k 1 i lt n 1 i 1 for k 0 k lt j 1 k 1 i E A j h for k zk k lt min xk b 1 1 k 1 s alla akli alk lk br Gkr hi s
137. g include lt iostream gt include lt typeinfo gt using namespace std struct A virtual inline AO struct B public A struct C struct D public C 62 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION void fooQ struct B bobj struct A ap amp bobj struct A amp ar bobj amp std cout lt lt ap lt lt typeid ap name lt lt std endl amp std cout lt lt ar lt lt typeid ar name lt lt std endl struct D dobj struct C cp amp dobj struct C amp cr dobj amp std cout lt lt cp lt lt typeid C name lt lt std endl amp std cout lt lt cr lt lt typeid C name lt lt std endl amp std cout lt lt expression lt lt typeid bool name lt lt std endl bool t bool f amp std cout lt lt expression lt lt typeid bool name lt lt std endl int less int more amp std cout lt lt expression lt lt typeid bool name lt lt std endl amp std cout lt lt expression lt lt typeid int name lt lt std endl amp std cout lt lt expression lt lt typeid int name lt lt std endl Notice that not all expressions are normalized and that the cases which are normalized vs those which are not is very subtle This normalization appears to be a result of the internal working of EDG and not the Sage III IR This test code can b
138. g Nightly Backups of Unrequired ROSE Files at LLNL o 153 DB A tt de A Oe es A mea a as AR dae a aE ara 153 16 14Enabling PHP Support 2 2 154 wk fe ek o Pe eae ed hha kaa a ooh t ae a OD oe ee RG 155 A eee eo ak Wee ae Se he es ee ce 155 157 18 Glossary 163 12 CONTENTS List of Figures 1 1 Different phases of internal processing within translators built using ROSE infrastructure 2 1 Example output from configure help in ROSE directory Part 1 2 2 Example output from configure help in ROSE directory Part 2 3 1 Example of simple translator building and AST unparsing it and compiling the generated un parsed code 4a kee ee eee pee ee e a he Re gee ae ee ee 3 2 Example of makefile to build the example translator Notice that we use the identityTranslator C file presented in ROSE Tutorial oa aaa 40 3 3 Example output from current version of translator build in ROSE src 41 3 4 Example command line for compilation of C source file roseTestProgram C 42 3 5 Example of output from execution of exampleTranslator o 42 7 1 Headerfile MyVisitor A l ee 74 7 2 Implementation file My Visitor C o oo oo 75 7 3 Example main program MyVisitorMain C 0 75 bbs eT tee ere eens 77 reed Tee tee eee rere ty eee 77 EUA sae 4445 RN 78 IAN Ten ee 82 7 8 Numbers at nodes show the order in whic
139. g using ROSE to compile examples of ROSE source code 7 PythonExample_tests These tests use the Python h header file and are part of tests of code generated by SWIG 8 ExpressionTemplateExample_tests These are a number of tests demonstrating the use of expression templates They are separated out because they take a long time to compile using ROSE This is part of work to understand why expression templates take so long to compile generally e roseTests This directory tests the internal ROSE infrastructure It contains separate subdirectories for individual parts of ROSE See ROSE tests roseTests README for details i Complete the list of ries that hold tests in LOSE tests directory Chapter 14 Testing Within ROSE 131 ey ee ee ee CHAPTER 14 TESTING WITHIN ROSE 1 Usage 2 3 if lt 4 then 4 echo Usage qm sh lt flo gt lt QMTest test class gt lt ROSE translator gt lt Backend Compiler NULL gt compiler arguments Test arguments testopt lt gt exit 1 fi a Functions 000 10 includeFullPath 11 local BACK pwd 12 13 ARG echo ARG sed e s I g 14 cd ARG 15 ARG 1 pwd 16 17 cd BACK 18 return 0 19 get the absolute path of all include directories 20 21 cdi 22 23 Globals 24 25 declare i COUNT 0 26 declare i FLAG 0 27 28 TEST BADTEST qmt error in test creation 29 MODE i The naming mode of the script 30 TEST_CLASS 2
140. graphs These are discussed in subsequent subsections See chapter on Getting Started for details of SQLite installation and configuration Previous work supported MySQL but this was overly complex 9 2 1 Making a Connection To the Database and Table Creation Figure 9 1 shows the listing of a program that connects to the ROSE database creates a custom table and performs some simple SQL queries In the main function at line 12 a GlobalDatabaseConnection object is created and is used to connect to the database in line 13 When the initialization succeeds the database connection and ROSE database are ready for use Line 16 creates a TableAccess object This object can be used to perform SQL queries like SE LECT INSERT or MODIFY on a given table in the database The TableAccess object is tem plated by a Rowdatalnterface object that defines the structure of the table For this example pro gram a Rowdatalnterface object for a test table is created in line 6 and 7 Here two macros are called that handle the definition of the Rowdatalnterface class and all standard member func tions The general syntax is CREATE_TABLE_ n tablename column 1 datatype column i name column 2 datatype column 2 name column n datatype column n name where the J represents values to be filled in such as the name of the table As column datatype all standard C datatypes as bool char short long float double etc are valid The r
141. group n Profiler tauFP amp tauFI taugr n else n include TAU h n endif locationInScope Macro Use This example of macro use shows how to leave the macro unexpanded in the AST fragment which is generated to be patched into the application s AST e MiddleLevelRewrite insert statement MiddleLevelRewrite postponeMacroExpansion PRINT_MACRO Hello World locationInScope e MiddleLevelRewrite insert statement MiddleLevelRewrite postponeMacroExpansion TAU_PROFILE main TAU_LUSER locationInScope 8 6 Example Using AST Rewrite This section demonstrates a simple example using the AST Rewrite Mechanism The input code 8 6 contains the variable declaration statement int x which example preprocessor testRewritePermutations a testcode in the ROSE tests roseTests astRewriteTests directory will use to place additional variable declarations in all possible relative absolute positions include lt stdio h gt int main for int i 0 i lt 1 i int x return 0 98 CHAPTER 8 AST REWRITE MECHANISM The new variable declarations contain as a substring of the variable name the relative scope and location in that scope relative to the target declaration int x The output of processing this input file is a new code 8 6 with many added declarations one for each possible relative absolute position possible relative to the declaration int x int y_ GlobalScope_TopO
142. gs of the SIGPLAN 98 Conference on Programming Language Design and Implementation Montreal June 1998 R Allen and K Kennedy Optimizing Compilers for Modern Architectures Morgan Kaufmann San Francisco October 2001 J F L Carter and S F Hummel Efficient multiprocessor parallelism via hierarchical tiling In SIAM Conference on Parallel Processing for Scientific Computing February 1995 M E Wolf and M Lam A loop transformation theory and an algorithm to maximize parallelism EEE Trans actions on Parallel and Distributed Systems 2 4 452 471 Oct 1991 M J Wolfe Optimizing Supercompilers for Supercomputers The MIT Press Cambridge 1989 Q Yi Transforming Complex Loop Nests For Locality PhD thesis Rice University 2002 Q Yi V Adve and K Kennedy Transforming loops to recursion for multi level memory hierarchies In ACM SIG PLAN Conference on Programming Language Design and Implementation Vancouver British Columbia Canada June 2000 Q Yi and K Kennedy Improving memory hierarchy performance through combined loop interchange and multi level fusion In LACST Symposium Santa Fe NM Oct 2002 Q Yi and K Kennedy Transforming complex loop nests for locality Technical Report TR02 386 Computer Science Dept Rice University Feb 2002
143. gure script Type AbsoluteOrRelativePathToSourceTree configure prefix installTree to run the BOOST configure script The path to the configure script may be either relative or absolute The prefix option specifies the installation directory e g BOOST INSTALL Run make Type make to build all the source files Run make install Type make install to copy all build files into the install directory BOOST is now available in your installTree e g BOOST_INSTALL to be used by ROSE Note that the installation of Boost will frequently output warnings e g Unicode ICU support for boost regex not found these can be ignored 2 2 3 Building ROSE From a Distribution ROSE 0 9 3a tar gz The process for building ROSE from a ROSE Distribution Version is the same as for most standard software distributions e g those using autoconf tools 1 Untar ROSE Type tar zxf ROSE 0 9 3a tar gz to untar the ROSE distribution Build a separate compile tree Type mkdir compileTree to build a location for the object files and documentation use any name you like for this directory Change directory to the new compile tree directory Type cd compileTree This changes the current directory to the newly created directory Add JAVA environment variables For example export JAVA_HOME usr apps java jdk1 5 0_11 export LD_LIBRARY_PATH JAVA_HOME jre 1ib i386 server LD_LIBRARY_PATH Add the Boost library path into
144. h the visit function is called in a preorder traversal 83 7 9 Numbers at nodes show the order in which the visit function is called in a postorder traversal 84 7 10 Numbers at nodes show the order in which the function evaluatelnheritedAttribute is called in a top down processing 4 24 33 ee Sa ee Ee AA a ee a 85 7 11 Numbers at nodes show the order in which the function evaluateSynthesizedAttribute is called in a bottom up processing s soa saaara aari veep eenaa ka eee a a 86 7 12 The pair of numbers at nodes shows the order in which the function evaluatelnheritedAttribute first number and evaluateSynthesized Attribute second number is called in a top down bottom up processing eine eee a eters eee 105 adas one teehee eee 106 OSES ee Ea es Bees 107 De reia e ee bin dai 108 13 14 LIST OF FIGURES 10 1 Optimizing matrix multiplication first applying loop interchange to arrange the best nesting order in b then applying blocking to exploit data reuses carried by k and j loops in c 10 2 Optimizing non pivoting LU In b the k s1 loop is fused with the j s2 loop and the fused loop is then put at the outermost position achieving a combined interchange and fusion transformation the code in c achieves blocking in the row dimension of the matrix through combined interchange fusion and tiling transformations 10 3 Optimizing tridupk from Erlebacher combining loop interchange and fusion thus fusing multiple
145. hangeLog for current version numbers to be used with ROSE e automake Automake version 1 9 or higher is required Most software projects appear to be less sensitive to the specific version of automake Check the ROSE ChangeLog for current version numbers to be used with ROSE The ROSE ChangeLog details the changes between versions of ROSE and lists the specific version numbers of all software upon which ROSE depends Comments of this type appear in the ChangeLog as xkkkkkkkk TESTED with kkk kk kkk gt k automake GNU automake 1 6 3 autoconf GNU Autoconf 2 57 GNU Make version 3 79 1 g GCC 3 3 6 gcc GCC 3 3 6 doxygen 1 3 8 dot version 1 12 Sun Aug 15 02 43 07 UTC 2004 TeX Web2C 7 3 1 3 14159 Original LaTeX2HTML Version 2002 1 62 sqlite requires g 3 3 2 3 2 1 Optional for use of ROSE by users but required for internal ROSE development by ROSE project team Required for use of ROSE and for all internal development The build process for a Developer Version is 143 FIXME This 144 CHAPTER 16 DEVELOPER S APPENDIX Checkout a NEW version from SVN The newest work on ROSE as of March 2008 is using SVN instead of CVS This switch to SVN means the directions for how developers use ROSE have changed This effects developers of ROSE only or anyone with access to the newer SVN repository Please learn about SVN on another project before using it on
146. hat pragma pack declarations are handled in a special normalization see section 18 These pragmas are a bit different from other pragmas and are handled as a stack based embedded language 5 5 Copying IR Nodes and Subtrees Support is provided for a policy based copying of the AST and subtrees of the AST Flexibility and control is provided through an independent policy mechanism that defines the copying process as shallow or deep for different types of nodes within the AST Each SgNode object has the following public virtual member function class SgNode virtual SgNode copy SgCopyHelp amp help const Here SgCopyHelp is a virtual policy class for duplicating SgNode objects and is defined as class SgCopyHelp public virtual SgNode copyAst const SgNode n 0 Two concrete classes SgShallowCopy and SgTreeCopy are provided as subclasses of SgCopyHelp to configure a shallow copy duplicating the current SgNode object only or a deep copy duplicate the complete subtree rooted at the current SgNode object respectively The following example illustrates how to use SgShallowCopy and SgTreeCopy to duplicate SAGE nodes and sub trees SgNode orig SgNode ni SgNode n2 orig gt copy SgShallowCopy static_instance orig gt copy SgTreeCopy static_instance Here ni points to a duplicate of the SgNode object pointed to by orig while n2 points to a duplicate of the complete subtree rooted at orig Therefore the
147. he design of the IR is to not loose any source code information Thus ROSE is especially well suited to source to source translation However then means that the IR for ROSE is quite large and this has advantages and disadvantages The IR forms the base for an abstract syntax tree so clearly some syntatic details are lost in the IR but these are regenerated in the back end which has language specific support More languages could be added to ROSE ROSE is designed to be langauge neutral but it is implemented in C PHP has for example been added to ROSE but it represnted initial work and an experiment with the general subject of run time typed scripting language support 4 2 Design Fundamentally ROSE has three parts 1 frontend which addresses language specific parsers frontend issues and the binary disassembly for the case of the binary support in ROSE 2 midend which addresses analysis and transformation issues 3 backend which addresses code generation issues The frontend constructs an AST which saves as much as possible about the structure of the original source code or binary for the case of the ROSE binary supposrt This section will cover the design goals etc of ROSE 43 Add more detail about each directory 44 CHAPTER 4 THE ROSE INFRASTRUCTURE 4 3 Directory Structure The top level of the ROSE directory tree has a simple design All the source code is in src all the tests are in tests all the documentat
148. his question until we can implement a more automated mechanism on the SAGE III source code See the Doxygen generated documentation for more details on the IR nodes and their member functions 5 9 AST Normalization Subtle Ways That ROSE Output Differs from the Original Source Code In general every attempt is made to preserve the look and feel of the original input code Original formatting use of C preprocessor directives e g include lt file h gt and comments are preserved within the AST and output in the generate code However there can be minor differences between the input source code and the code that is generated from ROSE translators In all cases this difference is due to normalizations internally within the EDG front end Current normalizations include 1 White space differences ROSE generated code will appear somewhat different due to slightly different uses of white space within formatting of the generated code All attempts are to preserve as much of the original formatting as possible or practical 2 Variable declarations are normalized to separated declarations Variable declarations containing multiple names variables to be declared are normalized within the AST to form one declaration for each name variable This simplifies program analysis since it avoids one of two ways of searching for a variable declaration as a separate declaration and as a member of a list in another declaration As an example int x y z
149. hod defaultSynthesizedAttribute must be used to initialize attributes of primitive type such as int bool etc This method is called when a synthesized attribute needs to be created for a non existing subtree i e when a node pointer is null A null pointer is never passed to an evaluate function If a class is used to represent a synthesized attribute this method does not need to be implemented because the default constructor is called In order to define an default value for attributes of primitive type this method must be used Two cases exist when a default value is used for a synthesized attribute and the defaultSynthesizedAttribute method is called 7 7 ASTTOPDOWNBOTTOMUPPROCESSING 79 e When the traversal encounters a null pointer it will not call an evaluate method but instead calls default Synthesized Attribute e When the traversal skips over specific IR nodes For example traverseInputFiles only calls the evaluate method on nodes which represent the input file s but skips all other nodes of header files for example 7 6 1 Example Access of Synthesized Attribute by Name The enum definition used to access the synthesized attributes by name at a SgForStatement node is enum E_SgForStatement SgForStatement_init_stmt SgForStatement_test_expr_root SgForStatement_increment_expr_root SgForStatement_loop_body The definitions of the enums for all AST nodes can be found in the generated file lt COMPILETREE gt SAGE Cxx_Gram
150. ialize this member field e Access functions to get and set the member function will be automatically generated e The automatically generated AST traversal will traverse this node i e it will visit its children in the AST e Have the automatically generated destructor not call delete on this field the traversal will to that In the case of the VarArgOp an additional data member was added VarArgOp setDataPrototype GRAMMAR_PREFIX_Type expression_type NULL CONSTRUCTOR_PARAMETER BUILD_ACCESS_FUNCTIONS NO_TRAVERSAL DEF2TYPE_TRAVERSAL 5 Most IR nodes are simpler but SgExpression IR nodes have explicit precedence All expression nodes have a precedence in the evaluation but the precedence must be specified This precedence must match that of the C frontend So we are not changing anything about the way that C evaluates expressions here It is just that SAGE must have a defined value for the precedence ROSETTA permits variables to be defined and edited to tailor the automatically generated source code for the IR VarArgOp editSubstitute PRECEDENCE_VALUE 16 6 Associate customized source code Automatically generated source code sometimes cannot meet all requirements so ROSETTA allows user to define any custom code that needs to be associated with the IR node in some specified files If customized code is needed you have to specify the source file containing the code For example we specify the file c
151. ibe list address Subscribe yourself or address if specified to the named list The list may be configured so that you can only subscribe yourself ie you can t specify an address other than your own 152 CHAPTER 16 DEVELOPER S APPENDIX Unsubscribe list address Unsubscribe yourself or address if specified from the named list unsubscribe will remove you or address from all lists This may not work if you have subscribed using multiple addresses The list may be configured so that you can only unsubscribe yourself ie you can t specify an address other than your own which address Find out which lists you or address if specified are on Only lists enabled to supply this information will be returned to the requester who list Find out who is on the named list Only lists enabled to supply this information will be returned to the requester info list Retrieve the general introductory information for the named list Only lists enabled to supply this infor mation will be returned to the requester intro list Retrieve the introductory message sent to new users Non subscribers may not be able to retrieve this lists Show the lists served by this Majordomo server will not show private lists help Retrieve some help information on the available user commands end Stop processing commands useful if your email program adds a signature Here are the URLs for the casc rose email list Instructions on how to
152. ices Because each slice represents a set of loops that can be fused into a single loop interchanging the nesting order of two slices corresponds directly to the interchange of the two sets of slicing loops The effects of applying loop interchange is shown for matriz multiplication in Figure 10 1 b and for non pivoting LU factorization in Figure 10 2 b Because this package implements loop interchange using dependence hoisting it achieves loop blocking by combining a sequence of dependence hoisting with loop strip mining transformation Given an input loop nest C the algorithm takes the computation slices constructed for C in the reverse of their desired nesting order and then uses each slice to perform a dependence hoisting transformation After each dependence hoisting transformation if the new outermost loop ff should be blocked the algorithm strip mines ff into a strip counting loop le and a strip enumerating loop amp It then uses loop amp as the input loop nest for further dependence hoisting transformations which in turn will shift a new set of loops outside loop but inside loop thus blocking loop 116 CHAPTER 10 LOOP TRANSFORMATIONS Ls The effects of applying loop blocking is shown for matrix multiplication in Figure c and for non pivoting LU factorization in Figure c Loop Fusion and Distribution Fission To achieve an aggressive multi level loop fusion effect the package merges multiple computation slices and then
153. if the program in represented by a single translation unit 52 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION Transformation of template classes This case is discussed in more detail below It is a much harder case and is currently incomplete Transformation of template static data members This case is not handled yet but should not be much trouble e Transformation of template specializations ROSE generates all template instantiations internally as template specializations As such they are no different from any other AST subtree and all ROSE mechanism can be used for analysis and transformation of the instantiated template Those instantiated templates that are transformed are marked for output in the code generation phase and output as template specializations In this approach templates instantiated for different types may be easily transformed differently 3 Transformation of templated classes is enabled via generated specializations This was discussed briefly above under Options specific to templates item Transformation of template specializations above In general transformations on template classes functions and static data members are handled through the explicit generation of specializations in place of the instantiations that would be generated by the back end vendor compiler All templates are explicitly generated as specializations in ROSE and in principle no instantiations are required by the back end ven
154. ignedLongLongIntVal SgMod0p SgLongLongIntVal SgLongDoubleVal SgNotOp SgBitOrOp SgMinusOp SglhileStmt SgForStatement Sg0r0p SgGreaterThan0p SgDeleteExp SgAggregatelnitializer SgNamespaceSymbol SgForInitStatement SgRshiftOp SgRshiftAssignOp SgPlusAssign0p SgLshiftOp SgBitXor0p SgBitComplementOp SgDivAssignOp SgAndAssignOp SgFile 141 142 CHAPTER 15 APPENDIX AST Memory Pool number0fNodes 1 memory consumption 84 node SgProject AST Memory Pool number0fNodes 1 memory consumption 48 node SgCatchOptionStmt AST Memory Pool number0fNodes 1 memory consumption 44 node SgTypeInt AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypelichar AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeVoid AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeUnsignedShort AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeUnsignedLongLong AST Memory Pool number0fNodes 1 memory consumption 40 node SgTypeUnsignedLong AST Memory Pool Statistics number0fNodes 1 memory consumption 40 node SgTypeUnsignedInt AST Memory Pool Statistics numberU Nodes 1 memory consumption 40 node SgTypeUnsignedChar AST Memory Pool Statistics numberU Nodes 1 memory consumption 40 node SgTypeString AST Memory Pool number0fNodes 1 memory consumption
155. ile Format Byte order File byte order can be changed by modifying the SgAsmGenericFormat object pointed to by the file header SgAsmFileHeader fhdr fhdr jget _exec_format jset_sex ORDER_MSB 12 5 STATIC BINARY REWRITING 125 NOTE Modifying the byte order affects only those sections that are actually parsed If the ELF file contains a section whose purpose we don t recognize then the original section data is written to the new file FIXME If the byte order is not specified in the ELF header e ident_data encoding other than 1 or 2 then the parser will make an educated guess and assign a byte order The unparsed file will differ from the original in this case at the sixth byte of the file 3 ELF Word Size File word size can be changed between 4 bytes and 8 bytes by modifying the SgAsmGenericFormat object pointed to by the file header SgAsmFileHeader fhdr fhdr get_exec_format set_word_size 4 When changing word sizes any fields that have values too large to represent in the new word size will cause the unparser to abort NOTE Modifying the word size affects only those sections that are actually parsed If the ELF file contains a section whose purpose we don t recognize then the original section data is written to the new file FIXME Increasing word size probably requires allocating more space for many of the sections Vice versa for decreasing the word size 4 ELF Header Magic Number An ELF header has
156. iles with EDG and does not with g const double doubleValueConstant2 3 14 3 and it applies to integer based types only why such types are special while float and double are not I don t know However double is somewhat supported as a non standard extension by both EDG and GNU g though in different ways This is a little corner of C which is truly obscure but shows up in some large 5 11 NOTES ON ROSE SPECIFIC HEADER FILES 63 applications at LLNL Since the code that works with EDG does not work with GNU g and vice versa there is no common ground So we assume that the code will compile using EDG we have no choice and then generate code that will compile with GNU g This means that we generate C code that can t be compiled with EDG but this is the mess that application developers get themselves into when they use non standard features The fix up of the AST to force the generation of code suitable to GNU g is handled in the ROSE src frontend SagellI astFixup directory 5 11 Notes on ROSE specific Header Files We borrow the header files of whatever compiler is specified as the target back end compiler This allows the same expansion of any macros as would be expanded without ROSE to match the expansion that would be done with ROSE The mechanism for borrowing the header files from the target back end compiler is somewhat messy but fully automated There are several steps including translation and matching the val
157. imization in ROSE has only recently become a focus This work is not yet as durable as the compiler construction aspects of ROSE The development of the ROSE infrastructure requires that we can first compile and transform large scale applications before we address complex program analysis and its robustness 2 5 Submitting a Bug Report The rule is simple the better quality the bug report the higher priority it gets All good bug reports include a very simple example that demonstrates the bug and only that bug so that it is clearly reproducible We welcome your submission of good quality bug reports You may also send email directly to dquinlan at llnl dot gov Any bug report you submit will be added as a test code and used to test future versions of ROSE please add ROSE bug report to the subject line At a later point we will use a more formal bug tracking mechanism 2 6 Getting a Version of the EDG License for Research Use ROSE uses the EDG www edg com C front end to parse C code internally No part of the EDG source code is visible to the user or ROSE but since ROSE does not yet routinely package a separate binary we provide the EDG source code as part of the distribution of ROSE So at present we only give out ROSE to people who also get a free research license for the EDG source code available from EDG 2 6 GETTING A VERSION OF THE EDG LICENSE FOR RESEARCH USE 33 We are particularly thankful to the EDG people for providing
158. in o home yuan5 bin identityTranslator home yuan5 bin identityTranslator home yuan5 bin identityTranslator home yuan5 bin identityTranslator Figure 14 7 make output all the local directories containing their object file counterparts Thus it s necessary to collect them all into one directory which will serve as a QMTest database From the directory where make or the project s build system 138 CHAPTER 14 TESTING WITHIN ROSE bash 2 05b find name qmt _home_yuan5_roseqmtest_project_p2_cpu_out qmt _home_yuan5_roseqmtest_project_p2_decoder_c qmt _home_yuan5_roseqmtest_project_p2_main_c qmt _home_yuan5_roseqmtest_project_p2_reader_c qmt _home_yuan5_roseqmtest_project_p2_registers_c qmt Figure 14 8 find name qmt output was launched type the command find name qmt exec mv test_database This will recursively find all files with extensions qmt and move them to the directory test_database which was created by the user Change directory to test_database and type the command qmtest D pwd create tdb This command will allow QMTest to access the test files by creating a test database Once this test database has been created by QMTest it is possible to run tests from the command line or GUI with the respective commands qmtest run o results qmr runs command line and writes QMTest output to results qmr qmtest gui runs the QMTest GUI by which the user may read results stored in results qmr
159. information of computation slices the transformation framework optimizes a code segment in the following steps First it applies dependence analysis and constructs all the legal computation slices for an input code segment It then treats all the valid computation slices as if they form a sequence of loop nests and rearranges these slices to achieve better cache locality For each set of computation slices that forms a single loop nest the package first selects a nesting order so that the loops that are associated with more reuses are nested inside It then fuses each pair of disjunct computation slices slices that contain disjunct sets of statements when their statements access a common set of data After fusion if some non innermost slices carry data reuses the package marks the corresponding slice nest to be tiled later Finally the framework uses the rearranged computation slices to perform a sequence of dependence hoisting transformations to achieve the desired transformation result Note that all the transformations are applied only when legal that is no semantics of the original program is violated by the transformations The following briefly describes the optimization strategies implemented in this package For more details of the optimization algorithms see 34 Loop Interchange and Blocking To achieve loop interchange the package carefully arranges the order of applying dependence hoisting transformations using different computation sl
160. ion The EDG research license restricts the distribution of their source code Working with EDG is still possible within an open source project such as ROSE because EDG permits binaries of their work to be freely distributed protecting their source code As ROSE matured we designed the autoconf automake distribution mechanism to build distributions that exclude the EDG source code and alternatively distribute a Linux based binary version of their code All releases of ROSE starting with 0 8 4a are done without the EDG source code by default An optional configure command line option is implemented to allow the construction of a distribution of ROSE which includes the EDG source code see configure help for the with edg_source_code option The default options for configure will build a distribution that contains no EDG source code no source files or header files This is not a problem for ROSE because it can still exist as an almost entirely open source project using only the ROSE source and the EDG binary version of the library Within this default configuration ROSE can be freely distributed on the Web eventually Importantly this simplifies how we work with many different research groups and avoid the requirement for a special research license from EDG for the use of their C and C front end Our goal has been to simplify the use of ROSE Only the following command to configure with EDG source code is accepted configure with edg_so
161. ion is in docs ROSE uses autoconf and automake so there is an autoconf generated configure script included The conf directory contains all the autconf macros used in ROSE The projects directory contains a collection of ongoing and past projects in ROSE that are either not large enough or mature enough to stand along as sperate projects We use this location to incubate developing tools or technologies built on ROSE as they are developed some are moved into the ROSE src directory proper The README file contains information on how to install ROSE and information about where information on ROSEis located 4 4 Implementation of ROSE ROSE is implemented in C It supports source to source analysis and transformations on source code in a language neutral way or alternativeively in a collection of langauge specific ways This section will be added to in the future 4 4 1 Implementation of ROSETTA ROSETTA is a tools built internally to generate code for ROSE so that ROSE follows simple and consistant design rules ROSE relies heavily on code generatation as a way to automate as much as possible and permits ROSE to be maintained by as easily as possible ROSETTA is thus used so that we can avoid spending all our time doing mainainance ROSETTA is however not very ROSE specific and might be more generally useful we have not pursued this line of work We are happy to have ROSETTA be only used in ROSE it is however separated out in the src ROSETTA src a
162. ion prototypes Function prototypes for all specializations that we generate are required before the use of the template forces its instantiation The point is to allow the specialized template function to be available for transformation It can be placed anywhere in the file typically at the end in ROSE as long as a prototype has been output to prevent a full instantiation of the specialized template function before any use would force its instantiation by the back end compiler At that point the template specialization generated by ROSE and perhaps transformed by the user is not only redundant but results in an error since the function is defined twice first instantiated by the vendor compiler and then seen as an explicit template specialization generated by ROSE Output template function definitions All template specializations can be now output even if they referenced templated classes for functions that would force the instantiations the reason why all prototypes must proceed the definitions of template classes and functions These can actually appear before or after the rest of the code so 3 and 4 may be swapped Output the rest of the code This will force template instantiations of any non specialized template classes or functions It may appear before the template functions definitions or mixed interleaved with them Output all explicit template instantiation directives at the base of each namespace where they appear It
163. ism 1 queries of a sub tree of a AST from a SgNode 2 queries of a node list and 3 queries of the memory pool If the last parameter of the querySubTree has the value NodeQuery ChildrenOnly then only the IR nodes which are immediate children of the input IR node SgNode in the AST are traversed else the whole of the AST subtree will be traversed Variant Vector objects are internally a bitvector or IR node types from the hierarchy of IR nodes Vari ant Vector can be formed via masks built from variant names VariantVector ir_nodes V_SgType For all AST queries taking a VariantVector if no VariantVector is provided to the function queryMemory Pool the whole memory pool will be traversed all TR nodes from all files 6 2 Node Queries AST Queries can return list of IR nodes These queries are useful as a simple way to extract subsets of the AST Node queries can be applied to the whole of the memory pool or any subtree of the AST The result of an AST Node query on the AST is a list of IR nodes the same interface permits additional AST Node queries to be done of the STL list of IR nodes This permits compositional queries using simple function composition 69 70 6 2 1 Interface Functions The functions supported in the AST Node Query interface are namespace NodeQuery Functions that visits every node in a subtee of the AST and returns a std list lt SgNode gt s It is the subtree of the first parame
164. ith the data structure and does not represent the pragma explicitly Generated code using ROSE thus only uses pragma pack n and pragma pack explicitly for each structure declaration before and after each declaration respectively The specific placement of the pragma pack is also modified so that it appears immediately before and after the opening and closing parents for the class or structure definition As an example the following code as input pragma pack 4 struct A unsigned short a pragma pack push 8 struct B1 unsigned short a struct B2 unsigned short a pragma pack pop struct C unsigned short a pragma pack push 1 struct D unsigned short a pragma pack 2 struct F unsigned short a struct G unsigned short a pragma pack pop struct H unsigned short a struct I unsigned short a pragma pack struct J unsigned short a 3 3 gt gt will be translated normalized to struct A pragma pack 4 unsigned short a pragma pack Ww struct B1 pragma pack 8 unsigned short a pragma pack Ww struct B2 pragma pack 8 unsigned short a pragma pack Ww struct C pragma pack 4 unsigned short a pragma pack Ww struct D pragma pack 1 5 9 AST NORMALIZATION SUBTLE WAYS THAT ROSE OUTPUT DIFFERSFROM THE ORIGINAL SOURCE CODE61 unsigned short a pragma pack struct F pragma pack 2 unsigned short a
165. ither in your PATH or in JAVA_HOME bin 2 2 ROSE INSTALLATION 29 e Adding SQLite support ROSE configure with sqlite3 home dquinlan SQLite sqliteCompileTree prefix pwd with boost BOOST_installTree Configure as above but permit use of SQLite database for storage of analysis results between compilation of separate files one type of support in ROSE for global analysis e Adding parallel distributed memory analysis support using MPI ROSE configure prefix pwd with mpi with gcc omp with boost BO0ST_installTree Configure as above but with MPI and OpenMP support for ROSE to run AST traversals in parallel distributed and shared memory e Adding IDA Pro support ROSE configure prefix pwd with binarysql with boost BOO0ST_installTree The binarysql flag allows ROSE to read a binary file previously stored as a sql file e g fetched from IDA Pro e Adding support for SWIG Python connection ROSE configure prefix pwd with javaport yes SWIG swig with boost BOOST_installTree with java This allows ROSE to be build with javaport a support that connects ROSE to Java via SWIG The Eclipse plug in to ROSE is based on this work e Additional Examples More detailed documentation on configure options can be found by typing configure help or see figure for complete listing Output of configure help is detailed in Figures 2 2 4 Part 1 and Part 2 2 2 5 Running GNU
166. ive from SgAsmGenericString Changing the value of an SgAsmBasicString has no effect on the unparsed file 1 What text should go here 128 CHAPTER 12 BINARY ANALYSIS SUPPORT FOR THE ANALYSIS OF BINARY EXECUTABLES 12 5 4 Modifications ELF Section Table Entries Every ELF Section defined by the ELF Section Table is parsed as an SgAsmElfSection which is derived from SgAsmGenericSection The SgAsmElfSection get_section_entry returns a pointer to the ELF Section Table Entry SgAsmElfSectionTableEntry Some members of these objects can be modified and some can t 1 These functions should not be called since their values are overwritten during the unparse phase e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry 2 Can modify e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry e SgAsmElfSectionTableEntry 12 6 Usage set_sh_name see SgAsmGenericSection set_name set_sh_addr see SgAsmGenericFile shift_extend set_sh_offset see SgAsmGenericFile shift_extend set_sh_size see SgAsmGenericFile shift_extend set_sh link don t call no alternative yet set_sh_type set_sh_flags although the Write and Execute bits are ignored set_sh_info set_sh_addralign set_sh_entsize See the ROSE Tutorial for exa
167. ize 0 CHAPTER 7 AST PROCESSING Figure 7 9 Numbers at nodes show the order in which the visit function is called in a postorder traversal 7 11 VISUALIZATION toot 1 SgGlobal CSize l SgFunctionDeclaration CSize 2 args definition 3 SgFunctionParameterList CSize 0 4 SgFunctionDefinition CSize 1 SgBasicBlock CSize 3 10 ly 6 SgVatiableDeclaration CSize 1 0 condition 7 SglnitializedName CSize 2 12 initpte itemptr expression_root 8 13 SglnitializedName SgExpressionRoot 17 SgExprStatement SgBasicBlock CSize 1 CSize 1 18 2 25 SgReturnStmt CSize l expression_root 26 SgExpressionRoot CSize l 0 operand_i 27 CSize 2 CSize 1 initpte itemptr SgExprStatement SgV arRefExp CSize 1 CSize 0 operand_i expression_root 9 14 19 SgAssignInitializer dh SgGreaterThanOp SgExpressionRoot CSize 1 operand_i lhs_operand_i ths_operand_i 13 20 SgV arRefExp CSize 2 CSize 1 operand_i SgAssignOp CSize 0 CSize 2 lhs_operand_i ths_operand_i 21 SgV arRefExp CSize 0 23 lhs_operand_i SgSubtractOp CSize 2 ths_operand_i SgV arRefExp CSize 0 Figure 7 10 Numbers at nodes show the order in which the function evaluateInheritedAttribute is called in a top down processing 85 86 toot 26 SgGlobal CSize l 0 25 SgFunctionDeclaration CSize 2
168. ized attributes for computing inherited attributes which is obvious from the method signatures If such a data dependence needs to be represented member variables of the traversal object can be used to simulate such a behavior to some degree Essentially this allows for the implementation of a pattern also called accumulation For example building a list of all nodes of the AST can be implemented using this technique template lt InheritedAttributeType SynthesizedAttributeType gt class AstTopDownBottomUpProcessing public SynthesizedAttributeType traverse SgNode node InheritedAttributeType initialInheritedAttribute SynthesizedAttributeType traverseWithinFile SgNode node InheritedAttributeType initialInheritedAttribute void traverseInputFiles SgProject projectNode InheritedAttributeType initialInheritedAttribute typedef SynthesizedAttributesList protected InheritedAttributeType virtual evaluateInheritedAttribute SgNode astNode InheritedAttributeType inheritedValue SynthesizedAttributeType virtual evaluateSynthesizedAttribute SgNode astNode InheritedAttributeType inh SynthesizedAttributesList synList 0 80 CHAPTER 7 AST PROCESSING SynthesizedAttributeType virtual defaultSynthesizedAttribute 7 8 Combined Processing Classes Running many read only traversals on a single unchanged AST is an inefficient operation because every node is visited many times ROSE therefore provides combined traversals that
169. l but until then it is provided for reference This chapter will at some point contain a reference to error messages there are few at present most abort upon error just like a compiler 22 CHAPTER 1 INTRODUCTION e Developer s Appendix This chapter contains information specific to development of ROSE and thus mostly of use only for ROSE developers e Frequently Ask Questions FAQ This chapter contains a series of frequently ask questions FAQ about the ROSE project e Glossary Terms and definitions that simplify the documentation are included in this section More will be added over time A later version of the manual will include performance data on different machines so that the use of different features in ROSE can be better understood This work is incomplete at present implemented but not yet represented in the documentation Chapter 2 Getting Started This chapter details how to build ROSE and how to begin to use ROSE to build a source to source translator ROSE uses EDG and SAGE III internally EDG is a commercial and proprietary C frontend that we are permitted to use to support our research work SAGE III is loosely derived from SAGE II which is derived from SAGE SAGE III is a rewrite of SAGE II and uses a similar object oriented design and a similar interface API The developers of SAGE II suggested that we call our work on the C intermediate representation Sage TIT We are thankful to the developers of
170. l a UPC runtime system such as has been done for OpenMP by Liao from University of Houston 5 K amp R C strict mode This is the mode used when compiling with the edg old_c file must have c filename extension This option will not currently work with ROSE because prototyped versions of functions are used within rose_edg_ required macros_and_functions h and these are not allowed in EDG s old_c mode trans lated from the ROSE edg ol1d_c y Most of the time the C mode is sufficient for compiling either C or C applications Sometimes the C mode is required then typically rose C_only is sufficient The specific K amp R strict C mode does not currently work in ROSE But K amp R C will compile in both the C and often C modes without problem For C99 specific codes relatively rare rose C99_only is sufficient On rare occasions a greater level of control is required and the other modes can be used 5 15 1 Strict modes can not be used with g and gcc compilers as back ends to ROSE Note that currently gnu builtin functions are not properly defined in strict modes so they modes should not be used This is a problem for strict modes for both C and C 5 15 2 Use c filename suffix to compile C language files In general most C programs can be compiled using the rose C_only independent of their filename suffix However sometimes C program files that use a non c suffix cannot be handled by the rose C_only option
171. lassType AST Memory Pool Statistics numberUfNodes 7507 memory consumption 2071932 node SgClassDeclaration AST Memory Pool Statistics number0fNodes 7060 memory consumption 282400 node SgTemplateArgument AST Memory Pool Statistics number0fNodes 6024 memory consumption 385536 node SgPartialFunctionType AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes AST Memory Pool Statistics number0fNodes 5985 memory consumption 1388520 node 4505 memory consumption 1477640 node 3697 memory consumption 162668 node 3270 memory consumption 758640 node 3178 memory consumption 76272 node SgFunctionParameterList SgTemplateInstantiationDecl SgReferenceType SgCtorInitializerList SgMemberFunct ionSymbol AST Memory Pool Statistics number0fNodes 2713 memory consumption 119372 node SgPointerType AST Memory Pool Statistics number0fNodes 2688 memory consumption 161280 node SgThrow0p AST Memory Pool Statistics numberUfNodes 2503 memory consumption 60072 node SgFunctionSymbol AST Memory Pool Statistics numberUfNodes 2434 memory consumption 107096 node SgFunctionTypeSymbol AST Memory Pool Statistics number0fNodes 2418 memory consumption 831792 node SgFunctionDeclaration AST Memory Pool Statistics number0fNodes 2304 memory consumption 55296 node SgVariableSymbol AST Memory Pool Statistics number0fNod
172. ll other options are passed to the back end compiler with no processing 5 15 How to Control Language Specific Modes C C C99 UPC ROSE supports a number of different modes internally within ROSE the SAGE III IR and the EDG front end There are five modes supported 1 C mode 66 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION a C mode default This mode is used when compiling all files when no command line options are specified b C strict_warnings mode edg a This is the mode used when compiling with the edg a violations are issued as warnings Note that currently gnu builtin functions are not properly defined in strict modes so they modes should not be used c C strict mode edg A This is the mode used when compiling with the edg A violations are issued as errors Note that currently gnu builtin functions are not properly defined in strict modes so they modes should not be used So these strict modes are incompatable with the use of the g and gcc compilers as a back end to ROSE 2 C mode a ANSI C non strict mode This is the mode used when compiling with the rose C_only C89 standard works best if files have c filename extension This implies conformance with the C89 ANSI standard Also equivalent to edg c option b ANSI C strict_warnings mode edg a This is the mode used when compiling with the edg a in addition to the edg c or rose C_only options file must have
173. make it possible to run several traversals of the same base type in a single traversal reducing the overhead considerably Processing classes need not be adapted for use with the combined processing framework so existing traversals can be reused new traversals can be developed and tested independently and combined at any time To make sure that combined traversals work correctly they should not change the AST or any other shared data Terminal output from combined processing classes will be interleaved No assumptions should be make about the order in which the individual traversals will be executed on any node For each Ast Processing class there is a corresponding AstCombined Processing class that behaves simi larly The interfaces for two of these classes are presented below the others are analogous typedef preorder postorder t_traversalOrder class AstCombinedSimpleProcessing public void traverse SgNode node t_traversalOrder treeTraversal0rder void traverseWithinFile SgNode node t_traversal0rder treeTraversalOrder void traverseInputFiles SgProject projectNode t_traversalOrder treeTraversalOrder void addTraversal AstSimpleProcessing traversal vector lt AstSimpleProcessing gt amp get_traversalPtrListRef template lt InheritedAttributeType SynthesizedAttributeType gt class AstCombinedTopDownBottomUpProcessing public vector lt SynthesizedAttributeType gt traverse SgNode node vector lt InheritedAttrib
174. marTreeTraversalAccessEnunms h For example to access the synthesized attribute value of the SgForStatement s test expression the synthesized attributes list is accessed using the enum definition for the test expr In the example we assign the pointer to a child node to a variable myTestExprSynValue SgNode myTestExprSynValue synList SgForStatement_test_expr_root node For each node with a fixed number of child nodes the size of the synthesized attributes value list is always the same size independent of whether the children exist or not For example for the SgForStatement it is always of size 4 If a child does not exist the synthesized attribute value is the default value of the respective type used for the synthesized attribute as template parameter 7 7 AstTopDownBottomUpProcessing This class combines all features from the two classes that were previously presented It allows the user to use inherited and synthesized attributes Therefore the user needs to provide an implementation for two virtual functions for evaluateInheritedAttribute and evaluateSynthesizedAttribute The signature for evaluateSynthe sizedAttribute has an inherited attribute as an additional parameter This allows the results of inherited and synthesized attributes to be combined You can use the inherited attribute that is computed at a node A by the evaluateInheritedAttribute method in the evaluateSynthesizedAttribute method at node A But you cannot use synthes
175. member functions of template classes which are required for the definition to appear in the generated source code At present this is done with a member function that computes this information see the IR node documentation for more detail 5 6 3 Template Specialization Things that can be specialized include classes structures unions variables static data members of templated classes functions and member functions Template and template instantiations need more information stored in the IR nodes to allow the unparser to be simplified We currently compute this information within separate post processing passes over the AST see the source code in ROSE src frontend SagellI astPostProcessing for details Interestingly a template specialization is not an instantiation and can co exist in each file and not cause linker problems multiply defined symbols it may cause generation of weak symbols 5 6 4 Unparsing Templates The general handling of templates requires a specific sorting of the template output This order permits the generation of all template specializations which allows each specialization to be transformed uniquely This is important to the support of transformations on templates based on template arguments The order of output for template handling is as follows 5 6 TEMPLATE HANDLING IN C 51 Output templates Raw template declarations text strings are output at the top of the file Output template funct
176. mple to define the structure of these tables The easiest way to use these tables is the CREATE_TABLE macro The first parameter is a GlobalDatabaseC onnection object the second one is the name of the table Hence line 17 will initialize the projects table and create an instance of the projectsTableAccess object having the same name as the table projects Line 18 initializes the files table in the same way Now two instances of the TableAccess class for the projectsRowdata and the filesRowdata objects are declared in the main scope and are ready to be used The example program performs an initialization to retrieve the ids for the project and the file currently processed which is usually needed for a traversal Lines 21 and 22 set values for project and file name although these values might normally be retrieved from the corresponding SgProject and SgFile nodes As all projects work on the single ROSE database and share the same tables for function and data each of these tables has a projectId column to specify to which project each row belongs Thus one of the first tasks a preprocessor using the database will do is to enable these ids to select or insert rows FIXME pre 104 CHAPTER 9 PROGRAM ANALYSIS The TableAccess retrieveCreateByColumn function is used for this purpose It tries to identify an entry using a unique name and creates that entry if it does not yet exist or retrieves the id of the existing entry otherwise The f
177. mples Chapter 13 ROSE Tests 13 1 How We Test ROSE includes a number of test codes These test codes test 1 Robustness of translators built using ROSE A test translator testTranslator is built and it is used to process a number of test codes both the compilation of the test code and the compilation of the generated source code it tested to make sure that they both compile properly No execution of the generated code is attempted after compilation These tests are used to verify the proper operation of ROSE as part of the standard SVN check in process for all developers 2 Execution of the code generated by the translator built using ROSE Here tests are done to verify that the translator generated correct code that resulted in the same result as the original code 3 Robustness of the internal mechanisms within ROSE Here tests are done on separately developed features within the ROSE infrastructure e g the AST Rewrite Mechanism Loop Optimizations etc Specific directories of tests include e CompileTests This directory contains code fragments that test the internal compiler mechanisms Many code fragments or whole codes are present either have previously or continue to present problems in the compilation demonstrate bugs The CompileTests directory consists of several directories The README file in the CompileTests directory gives more specific information The test codes developed here are intended to be a small te
178. ms to be easily restricted in size to the just the IR nodes specific to the binary executable file format FIXME We need an the binary execut 12 2 2 Instruction Disassembly ROSE has its own disassembler a recursive disassembler that is well suited to details of variable length instruction set handling and data stored in the instruction stream All details of the instructions and the roperands and operator expression trees etc are stored in the binary AST as seperate IR nodes FIXME We need an the AST for a few i 121 122 CHAPTER 12 BINARY ANALYSIS SUPPORT FOR THE ANALYSIS OF BINARY EXECUTABLES 12 3 Binary Analysis A number of binary analysis passes are provided most are a part of the Compass framework for software analysis See the Compass manual for more details on supported binary analysis The ROSE tutorial shows a number of binary analysis passes over both the binary instructions and the executable file format 12 4 Compass as a Binary Analysis Tool Compass is a tool framework for building software analysis tools using rules on source code and alternatively directly on binary executables Compass reports violations of the rules in the evaluation of the software Compass is a reletively simple application built on top of ROSE Most of the complexity and code within Compass is that it includes a large collection to rules each rule has its own implementation of an arbitrary test over the source code or the binary Rules
179. n consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption consumption 36880 27552 47880 40176 42484 31360 212432 18336 42672 18072 33436 22680 26620 33656 28496 19152 13356 8424 18200 68676 15080 10704 78584 8844 10036 8688 8064 7536 35032 6600 6048 6136 5980 4040 2828 4592 3388 19712 3744 3172 3180 1568 11368 3312 2208 2184 11088 1008 1872 1680 1664 560 1352 1152 1196 1144 1092 988 912 936 864 864 780 728 624 616 572 520 520 520 540 468 416 364 308 260 208 208 192 192 176 144 156 156 156 156 156 156 104 104 96 node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node
180. n AstTopDownBottomUp Processing traversal of the AST Using conditional compilation the example code shows the somewhat trivial changes required to convert a read only AST traversal into a read write AST rewrite operation In this example the AST traversal is converted to be ready for rewrite operations but no rewrite operations are shown The purpose of this example is only to show the modifications to an existing traversal that are required to use the AST rewrite mechanism The specialized AST rewrite traversal is internally derived from the ASTProcessing TopDownBottomUp traversal processing but adds additional operations in recording the local context of source code position in the inherited attribute and performs additional operations on the way back up the AST on the synthesized attribute include rose h include rewrite h Extra headers for customizing the rewrite mechanism include rewriteTreeTraversallmpl h include rewriteSynthesizedAttributeTemplatesImpl h include rewriteMidLevellnterfaceTemplatesIimpl h include ASTFragmentCollectorTraversallmpl h include prefixGenerationlimpl h include rewriteASTFragementStringTemplatesIimpl h include nodeCollectionTemplatesImpl h include rewriteDebuggingSupportTemplatesImpl h 94 CHAPTER 8 AST REWRITE MECHANISM Use new rewrite mechanism define USE REWRITE MECHANISM 1 Notice that only the names of
181. n II code not implemented yet rose Fortranl rose FI rose fl compile Fortran I code not implemented yet rose strict strict enforcement of ANSI ISO standards rose astMerge merge ASTs from different files rose astMergeCommandFile FILE rose filename where compiler command lines are stored for later processing using AST merge mechanism compilationPerformanceFile FILE filename where compiler performance for internal phases in CSV form is placed for later A r gt 42 CHAPTER 3 WRITING A SOURCE TO SOURCE TRANSLATOR Example command line to execute exampleTranslator exampleTranslator roseTestProgram C Figure 3 4 Example command line for compilation of C source file roseTestProgram C Example Output From Execution of exampleTranslator Figure 3 5 Example of output from execution of exampleTranslator Chapter 4 The ROSE Infrastructure 4 1 Introduction This chapter was requested by several people who wanted to understand how ROSE was designed and imple mented ROSE supports a number of different languages and used different parsers and or frontends to address each on For C C99 UPC and C we use the EDG frontend While for Fortran we use the Open Fortran Parser as a parser and build the fronend end required ROSE contains a midend where analysis support is made available and and backend which does the code generation from the IR The goal of t
182. n of any dependence edge the transformation is guaranteed to preserve the original semantics of the program If two statements s and sa are both surrounded by loops for each dependence edge between s and sa the dependence graph also defines a condition that must hold between the iterations of these loops The compiler then uses the dependence relations to determine the safety of transforming these loops In traditional unimodular and single loop transformation systems the dependence relation between each pair of statements s to sa is defined using a vector of direction or distance entries where each direction or distance entry defines the relation between the iterations of a common loop surrounding both s and sg The compiler then uses these dependence vectors to determine the safety of transforming a set of common loops that are perfectly nested In order to effectively transform arbitrary non perfectly nested loop structures this package extends the traditional dependence model with a new dependence representation Extended Direction Matrir EDM Given 114 CHAPTER 10 LOOP TRANSFORMATIONS two statements s and s2 a dependence EDM from s to sa defines a direction or distance entry for each pair of loops 1 l2 s t surrounds s and ls surrounds s2 This new dependence representation thus defines dependence conditions for not only common loops surrounding both s and s2 but also non common loops that surround only one of
183. nal examples are The directives in this file are for the legato backup system Here we specify not to backup any of the following file types t skip ppm o show More information can be found at www ipnom com Legato Net Worker Commands nsr 5 html Thanks for saving a number of people a lot of work 16 13 Setting Up Nightly Tests Directions for using roseFreshTest to set up periodic regression tests 1 Get an account on the machine you are going to run the tests on 2 Get a scratch directory normally export 0 tmp jyour username on that machine 3 Copy using svn cp a stub script scripts roseFreshTestStub to one with your name 4 Edit your new stub script as appropriate a Set the versions of the different tools you want to use compiler b Change ROSE_TOP to be in your scratch directory c Set ROSE_SVNROOT to be the URL of the trunk or branch you want to test 154 CHAPTER 16 DEVELOPER S APPENDIX d Set MAILADDRS to the people you want to be sent messages about the progress and results of your test e MAKEFLAGS should be set for most peoples needs but the j setting might need to be modified if you have a slower or faster computer If you would like the copy of ROSE that you test to be checked out using svn checkout rather than the default of svn export add a line SVNOP checkout to the stub file g The default mode of roseFreshTest is to use the most current version
184. nd src ROSETTA Grammar directories This section will be added to in the future 4 4 2 Implementation of Fortran support All fortran support in ROSE used the Open Fortran Parser OFP developed at Los Alamos and part of a community effort to define an open fortran parser that tracks the Fortran language supports Fortran 2003 and the anticipated Fortran 2008 ROSE uses the OFP and builds from the parser the implementations of the parser actions required to construct a proper Fortran frontend That the Fortran frontendin ROSE uses the ROSE IR means that the analysis in the midend can be used or has been fixed up for use with Fortran A backend is also defined in ROSE so that source to source support for Fortran is provide Chapter 5 SAGE III Intermediate Representation There are many details that this chapter on SAGE will present 5 1 History of SAGE We chose to develop and use SAGE III originally developed as SAGE by Dennis Gannon and others at University of Indiana and then SAGE II by Dennis at IU and Carl Kesselman at ISI and others Because SAGE III is a reimplementation of the similar object oriented IR API their work gave us a significant head start in the development of ROSE and an understanding of object oriented IRs 5 1 1 Differences Between SAGE gt and SAGE II SAGE was the first version of SAGE and it provided support for C a subset of C C evolved quite a bit early on and was a moving target and F90 SAGE
185. ng QMTest Files 1 if ARG 0 1 J then 2 ARG pwd ARG 3 fi 4 TEST echo ARG sed e s _ g sed e s _ g gawk print tolower 0 qmt 5 OBJECT TEST o Figure 14 3 Naming procedure for QMTest Files At this block of code it is assumed that ARG contains name of either the source or object file specified by the command line This name is must first contain its absolute path to prevent name collisions which is handled by the if construct on lines 1 3 The TEST name is then created on line 4 by replacing any forward slashes or gt dots in ARG with underscores The OBJECT name is simply the TEST name value with the o extension The object file name argument held in OBJECT is appended to the end of the QMTest argument list along with the o flag Note that QMTest does not allow capital alphabetic letters or periods in the names of individual tests 14 4 4 Create QMTest test and Execute Backend 1 qmtest create o TEST a program PROGRAM a arguments ARGUMENTS test TEST_CLASS 2 BACKEND Execute the old command line to fake the makefile 3 exit Figure 14 4 Create qmt and Execute Backend Line 1 creates a qmt QMTest file with the name TEST that executes PROGRAM with arguments ARGUMENTS using the class TEST_CLASS The qmt test file is created in the present working directory of the project s build system file structure under the make process Lines 2
186. ng profitability analysis algorithms in a set of policy classes and then using these policy classes to control the application of loop transformations A flexible internal interface is provided for compiler writers to plug in their own performance model for various optimization purposes see Section 10 2 2 The currently available performance model includes only the counting of array references being reused in cluding both temporary and spatial cache reuses Because the package has not yet implemented the calculation of the working set size of each loop body it cannot automatically decide the tile size for each blocked loop nest Similarly because the current data reuse analysis is insufficient in calculating the trade off between outer loop blocking and inner loop blocking the package asks the user to specify the desired strategy It then applies the specified strategy uniformly for all the loop nests 10 3 ANALYSIS AND TRANSFORMATION TECHNIQUES 117 The profitability analysis algorithms within this package are not yet complete and will incorporate more sophisticated algorithms in the future These algorithms include not only various strategies to automate the decision of blocking parameters but also runtime tuning strategies that execute applications on a specific machine and then use the collected performance information to automatically select the best overall transformations 118 CHAPTER 10 LOOP TRANSFORMATIONS Chapter 11 AST Merge
187. nsformations as strings The low level interface is useful when AST fragments are built directly using the SAGE III classes through their constructors a somewhat tedious process The low level interface preserves the original interfaces adopted from SAGE II Interface Features Contains State Positioning String Traversal Level 1 No State Absolute AST Subtree Not Used Level 2 No State Absolute AST Subtree Not Used Level 3 No State Absolute String Not Used Level 4 State Relative String Required Table 8 2 Advantages and disadvantages of different level interfaces within the ROSE Rewrite Mechanism 8 3 Generation of Input for Transformation Operators Providing operators to insert replace remove solves only part of the problem of simplifying transfor mations The other part of the problem is generating the input to the transformation operators Both insert and replace require input either as an AST fragment or as a string containing source code This section presents the pros and cons of the specification of transformations as strings 8 4 AST REWRITE TRAVERSAL OF THE HIGH LEVEL INTERFACE 93 8 3 1 Use of Strings to Specify Transformations The mid level and high level rewrite interfaces introduce the use of strings to specify transformations Using strings to specify transformations attempts to define a simple mechanism for a non compiler audience to express moderately complex transformations
188. nstall_dir include directory for their compilation To test linking ROSE translators forces the previously compiled example translator to only use the libraries installed in install_dir lib This is sufficient to test the installation the way that users are expected to use ROSE only from an installed version A sample makefile is generated see 16 2 How to recover from a file system disaster at LLNL Disasters can happen cron scripts can go very very badly If you loose files on the CASC cluster at LLNL you can get the backup from the night before It just takes a while To restore from backups at LLNL use the command restore 1 add lt directory name gt This will build the list of files to be recovered 2 recover This will start the process to restore the files from tape This process can take a long time if you have a lot of files to recover 16 3 Generating Documentation There is a standard GNU make docs rule for building all documentation Note to developers To build the documentation make docs you will need LaTeX Doxygen and DOT to be installed check the list of dependences in the ROSE ChangeLog If you want to build the reference manual of Latex documentation generated by Doxygen not suggested you may have to tailor your version of LaTeX to permit larger internal buffer sizes All the other LaTeX documentation such as the User Manual but not the Reference Manual may be built without problems using the default con
189. nt main int argc char argv SgProject astNode frontend argc argv My Visitor v v traverseInputFiles astNode preorder Figure 7 3 Example main program My VisitorMain C 7 4 AstPrePostProcessing The AstPrePostProcessing class is another traversal class that does not use attributes In contrast to the AstSimpleProcessing class which performs either a preorder or a postorder traversal AstPrePostProcessing has both a preorder and a postorder component Two different visit methods must be implemented one of which is invoked in preorder before the child nodes are visited while the other is invoked in postorder after all child nodes have been visited This traversal is therefore well suited for applications that require actions to be triggered when entering or leaving certain subtrees of the AST class AstPrePostProcessing 1 public void traverse SgNodex node void traverseWithinFile SgNode node void traverseInputFiles SgProject projectNode protected virtual void preOrderVisit SgNode node 0 virtual void postOrderVisit SgNode node 0 F The user needs to implement the preOrderVisit and postOrderVisit methods which are called before and after visiting child nodes respectively 7 5 AstTopDownProcessing This class allows the user to use a restricted form of inherited attributes to be computed for the AST The user needs to implement the function evaluateInheritedAttribute This function i
190. o move text one byte later in the file e text might not satisfy its file aligment constraint e The end of text might overlap with the following section The ELF unparser has undefined behavior when two sections overlap without storing identical bytes at the overlapping regions e text if memory mapped which it surely is might not be consistent with the mapping of other adjacent or overlapping sections For instance text is contained in ELF Load Segment 2 both in the file address space and in the mapped memory space The offset from ELF Load Segment 2 to text must be identical in both file and memory 12 5 STATIC BINARY REWRITING 123 e RVAs that point to instructions in text can be associated with the text section or with ELF Load Segment 2 depending on how they were parsed Normally it doesn t matter which since the relationship between file address space and memory address space is consistent But if you change the file addresses without changing memory addresses then the byte to which the RVA points could be ambiguous Changes to ELF Section or Segment file addresses are reflected in the ELF Section Table and or ELF Segment Table If the particular Sg AsmGenericSection is present in both tables then modifying its file address will result in updates to both tables NOTE Do not modify section offsets and sizes by modifying the section table entries Changes to these values will
191. o reader o decoder o decoder c decoder h CXX CFLAGS c decoder c o decoder o Figure 14 5 makefile before editing QM home yuan5 RoseQMTest scripts qm sh ROSE home yuan5 bin identityTranslator MYCC QM rose RoseTest ROSE CXX MYCC g ROSEFLAGS rose edg no_warnings CFLAGS ROSEFLAGS g Wall CPU out main o registers o reader o decoder o CXX CFLAGS o CPU out reader o registers o decoder o main o main o main c registers h reader h decoder h instruction h CXX CFLAGS c main c o main o registers o registers c registers h main h CXX CFLAGS c registers c o registers o reader o reader c reader h instruction h CXX CFLAGS c reader c o reader o decoder o decoder c decoder h CXX CFLAGS c decoder c o decoder o bash 2 05b make Figure 14 6 makefile after editing home yuan5 RoseQMTest scripts qm_file sh rose RoseTest gt rose edg no_warnings g Wall c main c home yuan5 RoseQMTest scripts qm_file sh rose RoseTest gtt rose edg no_warnings g Wall c registers c home yuan5 RoseQMTest scripts qm_file sh rose RoseTest g rose edg no_warnings g Wall c reader c home yuan5 RoseQMTest scripts qm_file sh rose RoseTest gt rose edg no_warnings g Wall c decoder c home yuan5 RoseQMTest scripts qm_file sh rose RoseTest home yuan5 bin identityTranslator gt rose edg no_warnings g Wall o CPU out reader o registers o decoder o ma
192. ode ee ooo go ee ee deee eee eas gt eMac eed ee Behe Ae Beko 9 2 1 Making a Connection To the Database and Table Creation A a A eee 9 2 3 Working With Database Graphs o a 9 2 4 A Simple Callgraph Traversal o o o 10 Loop Transformations 10 1 Introduction i 2 a w a clea a aa a a DA a Perec reese ree a eS eee OEE ee Len ews imine Gaede T ite ie Sale ed ho Bee DESEOS Pee als oon E E 10 3 3 Transformation Framework e eee 10 3 4 Profitability Analysis 2 a 11 AST Merge Whole Program Analysis Support 11 1 Introduction 4 2 4 acrana a a a A RO he we A 11 2 Usage oo 4 6 eek Bw eR ee SO ER A ee ow O em aos 12 Binary Analysis Support for the Analysis of Binary Executables E21 Introduction se lt 4 a4 sis Sal ote A ae wel a we Be eR a Ee eS 12 2 The Binary AST i 2 64 e4 ee e4 444 pee pede bbb eaeeaewd 12 2 1 The Binary Executable Format o o 12 2 2 Instruction Disassembly o o e 12 3 Binary Analysis a a a e a AA AA e a 12 4 Compass as a Binary Analysis To0ll o o e e 12 5 Static Binary Rewriting 12 5 3 Modifications to ELF String Tables and their Containing Sections 12 5 4 Modifications ELF Section Table Entries 13 ROSE Tests 13 1 How We Testli cosa 24 e h4 e404 4484 oe beeen bad a 12 5 1 Generic Section Segment Modifications o o e
193. of ROSE on your branch as the one to test If you would like to test a previous version you can set SVNVERSIONOPTION to the revision specification to use one of the arguments to r in svn help checkout 5 Check your stub script in so that it will be backed up and so that other people can copy from it or update it to match infrequent changes in the underlying scripts 6 Run crontab e on the machine you will be testing on a Make sure there is a line with MAILTO jyour email b Add new lines for each test you would like to run i If other people are using the machine you are running tests on be sure to coordinate the time your scripts are going to run with them ii See man crontab for the format of the time and date specification iii The command to use is all one line cd lt your ROSE source tree gt scripts amp amp roseFreshTest roseFreshTestStub lt your stub name gt sh lt extra configure options gt Where lt extra configure options gt are things like enable edg _union _struct _debugging with C _DEBUG with java etc 7 Your tests should then run on the times and dates specified 8 If you would ever like to run a test immediately copy and paste the correct line in crontab e and set the time to the next minute note that the minute comes first and the hour is in 24 hour format ensure the date specification includes today s date Be sure to quit your editor
194. of its organization Getting Started This chapter walks the user through the configuration compilation installation and testing of ROSE Installation requirements are also explained A small set of tests are available which verify the installation Writing a Source to Source Translator This chapter presents by example the details of writing a trivial translator using ROSE Overview of ROSE This chapter presents details of specific features in ROSE AST Query Library This chapter presents work that has been completed to support simple and complex queries on the AST AST Processing This chapter covers different ways to write AST traversals operators on the AST This chapter is required to understand the subsequent chapter on the AST Rewrite Mechanism AST Rewrite Mechanism This chapter covers the details of how to use the mechanism within ROSE for modifying the AST This chapter describes how to write general transformations on the Abstract Syntax Tree AST It builds on concepts from the previous chapter Program Analysis This chapter explains what program analysis is available within ROSE Loop Transformations This chapter explains the loop optimization work that has been done SAGE III Intermediate Representation This chapter details issues specific to the IR used in ROSE Appendix This contains information that has not yet made its way into the manual Much of this information will later be integrated into the User Manua
195. ons h DEFINE_TABLE_PROJECTS DEFINE_TABLE_FILES DEFINE TABLE GRAPHDATA DEFINE TABLE GRAPHNODE DEFINE TABLE_GRAPHEDGE CREATE TABLE2 testtable string name double number DEFINE TABLE 2 testtable string name double number define TABLES DEFINED 1 include DatabaseGraph h define traversal classes NYI int main int argc char argv GlobalDatabaseConnection db int initOk db initialize assert initOk 0 CREATE TABLE db projects da files CREATE_TABLE db graphdata CREATE TABLE db graphnode CREATE_TABLE db graphedge TableAccess lt testtableRowdata gt testtable amp db testtable initialize initialize project string projectName testProject this should be given at the command lin string fileName testFile C this should be retrieved from a SgFile projectsRowdata prow UNKNOWNID projectName UNKNOWNID projects retrieveCreateByColumn amp prow name projectName long projectId prow get_id get id of this file filesRowdata frow UNKNOWNID projectId fileName files retrieveCreateByColumn amp frow fileName fileName frow get_projectId long fileld frow get_id init graph DatabaseGraph lt testtableRowdata EdgeTypeEmpty gt callgraph SO 2 Pr a 11 A af es ra IAEA om Chapter 10 Loop Transformations 10
196. ontaining customized source code for VarArgOp in src ROSETTA src expression C VarArgOp setFunctionPrototype HEADER_VARARG_OPERATOR Grammar Expression code VarArgOp setDataPrototype SgExpression operand_expr NULL CONSTRUCTOR_PARAMETER BUILD_ACCESS_FUNCTIONS DEF_TRAVERSAL NO_DELETE VarArgOp setDataPrototype SgType expression_type NULL CONSTRUCTOR_PARAMETER BUILD_ACCESS_FUNCTIONS NO_TRAVERSAL DEF2TYPE_TRAVERSAL NO_DELETE Mo VarArgOp setFunctionSource SOURCE_EMPTY_POST_CONSTRUCTION_INITIALIZATION Grammar Expression code Pairs of special markers such as SOURCE_VARARG_OPERATOR and SOURCE_VARARG_END_OPERATOR are used for marking the header and implementation parts of the customized code For example the marked header and implementation code portions for VarArgOp in src ROSETTA Grammar Expression code are HEADER_VARARG_OPERATOR_START virtual unsigned int cfgIndexForEnd const virtual std vector lt VirtualCFG CFGEdge gt cfgQutEdges unsigned int index virtual std vector lt VirtualCFG CFGEdge gt cfgInEdges unsigned int index HEADER_VARARG_OPERATOR_END TI sere SOURCE_VARARG_OPERATOR_START SgType CLASSNAME get_type const SgType returnType p_expression_type 148 CHAPTER 16 DEVELOPER S APPENDIX ROSE_ASSERT returnType NULL return returnType y unsigned int CLASSNAME cfgIndexForEnd const return 1 EEEN SOURCE_VARARG_OP
197. optimization policies This configuration class uses a set of policy classes automat ically selected from the command line arguments as described in Section to control the application of three loop transformations interchange fusion and blocking The currently available policy classes are defined in the locations driver InterchangeAnal h driver FusionAnal h and driver Blocking Anal h respectively To plug in different optimization strategies the developer can write new profitability policy classes and then con figure LoopT rans formOptions to use the new algorithms The command line configurations are automatically extended when the developer registers these new policy classes 10 3 Analysis and Transformation Techniques This package implements the following techniques to optimize applications for better cache locality This section provides only brief introductions to the algorithms without going into any detail Most algorithms are described in detail in Qing Yi s Ph D thesis 34 10 3 1 Dependence and Transitive Dependence Analysis Similar to most of the existing loop optimizing compilers this package models the safety requirement of loop transformations using a dependence graph The dependence graph includes all the statements of the input code segment as vertices and a dependence edge is put from statement s to s2 in the graph if s must be executed before s2 If a statement reordering transformation does not reverse the directio
198. or run additional tests Chapter 15 Appendix This appendix covers a number of relevant topics to the use of ROSE which have not been worked into the main body of text in the ROSE User Manual FIXME The sec this Appendix are while we figure out belong in the ROSE U or 15 1 Error Messages The user will mostly only see error messages from EDG these will appear like normal C compiler error messages These can be turned off using the EDG option edg no_warnings or edg w on the command line of any translator built using ROSE 15 2 Specifying EDG options The EDG options are specified using edg lt edg option gt for EDG options starting with or edg lt edg option gt for EDG options starting with The details of the EDG specific options are available at http www edg com docs edg_cpp pdf available from the EDG web page at http www edg com cpp html 15 3 Easy Mistakes to Make How to Ruin Your Day as a ROSE Developer There are a few ways in which you can make mistakes within the development of the ROSE project 1 Never run configure in your source tree If you do then never run make distclean since this will remove many things required to develop ROSE Things removed by make distclean are a documentation including several of the directories in ROSE docs Rose 139 140 CHAPTER 15 APPENDIX 15 4 Handling of source filename extensions in ROSE On case sensitive systems ROSE handles
199. oreign ID return fold_foreign dynamic_cast lt Foreign gt in return 0 case Switch_case 1D return fold_switch_case dynamic_cast lt Switch_case gt in case Catch ID 4 Once both packages have been installed ROSE must be configured with the additional with php usr local php option 16 15 Binary Analysis The documentation for the binary analysis can be found in the ROSE manual at However there are a collection of details that we need to document about the design so for how these details can go here The design behind the support for binary analysis in ROSE has caused a number of design meetings to discuss details This section is specific to the support in ROSE for binary analysis and the development of the support in ROSE for the binary analysis 16 15 1 Design of the Binary AST This subsection is specific to the design of the binary executable file format and specifically the representation of the binary file format in the Binary AST as a tree in the graph sense instead of as a directed graph so that 156 CHAPTER 16 DEVELOPER S APPENDIX ti can be traversed using the mechanisms available in ROSE e Symbols Their are multiple references to symbols as shown in the Whole Graph view of the AST with the binary format We have selected the SgAsmELFSymbolTable and the SgAsmCoffSymbolTable instead of the SgAsmGenericSymbolTable because it points to the most derived type An alternative reasoning is that in stripped
200. output qmt files TEST The name of the QMTest test file created TEST_CLASS The QMTest class specified on command line PROGRAM The ROSE translator specified on command line BACKEND The original command line of the project build system with the backend compiler ARGUMENTS The compile line arguments specified on the command line with any script user specified arguments for the ROSE translator such as edg no_warnings bound for the QMTest test file LAST_ARG The closing stub to the QMTest arguments format along with the o lt object file name gt argument ARG The current compile line argument place holder used in constructing the argument format to QMTest arguments ARGUMENTS arg1 arg2 argN 14 4 EXECUTION WALKTHROUGH 135 14 4 Execution Walkthrough qm sh is broken into code blocks which each perform some procedure These blocks are delimited with a solid line of 80 characters 14 4 1 Backend and ROSE arguments 1 for ARG in 2 do 3 COUNT 4 if COUNT gt 3 then 5 if CL ARG 0 6 rose then 6 ARGUMENTS ARGUMENTS echo ARG sed e s rose g 7 continue 8 fi 9 BACKEND BACKEND ARG build original compile line Figure 14 1 Backend and ROSE argument construction block This block of code builds the original compile line of the project s build system along with the arguments passed specifically to the ROSE compiler In the fo
201. oved since calling them had no lasting effect anyway Instead if you want to change one of these values for unparsing then modify the actual SgAsmGenericSection that holds the table e g calling SgAsmGenericFile shift_extend 9 ELF Section or Segment Table size The number of entries in the section or segment table cannot be modified by calling set_e shnum or set_e_phnum on the SgAsmElfFileHeader FIXME Remove these function Rather the sizes are obtained by looking at what sections and segments are currently defined and writing an entry to the file for each one 10 ELF Section Names Elf section names can be modified Doing so may cause extensive changes to the executable due to reallocation of the section holding the string table Do not call SgAsmElfSectionTableEntry set_sh_name since that value will be overwritten based on the actual current location of the name in the associated string table 11 ELF Segment Names ELF segment names are often parser generated based on constants in the ELF Segment Table However if the segment corresponds to an actual ELF Section defined in the ELF Section Table then the segment and section share the same SgAsmGenericSection object and changing the name causes the ELF Section name to change with no effect on the segment table 12 ELF Section Name Table The section that holds the section names is identified in the ELF File Header get_e_shstrndx Although it is possible to change this value
202. pile the example applications These command lines will be what you will want to reproduce in your Makefile 6 Where is the SgTypedefSeq used Any type may be hidden behind a chain of typedefs The typedef sequence is the list of typedefs that have been applied to any given type 7 Why are there defining and non defining declarations class X non defining declaration X foo return type of function will refer to non defining declaration X xPointer NULL Again the type will refer to a pointer to a type that will be the non defining declaration class X defining declaration The traversal will visit the declarations so you will in this case see the class X class declaration and the class X class declaration In general all references to the class X will use the non defining declaration and only the location were X is defined will be a defining declaration This is discussed in great detail in the chapter on SAGE III of the ROSE User Manual and a bit in the Doxygen Web pages In general while unparsing we can t be sure where the definitions associated with declarations are in the AST without making the code generation significantly more complex class X class X could be unparsed as class X should have been class X class X should have been class X The previous example hardly communicates the importance of this concept but perhaps this one does class X class Y class
203. r loop all the arguments passed to qm sh are looped through however the first three arguments are skipped due to the if statement on line 4 All other arguments after the third are considered arguments of either ROSE or the original project s build system ROSE arguments must be prefixed with rose lt ROSE argument gt when specified on the compile line Each argument with this prefix is stripped of the prefix rose and added to the ARGUMENT list of the QMTest test file ROSE arguments are not carried over to the BACKEND compile line variable but all other arguments are appended without change with the exception of the o lt Object file Name gt flag 14 4 2 Relative Path Compile line Arguments 1 case ARG in I includeFullPath N 3 c cpp C ccC ARG pwd ARG 4 o BOOL 1 continue spike out o outputfilename 5 AS 6 esac Figure 14 2 Relative to Absolute Paths in Arguments This block of code handles all compile line arguments containing relative file or include paths The case esac switch statement compares against patterns indicative of C C source files or an include di rective All source files without absolute paths stemming from root are simply appended with their present working directory Directories specified by the I include directive call the function includeFullPath which changed relative paths to absolute paths 136 CHAPTER 14 TESTING WITHIN ROSE 14 4 3 Nami
204. rder or postorder traversal typedef preorder postorder t_traversalOrder class AstSimpleProcessing public void traverse SgNode node t_traversalOrder treeTraversal0rder void traverseWithinFile SgNode node t_traversal0rder treeTraversalOrder void traverseInputFiles SgProject projectNode t_traversalOrder treeTraversal0rder protected void virtual visit SgNode astNode 0 3 To use the class AstSimpleProcessing the user needs to implement the function visit for a user defined class that inherits from class AstSimpleProcessing To invoke a traversal one of the three traverse functions needs to be called 7 3 1 Example In this example we traverse the AST in preorder and print the name of each node in the order in which they are visited The following steps are necessary Interface Create a class MyVisitor that inherits from AstSimpleProcessing Implementation Implement the function visit SgNode astNode for class My Visitor Usage Create an object of type MyVisitor and invoke the function traverse SgNode node t_traverseOrder treeTraversalOrder Figure presents the interface Figure presents the implementation Figure presents the useage 7 4 ASTPREPOSTPROCESSING 75 include MyVisitor h My Visitor visit SgNodex node cout lt lt node gt get_class_name lt lt endl Figure 7 2 Implementation file My Visitor C include rose h include MyVisitor h i
205. ritedValue method may be implemented This method is invoked with the inherited attribute computed at this node after all child nodes have been visited It can free any resources necessary An empty default implementation of this method is provided so the method can be ignored if it is not needed 7 5 1 Example In this example we traverse the AST and print the node names with proper indentation according to the nesting level of C basic blocks The function evaluateInheritedAttribute is implemented and an inherited attribute is used to compute the nesting level The following steps are necessary Interface Create a class MyIndenting which inherits from AstTopDownProcessing and a class MyIndentLevel The latter will be used for attributes Note that the constructor of the class MyIndentLevel initializes the attribute value Implementation Implement the function evaluateInheritedAttribute SgNode astNode for class MyIndenting Usage Create an object of type MyIndenting and invoke the function traverse SgNode node t_traverseOrder treeTraversalOrder Figure 7 4 presents the interface Figure 7 5 presents the implementation Figure 7 6 presents the useage Note that we could also use unsigned int as attribute type in this simple example But in general the use of objects as attributes is more flexible and necessary if you need to compute more than one attribute value in the same traversal 7 5 ASTTOPDOWNPROCESSING 77
206. ror branches is preferred Put your code into source files C and as little as possible into header files If you use templates put the code into a C file and include that C file at the bottom of your header file If you use a for loop and break out of the loop using break at some point in the iteration then consider a while loop instead Don t forget a default statement within switch statements 16 9 THINGS THAT MAY HAPPEN TO YOUR CODE AFTER YOU LEAVE 151 8 Please don t open namespaces in source files i e use the fully qualified function name in the function definition to make the scope of the function as explicitly clear as possible 9 Think about your variable names I too often see Node node and n in the same function Make your code obvious so that I can understand it when I m tired or stupid or both 10 Write good code so that we don t have to debug it after you leave 11 Indent your code blocks My rules for style are as follows Adhere to them if you like or don t if you re appalled by them 1 Indent your code blocks I use five spaces but some consider this excessive 2 Put spaces between operators for clarity 16 9 Things That May Happen to Your Code After You Leave No one likes to have their code touched and we would like to avoid having to do so We would like to have your contribution to ROSE always work and never have to be touched We don t wish to pass critical judgment on style
207. rs ROSE represents a tool for building source to source translators Such translators can be useful for many pur poses e automated analysis and or modification of source code e instrumentation e data extraction e building domain specific tools An optimizing translator can be expected to both analyze the input source code and automatically generate transformations of the source code the result being a new source code If successful the automatically generated source code will demonstrate better performance ROSE is the tool that helps users write such source to source translators Expected users would be library writers and tool developers not necessarily the application developers As a result we expect the ROSE user to be more knowledgeable about programming languages issues than the average application developer ROSE translators are particularly useful as a way to bridge the gap between what we want compilers to do and what they actually do This semantic gap is significant when optimizing user defined abstractions functions and or data structures because the base language compiler has no knowledge of their semantics The opti mization is particularly important within scientific applications Such applications are often expensive to build because they are exceedingly complex and must too often be written at low levels of abstraction to maintain significant performance on modern computer architectures The modern computer architectures
208. rs documentedExamples astProcessingExamples 7 11 Visualization 7 11 1 Example Graphs The graph shown in figure 7 8 is the AST of the program in figure 7 7 Such an output can be generated for an AST with AstDOTGeneration dotgen dotgen generateInputFiles projectNode AstDOTGeneration PREORDER where projectNode is a node of type SgProjectNode and the order in which the AST is traversed is specified to be AstDOTGeneration PREORDER or AstDOTGeneration POSTORDER 82 int main int n 10 while n gt 0 n n 1 i return n CHAPTER 7 AST PROCESSING Figure 7 7 Example program used as running example 7 11 VISUALIZATION toot 1 SgGlobal CSize l SgFunctionDeclaration CSize 2 args definition 3 SgFunctionParameterList CSize 0 4 SgFunctionDefinition CSize 1 SgBasicBlock CSize 3 10 ly 2 6 SgVatiableDeclaration CSize 1 25 SgReturnStmt CSize l o condition expression_root 7 12 17 26 SglnitializedName SgExprStatement SgBasicBlock SgExpressionRoot CSize 2 CSize 1 CSize 1 CSize 1 initpte itemptr expression_root 0 operand_i 8 13 SglnitializedName SgExpressionRoot CSize 2 CSize 1 18 27 SgExprStatement SgV arRefExp CSize 1 CSize 0 initpte itemptr operand_i expression_root 9 14 19 SgAssignInitializer dh SgGreaterThanOp SgExpressionRoot CSize 1 CSize 2 CSize 1 operand_i lhs_operand_i
209. s Security analysis tools e and many more 17 MIE Check spelling of recent work by Bjarne 18 CHAPTER 1 INTRODUCTION 1 3 What Is ROSE ROSE is a project that aims to define a new type of compiler technology that allows compilation techniques to address the optimization of user defined abstractions Due to the nature of the solution we provide it is also an open compiler infrastructure that can be used for a wide number of other purposes User defined abstractions are built from within an existing base language and carry specific semantic informa tion that can t be communicated to the base language s compiler In many situations the semantic information could be useful within program optimization but the base language compiler is forced to ignore this semantic information because there is no way for applications to pass such additional information to the base language compiler Note that pragmas only permit information that the base language compiler might anticipate ex pect to be passed it is not a meaningful mechanism to communicate arbitrary information about user defined abstractions to a compiler ROSE is a part of general research on telescoping languages a term coined by Ken Kennedy at Rice University and CELL languages a term coined by Bjarne Stroustrup It is part of general work to define domain specific languages economically from general purpose languages 1 4 ROSE A Tool for Building Source to Source Translato
210. s numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes number0fNodes numberOfNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes numberOfNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes number0fNodes number0fNodes numberOfNodes number0fNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes 922 861 855 837 817 784 781 764 762 753 643 630 605 601 548 399 371 351 325 291 290 223 209 201 193 181 168 157 151 150 126 118 115 101 101 82 77 77 72 61 53 49 49
211. s Conference Sunriver Oregon April 1994 Parsons R and Quinlan D Run time Recognition of Task Parallelism within the P Parallel Array Class Library Proceedings of the Conference on Parallel Scalable Libraries Mississippi State 1993 Angus I G and Thompkins W T Data Storage Concurrency and Portability An Object Oriented Approach to Fluid Dynamics Fourth Conference on Hypercubes Concurrent Computers and Applications 1989 Baden S B Kohn S R Lattice Parallelism A Parallel Programming Model for Non Uniform Structured Scientific Computations Technical report of University of California San Diego Vol CS92 261 September 1992 Balsara D Lemke M Quinlan D AMR a C Object Oriented Class Library for Parallel Adaptive Mesh Refinement Fluid Dynamics Applications Proceeding of the American Society of Mechanical Engineers Win ter Anual Meeting Anahiem CA Symposium on Adaptive Multilevel and Hierarchical Computational Stratagies November 8 13 1992 Berryman H Saltz J Scroggs J Execution Time Support for Adaptive Scientific Algorithms on Distributed Memory Machines Concurrency Practice and Experience Vol 3 3 pg 159 178 June 1991 Chandy K M Kesselman C CC A Declarative Concurrent Object Oriented Programming Notation California Institute of Technology Report Pasadena 1992 Chase C Cheeung A Reeves A Smith M Paragon A Parallel Programming Environment for
212. s called for each node when the AST is traversed The inherited attributes are restricted such that a single attribute of a parent node is inherited by 76 CHAPTER 7 AST PROCESSING all its child nodes i e the return value computed by the function evaluateInheritedValue at the parent node is the input value to the function evaluateInheritedValue at all child nodes template lt InheritedAttributeType gt class AstTopDownProcessing public void traverse SgNode node InheritedAttributeType initialInheritedAttribute void traverseWithinFile SgNode node InheritedAttributeType initialInheritedAttribute void traverseInputFiles SgProject projectNode InheritedAttributeType initialInheritedAttribute protected InheritedAttributeType virtual evaluateInheritedAttribute SgNode astNode InheritedAttributeType inheritedValue 0 void virtual destroyInheritedValue SgNode astNode InheritedAttributeType inheritedValue The function evaluateInheritedAttribute is called at each node The traversal is a preorder traversal In certain rare cases the inherited attribute computed at a node may involve resources that must be freed for instance the attribute may be a pointer to dynamically allocated memory that is no longer needed after the traversal of the child nodes has been completed Dynamically allocated attributes are only recommended for very large attributes where copying would be prohibitively expensive In such cases the destroyInhe
213. s the root of the input code fragment to be transformed by the package The parameters argc and argv represent command line arguments that instruct the package to adopt specific optimization strategies argc contains the number of arguments and argv contains the vector of string arguments The package currently recognizes the following arguments e bk1 lt blocksize gt apply outer loop blocking for better data reuse bk2 lt blocksize gt apply inner loop blocking for better data reuse icl apply loop interchange for better data reuse e fs0 perform maximum loop distribution with no fusion afterwards e fsl apply hierarchical single level loop fusion for better data reuse e fs2 apply simultaneous multi level loop fusion for better data reuse e tm report timing information for each phase of the transformation package ta lt int gt set the maximum number of split nodes when performing transitive dependence analysis e clsize lt int gt set cache line size for spatial reuse analysis The loop transformation tool LoopProcessor within ROSE recognizes these command line arguments and then automatically selects the corresponding optimization strategies When invoked with no argument LoopProcessor prints out usage information of this package 10 2 2 Developer Interface Utilizing the available internal interface compiler developers can easily extend this package in two aspects First they can rewrite the outside interfa
214. sformation on a specific input code is particularly useful Depending on the type of transformation there are different mechanisms within ROSE to support the development of a transformation Available mechanisms include in decreasing levels of abstractions a String Based Specification A transformation may specify new code to be inserted into the AST by specifying the new code as a source code string Functions are included to permit insert replace remove 157 158 CHAPTER 17 FAQ b Calling Predefined Transformations There are a number of predefined optimizing transformations loop optimizations that may be called directly within a translator built using ROSE c Explicit AST Manipulation The lowest level is to manipulate the AST directly Numerous functions within SAGE III are provided to support this but of course it is rather tedious 4 How do I use the SQLite database ROSE has a connection to SQLite but you must run configure with the correct command line options to enable it Example scripts to configure ROSE to use SQLite are in the ROSE scripts directory Another detail is that SQLite development generally lags behind ROSE in the use of the newest versions of compilers So you are likely to be forced to use an older version of your compiler particularly with GNU g 5 What libraries and include paths do I need to build an application using ROSE Run make installcheck and observe the command lines used to com
215. shallow copy n1 from orig shares all the children of orig while the deep copy n2 from orig duplicates all the children of orig by recursively cloning the children objects Note that the children of node orig are determined by the tree traversal mechanism of ROSE A field fp within orig E Check on template unions 48 CHAPTER 5 SAGE UI INTERMEDIATE REPRESENTATION is considered a child of orig only if fp is traversed by the tree traversal mechanism For all other fields in orig only shallow copies are performed As a result only pointers to SgNodes that are part of the tree traversal rooted at orig can be recursively cloned To simplify the specification of shallow and deep cloning of SgNodes two macros are further defined define SgSHALLOW_COPY SgShallowCopy static_instance define SgTREE_COPY SgTreeCopy static_instance The above example code therefore can be rewritten as SgNode orig SgNode n1i SgNode n2 orig gt copy SgSHALLOW_COPY orig gt copy SgTREE_COPY 5 6 Template Handling in C The purpose of this section is to lay out the details of handling C templates Initial template handling in SAGE III represented templates as classes and function using generated i e mangled names and with a flag indicating there derivation from a C template ROSE allows the transformation of templated classes and functions by generating the required specializations This way all details of a templated class of func
216. sing and synthesized attributes for passing information up the AST bottom up processing Inherited attributes can be used to propagate context information along the edges of the AST whereas synthesized attributes can be used to compute values based on the information of the subtree One function for computing inherited attributes and one function for computing synthesized attributes must be implemented when attributes are used We provide different interfaces that allow both one or no attribute to be used in the latter case it is a simple traversal with a visit method called at each node The AST processing mechanism can be used to gather information about the AST or to query the AST Only the functions that are invoked by the AST processing mechanism need to be implemented by the user of AstProcessing classes no traversal code must be implemented 7 2 Common Interface of the Processing Classes All five Ast Processing classes provide three different functions for invoking a traversal on the AST T traverse SgNode node traverse full AST including nodes that represent code from include files T traverseInputFiles SgProject projectNode traverse the subtree of the AST that represents the file s specified on the command line to a translator files that are the input to the translator T traverseWithinFile SgNode node traverse only those nodes that represent code of the same file where the traversal started The travers
217. special when it is preprocessed I am not aware the CPP will remove whitespace but it is not an issue since we get the information from EDG where it is generated before CPP translation Code correctness say someone analyzes some code which isn t fully correct complete e g misses some includes or misses some declarations or plainly has syntax errors What do you do in such a case Skip somehow the erroneous code or alternatively simply abort We can not currently handl incomplete code I would argue that any analysis of such code would have huge question marks The essential reasom for this limitation is that we use EDG for C and C and it 162 17 CHAPTER 17 FAQ can t handle incomplete code in version 3 4 However the newer 3 11 version of EDG is expected to handle incomplete code and then we will support this we have no experience with this yet Dialects how would you handle different language dialects e g c89 c99 the different flavors of C Visual C etc Do you build a super grammar that unifies all these somehow Or you have alternative grammars type checkers We support C89 C99 C 98 standard Fortran 4 Fortran 66 Fortran 77 Fortran 90 Fortran 95 Fortran 2003 PHP and Binary Analysis for x86 and ARM using ELF and PE NE LE and DOS binary formats We will start work on C 0x when we upgrade to the newest version of EDG We support C compiled using Microsoft Visual Studio but not all the MS ex
218. st of ROSE a much larger regression test suit will be available separately and is used separately These tests are divided into caegories 1 C_tests These are tests of the differences between the C subset of C and C Specifically these are typically C codes that will not compile with a C compiler even under the subset of C language rules used to invoke the subset of C rose C or rose C_only UNLESS the source files have the c suffix as 129 130 CHAPTER 13 ROSE TESTS opposed to any other suffix e g C These are all specific to the C89 standard which is what is typically assumed when refering to the C language C99 is covered separately 2 C99_tests These are tests specific to C99 new features not in C 3 UPC_tests These are tests that are specific to UPC modifiers recognized by EDG and handled in the Sage II AST This support for UPC does not constitute a UPC compiler a UPC specific runtime system would be required for that 4 Cxx_tests These are the C test files there are more tests here than elsewhere 5 C_subset_of_Cxx_tests This is the subset of C represented by the C language rules it is not all of C There are test codes here which contain if _cplusplus to represent some differences between the syntax of C and C typically enum and struct specifiers are requird for C where they are not required for C 6 RoseExample_tests These are examples of ROSE project source code and testin
219. stProcessing sses 6 6b see eS HR EAD ee ee eR 7 0 Ast TopDownProcessing lt sss 2 sayd dddaarra ata syre tbtraa rrd GoL EXAmplel e 28 weak a a et BESS Se ee a e ee ee 7 6 AstBottomUpProcessing o 6 se Sa ee ds da pe a a 7 6 1 Example Access of Synthesized Attribute by Name 20 4 7 7 Ast TopDownBottomUpProcessing 2 0 e Sesto aes ee SS Se hak ak at AA oe ea ee de 7 9 AST Node Attributes ee T10 COnGhisions ss 2 2 si i GRO ek RC a BB ee OS So RO a MA MED OS oO ae hae BOGE eee bade eee eae ds Phd radar dodo rara eee th bpb dae dard 8d Introdtictiony aa Ge Oe ee BARBERS CSE ERA Be OS 2 a ey ee Bae we es ak we eee eh LAA 8 2 1 SAGE II Rewrite Interface 2 ee iMate OO OG eG RHE SERS ERE R Sea a ead 8 2 3 Mid Level Rewrite Interface a E E ee eee 8 2 5 Advantages and Disadvantages of Rewrite Interfaces o o o fee ee oo fae eee Y age ee 8 3 1 Use of Strings to Specify Transformations ooo a e e a 0000 eee eee ee aa a a aa ar Re e AA BOs RR a ae ee a a SS A Ritos sio aa saari E e e ii A a daa ride BAe DEES ER id AA EA AA a 8 7 Limitations Known Bugs 69 69 69 70 70 71 71 72 73 73 73 74 74 75 75 76 78 79 79 80 81 81 81 81 10 9 Program Analysis YN ob ee Be A ee os A Be Sak bo ieee oe eae eG dl Ga Se Pe ee Seep teeter daa A E ee ee fe Geese eee i Khel Ai ee ee oe a Be hegre ena Gata rata Goh ae aM Se are ee hpind
220. such a good quality C front end and for allowing it to be used for research work in C They have permitted research work specific to the C language to address the complexity of real application written in C which would not otherwise be practical or within the scope of a research project To get a version of ROSE we encourage you to contact EDG to obtain their research license Instructions for getting an EDG license e Send email to these three fellows at EDG Steve Adamczyk jsa edg com John Spicer jhs edg com Daveed Vandevoorde daveed edg com I suggest sending the email to all of them at the same time so that they can see that you have sent email to the other two since I really don t know which one is the correct person to contact At some point we might get more information about a better approach The content of the email can be something like e We would like to work with the ROSE project at Lawrence Livermore National Laboratory LLNL which is using the EDG front end for research on C optimization They have asked that we obtain a research license in order to use ROSE for our research work with them They will then contact you by email and give you the location of the license form to fill out and get signed They will either let you know where to get the EDG software or suggest that you get our version of their code directly from us We will then give you all of ROSE which includes at present the source co
221. tLocationAfterStart CurrentLocationAfterEnd CurrentLocationBottom0fScopeStart CurrentLocationBottom0fScopeEnd Reference marker variables to avoid compiler warnings F J This is more difficult because the declaration markers must appear after the local stack 1 but then the statement case 0 y break default y break cannot appear after a switch It is probably impossible to fix this case due to the design and constraints of the C language design and limitations of the switch statement This is not a serious problem it just means that the whole switch statement must be operated upon instead of the block within the switch statement separately not a serious limitation Chapter 9 Program Analysis Program analysis is an important part of required support for sophisticated transformations This work is currently incomplete and is the subject of significant current research work Specific support for global analysis is provided via a database mechanism provided within ROSE and as part of work in merging multiple ASTs from different files to hold the AST from a whole project many files in memory at one time 9 1 General Program Analysis General program analysis is a critical piece of the work to provide optimization capabilities to ROSE to support DOE applications This work generally lags behind the compiler construction issues and robustness required to handle large scale DOE applications 9 1 1 Call Graph
222. table exists and create it if necessary The next two statements at line 20 and 21 create a rowdata object that stores all fields of a single row of the test table The constructor has initial arguments for all of the fields of a row In this case name and 1 0 are used to initialize the field s name and number which were specified in lines 6 and 7 The first argument UNKNOWNID is used to set the value of the row id to the default value 0 which means that the id is not yet properly initialized 0 is never used as an id for table rows the lowest possible valid id is 1 Note that the insert function initializes the id of the row as insert will create a new row in table that has a valid id In lines 24 and 25 a SQL query is performed which selects all rows where the number column is equal to 1 0 The string passed to the select member function call contains the conditional expression the WHERE clause of an SQL statement Hence the single equals sign is an SQL equality test and not for example an assignment The selected rows of the table are returned as a vector of rowdata objects As in line 21 a row matching the select condition was inserted into the table so at least one row should be returned if the example program was executed multiple times without deleting the test table entries from previous runs may be returned as well Assuming that the example program is run for the first time the SQL query should return the inserted row and th
223. tal over the years and remains somewhat incomplete a Parent pointers Pointers to parent nodes available through the SgNode get_parent member function in the AST are set reset after construction of the AST As a result of being set within a traversal of the AST the parents perfectly match the traversal s concept of the AST as a tree This point is important since the AST included edges that make it a directed graph and it is the traversal of the AST that gives it its form representation as a tree Thus all parent pointers are valid non null values except the root of the AST which has no parent and has a null valued pointer returned from SgNode get_parent There are two possible nodes that can be considered a root of the AST either the SgProject or the SgFile both nodes have constructors that take a translator s command line arguments b Function declarations Function declarations and function prototypes are confused in the AST and where a function is defined i e with a function body it appears in the AST as a func tion declaration SgFunctionDeclarationStatement with a pointer to a function definition SgFunctionDefinitionStatement A function prototype can have a null valued pointer returned from its get_definition member function and is marked explicitly as a function prototype so that the null valued pointer can be error checked If the function definition is available in the file not always the case then the
224. ted to the examples An overview of the ROSE project is included Error messages are contained in the Appendix there are few at the moment Further information is provided about the ROSE Web site where more information is available and where the latest copy of the documentation is located This Web site will also be 1All ROSE documentation is still in development 20 CHAPTER 1 INTRODUCTION Unoptimized C Source C Front End A a e EES AST Traversal y AST Rewrite System Optimized 5 C Lib AST C Code Generator Optimized s C Source Vendor C Optimized Compiler Executable Figure 1 1 Different phases of internal processing within translators built using ROSE infrastructure 1 9 ABOUT THIS MANUAL 21 the distribution site for ROSE once it is made public until then we welcome researchers to contact us directly to obtain pre release versions of ROSE This manual is divided into several principal chapters Each chapter covers material that in some cases requires an understanding of previous chapters These are intended to simplify your use of this manual Each chapter is described briefly below Preface This section briefly describes what this project is about Acknowledgments This section acknowledges contribution by many people over several years to the development of the ROSE project Introduction This chapter introduces why we have developed ROSE and some
225. tem ber 1992 Lyon France to be published in Lecture Notes in Computer Science Springer Verlag September 1992 Lemke M Quinlan D Witsch K An Object Oriented Approach for Parallel Self Adaptive Mesh Re finement on Block Structured Grids Preceedings of the 9th GAMM Seminar Kiel Notes on Numerical Fluid Mechanics Vieweg Germany 1993 McCormick S Quinlan D Asynchronous Multilevel Adaptive Methods for Solving Partial Differential Equations on Multiprocessors Performance results Parallel Computing 12 1989 pg 145 156 BIBLIOGRAPHY 167 24 25 26 27 28 29 30 31 32 33 34 35 36 37 McCormick S Quinlan D Multilevel Load Balancing Internal Report Computational Mathematics Group University of Colorado Denver 1987 Peery J Budge K Robinson A Whitney D Using C as a Scientific Programming Language Report Sandia National Laboratories Albuquerque NM 1991 Schoenberg R M an Array Language Extension to C Dyad Software Corp Renton WA 1991 Stroustrup B The C Programming Language 2nd Edition Addison Wesley 1991 V Adve G Jin J Mellor Crummey and Q Yi High performance fortran compilation techniques for parallelizing scientific codes In Proceedings of SC98 High Performance Computing and Networking Nov 1998 V Adve and J Mellor Crummey Using integer sets for data parallel program analysis and optimization In Proceedin
226. tensions We support a number of GNU specific C and C extensions but not all Since we use EDG for the frontend we don t have any super grammar representation evn EDG does not have such a construction in the design of there frontend Such concepts don t work well for real languges when you need to handle all the corners which is itself a sad commentary on parser generators and or modern languages For C and C the typechecking is mostly done by EDG and we save this information and add to it in the ROSE IR Chapter 18 Glossary We define terms used in the ROSE manual which might otherwise be unclear AST Abstract Syntax Tree A very basic understanding of an AST is the entry level into ROSE Attribute User defined information objects associated with IR nodes Forms of attributes include accumulator inherited persistent and synthesized Both inherited and synthesized attributes are managed automatically on the stack within a traversal Accumulator attributes are typically something semantically equivalent to a global variable often a static data member of a class Persistent attributes are explicitly added to the AST and are managed directly by the user As a result they can persist across multiple traversals of the AST Persistent attributes are also saved in the binary file I O but only if the user provides the attribute specific pack and unpack virtual member functions See the ROSE User Manual for more information and th
227. ter of the interface which is traversed template lt typename NodeFunctional gt querySubTree SgNode NodeFunctional AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes querySubTree SgNode TypeOfQueryTypeOneParameter AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes querySubTree SgNode roseFunctionPointerOneParameter AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes querySubTree SgNode SgNode roseFunctionPointerTwoParameters AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes querySubTree SgNode SgNodex Type0fQueryTypeTwoParameters AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes querySubTree SgNode VariantT AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes querySubTree SgNode VariantVector AstQueryNamespace QueryDepth AstQueryNamespace Al1Nodes Functions that visits every node in a std list lt SgNode gt s and returns a std list lt SgNode gt s queryNodeList NodeQuerySynthesizedAttributeType TypeOfQueryTypeOneParameter elementReturnType queryNodeList std list lt SgNode gt roseFunctionPointerOneParameter queryNodeList std list lt SgNode gt SgNode roseFunctionPointerTwoParameters queryNodeList std list lt SgNodex gt SgNode TypeOfQueryTypeTwoParameters queryNodeList std list lt SgNode gt VariantT queryNodeList std list lt SgNode gt VariantVector
228. tes Transformations on member function templates are not affected They are instantiated in the prelink stage and seen in the SAGE III AST at that time They can be transformed in the prelink stage during or immediately after the instantiation This is the earliest possible stage where transfor mation on instantiated templates can be done A transformation on a templated function is handled as a transformation on each of its instantiations e Generation of code for transformed templated class If a class template is modified then we have to unparse all of the templated member functions This is because an instantiated template cannot force its member functions to be instantiated unless we do it explicitly as I understand the template prelinking mechanism Unparsing the instantiated template with a mangled name or as a specialization causes it to be considered as a new class and forces the construction of all member functions This is a slightly different concept than instantiation closer to specialization I think since specialization is not automated and must be handled explicitly as a result Details are discussed earlier in this section 5 6 TEMPLATE HANDLING IN C 53 The declaration of a template class as a specialization requires declaration and definition of member functions of the class because the template mechanism would permit them to be different even though this seems redundant and even if we can automate the construction of
229. the effect of applying loop fusion to a sequence of loop nests in the subroutine tridupk of the application benchmark Erlebacher from ICASE The original code in a contains four separate loop nests all of which can be fused into a single one The package performs multiple levels of loop fusion simultaneously using a combined interchange and fusion transformation see Section 10 3 2 and the optimized code is shown in b 10 2 Interface for End Users and Compiler Developers This package is written in C language in a object oriented style It utilizes traditional techniques developed to optimize loop nests in Fortran programs When optimizing C or C applications this package only recognizes and optimizes a particular for loop that corresponds to the DO loop construct in Fortran programs Within the ROSE source to source compiler infrastructure such a loop is defined to have the following formats for i lb i lt ub i positiveStep or for i ub i gt lb i negativeStep 10 1 Here i is an arbitrary integer variable lb and ub are arbitrary integer expressions and positiveStep and negativeStep are positive and negative integer expressions respectively To expand this definition the user can rewrite the LoopTrans formInter face class within the package distribution see Section or use a preprocessor within ROSE to translate all the non Fortran loops into the aforementioned formats Such a loop normalization preproc
230. the internal compiler robustness 7 Overture This is an internal DOE library that supports Overset Grid applications It is well in excess of one million lines of code It includes the A P library and other libraries upon which it depends 8 CHROMA This is an Molecular Dynamics application developed at University of Illinois at Urbana Champaign UIUC This is not really a one million line code I think but Overture more than makes up the dif ference The first six are mostly done in the sense that there are about 10 bugs that have been isolated which appear to be the only remaining problems I am working on these bugs but some are non trivial read hard Plum Hall C and C Compiler Test Suite This is a commercial C and C compiler test suit that was purchased for us by the DOE Advanced Simulation and Computing ASC program We appreciate their substantial support of ROSE They also fund part of the ROSE project but these test codes are REALLY hard Nightly cron jobs Nightly regression tests are run on ROSE these are easy to setup using the command crontab e this will bring up an editor then put in the following lines Time Spec ist column minute 2nd column hours 3rd column day 4th column month 5th column year then followed by the command to be run at the time specified by the time spec 55 12 cd home dquinlan ROSE svn rose scripts amp amp roseFreshTest roseFreshTestStub xyz sh Then build a
231. the member function by automating the declaration of specializations for all member functions and static member data This mechanism needs to be controlled so that we can control the amount of code generated Options are Always generate used class templates AND the member functions upon which they depend This might seem to require that we generate all possible code though in general it is only slightly less than all the member functions minus the template class definition So maybe this is a suitable option but not in the current plan Generate only class template instantiations that have been transformed Then generate all the member functions upon which they depend This is the current design within ROSE Generate only the function template instantiations that have been transformed currently all function template instantiations are generated since we can t know in advance which ones the user might wish to transform Note that if a template is never used in a given translation unit then we will not instantiate it and we can t even allow the user to see it for a possible transformation This is not much different than existing vendor compilers that would not instantiate the template unless it was required by at least one translation unit It can be argued that the ability to transform templated functions and classes that are never used by an application is inherently meaningless As is the case for any vendor compiler if the
232. the same ROSETTA source file specify the node s SAGE class hierarchy This is done through the specification of what looks a bit like a BNF production rule to define the abstract grammar NEW_NONTERMINAL_MACRO Expression UnaryOp BinaryOp ExprListExp VarRefExp ClassNameRefExp FunctionRefExp MemberFunctionRefExp ValueExp FunctionCallExp Size0f0p TypeIdOp ConditionalExp NewExp DeleteExp ThisExp RefExp Initializer VarArgStartOp VarArgOp VarArgEndOp VarArgCopyOp VarArgStartOneOperandOp Expression ExpressionTag In this case we added the VarArgOp IR node as an expression node in the abstract grammar for C 4 Add the new node s members fields both data and function members are allowed ROSETTA permits the addition of data fields to the class definitions for the new IR node Many generic access functions will be automatically generated if desired VarArgOp setDataPrototype GRAMMAR_PREFIX_Expression operand_expr NULL CONSTRUCTOR_PARAMETER BUILD_ACCESS_FUNCTIONS DEF_TRAVERSAL NO_DELETE 16 5 ADDING NEW SAGE III IR NODES DEVELOPERS ONLY 147 The new data fields are added to the new IR node Using the first example above the new data member is of type SgExpression with name operand_expr and initialized using the source code string NULL Additional properties that this IR node will have include e Its construction will take a parameter of this type and use it to init
233. the two statements To compute the EDM representation of dependences this package uses an adapted Gaussian elimination algorithm to solve a set of integer linear equations of loop induction variables For each array access in the original input program the algorithm first constructs a set of linear equations based on the index expressions of the array access If no loop induction variable has a symbolic coefficient in the array access expressions such as the ones in the Matrix Multiplication code in Figure 10 1 and the non pivoting LU in Figure the algorithm is at least as powerful as the combined ZIV SIV and Delta dependence tests described by Allen and Kennedy 30 83 However when loop induction variables do have symbolic coefficients the algorithm assumes a conservative solution and is less precise than the symbolic solution algorithms described in 33 This package also extends the traditional dependence model by implementing the transitive dependence analysis algorithm published by Yi Adve and Kennedy 35 Note that although the algorithm is quite efficient in summarizing the complete transitive dependence information between statements this package applies transitive dependence analysis only when transforming complex loop structures that cannot be translated into sequences of perfectly nested loops Because the safety of transforming perfect loop nests can be determined based on individual dependence edges alone it is often more economic
234. themselves also vary widely and make the optimization of software difficult 1 5 Motivation for ROSE The original motivation for the development of ROSE comes from work within the Overture Project to develop abstractions for numerical computation that are efficient and easy to use Basically C language mechanisms 1 6 ROSE AS A COMPILER FRAMEWORK 19 made the abstractions easy to use if not tedious to build but efficiency was more problematic since the opti mization of low level abstractions can be and frequently is not handled well by the compiler Specifically the rich semantic information the library writer embeds into his abstractions can t be communicated to the compiler so many optimizations are missed ROSE has addressed this fundamental problem by simplifying how an opti mized translator could be built and tailored to a library s abstractions to introduce optimizations that use the high level semantics of user defined abstractions 1 6 ROSE as a Compiler Framework ROSE contains compiler infrastructure This is because a translator that reads source code in any language is essentially a compiler or translator The most precise understanding of a source code in any language is the process of compiling it Source to source compilation can however skip the common back end code generation since source code is generated instead of object code in the form of an executable ROSE translators pay particular attention to reconstru
235. thin ROSE is available if one has additional freely available software installed libxml2 devel Several optional features of ROSE need to handle XML files such as roseHPCT and BinaryContextLookup Doxygen Most ROSE documentation is generated using LaTex and Doxygen thus Doxygen is required for ROSE developers that want to regenerate the ROSE documentation This is not required for ROSE users since all documentation is included in the ROSE distribution Visit www doxrygen org for details and to download software There are no ROSE specific configure options to use Doxygen it must only be available within your path LaTeX LaTeX is used for a significant portion of the ROSE documentation LaTex is included on most Unix systems There are no ROSE specific configure options to use LaTeX it must only be available within your path DOT GraphViz ROSE uses DOT for generating graphs of ASTs Control Flow etc DOT is also used internally by Doxygen Visit www graphviz org for details and to download software An example showing the use of the DOT to build graphs is in the ROSE Tutorial There are no ROSE specific configure options to use dot it must only be available within your path SQLite ROSE users can store persistent data across separate compilation of files by storing information in an SQLite database This is used by several features in ROSE call graph generation for example and may be used directly by the user for storag
236. ths_operand_i operand_i 10 15 20 SglntVal SgV arRefExp Sga ssignOp CSize 0 CSize 0 Figure 7 8 CSize 2 lhs_operand_i ths_operand_i 21 22 SgV arRefExp SgSubtractOp CSize 0 CSize 2 lhs_operand_i ths_operand_i 23 SgV arRefExp CSize 0 Numbers at nodes show the order in which the visit function is called in a preorder traversal 83 84 toot 26 SgGlobal CSize l 0 25 SgFunctionDeclaration CSize 2 args definition 0 24 SgFunctionParameterList SgFunctionDefinition CSize 0 CSize 1 body 23 SgBasicBlock CSize 3 0 y 12 5 SgVariable Declaration CSize l 19 SgWhileStmt CSize 2 22 SgReturnStmt CSize 1 0 condition expression_root 4 10 18 21 SglnitializedName SgExprStatement SgBasicBlock SgExpressionRoot CSize 2 CSize l CSize l CSize 1 initpte itemptr expression_root 0 operand_i 3 SglnitializedName CSize 2 9 SgExpressionRoot CSize 1 17 20 SgExprStatement SgV arRefExp CSize l CSize 0 initpte itemptr expression_root 5 SgAssignlnitializer 16 dh SgGreaterThanOp SgExpressionRoot CSize 1 CSize 2 CSize 1 operand_i hs_operand_i ths_operand_i operand_i L 6 7 15 SglntVal SgV arRefExp SglntVal Sga ssignOp CSize 0 CSize 0 CSize 0 CSize 2 lhs_operand_i ths_operand_i 11 14 SgV arRefExp SgSubtractOp CSize 0 CSize 2 lhs_operand_i ths_operand_i 12 SgV arRefExp CS
237. tics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics Statistics number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes numberOfNodes number0fNodes number0fNodes number0fNodes numberOfNodes number0fNodes number0fNode
238. tion or static data member become visible to the user in the AST and permit maximum information assumed to be required for any transformation No transformation occurs on the template declaration unless it s done explicitly by the user this is difficult since the text string representing the template is not formed into an AST that we can traverse Note that this is a result of a design decision on the part of EDG to provide this as a default behavior and our decision to use it More recent work to get the template as an AST is underway using some of the options in EDG to support this This later work is not robust enough to be the default in ROSE without a bit more work 5 6 1 C Constructs That Can Be Made Into Templates The concept of templates does not apply to all C constructs and affects only a few The only things that can be templates are classes including structs and likely unions functions including member functions and variables static data members The first two are common but the case of templated variables perhaps requires an example template lt typename T gt class A public non template data member int nonTemplateDataMember template data member T templateDataMember template static data members static T staticTemplateDataMember_T static float staticTemplateDataMember_float 5 6 TEMPLATE HANDLING IN C 49 This is a template static data member SgVariableDeclaration template lt class
239. try to get to fixing up the ROSE Tutorial example so use this interface Rich and I have been spending a lot of time on the Tutorial lately after finishing the ROSE User Manual two weeks ago We are getting all the documentation ready for release on the web This will likely happen in a few weeks though all the paperwork and approvals are already in place So as it is this is a wonderful example of just what a bad idea it is to manipulate the AST at such a low level It is the reason we have the AST Rewrite Mechanism provide the highest level of interface required to make manipulation generally more simple We have read that the rose compiler is provided under the BSD license Is every part of the rose compiler under BSD licence and is it free for commercial use ROSE is free for commercial use our research license with EDG has no restrictions except that we can only release the binary and not the source code Obviously the EDG part is not released BSD only the source code part If you want to build products using ROSE for C C then you should consider contacting EDG for a license to there work then you could build commercial products and sell them but you don t have to worry about ROSE I have no idea what ground your on if you build commercial products for sale based on ROSE and just use the EDG binary that we provide I expect it would be a complicated install for your customers In general if you are using EDG and building commercial
240. ts the qualified name The algorithm used for name mangling is best described in the actual code where the documentation should be clear The code for this is in the SgType IR nodes and its derived IR nodes The codes used for the operators is present in the function SgType mangledNameSupport SgName SgUnparse_Info 5 14 Passing Options to EDG and ROSE By default all command line options except EDG or ROSE specific options are passed to the back end compiler As a result the command line for the compiler can be used with any translator built using ROSE This is particularly effective in allowing large complex Makefiles to be used by only changing the name of the compiler CC or CXX Command line options are considered EDG specified when prefixed with option edg xxx edg xxx edg parameter xxx n or edg parameter xxx n which then translates to xxx xxx XXX N Or xxx N respectively for only the command line passed to the EDG front end not passed to the back end compiler These are required to support the different types of command line arguments used in EDG For a complete list of the EDG options see the EDG documentation available only from EDG and covered under their license to use EDG Similarly ROSE specific command line options are prefixed using rose xxx and only interpreted by ROSE not passed on to EDG or the back end compiler To see a complete list use any translator build using ROSE with the option help A
241. ts to build a new application In the context of ROSE all transformations are source to source Translator An executable program in our context built using ROSE that performs source to source translation on an existing input application source to generate a second generated source code file The second generated source code is then typically provided as input to a vendor provided compiler which generates object code or an executable program Traversal The process of operating on the AST in some order usually pre order post order out of order randomly depending on the traversal that is used The ROSE user builds a traversal from base classes that do the traversal and execute a function or a number of functions provided by the user Bibliography 1 2 Bassetti F Davis K Quinlan D A Comparison of Performance enhancing Strategies for Parallel Numerical Object Oriented Frameworks To be published in Proceedings of the first International Scientific Computing in Object Oriented Parallel Environments ISCOPE Conference Marina del Rey California Dec 1997 Quinlan D Berndt M MLB Multilevel Load Balancing for Structured Grid Applications Published in Proceedings of the SIAM Parallel Conference Minneapolis MN March 1997 Brown D Chesshire G Henshaw W and Quinlan D OVERTURE An Object Oriented Software System for Solving Partial Differential Equations in Serial and Parallel Environments
242. uble number define TABLES DEFINED 1 JE include DatabaseGraph h int main int argc char xargv GlobalDatabaseConnection db int initOk db initialize assert initOk 0 CREATE TABLE db projects db files CREATE TABLE db graphdata CREATE TABLE db graphnode CREATE TABLE db graphedge TableAccess lt testtableRowdata gt testtable amp db testtable initialize initialize project string projectName testProject this should be given at the command line string fileName testFile C this should be retrieved from a SgFile node projectsRowdata prow UNKNOWNID projectName UNKNOWNID projects retrieveCreateByColumn amp prow name projectName long projectId prow get_id get id of this file filesRowdata frow UNKNOWNID projectId fileName files retrieveCreateByColumn amp frow fileName fileName frow get_projectld long fileld frow get_id init graph DatabaseGraph lt testtableRowdata EdgeTypeEmpty gt graph new DatabaseGraph lt testtableRowdata EdgeTypeEmpty gt projectId GTYPE TESTGRAPH graph gt loadFromDatabase the graph is now ready for use add some example nodes and edges 108 CHAPTER 9 PROGRAM ANALYSIS Simple Callgaph Example include lt iostream gt include GlobalDatabaseConnection h include TableDefiniti
243. ues of the target compiler s predefined macros to build a set of header files that can be used by ROSE by the EDG front end from those used by the target back end The details are handled automatically and need not be a concern for users of ROSE We use the preinclude mechanism in EDG to force a specific generated header file to be read ahead of any ROSE system header files translated from the back end system header files by the ROSE configure mechanism This head file contains all the back end specific macros definitions The file name is rose_edg required _macros_and_functions h and is placed in the install tree lt prefix gt include lt back end compiler name gt HEADERS 5 12 Comments About Declarations Defining Declarations vs Non defining Declarations Declarations come in two kinds those that can have a separate definition e g class and function declarations and those that cannot e g enum and pragma declarations For example enums have to have their definition in their declaration there is no concept of forward declarations of enums in C or C A class declaration in C can have a forward declaration even repeated forward declarations before the declaration that contains the class definition the part Thus the following code is valid C class X forward declaration declaration with NULL pointer to definition class X defining declaration declaration with pointer to definition Note that multiple
244. unction takes a pointer to a rowdata object the name of the column to use and the unique name of the row as arguments see line 25 So in this case the name column and the string testProject are used As with the normal insert function from the first example the retrieveCreateByColumn function sets the id field of the rowdata to the correct value A new variable storing this project id is created in line 27 For the file id the procedure is almost the same with the exception that the project id is also passed to the function call in line 32 For most other ids other than the project id the project id is used to retrieve the row for the desired project If a project id is passed to the retrieveCreateByColumn function it assumes the table has a projectId column which has to match the given value Instead of working with these ids the example program just prints these values to stdout and quits The ids will remain the same over multiple runs of the program Try changing the file or project ids to force new entries to be created 9 2 3 Working With Database Graphs The following tutorial program will use the ROSE tables to build a graph for a user defined table Each execution of the program will enlarge the test graph by adding three nodes and edges to them from a random node in the graph 9 2 4 A Simple Callgraph Traversal The last database example tutorial will show how to use the database graph features explained in th
245. urce_code true This particularly restrictive syntax is used to prevent it from ever being used by accident Note that the following will not work They are equivalent to not having specified the option at all configure with edg_source_code configure with edg_source_code false configure with edg_source_code TRUE configure with edg_source_code xyz configure configure with edg_source_code True To see how any configuration is set up type make testEdgSourceRule in the ROSE src frontend CxxFrontend EDG_3 3 src directory To build a distribution without EDG source code 1 Configure to use the EDG source code and build normally 150 2 3 CHAPTER 16 DEVELOPER S APPENDIX Then rerun configure to not use the EDG source code and Run make dist 16 7 How to Deprecate ROSE Features There comes a time when even the best ideas don t last into a new version of the source code This section covers how to deprecated specific functionality so that it can be removed in later releases typically after a couple of releases or before our first external release When using GNU compilers these mechanisms will trigger the use of GNU attribute mechanism to permit use of such functions in applications to be easily flagged as warnings output when using the GNU options Wal11 Both functions and data members can be deprecated but the process if different for each case e Deprecated functions and member functions Use the macro
246. us ROSE is presently specific to the generation of C and C source to source based compilers translators more precisely Other language front ends may be appropriate to add to ROSE in the future current work with Rice University is focused on the addition of Open64 s front end to ROSE as part of support for FORTRAN 90 ROSE makes it easy to build complex source to source translator preprocessor tools and thus supports research work in many areas Performance Optimization General Program Transformations Instrumentation Program Analysis Interface Generation Automated Check pointing Software Security Analysis Software Verification Automated Unit Test Generation and much more Acknowledgments The Intermediate Representation IR used in ROSE is called SAGE III SAGE III is something that we have built based on SAGE II which was never completed or widely distributed SAGE II was based on SAGE the improvements over SAGE were significant SAGE II was the first version of SAGE to use the Edison Design Group EDG frontend We want to thank the original developers of SAGE and SAGE II for their work from which we learned a lot through use of their object oriented IR interface We chose the name SAGE III to give sufficient credit to the original developers of SAGE and SAGE II who also suggested that we call what we are doing SAGE III ROSE of course builds on SAGE III and adds numerous additional mechanisms including
247. uses the merged slices to transform the original code Given two disjunct computation slices two slices that contain disjunct sets of statements because each computation slice fuses a set of loops that can be shifted to the same loop level fusing these two slices automatically achieves the fusion of the loops in both slices For example in Figure after transformation analysis the package constructs a computation slice for each of the loops in the original code in a It then performs fusion analysis and realizes that all the j slices and thus all the j loops can be legally fused into a single loop After merging these slices it uses a single j slice to perform a dependence hoisting transformation and thus automatically achieves the fusion of all the j loops in a Similarly all the i loops are also fused into a single loop and two of the k loops are fused Because the original loop structure may need to be distributed to achieve better performance before applying loop fusion analysis this package first performs maximum loop fission to distribute all the loop nests in the original code The distributed loop nests are then recombined during the loop fusion phase This strategy ensures that both loop fission and fusion optimizations are applied and that the final result of the optimization does not depend on the original loop structure of the application Combined Loop Interchange and Fusion This package optimizes applications to improve the memor
248. uteType gt initialInheritedAttributes vector lt SynthesizedAttributeType gt traverseWithinFile SgNode node vector lt InheritedAttributeType gt initialInheritedAttributes void traverseInputFiles SgProject projectNode vector lt InheritedAttributeType gt initialInheritedAttributes typedef SynthesizedAttributesList void addTraversal AstTopDownBottomUpProcessing lt InheritedAttributeType SynthesizedAttributeType gt traversal vector lt AstTopDownBottomUpProcessing lt InheritedAttributeType SynthesizedAttributeType gt gt amp get_traversalPtrListRef Note that these classes do not contain virtual functions for the user to override They are meant to be used through explicit instances not as base classes Instead of calling one of the traverse methods on the individual processing classes they are combined within an instance of the AstCombined Processing class and started collectively using one of its traverse methods Inherited and synthesized attributes are passed in and back through STL vectors Two methods for managing the list of traversals are provided The addTraversal method simply adds the given traversal to its list while get_traversalPtrListRef returns a reference to its internal list that allows any other operations such as insertion using iterators deletion of elements etc 7 9 AST NODE ATTRIBUTES 81 7 9 AST Node Attributes To each node in the AST user defined attributes can have an attribute attach
249. ve the following properties e Operations performed on collections only e Operations are immediate executed e Operations are local on the specified node of the AST e Operations do not take attached comments or preprocessor directives into account This can lead to unexpected results e g removing or moving include directives by accident 8 2 2 Low Level Rewrite Interface This interface is similar to the SAGE III Rewrite Interface except that operations are performed on any statement and not on the containers that store the statement lists The domain of the operations on the statements instead of on the parent nodes of the statements is the most significant difference between the two interfaces An additional feature includes support for repositioning attached comments directives from removed nodes to their surrounding nodes to preserve them within replace and remove operations Additional support is provided for marking inserted statements as transformations within the Sg_File_Info objects Operations using this interface have the following properties e Attached comments directives are relocated e Inserted AST fragments are marked within the Sg_File_Info objects e Operations are immediate e Operations are local on the specified node of the AST 8 2 MULTIPLE INTERFACES TO REWRITE MECHANISM 91 8 2 3 Mid Level Rewrite Interface This interface builds on the low level interface and adds the string interface which permits
250. wever we need to document the process to support developers who might be extending ROSE It is hoped that if you proceed to add TR nodes that you understand just what this means you re not extending any supported language e g C you are only extending the internal representation Check with us so that we can help you and understand what you re doing The SAGE III IR is now completely generated using the ROSETTA IR generator tool which we developed to support our work within ROSE The process of adding new IR nodes using ROSETTA is fairly simple one adds IR node definitions using a BNF syntax and provides additional headers and implementations for customized member data and functions when necessary There are lots of examples within the construction of the IR itself So you are encouraged to look at the Need to cover the new examples The general steps are Fortran support 1 Add a new node s name into src ROSETTA astNodeList 2 Define the node in ROSETTA s source files under src ROSETTA src For example an expression node has the following line in src ROSETTA src expression C NEW_TERMINAL_MACRO VarArgOp VarArgOp VA_OP This is a macro currently which builds an object named VarArgOp a variable in ROSETTA to be named SgVarArgOp in SAGE III and to be referenced using an enum that will be called V_SgVarArgOp The secondary generated enum name VA_OP is historical and will be removed in a future release of ROSE 3 In
251. y performance of applications through a combined loop interchange and multi level fusion strategy 36 Since loop fusion is implemented in terms of merging computation slices given a code segment C to optimize the package first constructs all the valid computation slices It then applies loop interchange analysis to these slices to arrange the best nesting order for each loop nest in C When applying fusion analysis to merge the disjunct computation slices it performs data reuse analysis and performs the actual fusion only when loop fusion does not interfere with loop interchange or when fusion is more favorable even if it interferes with loop interchange Because multiple computation slices are constructed for each loop nests and all of these slices participate in the fusion analysis simultaneously multiple loops may be fused for each loop nest in a single pass of fusion analysis As the result this package achieves a combined loop interchange and multi level fusion optimization for a collection of loop nests For example in Figure even though the j and 7 loops are nested at different levels in the original code in a the package successfully achieves the fusion of these loops because all the loops are collected as computation slices in a single pass and together they participate in the fusion analysis 10 3 4 Profitability Analysis This package separates the profitability analysis of loop transformations from the actual transformations by encodi

Download Pdf Manuals

image

Related Search

Related Contents

Tanaka ECS-506 User's Manual  Vivitek D755 User Manual French  VIZIO VO332L User's Manual  AK650B  同意する - メルセデス・ベンツ  Monitor UV-900 Bedienungsanleitung  Themis USPIIe-cPCI Software Manual  Extended LYX Features  VERNIS BISTROT  communedesaint-leu  

Copyright © All rights reserved.
Failed to retrieve file