Home

TAPENADE 2.1 user's guide - Sophia Antipolis

1. 9 2 What if validation fails 9 2 1 Debugging tools for the tangent mode 9 2 2 Debugging tools for the reverse mode 9 3 Improving differentiated programs 9 3 1 Indeed inactive variables 9 3 2 linear code fragments 9 3 3 Auto adjoint code fragments 9 3 4 Iterative resolutions 9 3 5 Loops with Independent Iterations 10 KNOWN PROBLEMS AND DEVELOPMENTS TO COME 10 1 Includes comments and declarations 10 2 Directives tac Set he eS et a eas el a ES Ge ed TOS input languagesys A M M ls MS 10 4 Pointers and dynamic allocation 10 5 Parallel programming 10 6 The Recompute All strategy lt lt lt lt ee ee ee INRIA TAPENADE 2 1 5 List of Figures OO NIO G BON F TAPENADE S D card 7 The Recompute All strategy 10 The store All strategy E SE SANS ROK es 10 Checkpointing in the contexts of RA and SA 11 Checkpointing on calls in TAPENADE reverse AD 12 Names of differentiated symbols 14 Differentiated Data Types 206 020 dai au 15 Differentiation of a simple assignment
2. 17 Differentiation of an array assignment 17 Instructions simplifications due to Activity Analysis 18 Effect of Activity Analysis on a complete procedure 19 Differentiation of the flow of control 21 Differentiation of I O procedure calls showing warning messages 22 Differentiation of procedure calls 24 Differentiation model in presence of Overloading 26 Differentiation of Modular Programs 27 Multi Directional tangent differentiation 29 Removing unnecessary storage through To Be Restored analysis 31 Gathering instructions and improving locality through instructions reordering 32 Detection and correction of aliasing in the reverse mode 33 Factorizing duplicate computations through minimal splitting 33 Removing useless dead adjoint code 34 HTML interface for TAPENADE input 54 HTML interface for TAPENADE output 55 Overall architecture of TAPENADE 57 Time and memory consumption on six validation codes 58 Internal Data Flow analyzes needed by AD in TAPENADE 59 Subroutine framework for validation of derivatives 61 Subroutine framework for debugging the tangent code 64 Split
3. ss ml Pln pene Roos Sep W httpvtapenade inria fr 6080 tapenadeform jsp 4 22 Search ant UU Zh lome foot On line Documentation New subscribe Z INRIA mao rhe TAPENADE tapenade users On line Automatic Differentiation Engine mailing list Given e a Fortran source program e the name of the top routine to be differentiated e the dependent output variables whose derivatives are required e the independent input variables with respect to which it must differentiate this tool returns the forward tangent or reverse adjoint differentiated program If you want to be kept informed about new developments and releases of TAPENADE subscribe to the tapenade users mailing list Select the Fortran dialect given by the suffix of the files Fortran 77 Fortran 95 Upload source and include files repeatedly Type the file path in or browse Browse and upload it as a source as an include Remove selected files utiist TS comecinc Retry with new files sub2 f sub1 f gt Name of the top routine psiroe gt Dependent output variables separator white space default all variables ce gt Independent input variables separator white space default all variables ua ctrl gt Differentiate in Tangent Mode Tangent Vectorial Mode Reverse Mode 3 El 34 Es Done JI Fear Figure 23 HTML interface for TAPENADE input INRIA TAPENAD
4. 2 ab i 2 a i a n i a n i tmp ab it2 0 0 CALL POPREAL4 a i ab i 1 ab i 1 ab i ab i 3 ab i Figure 20 Detection and correction of aliasing in the reverse mode mode uses this mechanism but the tangent mode will use it in further versions Expressions are not split during the forward sweep to avoid intermediate variables that might need to be added to the tape Splitting occurs only in the backward sweep and occurs only at carefully selected places in the differentiated expressions Figure 21 illustrates this on a not so long expression to keep things readable On this example splitting spares one exponentiation and one division original program reverse mode TAPENADE reverse backward sweep split backward sweep a SIN b ab ab SIN b rib tempb rib a x ty a x y r1b a a temp x y bb bb a COS b rib ab ab SIN b rib xb xb temp tempb a y x y 1 r1b a bb bb axC0S b rib yb yb xb xb y x y 1 tempb x y LOG x rib a yb yb temp L0G x tempb rib 0 0 rib 0 0 Figure 21 Factorizing duplicate computations through minimal splitting 5 5 Adjoint Liveness analysis Reverse differentiation of the program P that computes function F yields program P that computes the gradient of F The original results of P which are also computed by the forward sweep of P are not a result of P Only the gradient is needed by the end user Moreover in RT n 0300
5. outputlanguage lt lang gt o lt file gt output lt file gt 0 lt dir gt outputdirectory lt dir gt extAD lt file gt noADlib RT n 0300 Specifies the language used by all the files in the lt file_ list gt lt lang gt may be fortran fortran90 fortran95 or c This option overrides the default mechanism that deduces the language of each file from its suffix Specifies the language in which the output differentiated files must be produced lt lang gt may be fortran fortran90 fortran95 or c This option overrides the default mech anism that produces each differentiated procedure in the same language as its original procedure Specifies that the complete differentiated program must be written into a single file named lt file gt lt mode gt lt ext gt Ex tension lt ext gt reflects the language This option overrides the default mechanism that writes each differentiated top level object program subroutine function or module into a separate file named after the name of this object Specifies that all produced files must be put into directory lt dir gt If this option is not used all files are created in the current directory Adds lt file gt to the list of files that contain general speci fications of black box procedures Initially this list contains one predefined file named F77GeneralLib which is found in the lib directory of the TAPENADE installation Empties the list of files sear
6. that builds a Call Graph of the program with a Flow Graph for each procedure This kernel is responsible for all general purpose analyzes such as Read Written analysis or pointer Analysis Through an API on top of this kernel which is maybe not as clear cut as it should is built the AD engine strictly speaking It implements the AD specific analyses and builds the differentiated call graph and flow graphs back into the kernel We claim that other tools could be built this way on top of the same kernel The kernel is then able to send the differentiated IL programs to the pretty printer of the appropriate language Notice also in the middle bottom of figure 25 the black box signatures which stands for the user written library files described in section 6 2 User Interface Java XHTML Differentiation Engine Java API Imperative Language Analyzer Java trees IL trees IL Fortran77 parser C Fortran95 parser C Figure 25 Overall architecture of TAPENADE The chain of analyses that lead to differentiated programs is shown on figure 27 The topmost half is done inside the AD independent kernel and the bottom half is done inside the AD engine Figure 27 illustrates the partial order coming from the dependencies between the analyses Some of these analyses are more time consuming than others Pointers Dependency and Differentiable Dependency All these analyses actually propagate through the program a
7. This can be generalized to higher level derivatives Taylor series etc INRIA TAPENADE 2 1 9 3 1 Tangent and Reverse AD modes In practice the above Jacobian F X is often far too expensive to compute and store Notice for instance that equation 2 repeatedly multiplies matrices whose size is of the order of m x n Fortunately resolution of many problems requires only some projections of F X For example one may need only sensitivities also known as directional derivatives which are F X x X for a given direction X in the input space Using equation 2 sensitivity is F X x X fi Xpa X A a Xp 2 X lt X Ko x 3 which is efficiently computed from right to left because this keeps multiplying a matrix by a vector This is easy to implement because the derivatives of the first instructions are required first and thus they can be interleaved with the original program instructions This is the principle of the tangent mode of AD which is the most straightforward of course available in TAPENADE The program differentiated in tangent mode is often called the tangent program of P written P However in optimization data assimilation 21 inverse problems or adjoint prob lems 13 6 19 the appropriate derivative is the gradient of F Notice that the gradient is only defined for a scalar valued function in the general case where Y is a vector we need to define a scalar linear combination Y x Y as the new result
8. 0 0 0 id id The deps entry gives the differentiable dependency matrix of each output value with respect to input values for each argument For example here the result in parameter 2 depends on the input parameters 1 and 2 the result in the first variable in common c2 doesn t depend on any input and the result in the tail array of c2 depends on the inputs RT n 0300 42 Hasco t and Pascual in parameter 2 and in the whole of c2 An id entry instead of a complete line states that the argument is left unmodified which is a stronger information than a line with just one 1 because it implies that the partial derivative is certainly 1 0 This is the case here for parameters 1 4 and 5 On the other hand parameter 3 is overwritten but of course depends on nobody since it has a non differentiable type For the same reason nobody depends on parameters 3 4 and 5 Notice also that nobody depends on the input in c1 it is an output only File MyADLib can also specify the partial derivatives of black box functions such as FBBOX This uses the derivative keyword Here again the syntax is ugly and will be cleaned up soon Please refer to the TAPENADE web site for information on the current syntax and on the cleaner syntax when it is available Let us underline an interesting utilization of the black box mechanism to declare hid den side effects between communicating procedures Suppose that two procedures SEND and RECEIVE ar
9. 63 tape 10 20 23 31 67 70 tapenade 35 command line options 35 39 DIFFSIZES inc 30 50 52 52 ftp downloading 35 on line documentation 35 performances 58 58 67 procedure headers 19 query web interface 53 56 result web interface 55 synopsis 35 warning messages 43 51 to be restored see TBR analysis trajectory see tape transposition 9 16 17 type checking 25 40 41 45 59 types 14 16 20 25 41 45 primitive types 15 39 63 structured types 14 16 25 27 upstream 11 validation 60 62 vector mode see multi directional mode vectorial notation see array notation Hasco t and Pascual INRIA A Unit de recherche INRIA Sophia Antipolis 2004 route des Lucioles BP 93 06902 Sophia Antipolis Cedex France Unit de recherche INRIA Futurs Parc Club Orsay Universit ZAC des Vignes 4 rue Jacques Monod 91893 ORSAY Cedex France Unit de recherche INRIA Lorraine LORIA Technop le de Nancy Brabois Campus scientifique 615 rue du Jardin Botanique BP 101 54602 Villers l s Nancy Cedex France Unit de recherche INRIA Rennes IRISA Campus universitaire de Beaulieu 35042 Rennes Cedex France Unit de recherche INRIA Rh ne Alpes 655 avenue de l Europe 38334 Montbonnot Saint Ismier France Unit de recherche INRIA Rocquencourt Domaine de Voluceau Rocquencourt BP 105 78153 Le Chesnay Cedex France diteur INRIA Domaine de Voluceau Rocquencourt BP 105 78
10. Selecting a procedure name in a call graph frame triggers display of this procedures source in the frame below Selecting an instruction in RT n 0300 56 Hasco t and Pascual either source frame makes the other source frame scroll to show the corresponding area Danger signs in the source frames are links to warning messages which are displayed in the bottom frame and reciprocal links go from the message text to the location in the source frame In addition the output window has links to download the resulting differentiated program to your system to download the source for the PUSH and POP utility subroutines for the reverse mode The Back button of your browser will take you to a new TAPENADE input web page which may have lost all updated files To avoid that use the Retry with same files button from the output interface page to go back to the input interface page without loosing the data you filled in INRIA TAPENADE 2 1 97 8 ARCHITECTURE AND PERFORMANCES Figure 25 summarizes the architecture of TAPENADE At the source language level the input parsers and the output pretty printers are clearly separated from the kernel which only knows an abstract language called TL for Imperative Language Ideally IL contains all the semantic constructs that exist in the most common imperative languages we are targeting at i e the FORTRANs and the Cs The kernel of TAPENADE is an Imperative Language Analyzer
11. TAPENADE can be installed on the local computer and run from the command line or from a Makefile just like a compiler A typical call to TAPENADE is gt tapenade reverse root func vars x z filel f file2 f TAPENADE may be downloaded from the FTP address ftp ftp sop inria fr tropics tapenade All TAPENADE documentation with a tutorial and an ever growing reference manual is available at http www sop inria fr tropics tapenade html Section 6 1 describes the command line options section 6 2 describes the files where the user can give information on black box procedures section 6 3 explains the messages that TAPENADE sends to the user during differentiation and section 6 4 describes how the user can provide the variable size information requested by TAPENADE 6 1 Available TAPENADE options TAPENADE offers two main ways to drive differentiation command line parameters and configuration files Directives put into the source files are a third way which is not available yet and which will be available in the next versions Schematically the command line has the following simple syntax tapenade lt options list gt lt file_ list gt e The lt file_list gt is an unordered list of source files that contain the program to be differentiated File names must be separated by white space The INCLUDE files must not be given in the lt file list gt TAPENADE will look for them automatically when meeting an INCLUDE instruction in some
12. active if it is at the same time varied and useful In the special case of the tangent mode it is easy to check that when variable v is not varied at some place in the program then its derivative v at this place is certainly null Conversely when variable v is not useful then whatever the value of v this value does not matter for the final result Sym metric reasoning applies for the reverse mode of AD observing that differentiated variables go upstream we see that a useless variable has a null derivative in other words the partial derivative of the output with respect to this variable is null Conversely Conversely when variable v is not varied then whatever the value of v this value does not matter for the final result TAPENADE will simplify the differentiated program accordingly and section 4 5 explains on examples how far this simplification can go Activity analysis is global cf 18 running on the complete call graph below the topmost differentiated procedure This is why there is no separate compilation in AD the whole call graph must be analyzed globally for a good activity analysis Let s explain in more detail activity analysis propagates the information whether each variable depends on an independent downstream on the program It also propagates the information whether some dependent depends on each variable upstream on the program Each time this propagation encounters a call to some procedure R we need the precise
13. procedure call the forward sweep calls the original SUB and the backward sweep calls the differentiated SUB B that gathers its own forward and backward sweeps cf figure 5 Notice that even if SUB is a function that returns an active value TAPENADE always makes SUB_B a subroutine because the result of SUB is not an output of SUB_B cf section 5 5 and the derivative of the result of SUB subb is an additional input argument of SUB_B Figure 14 illustrates this when SUB is a subroutine We recall that TAPENADE prefers procedure generalization as opposed to specialization Thus even if a procedure is called many times with arguments sometimes active sometimes not only one differentiated procedure is built i e for the most general activity of arguments Thus specific calls are sometimes given dummy derivatives either to feed them with a null derivative input or to receive a derivative result which will not be used Assume SUB is called elsewhere with an active 3 argument whereas the 4 argument is never active This explains the 0 0 argument in tangent and the arg3b in reverse which is not used later In the reverse mode checkpointing requires taking a snapshot TAPENADE runs a classical Read Written analysis plus specific Adjoint Liveness and Adjoint Written analyses to find a minimal snapshot made of variables that are both used by the differentiated procedure and overwritten before the differentiated procedure is
14. the tangent mode of AD takes as input a single vector X which is the direction in the input space along which the directional derivatives must be computed Conversely the reverse mode of AD takes as input a single weight vector Y that defines the composite optimization criterion for which the gradient must be computed Actually nothing forbids the tangent mode to run simultaneously on several values of X nor respectively the reverse mode to run on several values of Y This is called multi directional tangent or reverse mode This is functionally equivalent to running the tangent code respectively the adjoint code many times with the same value of X but with different values of X respectively Y The difference lies in the execution time because multi directional code computes the original function only once One classical use of multi directional mode is to compute the complete Jacobian matrix either column by column through multi directional tangent mode or row by row through multi directional reverse mode If the number of rows i e output variables is notably smaller than the number of columns i e input variables then the multi directional reverse mode is recommended Otherwise the multi directional tangent mode should be preferred The multi directional extension can be applied to both tangent and reverse modes of AD However for the time being TAPENADE only provides the multi directional tangent mode Next versions of TAPENADE wi
15. L for j p down to 0 F00 B prints the dot product X X equation 8 states it should be the same followed by the name of L With correct derivatives one gets this sort of output locations go from Lo to L3 numbers symbolized here by nnnnn and mmmmm are useless for the test and must be ignored gt gt DotProduct Location L 0 gt gt DotProduct 7827 3 Location L 1 gt gt DotProduct 2 543E 02 Location L 2 gt gt DotProduct 9 233E 07 Location L 3 lt lt DotProduct mmmmm Location L 3 lt lt DotProduct 9 233E 07 Location L 2 lt lt DotProduct 2 543E 02 nnnnn INRIA TAPENADE 2 1 67 Location L 1 lt lt DotProduct 7827 3 Location L 0 9 3 Improving differentiated programs This section is about modifying the differentiated program to improve its performances Please bear in mind that this is a dangerous activity if done only by hand on the differentiated code all modifications will be lost if the program is differentiated again Therefore we insist that all hand made modifications on the differentiated code should remain simple and of limited extent and next developments of TAPENADE will automate these improvements during AD itself probably with the help of directives inserted by the end user into the source program Despite the analysis effort done during differentiation there will always remain unneces sary instructions in the differentiated code There is even a theoretical justificati
16. MOD_T_B TYPE T TYPE T TYPE T real x y REAL x y REAL x y integer i INTEGER i INTEGER END TYPE T END TYPE T END TYPE T INTERFACE OPERATOR TYPE T_D TYPE T_B MODULE PROCEDURE TM REAL x y REAL x y END INTERFACE END TYPE T_D END TYPE T_B INTERFACE ASSIGNMENT MODULE PROCEDURE TS END INTERFACE CONTAINS CONTAINS CONTAINS SUBROUTINE TS a b SUBROUTINE TS_D a ad b bd SUBROUTINE TS_B a ab b bb FUNCTION TM a b FUNCTION TM D a ad b bd tm SUBROUTINE TM B a ab b bb tmb END MODULE MOD T END MODULE MOD T D END MODULE MOD T B FUNCTION TF v u FUNCTION TF D v vd u tf SUBROUTINE TF B v vb u tfb FUNCTION RF v u FUNCTION RF D v vd u rf SUBROUTINE RF B v vb u rfb SUBROUTINE HEAD a b c r SUBROUTINE HEAD D a ad b amp SUBROUTINE HEAD B a ab b amp use MOD T amp bd c cd r rd amp bb c cb r rb TYPE T a b c USE MOD_T_D USE MOD_T_B REAL r TYPE T a b c argl TYPE T a b c argl INTERFACE F TYPE T_D ad bd amp TYPE T B ab bb FUNCTION TF v u cd argid cb argib use MOD T r rd resulti REAL r rb resulti TYPE T resultid resultib real u TF bare mes END FUNCTION TF argi b x c FUNCTION RF v u r argi real u RF v resulti F a 2 0 END FUNCTION RF CALL PUSHREAL4 r INTERFACE r resulti r argid TM_D b bd c cd arg1 ict b c CALL TS_D r rd arg1 argid CALL POPREAL4 r F a 2 0 r resultid TF_D a ad amp resultib r rb amp 2
17. Number 19 in Frontiers in Appl Math SIAM Philadelphia PA 2000 16 A Griewank and G Corliss editors Automatic Differentiation of Algorithms Theory Implementation and Application SIAM Philadelphia PA 1991 17 A Griewank and Ch Faure Reduced Gradients and Hessians from Fixed Point Itera tion for State Equations Numerical Algorithms 30 2 113 139 2002 18 L Hasco t U Naumann and V Pascual TBR analysis in reverse mode automatic differentiation Preprint ANL MCS P936 0202 Argonne National Laboratory 2002 also Rapport de recherche number 4856 INRIA 19 L Hasco t M Vazquez and A Dervieux Automatic Differentiation for Optimum De sign Applied to Sonic Boom Reduction In V Kumar M L Gavrilova C J K Tan and P L Ecuyer editors Computational Science and Its Applications ICCSA 2003 Proceedings of the International Conference on Computational Science and its Applica tions Montreal Canada May 18 21 2003 Part IL volume 2668 of Lecture Notes in Computer Science pages 85 94 Berlin 2003 Springer 20 INRIA Tropics team On line documentation of the Tapenade AD tool Technical report url http www inria fr tropics 21 F X Le Dimet and O Talagrand Variational algorithms for analysis and assimilation of meteorological observations theoretical aspects Tellus 38A 97 110 1986 22 M Metcalf and J Reid Fortran 90 95 explained Oxford University Press 1996 23 U Na
18. Sequences of instructions i e basic blocks obviously produce differentiated basic blocks The control flow instructions that glue basic blocks together are essentially reproduced identically Actually what is reproduced identically is the structure of the Flow Graph which serves as the internal representation of procedures As a consequence the last regeneration step in TAPENADE that goes from a flow graph to a syntactic tree makes some arbitrary choices while building the final syntactic tree most often for the better for example removing GOTO s or cleaning up badly nested conditionals In some rare situations these choices actually make the code appearance worse and we are progressively working to correct this Therefore one should keep in mind that the original control flow structure is sometimes slightly altered by TAPENADE s differentiation model In reverse mode TAPENADE applies the Store All strategy cf section 3 resulting in a forward sweep followed by a backward sweep Like the tangent mode the forward sweep of the reverse mode reproduces the control constructs of the original code In addition the forward sweep stores into a stack the tape the intermediate variables potentially required by the derivatives plus some flags or control information that will allow the backward sweep to implement the reverse of the original control flow between the I The stack is used classically through several PUSH and POP subrout
19. The result is a shorter code called the non incremental code which is closer to what one would write when programming an adjoint RT n 0300 32 Hasco t and Pascual code by hand and which improves data locality Figure 19 illustrates the effect of this refinement on the same example as in the previous section original program reverse mode TAPENADE reverse backward sweep non incremental with TBR backward sweep x EXP a POPREAL4 a CALL POPREAL4 a x a 2 zb 3 ab zb zb 3 ab 3 z 0 0 xb xb yb ab 2 a yb ab 2 a yb EXP a xb xb yb yb 0 0 0 0 ab EXP a xb Figure 19 Gathering instructions and improving locality through instructions reordering 5 3 Detection of Aliasing Program transformation tools and AD tools in particular assume that two different vari ables represent different memory locations The program can specify explicitly that two different variables indeed go to the same place using pointers or the EQUIVALENCE declara tion In this case the tool must cope with that But it is not recommended and forbidden by the standard that the program hides this information e g declaring a procedure with two formal arguments and giving them the same variable as an actual argument This is called aliasing TAPENADE detects this situation and issues a warning message DF02 cf section 6 3 to the user This message should not be overlooked because it may point to a future problem in the differentia
20. called On the example of figure 14 the combination of the above analyses could prove that this is only the case for x In some rare situations a variable must be stored as part of the snapshot and also as part of the tape i e because it is needed by derivatives of previous instructions cf section 5 1 Instead of PUSH ing the value twice TAPENADE uses a special LOOK function that reads the value from the stack without removing it so that it can be POP ed again later RT n 0300 24 Hasco t and Pascual CALL PUSHREAL4 x x 3 X x 3 CALL PUSHREAL4 x CALL SUB a x 1 5 z CALL SUB a x 1 5 z arg x y CALL PUSHREAL4 x CALL PUSHREAL4 arg x FUN arg xd 3 x 2 xd POPREAL4 arg X x 3 FUN B arg argb xb CALL SUB_D a ad x xd POPREALA x 1 5 0 0 z y argb argd xd y x yd yb x argb arg x y POPREAL4 x xd FUN_D arg argd x SUB_B a ab x xb 1 5 arg3b z POPREAL4 x 3 x 2 xbD Figure 14 Differentiation of procedure calls INRIA TAPENADE 2 1 25 4 9 Overloading Overloading allows the user to call different procedures with the same name At compile time like in FORTRAN95 or at run time like in object oriented languages the system decides which is the actual procedure called generally by comparing the data types of the procedure s arguments Furthermore FORTRAN95 allows the user to overload predefined operators such as by user functions and to overlo
21. debugging 5 1 To Be Restored analysis We saw that intermediate values need to be taped before overwritten but this is only when they will be used by the differentiated instructions specific program static analysis called To Be Restored TBR 9 23 does this in TAPENADE In the example of figure 18 TAPENADE could prove that neither x nor y were needed by the differentiated instructions and therefore did not PUSH them on nor POP them from the stack original program reverse mode reverse mode POP fave backward sweep backward sweep with TBR POPREAL4 a POPREAL4 a zb 3 ab zb 3 ab 0 0 0 0 POPREAL4 y ab 2 a yb ab 2 a yb xb yb xb yb 0 0 0 0 ab EXP a xb POPREAL4 x ab EXP a xb Figure 18 Removing unnecessary storage through To Be Restored analysis 5 2 Gathering incrementation instructions Many reverse differentiated instructions increment a differentiated variable Others just reset them often to zero These instructions may fall far apart in the differentiated program which uses the reversed original instruction order and therefore it is likely that the compiler will not optimize them TAPENADE implements a data dependency analysis that allows it to safely move and gather initializations and incrementations of the same differentiated variable whenever possible Also when a differentiated variable is mplicit null an incrementation of this variable becomes a simple assignment
22. dependence pattern of each of R s outputs on each of R s inputs This differentiable dependency matriz must be precomputed for each procedure in the call graph bottom up therefore calling for a global analysis on the call graph known as the Differentiable Dependency analysis When for some reason one black box procedure in the call graph cannot be analyzed for its differentiable dependency matrix then some default conservative dependency matrix must be used somehow making each output depend on each input This has dramatic con sequences on activity analysis usually all variables downstream the call become varied all variables upstream become useful a lot of variables thus become active and the differ entiated code is less efficient To avoid this degradation of the quality of activity analysis the differentiable dependency matrix of each black box procedure must be provided to the AD tool in some manner For example TAPENADE provides an interface for the end user to specify these differentiable dependency matrices described in section 6 2 RT n 0300 14 Hasco t and Pascual 4 THE DIFFERENTIATION MODEL OF TAPENADE The previous section showed the theoretical basis of Automatic Differentiation It gave a rough idea of what a differentiated program looks like In this section we plan to de scribe precisely the actual differentiation model of TAPENADE The goal is to gain a deeper understanding and familiari
23. do Check that this does not impact differentiation Take care to perform the usual validation tests on the computed derivatives These messages indicate conflicts between declaration and usage of some variable Mes sage TC25 occurs in the context of FORTRAN95 array expressions Messages TC26 to TC29 indicate a problem in EQUIVALENCEs and COMMONs Recall that variables in a COMMON are stored contiguously so that their relative order cannot be changed and extra variables cannot be inserted occasionally Also a variable cannot be in two different COMMONs and one cannot set an EQUIVALENCE between two variables that already have their own different memory space allocated Why should you care In general if you let the system choose the type of a variable you run the risk that it becomes REAL while you don t want or the other way round And this in turn impacts differentiation Messages TC24 TC26 TC27 and TC29 are even more serious They can indicate that the code relies on FORTRAN weak typing to perform hidden EQUIVALENCEs between two variables or to resize reshape or split an array into pieces Since TAPENADE cannot possibly understand what is intended it might not propagate differentiability and derivatives from the old array shape to the new array shape What can you do Declare variables explicitly Try to avoid the implicit declaration facility for example by setting explicitly an IMPLICIT NONE declaration Declare arr
24. example the include files are not used there their contents are inlined into the new declarations Also the order and style of the declarations is changed according to TAPENADE s taste which is arbitrary Even more disturbing the comments which were attached to these declarations are now floating at the end of the declaration section which is inconvenient Although not vital for AD this is a serious problem and we plan to fix it by keeping a list of the original declarations lines and using these original declarations to produce the new differentiated source program 10 2 Directives We mentioned several times the need for directives to let the end user give additional in formation to TAPENADE Directives are interesting because they can provide information relative to a given location in the source Furthermore they remain in the source program when it is modified providing cumulative AD knowledge into the source code We are thinking about several uses for directives such as locating special pieces of code like iterative resolutions cf section 9 3 4 IFloops cf section 9 3 5 linear or auto adjoint functions cf section 9 3 3 The most important use of directives will be to drive checkpointing The present TAPE NADE systematically checkpoints each procedure call This choice should be modifiable by the end user to take advantage of the relative different costs of each procedure Directives will be available in the next versions
25. how this code fragment should be differentiated in tangent mode as well as in reverse mode This special case can be flagged by a directive Until this exists the obvious solution is again to use the black box mechanism and to provide a hand implementation of the differentiated fragment This is quite easy in the case of a linear operation done by a procedure call call LINEAROP X Y One just needs to define the differentiated subroutine LINEAROP_D X Xd Y Yd as call LINEAROP Xd Yd call LINEAROP X Y INRIA TAPENADE 2 1 69 9 3 3 Auto adjoint code fragments Consider a linear code fragment like above Y M x X and suppose in addition that M M To make the example more interesting suppose also that M is not explicit it is the inverse of another matrix N Of course we don t want to compute NT explicitly and we prefer to factorize N instead as L x U N In the program this may be gathered in a single call call AUTOADJ X Y N L U Here again reverse AD of AUTOADJ probably does not generate a very efficient code because TAPENADE cannot guess that the adjoint AUTOADJ_B is equivalent to the original AUTOADJ itself This special case can be flagged by a directive Until this exists the solution is again to make AUTOADJ a black box subroutine AUTOADJ B X Xb Y Yb N L U can be defined simply by hand as Ab 0 0 call AUTOADJ Yb Ab N L U Xb Xb Ab Yb 0 With a little more effort one can even put in common th
26. if the original program com piles and runs correctly with your compiler these messages warn you that differentiation may produce a faulty program Some messages are requests for help when TAPENADE needs some help from the user e g because it needs and does not know the size of an array it issues a message that asks you to give this size after differentiation is done Otherwise the resulting program will not run In the following sections we present all the messages that TAPENADE may issue when differentiating a file These messages are gathered into a number of general categories For each category after a general comment you will find a Why should you care paragraph that emphasizes the nasty things that may happen if this message is disregarded and a What can you do paragraph that gives hints about how the problem may be solved In the text of each message symbols in italics are place holders for symbols from the program being analyzed For example V V1 V2 represent variable names T T1 T2 represent type names C C1 C2 represent names of COMMONs F F1 F2 represent procedure names K1 K2 represent procedure kinds EXTERNAL INTRINSIC or USER Exp represents an expression File a file an interface name L a label identifier N a number and Sis any symbol RT n 0300 44 Hasco t and Pascual 6 3 1 Declarations problems DD06 Cannot combine successive declarations of function F return type T1 and T2 overwritte
27. independents v1 v3 v4 v6 v7 One can run activity analysis by hand on this small example The input independent variables v3 v6 v7 are useless and the output dependent variables v3 v6 v7 are not varied This justifies the two comments where only vi v4 and vb remain Differentiated variables v3d v4d v6d v3b v5b and v6b are indeed implicit null at the end of the differentiated subroutine so they are not even reset to zero because the generated calling subroutine will build shorter code that takes care of that However in the special case where subroutine S1 is the topmost differentiated routine i e it will be used in user written code the differentiated routines are slightly different and so are the comments To avoid misuses of the S1_D and S1_B in the hand written code the mplicit null variables that are visible from the calling context are explicitly reset to zero This is reflected by the output variables list in the tangent code which will feature in addition v3 v4 and v6 and by the input variables list in the reverse code which will feature in addition v3 v5 and v6 This can be surprising because the end user did not declare v3 as independent nor v4 as dependent Similar extra initializations are inserted before and after calls to differentiated black box procedures to allow for a more natural hand coding of the derivative of the black INRIA TAPENADE 2 1 19 original program TAPENADE tangent TAPENAD
28. it is straightforward to write the Jacobian of the corresponding function fr R R a i a i b j gt b j x x which is the matrix SIN a i x b j 0 1 0 0 0 1 Recalling that transposing a matrix is just applying asymmetry with respect to the diagonal we can write the operations that J and must implement a i SIN a i x b j a i E implements b j 0 1 0 x bG x 0 0 1 x g a i SIN a i 0 0 a i I implements b j x 1 0 x BC X b j 0 1 X INRIA TAPENADE 2 1 17 and therefore TAPENADE produces the derivative instructions shown on figure 8 TAPENADE tangent TAPENADE reverse ad i xd b j amp xb xb b j ab i amp x bd j amp bb j bb j x ab i amp ad i SIN a i ab i SIN a i ab i Figure 8 Differentiation of a simple assignment 4 4 Array or Vectorial notation FORTRAN95 provides array notation allowing the user to specify systematic operations on sets of array elements in one single instruction Instead of writing a loop one can write an assignment whose operands are array sections instead of array elements Each operation operates uniformly on each element of these array sections Array instructions are useful because they can be compiled very efficiently for parallel and vectorial target architectures TAPENADE can differentiate these array instructions producing new array instructions when ever possible Figure 9 illustrates this emphasizing tha
29. maximum java heap size for this command If this option is not used the default is mx256m but a larger heap may be necessary for very large programs Specifies the size in bytes of integer numbers on the current target architecture lt n gt may be 2 4 or 8 If this option is not used the default is 4 Specifies the size in bytes of real numbers on the current target architecture lt n gt may be 4 or 8 If this option is not used the default is 4 Specifies the size in bytes of double precision real numbers on the current target architecture lt n gt may be 4 8 or 16 If this option is not used the default is twice the size of real numbers 6 1 5 Display and Debugging options These options control display of differentiation output including some outputs that are used for debugging purpose msglevel lt n gt Sets the maximum detail level of the messages that show progress of differentiation If this option is not used the default is 5 Larger lt n gt prints more html Displays the differentiation result into a web page similar to the output of the TAPENADE web server described in sec tion 7 If your web browser doesn t display it automatically the page to load is in tapenadedir tapenade html or in lt dir gt tapenadedir tapenade html if the option 0 lt dir gt is used view Opens windows that display graphically the internal representation of the original program and of the differentiated program whi
30. of TAPENADE 10 3 Input languages For this version on TAPENADE accepts FORTRAN77 and FORTRAN95 as input Work to accept C as an input language will start soon INRIA TAPENADE 2 1 73 On the other hand extension to object oriented languages is not yet a priority This is still in the research area and will not be available in a foreseeable future Several users asked for a version of TAPENADE running on WINDOWS This should be available very soon 10 4 Pointers and dynamic allocation Full AD on FORTRAN95 supposes pointer analysis and an extension of the AD models on programs that use dynamic allocation This is not done yet Whereas the tangent mode does not pose major problems for programs with pointers and allocation there are problems in the reverse mode For example how should we handle a memory deallocation in the reverse mode During the reverse sweep the memory must be reallocated somehow and the pointers must point back into this reallocated memory Finding the more efficient way to handle this is still an open problem 10 5 Parallel programming We made very few tests on parallel programs Apparently TAPENADE does not differentiate all sorts of parallel programs This version 2 1 handles the vectorial notation of FORTRAN95 for the most frequent cases TAPENADE can also use previous work 8 7 with ODYSSEE 10 on differentiating programs with MPI communications But there are plenty of cases where parallel programs ar
31. program TAPENADE multi directional tangent function FF x y real x 100 y 100 v ff ff 1 0 Do i 1 100 v x i y i 2 0 ff i ff x i y i v ff v x i call F2 v y i enddo end x v SUBROUTINE FF DV x xd y yd ff ffd nbdirs INCLUDE DIFFSIZES inc Hint nbdirsmax should be the maximum number of differentiation directions INTEGER nbdirs i nd REAL ff ffd nbdirsmax x 100 xd nbdirsmax 100 amp amp y 100 yd nbdirsmax 100 v vd nbdirsmax DO nd 1 nbdirs ffd nd 0 0 ENDDO ff 1 0 DO i 1 100 v x i y i 2 0 DO nd 1 nbdirs ffd nd ffd nd xd nd i y i x i yd nd i ENDDO ff ff x i y i x i v ff DO nd 1 nbdirs vd nd xd nd i tyd nd i 2 0 xd nd i vd nd ff v ffd nd vd nd vd nd x i v xd nd i ENDDO v vxx i CALL F2_DV v vd y i yd 1 i nbdirs ENDDO END Figure 17 Multi Directional tangent differentiation RT n 0300 30 Hasco t and Pascual independent iterations On the example TAPENADE can create a cluster of three dif ferentiated instructions Data dependences on variable x i prevent merging of the fourth differentiated loop Notice that the differentiated loops are essentially parallel and telling this to the compiler can bring an interesting speedup As usual differentiated assignments are separated from their original assignment whereas procedure calls become one single call to the differentiated proc
32. reduced swapping Moreover dead adjoint code can be removed from the end of body i Again this is typically a job for directives Before this is done though here is a reason able partly automated implementation in the original program create a subroutine that contains only the IFloop Inside this loop create a subroutine that contains body i This modified source is sketched in the left column of figure 33 Then call TAPENADE to perform reverse AD The result is sketched in the middle column of figure 33 Observe that the loop immediately precedes its adjoint apart from a trivial PUSH POP There could be some initializations of derivatives between the two loops just move these initializations before the first loop Observe also that the code for BODY18_B already benefits from dead adjoint code elimination as the last instruction of BODY18 has been removed Then comes the manual step which is to remove the indicated lines in subroutine LOOP18_B to obtain the right column of figure 33 This is a small enough modification even if it must be repeated after each differentiation The resulting code embeds the optimization of the adjoint IFloop INRIA TAPENADE 2 1 71 original program reverse mode reverse mode after manual improvement call LOOP18 N call L00P18 N call LOOP18 N call L00P18 B N call L00P18 B N SUBROUTINE LOOP18 N SUBROUTINE L00P18 B N SUBROUTINE LOOP18_B N INTEGER i N INT
33. source file INCLUDE files can recursively contain INCLUDE files By default the suffix of each source file indicates its language f for or fortran for FORTRAN77 f90 or 95 for FORTRAN95 and c for C e The lt options_list gt is an unordered list of options possibly empty separated by white space Options belong to several categories described in the next sections RT n 0300 36 Hasco t and Pascual 6 1 1 Fundamental options These options give the basic data on which F to differentiate which dependent Y with respect to which independent X and with which AD mode root lt name gt Specifies the name of the root procedure that defines the function head lt name gt F to differentiate TAPENADE only allows you to differentiate com plete procedures If you want F to be a part of a procedure you must first split this part as a new procedure If this option is not used the default root procedure is chosen in the maximal elements of the call graph i e any procedure which is not called by another procedure outvars lt vars gt Specifies the dependents i e the list of the output variables of the root procedure for which the derivative is requested Variables which are not output local variables or constants are just ignored lt vars gt is a list of variable names between double quotes The double quotes may be omitted when there is only one variable in the list If this option is not used the default is t
34. structure of the original program as much as possible In particular the derivatives of variables of a structured type are themselves structured However their type is slightly different because it must have room to accommodate only the derivatives of components which may have a derivative So TAPENADE needs to introduce and declare a differentiated structured type For each component x of the original structured type T there will be a component x of the same name and type in the differentiated structured type T_D or T_B if somewhere in the program there is a variable of type T with a non trivial derivative for its component x original program TAPENADE tangent TYPE VECTOR_D REAL x z END TYPE VECTOR_D TYPE VECTOR TYPE VECTOR CHARACTER 64 CHARACTER 64 REAL x y z REAL x y z END TYPE VECTOR END TYPE VECTOR TYPE VECTOR u v w TYPE VECTOR u v w TYPE VECTOR D ud FUNCTION TEST a b a REAL TEST FUNCTION TEST D a ad b bd test TYPE VECTOR a b REAL test TEST D TYPE VECTOR a b TYPE VECTOR D ad bd TEST a x b x u z test d a x bd x ad x b x ud z TEST a x b x u z Figure 7 Differentiated Data Types Notice that this way it may happen that some component of the differentiated struc tured type is not really used for some differentiated variables Avoiding this would require RT n 0300 16 Hasco t and Pascual definition of many differentiated types for some T one for each combination of diffe
35. with the latest improvements or as a downloadable tool to be called from your command line or Makefiles Of course TAPENADE is alive and therefore evolves constantly RT n 0300 8 Hasco t and Pascual 3 AUTOMATIC DIFFERENTIATION OF COMPUTER PROGRAMS Automatic or Algorithmic Differentiation AD differentiates programs An AD tool takes as input a source computer program P that given a vector argument X JR computes some vector function Y F X IR The AD tool generates a new source program that given the argument X computes some derivatives of F In a sense an AD tool behaves largely like a compiler which first understands the meaning of the original program represents it in an internal form performs analyses to get a deeper understanding and generates an object code The main differences from a compiler are e the meaning of the object code is different from the source AD defines this new meaning as a mathematical transformation differentiation on the original function e the produced object code is generally not in machine form but rather in source form again This difference is in principle minor but most users prefer to see the results of differentiation in their favorite source language and this allows the compiler later on to run the usual optimization steps on the differentiated code Why is such a transformation possible AD actually relies on a number of reasonable assumptions on the progr
36. 0 result1 rb resulti rb rd resultid r resultix rd CALL TF_B a ab 2 0 resultib r resulti r CALL TS_B r rb argi argib sa CALL TM B b bb c cb argib END SUBROUTINE HEAD D He END SUBROUTINE HEAD_B Figure 15 Differentiation model in presence of Overloading INRIA TAPENADE 2 1 original program module examplei type vector real x y z end type vector type vector u v w contains function dot_prod a b type vector a b real dot prod dot prod a x b x amp ahy bhy ahz biz end function dot_prod end module 27 TAPENADE tangent MODULE EXAMPLE1_D TYPE VECTOR REAL x y z END TYPE VECTOR TYPE VECTOR u v w CONTAINS FUNCTION DOT_PROD_D a ad amp amp b bd dot_prod TYPE VECTOR a b TYPE VECTOR ad bd REAL dot_prod REAL dot_prod_d dot_prod_d ad x b x amp afx bd x adhy bhy ahy bdhy ad z b z amp a4z bd z dot prod afx b x amp ahy bhy a4z b z END FUNCTION DOT_PROD_D FUNCTION DOT PROD a b IMPLICIT NONE TYPE VECTOR a b REAL dot prod dot prod afx b x amp amp ahy bhy ahz b z END FUNCTION DOT_PROD END MODULE EXAMPLE1_D Figure 16 Differentiation of Modular Programs RT n 0300 28 Hasco t and Pascual differentiated top level subroutine may exist in a separate file from the original subroutine avoiding code duplication for further maintenance 4 11 Multi directional differentiation Until here
37. 153 Le Chesnay Cedex France http www inria fr ISSN 0249 0803
38. 34 Hasco t and Pascual original program reverse mode TAPENADE reverse dead adjoint code removed IF a GT 0 0 THEN a LOG a ELSE a LOG c CALL SUB a ENDIF END IF a GT 0 0 THEN CALL PUSHREAL4 a a LOG a CALL POPREAL4 a ab ab a ELSE a LOG c CALL PUSHREAL4 a CALL SUB a CALL POPREAL4 a CALL SUB_B a ab cb cb ab c ab 0 0 END IF IF a GT 0 0 THEN ab ab a ELSE a LOG c CALL SUB_B a ab cb cb ab c ab 0 0 END IF Figure 22 Removing useless dead adjoint code most implementations the original results will be overwritten and lost during the backward sweep of P Therefore some of the last instructions of the forward sweep of P are actually dead code TAPENADE implements a slicing analysis called Adjoint Liveness analysis to remove this dead adjoint code Incidentally as a side effect of this slicing analysis TAPENADE is also able to detect which arguments of a procedure are effectively read and effectively overwritten not by the procedure P itself but by its adjoint P This is the Adjoint Written analysis already mentioned in section 4 8 which is used to reduce the snapshot taken before calling P The example on figure 22 shows the effect of detection of Adjoint Liveness analysis on a small program which terminates on a test with some dead adjoint code at the end of ea ch branch INRIA TAPENADE 2 1 35 6 USING TAPENADE FROM THE COMMAND LINE
39. E 2 1 55 7 2 The TAPENADE output result web page A moment after the user clicked on one AD mode button which triggered differentiation step 6 the TAPENADE server sends the differentiation output in a new web page shown on figure 24 This graphical user interface helps examine TAPENADE output exhibiting j la ile Edit View Go ERA Tools Window Help gt 3 44 I Ss Bo ee eee 3 4 http tapenade inia fr 8080apenade resut htm 3 Search El A Home uf Bookmarks Retry with the same files Download differentiated file Download PUSH POP s Original call graph Differentiated call graph psiroe psiroe b conddirflux gradnod b transpiration gradfb b normcog transp b calcnormpeau d fluroe_b vcurvm vcurvm b fluroe transpiration b Pannes Do Wm A zi dz ivar is 1 0d0 pentel is dzt B PUSHIN RS BY 0 5d0 SIGN 1 0d0 dzt ivar SIGN E min3 PUSHINTEGER4 0 pence ied Je Mag POPINTEGER4 branch NOT branch LT 1 POPINTEGER4 ad to ivar ad_to 1 1 POPINTEGERA ad to C FIN DE LA BOUCLE SUR LES ELEMENTS is ad_to l 1 a dz 10 10 POPREALS dz ivar is Ey Maxx 21 23 POPREALS dy ivar is POPREAL8 dx ivar is TRANSP dx 5 nx resx aisb dx ivar is dxb ivar is dy ivar is C Completing the non limited nodal gradients is dz ivar is dzb ivar is nordre EQ 2 OR nordre EQ 3 Ti dzb ivar is ais dzb ivar is c i d
40. E reverse forward tangent mode reverse adjoint mode output variables vi v5 input variables vi v4 input variables vi v4 output variables vi v5 subroutine S1 vi amp SUBROUTINE S1_D vi vid v2 SUBROUTINE S1 B vi vib v2 v2 v3 v4 v5 v6 v7 v3 v4 vdd v5 v5d v6 v7 v3 v4 v4b v5 v5b v6 v7 real vi v2 v3 amp REAL vi vid v2 v3 v4 REAL vi vib v2 v3 v4 amp amp v4 v5 v6 v7 amp v4d v5 v5d v6 v7 amp v4b v5 vbb v6 v7 real tmp1 tmp2 REAL tmpi tmp2 tmp1 v tmpi vi tmp2 v2 tmp2 v2 vid v2 v1d vi vitv2 vi vitv2 2 v4 v5b v2 tmpi v3 v6 v2 tmpi v3 v6 v2 v1b v3 tmp2 tmp2 v3 tmp2 tmp2 v4d v4 v4 v4d v5 v4 v4 v5 v4 v4 v4 1 0 v4 1 0 v6 2 0 v6 2 0 Figure 11 Effect of Activity Analysis on a complete procedure RT n 0300 20 Hasco t and Pascual box Similarly this is indicated by a message ADO9 cf section 6 3 which is issued at differentiation time that specifies which independent and dependent are required for the black box from its calling contexts 4 6 Control Flow structure Now that we have a clearer idea of how a single instruction is differentiated and why let s focus on the flow of control In tangent mode equation 3 allows derivative instructions I to run along with the original I In fact is just before I because 1 may overwrite a part of Xp 1 that is used by f X 1 in
41. EGER i N INTEGER i N DO i 1 N DO i 1 N DO i 1 N call BODY18 i call PUSH ENDDO call BODY18 i END ENDDO CALL PUSHINTEGER4 i 1 CALL POPINTEGER4 ad to DO i ad to 1 1 call POP call BODY18 B i call BODY18 B i ENDDO ENDDO END END SUBROUTINE BODY18 i SUBROUTINE BODY18 B i SUBROUTINE BODY18 B i INTEGER i INTEGER i INTEGER i res i res i v END vb vb resb i vb resb i END Figure 33 Partly automatic reverse differentiation for IFloops RT n 0300 72 Hasco t and Pascual 10 KNOWN PROBLEMS AND DEVELOPMENTS TO COME We conclude this user s guide of TAPENADE by a quick description of known problems and how we plan to address them in the next releases We are not talking here about the bugs in TAPENADE which of course exist among the 70 000 lines of JAVA source Rather we focus on missing functionalities We suggest you check the TAPENADE web site to know which new functionalities appeared after this guide was written 10 1 Includes comments and declarations When a FORTRAN source contains INCLUDEs TAPENADE reads and uses the include files During the analysis the internal representation contains only the symbol tables built from the declarations and include files The internal representation doesn t contain the textual declarations nor the INCLUDE commands any more This is why the produced differentiated programs have a declaration part which is completely rewritten For
42. INTEGER jj TYPE 0BJ1 b 5 TYPE 0BJ1 b 5 TYPE 0BJ1 b 5 TYPE 0BJ1 D bd 5 TYPE 0BJ1_B bb 5 COMMON param jj w COMMON param jj w COMMON param jj w COMMON param d wd COMMON param b wb Figure 6 Names of differentiated symbols used in the program in which case it appends 0 then 1 etc after the derivative name until conflicts disappear The way that differentiated symbols are built can be changed via command line options 4 2 Differentiated data types When a variable has a non trivial derivative TAPENADE builds and declares a new differ entiated variable that holds this derivative The type of this derivative comes from the INRIA TAPENADE 2 1 15 type of the original variable In simple cases when the variable is REAL REAL 8 DOUBLE PRECISION COMPLEX or any other primitive type the differentiated variable has exactly the same type When the original variable is an array the differentiated variable is also an array with the same dimensions When the variable is a LOGICAL INTEGER CHARACTER or any other primitive type for which no notion of derivative is defined things are even simpler since there is no differentiated variable at all Things are slightly more complex for user defined structured types The FORTRAN95 standard made the unfortunate choice to call them derived types Instead we shall call them structured types to avoid confusion with differentiation itself TAPENADE has a principle of keeping the
43. ISSN 0249 0803 VAINRIA INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE TAPENADE 2 1 user s guide Laurent Hasco t Val rie Pascual N 0300 Septembre 2004 Th me NUM apport technique TA INRIA SOPHIA ANTIPOLIS TAPENADE 2 1 user s guide Laurent Hasco t Val rie Pascual 7 Th me NUM Syst mes num riques Projet Tropics Rapport technique n 0300 Septembre 2004 78 pages Abstract This is the user s manual for the version 2 1 of the Automatic Differentiation tool TAPENADE Given a source computer program that computes a differentiable mathematical function F TAPENADE builds a new source program that computes some of the derivatives of F specifically directional derivatives tangent mode and gradients reverse mode This report summarizes the mathematical justifications of Automatic Differentiation then describes in full detail the differentiation model that TAPENADE implements Our goal is to give the users of TAPENADE a precise understanding of the actions and choices made while differentiating programs so as to improve their confidence in the produced source programs This report documents all the available options and parameterizations that the users can give to TAPENADE and conversely all the diagnosis and requirements that TAPENADE may issue towards the users After a brief description of TAPENADE s architecture and performances this report describes mo
44. If they are missing the obvious conservative assumptions are made The type part gives the types of each argument in the same order The syntax of a type is rather heavy It should be cleaned up in next versions For example real 8 is written modifiedTypeName modifiers intCst 8 ident real Arrays are built with the combinator arrayType that binds the array elements type with a list of dimensions dimColon which are pairs of the lower and upper bound So far these bounds are not very useful so you might just as well leave them as none Take a look at the default F77GeneralLib to find more examples The R part gives the read signature of each argument A 1 means the argument may be used at least partly inside the procedure not necessarily in differentiable expressions The W part gives the written signature of each argument 1 means the argument may be overwritten at least partly inside the procedure Each entry in MyADLib is about one procedure and gives info related to AD This can be the differentiable dependency matrix or also the partial derivatives for an intrinsic func tion These properties are declared using the conventions defined in the corresponding GeneralLib file about the rank assigned to each argument For example consider the following entry about subroutine BBOX subroutine bbox deps id 1 1 0 0 0 0 0 O 1 0 0 1 0 0 0 O 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 O 0 0 0 0 0
45. TAPENADE s web page as it may evolve with time It is necessary to modify the test bed of figure 28 as explained in the Validation section yielding what is shown on figure 29 Computing the Divided differences requires as usual a first run of procedure FOO but we call FOO_D instead which anyway also computes the output of FOO but in addition stores the intermediate val ues selected by the user into a file called epsvalues The second run of FOOD effectively computes the tangent derivatives and in addition to that progressively recovers each value stored in epsvalues uses it together with the present intermediate value to compute the derivative with Divided Differences and immediately compares it with the tangent deriva tive and complain when the difference is more than epszero Distinction between these two behaviors of FO0 D is made through the global phase integer At any place in the source of FOO_D you may select a value so that its derivative is checked with the above mechanism simply by inserting procedure calls like the following call TRACEACTIVEARRAY1REAL T t td 1 10 which will compare at run time the Divided Differences and the tangent derivatives of ele ments 1 to 10 of array T and print a message if they differ These procedures are provided in the ADFirstAidKit directory of the standard installation 9 2 2 Debugging tools for the reverse mode As usual things are not so easy for the reverse mode but we also devised a r
46. The star in Y denotes transposition which makes it a row vector The weighting vector Y is an input to the differentiated program The gradient of Y x Y is therefore F X x Y where F is the transposed Jacobian Using eguation 2 the gradient writes FX x Y fi Xo x X1 x lt x fyr Xp 2 x fp Xp xY 4 because the transpose of A x B is B x A Again this is most efficiently evaluated from right to left because matrixx vector products are so much cheaper than matrix x matrix products This is the principle of the reverse mode of AD which is available in TAPENADE The program differentiated in reverse mode is often called the adjoint program of P written P This turns out to make a very efficient program at least theoretically 15 Section 3 4 The computation time reguired for the gradient is only a small multiple of the run time of P When the number m of outputs of F is small which is usually the case for optimization problems the reverse mode of AD can be used to compute the complete Jacobian row by row at a very reasonable cost In particular this cost is independent from the number of input parameters n which can be very large However we observe in equation 4 that the Xy are required in the inverse of their computation order If the original program overwrites a part of Xy the differentiated program must restore X before it is used by f X This is the main drawback of the reverse mode of AD There are t
47. This is slightly more than what the black box mechanism can provide Therefore until directives exist for that some hand coding will be necessary RT n 0300 70 Hasco t and Pascual 9 3 5 Loops with Independent Iterations Consider a loop whose iterations are independent i e they could be done in any order In this section we call iteration each instance of the loop body Precisely suppose all the instructions in this loop are either parallel or global sum reductions Call these loops IFloops This case is more frequent than it seems all gather scatter loops on meshes are IFloops It can be shown that an J loop is parallel and that its adjoint loop is parallel too so that the two loops can be fused into a single loop This is shown on figure 32 If we can Standard Improved doi 1 N body i body i end Figure 32 Optimization of adjoint II Loops manage to move the JFloop just before its adjoint loop we reach the state on the left of figure 32 Since the two loops are parallel and operate on the same iteration space they can be fused as shown on the right of figure 32 The advantage is that all the memory used to store the tape i e the intermediate values during the forward sweep body i is immediately used during the backward sweep body i that follows and can thus be reused for the next iteration The memory consumption is reduced by a factor N The resulting code is even faster due to a better locality and a
48. ables let alone the cost of checkpointing which is nearly always necessary cf section 3 2 Nevertheless the reverse mode remains a very interesting choice to obtain the derivatives of a small number of outputs with respect to a large number of inputs Name ALYA UNS2D THYC LIDAR STICS MARGARET Bonn GRD GRD Therm Optics Aaronom Nucor ie oss 230 207 ne 10 or fes rar 55 BB 380 0000 Pe 20 30 ajaj 4 ue aer ars 1020 2260 3530 00021 PAP sa oa a 20 ws 160 Memory 04 250 3834 165 280 oa Figure 26 Time and memory consumption on six validation codes Figure 26 shows the times in seconds and the memory consumption for intermediate variables and snapshots in megabytes Measurements were made on a PC running Linux with GNU or PGF compilers and or on a SUN workstation with the SUN 77 compiler Changing the compiler doesn t change the ratios significantly INRIA TAPENADE 2 1 59 Tree protocol reader Useful i Program previous internal representation results LEGEND Type Checking Symbol Tables and Types Pointer analysis Pointer destinations Read Written analysis Read Written sets Detection of Aliasing Error messages Detection of Uninitialized variables Error messages Dependency analysis Dependencies Differentiable Dependency analysis Diff dependencies Activity a
49. accordingly In the example of figure 10 x immediately becomes original program TAPENADE tangent TAPENADE reverse 1 0 1 0 1 0 x y x yd x y y 2 Z x y y 2 IF t GT 100 t y 2 IF t GT 100 IF t GT 100 ogee b x zb Figure 10 Instructions simplifications due to Activity Analysis not varied and t is useless Therefore TAPENADE knows that xd and tb are null they can be simplified and even never computed Resetting them explicitly to zero would be just a waste of time We shall say that these derivatives are implicit null Symmetrically td and xb are non null but do not matter and therefore need not be evaluated However if control flow merges later downstream and the other incoming flow has an explicit non null derivative for this variable TAPENADE is forced to reset explicitly the corresponding implicit null variable just before control flow merges This has is a somewhat puzzling consequence some of the user given independent and dependent variables may turn out to be inactive after activity analysis If so TAPENADE removes them automatically which may be surprising The next example on figure 11 illustrates this TAPENADE summarizes its decisions about activity of variables by setting comments before each differentiated procedure Figure 11 shows these comments on a small subroutine S1 for which tangent and reverse differentiation were required for dependents vi v3 vb v6 v7 with respect to
50. actually compiles Differentiation is undefined What can you do Rewrite the definitions of these types TC47 Definition of type T comes a little too late The definition of type T came after it was used This is just a warning because TAPENADE will do as if the definition was done in due time But you should better fix this Label Lis not defined Label variable V must be written by an ASSIGN TO construct Variable V used in assigned GOTO is not a label variable These are illegal uses of labels or label variables Why should you care If labels are misunderstood the final flow of control that TAPENADE understands and regenerates might not be the one you expect What can you do Define the labels you use Use the correct constructs to manipulate label variables 6 3 3 Control flow problems These messages indicate problems in the flow of control of the program Message CF01 comes from jumps that go from outside to inside a loop without going through the loop con trol header The meaning of the program is not well defined and is probably implementation dependent Why should you care Of course a bad control of a program yields a bad control of its differentiated programs INRIA TAPENADE 2 1 49 What can you do Avoid irreducible loops Avoid jumps into loops and into branches of a conditional structure Try to use structured programming instead If an assigned GOTO goes to only one place at least replace it by a plai
51. ad the assignment by a user subroutine During its type checking phase TAPENADE decides which procedure or predefined oper ator calls are overloaded and replaces them by calls to the appropriate procedures Then differentiation can proceed like for any other procedure call Figure 15 shows an example of that where and are overloaded as TM and TS respectively when their arguments are of the user defined structured type T Similarly function F is overloaded either as TF or RF according to the type of the first argument Overloading is often combined with modules so this example anticipates on section 4 10 for this aspect One can verify on the tangent and reverse differentiated results that the overloaded procedures are detected replaced and finally differentiated following the standard differentiation model of TAPENADE cf sec tion 4 8 Notice also that the overloaded form is preserved in the differentiated procedure whenever possible However the calls to TF_D and TF_B could also be replaced by calls to more general overloaded interface functions F_D and F_B following the pattern of interface function F This will probably be done in future versions of TAPENADE 4 10 Overall Modular Structure FORTRAN95 files have an overall structure of nested modules and internal external proce dures with interfaces The structure of FORTRAN77 is much flatter in comparison TAPE NADE produces a differentiated code that reproduces this modular struct
52. ally leads to calling F1 again In other words this is a cycle in the call graph Basically this is not an error although FORTRAN77 used to forbid recursive programs Why should you care It is just a matter of time Recursive programs are harder to analyze We are progressively updating TAPENADE analyses so that they cope with recursivity What can you do In the meantime check that the resulting derivatives are correct using the classical tests Could not find type of argument N of procedure F please check Could not find type of result of function F please check During differentiation TAPENADE had to split an expression near a procedure call either precomputing some argument of the procedure or the function s result into a temporary variable Sometimes TAPENADE cannot guess the type of this temporary variable This is probably a temporary bug of TAPENADE RT n 0300 48 Hasco t and Pascual Why should you care The temporary variable may be of a wrong type and this may loose activity What can you do Insert the temporary variable by hand into the source program and declare it of the correct type No field named Sin T Illegal recursive type definition These are illegal uses of types TC45 says the program looks for a field name which does not exist and TC46 that some type recursively contains a field of the same type thus yielding an infinite size Why should you care It is surprising if the original program
53. am P To begin with AD assumes that P represents all its possible run time sequences of instructions and it will in fact differentiate these sequences Therefore the control of P is put aside temporarily and AD will simply reintroduce this control into the differentiated program In other words P is differentiated only piecewise Experience shows that this is reasonable in most cases and going further is still an open research problem Then any sequence of instructions is identified with a composition of vector functions each of which is assumed differentiable Thus for a given control P is h h 1p 1 F fp fp 19 Offi where each fp is the elementary function implemented by instruction 14 In general the functions implemented by the arithmetic operations of the programming language are indeed differentiable However in some rare cases this is not true For example the square root is defined on a null argument and its derivative is not Again experience shows that this assumption is verified in most cases Finally AD simply applies the chain rule to obtain derivatives of F If we write for short Xy the values of all variables after each instruction Ig i e Xy X and X fk Xk 1 the chain rule gives the Jacobian F of F F X Xp X fpa Xp 2 X X f Xo 2 which can be mechanically translated back into a sequence of instructions 1 and these sequences inserted back into a copy of the control of P yielding program P
54. ample for SQRT on zero or exponentiation x y when x is zero NaNs can also appear for a slightly different reason for some given inputs some functions have an output which is large but not yet in overflow whereas the derivative is definitely in overflow Think of 1 x whose derivative is 1 x 2 Again this requires a special treatment although a brute force solu tion could be to switch to double precision However there are situations where TAPENADE can do a slightly better job automatically For SORT in the tangent mode instruction a SQRT b generates differentiated instruction ad bd 2 0 SQRT b which returns NaN if b is zero But if bd is also zero we feel it is safe to return zero in ad too Thus in this case TAPENADE avoids the above instruction and sets ad to zero instead Something similar will be done in the reverse mode in the future versions Another general remark is that TAPENADE emits a number of warning messages during differentiation In section 6 3 we describe some messages that may actually indicate why the derivatives are invalid Think of all programming techniques where REAL values are stored temporarily into INTEGERs or into files therefore loosing their derivatives Also TAPENADE performs a number of optimizations on the differentiated program which can be delicate and certainly not bug free A good strategy can be to retry differentiation this time after INRIA TAPENADE 2 1 63 disactivating optimizat
55. an array and the size of this array is dynamic TAPENADE must know this size to actually do the save Why should you care If you don t give this size the differentiated program will not compile What can you do Define the value of the constants PARAMETERs that TAPENADE requests in the special include file named DIFFSIZES inc You may also explicitly give this size in the original file so that the message does not show up INRIA TAPENADE 2 1 l Please provide a differential of function F for arguments Signature You must provide the differentiated program with a new function differentiated from F with respect to the input and output parameters of F specified in the Signature part Probably F is a black box function for which TAPENADE has no source and therefore it couldn t differentiate it Why should you care If you don t give this new function the differentiated program will probably not compile What can you do Define this new function You have the choice of the method Maybe you know explicitly the derivative mathematical function which can be implemented efficiently Maybe you have the source and you may use AD again with TAPENADE or with another tool Maybe you just want to run a local divided differences method knowing that this might degrade the precision of the derivatives In any case make sure that you provide a differentiated procedure in the correct mode tangent reverse and which adheres to th
56. aph 12 13 36 39 47 53 55 57 67 call tree see call graph chain rule 8 60 checkpointing 11 12 23 compilation 6 8 13 52 cotangent mode see reverse mode data assimilation 9 data dependency 28 30 31 data locality 32 38 70 debugging 62 67 dependent output 13 18 36 53 RT n 0300 77 derived types see structured types direct mode see tangent mode directional derivative 9 divided differences test 60 dot product test 60 63 67 downstream 11 duplicate sub expressions 32 33 flow graph 20 39 48 57 flow reversal 20 forward sweep 10 12 20 23 69 70 generalization 16 23 gradient 9 33 69 gradient based optimization 9 independent input 13 18 36 53 initializations 17 20 22 31 38 49 67 70 Input Output 22 23 50 inverse problems 9 Jacobian 8 modules 25 28 multi directional mode 28 30 38 52 non decidability 67 O yss e 7 overloading 25 parallelism 17 30 42 70 procedures 12 20 23 25 28 47 PUSH POP and LOOK 20 23 31 39 52 63 Recompute All 9 11 recursivity 47 reverse mode 9 12 16 20 23 31 34 36 43 50 52 58 60 63 67 root procedure 36 53 60 78 sensitivities see directional derivative side effects 22 32 42 49 50 62 snapshot 11 23 34 38 58 67 specialization 16 23 static analyses 31 67 Store All 10 11 12 20 symbol names 14 TAF 10 tangent mode 9 11 16 36 58 60
57. as no active input nor output After activity analysis it may turn out that the root differentiation procedure has no dependent which actually depends on an independent cf section 4 5 Then after simplifi cation no active input nor output remain and there is nothing left to differentiate Why should you care The differentiated program is just empty What can you do Check the dependents and independents that you specified so that they effectively depend on one another RT n 0300 52 Hasco t and Pascual 6 4 Specifying required size information Messages AD10 AD11 and AD12 require that the end user defines the value of some constants in a special include file named DIFFSIZES inc This is necessary to compile the differentiated program which uses these constants There are two sorts of such constants e the constant that defines the maximum number of simultaneous differentiation di rections in multi directional mode cf section 4 11 This constant is named usually nbdirsmax unless the name is already used in the program Of course the user is free to choose the value e the constants that give the size of some dimensions of some new arrays introduced by differentiation These constants are usually named SIZEnOFuIN where n is the rank of the dimension whose size is missing v is the name of the differentiated array for which the size is missing and fis the name of the procedure in which the problem occurs TAPENADE us
58. ays with their actual size not 1 Never resize reshape or split arrays implicitly through parameter passing and avoid to do so across COMMONs INRIA TAPENADE 2 1 47 Type mismatch in argument N of procedure F was T 1 is here T2 TC31 Type mismatch in result of function F was T1 is here T2 These messages indicate conflicts between declaration and usage of some procedure For most of these messages programs which raise them should not compile Messages TC30 to TC32 indicate a mismatch between the formal and actual parameters of a procedure or between two declarations Recall that the standards require that formal and actual parameters match in number and type Why should you care In general type mismatches mean that some variables may not get the type you intended and this in turn impacts differentiation Messages TC30 to TC32 may indicate that the code relies on FORTRAN weak typing to perform hidden EQUIVALENCEs between two variables or to resize reshape or split an array into pieces Since TAPENADE cannot possibly understand what is intended it might not propagate dif ferentiability and derivatives from the old array shape to the new array shape What can you do Insert the correct declarations and make sure each procedure usage matches its definition Recursivity F1 gt F2 gt gt Fn Recursivity means that a procedure F1 calls a procedure F2 which itself calls a procedure F3 and so on and this eventu
59. by comparing them with divided differences i e by applying the well known formula for a differentiable function f of a scalar variable x JR e 0 E 5 We recall that for a given direction X in the input space the output of the tangent mode should be Y F X x X Introducing function g of scalar variable h g h F X hxX expanding g at input h 0 with equation 5 on the left hand side and with the chain rule on the right hand side tell us that F X X F X ey a lim ATX FO L 0 P X x X 6 E so that we can approximate by running F twice on X and on X te xX The results of the reverse mode are validated in turn by using the validated tangent mode This is called the dot product test It relies on the observation that for any given X the result Y of the tangent mode can be taken as the input Y of the reverse mode yielding a result X We then develop the dot product of X and X X X F X x Y X V x F X x X V xY Y Y 7 so that we can compare X returned by the adjoint code with the Y returned by the tangent mode This results in the following framework to validate the code produced by TAPENADE shown on figure 28 Suppose the root differentiated procedure that implements the function INRIA TAPENADE 2 1 61 SUBROUTINE VALIDATION X Y n m INTEGER n m REAL X n Y m REAL Xsave n Xdsave n Xd n Xb n REAL Yeps m Ydeps m Yd m Yb m REAL eps normEps normTgt normAdj Cho
60. ch are each a Call Graph of Flow Graphs dump lt file gt Writes internal representation of the program differentiated program and analyses results into a large dump file named lt file gt RT n 0300 40 Hasco t and Pascual 6 2 Giving information on black box procedures In addition to command line options the user can actually should give information on each black box procedure used by the program We call black box any procedure whose source is not given to TAPENADE either because it comes from a compiled library or because its source language is not understood by TAPE NADE or even because the user prefers to differentiate the procedure by hand and therefore does not want TAPENADE to differentiate it Black box procedures cannot be analyzed normally by TAPENADE for example for type checking read written or activity analysis 3 3 If nothing is known on the black box procedure TAPENADE makes conservative assumptions For example that each output may depend on each input This has dramatic consequences nearly all variables downstream become varied nearly all variables upstream become useful many variables become active and the differentiated code is still correct but less efficient TAPENADE allows the user to give precise information on black box procedures in special configuration files one for the AD independent information e g MyGeneralLib and one for the AD related information e g MyADLib Ea
61. ch entry in MyGeneralLib is about one procedure and gives info about number type and intent of arguments For example you may type for a BBOX subroutine subroutine bbox external shape param 1 param 2 common c1 0 common c2 0 16 common c2 16 176 param 3 param 4 param 5 type ident real ident real arrayType ident real dimColons dimColon none none modifiedTypeName modifiers ident double ident real arrayType modifiedTypeName modifiers ident double ident real dimColons dimColon none none ident boolean arrayType ident character dimColons dimColon none none ident character R 1 1 0 1 1 1 1 1 W 0 1 1 1 1 1 0 0 Similarly for a function called FBBOX you may type INRIA TAPENADE 2 1 41 function fbbox intrinsic shape param 1 param 2 result type ident real ident real modifiedTypeName modifiers ident double ident real R 1 1 0 W 0 0 1 The external line states that the procedure is external For an intrinsic just replace this line by intrinsic The shape part is necessary because it defines the arbitrary order in which arguments will be referred to in the sequel Tts syntax is either param N that designates the N th formal parameter or result for the returned value if BBOX is a function or common xx 0 8 for an argument which is found from offset 0 to offset 8 in common xx All following lines are optional
62. ched for general specifications of black bor procedures This even removes the initial F77GeneralLib file from the list Adds lt file gt to the list of files that contain AD related spec ifications of black box procedures Initially this list contains one predefined file named F77ADLib which is found in the lib directory of the TAPENADE installation Empties the list of files searched for AD related specifica tions of black box procedures This even removes the initial F77ADLib file from the list 38 Hasco t and Pascual 6 1 3 Differentiation options These options affect the differentiation process itself diffvarname lt str gt Specifies how the name of a differentiated variable should be built from the name of the original variable If this option is not used the default appends d in tangent mode and b in reverse mode difffuncname lt str gt Specifies how the name of a differentiated procedure mod ule COMMON name or type name should be built from the name of the original object If this option is not used the de fault appends _D in tangent mode DV in multi directional tangent mode and _B in reverse mode nooptim lt optim_name gt Specifies that optimization lt optim_name gt must be disacti vated for this differentiation process This might be useful for instance for teaching or debugging The optimization lt optim_name gt can be one of the following e spareinit suppresses unnecessary computation or r
63. e conventions of TAPENADE about the name of the procedure and the order of its parameters cf section 4 8 AD10 User help requested constant V must hold the maximum number of differentiation direc tions User help requested constant V1 must hold the size of dimension N of V should be Hint AD12 User help requested unknown dimension N of array V During differentiation TAPENADE must declare new arrays whose size can be a dynamic number unknown during differentiation Such declarations are impossible in FORTRAN77 Therefore TAPENADE declares these arrays with a size which is a constant and asks you to define this constant before you compile the differentiated program Similarly AD10 requires you to give the size of the new dimension that holds the array of derivatives in multi directional differentiation cf section 4 11 Why should you care The differentiated program will simply not compile otherwise What can you do In this case TAPENADE automatically inserts an INCLUDE command in the differentiated files and the include file is named DIFFSIZES inc Create this file and fill it with all the required definitions of constants TAPENADE gives you hints about the values you should put Be careful to define these sizes as constants PARAMETERs and not as variables cf section 4 11 Again FORTRAN77 does not allow you to declare a local array with a size which is not known at compile time AD13 Differentiation root procedure F h
64. e factorization effort which was probably already done during the forward sweep of the reverse code in the first call to AUTOADJ X Y N L U 9 3 4 Iterative resolutions Consider an iterative resolution where only the final state matters All the intermediate states starting from the initial first guess do not really matter for differentiation Actually there are many clever strategies to compute the derivatives and especially the gradient of an iterative resolution One way to view it is to consider that only the last converged state matters If resolution iteratively goes from initial guess state sy to converged state S the forward sweep of the reverse mode does not need to keep track of all intermediate values during steps sg to sg 1 It can keep the last state s only and the backward sweep will compute the adjoints iteratively staying on state sz This is not the only strategy available and there are still research problems in this question 17 19 For example the resolution that takes place during each iteration step is probably well adapted for solving the state s but nothing proves its adjoint is the best for solving the adjoint state T Other strategies use a different solver for the adjoint state In any case this decision is up to the end user An AD tool must help the user by providing the adjoints of basic bricks of the iterative program and by letting open the choice of the strategy to glue these adjoint bricks together
65. e initialization of derivatives when they are implicit null or useless cf section 4 5 e diffarguments refuses to insert a differentiated argument after a procedure argument which is never active neither be fore nor after any call cf section 4 8 e splitdiff finds a minimal split of expressions so as to reduce common subexpressions in the derivatives cf sec tion 5 4 e mergediff merges differentiated instructions to reduce overhead and improve data locality cf section 4 11 for multi directional mode and section 5 2 for the reverse mode e tbr uses To Be Restored analysis to reduce memory con sumption in the reverse mode cf section 5 1 e adjointliveness removes instructions that are dead in the adjoint code cf section 5 5 e deadcontrol removes control which is dead in the adjoint code e snapshots reduces the snapshot taken before each proce dure call in the reverse mode using adjoint Read Written analysis cf sections 4 8 and 5 5 INRIA TAPENADE 2 1 39 6 1 4 System options These options specify system information Sizes of primitive types may also influence differ entiation because they influence the way EQUIVALENCEs are solved as well as the memory size passed to PUSH and POP subroutines in the reverse mode java_home lt dir gt Specifies the directory lt dir gt where the JAVA interpreter is If this option is not used the default is usr local jdk1 4 1 javaheapsize lt size gt Specifies the
66. e not treated well For example TAPENADE does nothing to keep the differentiation of a parallel loop par allel Consider a loop with the well known CRAY directive IVDEP meaning that a loop is vectorizable The differentiated loop in the reverse mode should be vectorizable in most cases However TAPENADE will probably have inserted several calls to PUSH and POP which make the loop non parallel This problem can be solved by changing the way the tape is stored inside parallel loops 10 6 The Recompute All strategy The PUSH and POP strategy to store the tape and the snapshots is relatively versatile and robust However in many cases it is not the most efficient approach At some locations in the program other approaches can be proposed probably triggered by directives From more robust to more efficient we might e use PUSH and POP in an external memory allocated dynamically e use PUSH and POP to statically allocated memory e replace the PUSH and POP by assignments to local storage arrays e use selected variables in Single Assignment mode i e use new different variables to avoid overwriting If a variable is never overwritten there is no need to save it RT n 0300 74 Hasco t and Pascual More generally the alternative Store All approach sometimes gives better performances In our research activity we are investigating optimal trade offs between RA and SA strate gies where some variables are stored and others are r
67. e used e g in the context of a MPrlike parallel program These two black box procedures communicate through a side effect which must be declared somehow to TAPE NADE This can be achieved by introducing a dummy global communication channel here a dummy COMMON and using it in the black box signatures of SEND and RECEIVE This way sending an active variable will make the received variable active too To this end one may insert into the MyGeneralLib file subroutine SEND shape param 1 common CHANNEL 0 type ident real ident real R 1 1 W 0 1 subroutine RECEIVE shape param 1 common CHANNEL 0 type ident real ident real R 0 1 W 1 1 and into the MyADLib file subroutine SEND deps id 1 1 subroutine RECEIVE deps 0 1 0 1 The end user is responsible for providing the appropriate differentiated procedures for SEND and RECEIVE INRIA TAPENADE 2 1 43 6 3 Messages issued by TAPENADE TAPENADE issues a number of messages resulting from the preliminary analyses done before actual differentiation Although the temptation is strong these messages should not be ignored right away Especially when AD is concerned these messages can be the indication that the program runs into one limitation of the AD technology Generally speaking com pilers often permit to go against the standard with no visible harm but this often introduces errors into the program differentiated in reverse mode So even
68. e vocabulary and x original form of X E forward sweep for X B D ES DT Gus d take snapshot Figure 5 Checkpointing on calls in TAPENADE reverse AD graphical notations Execution of a procedure A in its original form is shown as A The forward sweep i e execution of A augmented with storage of variables on the stack just before they are overwritten is shown as A The backward sweep i e actual computation of the gradient of A which pops values from the stack when they are needed tor restore the X s is shown as A With no checkpointing the adjoint program is just A T A TAPENADE applies checkpointing to every call to a user procedure call such as Bin A Procedure B will be run without storage during A When the backward sweep A reaches B it runs B ie B again but this time v with storage and then immediately it runs the backward sweep B and finally the rest of A Duplicate execution of B requires that some variables used by B a snapshot be stored Figure 5 shows the resulting differentiated call graph for an example initial program call graph If the program s call graph is actually a well balanced call tree the memory size as well as the computation time required for the reverse differentiated program grow only like the depth of the original call tree i e like the logarithm of the size of P which is satisfactory 3 3 Activity In many real situations the end users of AD need only the derivatives of some se
69. ecomputed according to properties of the data flow or data dependence graphs This may result in new functionalities of TAPE NADE in the long run References 1 A Aho R Sethi and J Ullman Compilers Principles Techniques and Tools Addison Wesley 1986 2 M Berz C Bischof G Corliss and A Griewank editors Computational Differentia tion Techniques Applications and Tools SIAM Philadelphia PA 1996 3 M Buecker G Corliss P Hovland U Naumann and B Norris editors AD2004 Post Conference Special Collection Lecture Notes in Computational Science and En gineering Springer 2004 to appear 4 A Carle and M Fagan ADIFOR 3 0 overview Technical Report CAAM TR 00 02 Rice University 2000 5 G Corliss Ch Faure A Griewank L Hasco t and U Naumann editors Auto matic Differentiation of Algorithms From Simulation to Optimization Computer and Information Science Springer New York NY 2001 6 F Courty A Dervieux B Koobus and L Hasco t Reverse automatic differentiation for optimum design from adjoint state assembly to gradient computation Optimization Methods and Software 18 5 615 627 2003 7 P Dutto C Faure and Fidanova S Automatic differentiation and parallelism In Proceedings of Enumath 99 Finland 1999 8 Ch Faure and P Dutto Extension of Odyss e to the MPI library Reverse mode Technical Report 3774 INRIA 1999 9 Ch Faure and U Naumann Minimi
70. edure There is a distinction between the two new integer variables nbdirsmax and nbdirs Variable nbdirsmax is actually a constant which defines the size of the extra dimension in derivatives Therefore it is the maximum number of simultaneous differentiation directions that can be handled by the differentiated program It must be defined as a constant with its static value e g 25 in the include file named DIFFSIZES inc for example as INTEGER nbdirsmax PARAMETER nbdirsmax 25 There is a comment in the differentiated program and also a warning message during differentiation AD10 cf section 6 3 to remind the user of this On the other hand nbdirs is the actual number of differentiation directions for one particular run It is a run time parameter passed to each differentiated subroutine used to avoid computing derivatives along undefined directions Of course nbdirs must be smaller or equal to nbdirsmax INRIA TAPENADE 2 1 31 5 SPECIFIC REFINEMENTS FOR THE ADJOINT The above section already made it clear that the reverse mode is far more complex than the tangent mode This is the price for efficient gradient computation This section presents specific aspects of the model of reverse AD This is necessary to fully understand some particular structures in adjoint codes that would appear very strange otherwise Most of these refinements can be disactivated via command line options for example for training purposes and also for
71. execution of the dot product test 65 Procedure calls inserted for split dot product test 66 Optimization of adjoint II Loops 70 Partly automatic reverse differentiation for Floops 71 RT n 0300 6 Hasco t and Pascual 1 INTRODUCTION As computational power increases the domains of computational simulation optimization and inverse problems are developing rapidly They widely use derivatives When a function is already discretized and solved Automatic Differentiation AD can return its derivatives without going back to the discretization step AD transforms a program that computes or simulates a mathematical vector function into a new program that computes derivatives of this function Further information about AD can be found in the monograph 15 and in the collection of articles from the past AD conferences 16 2 5 3 These articles cover most topics related to AD from theoretical questions in computer science and numerical science to AD tools development and industrial applications AD is a program transformation and is therefore performed by software tools similar to compilers or parallelizers This is the user s guide to TAPENADE an AD tool with a strong focus on the reverse mode that computes gradients Our guideline in this development is to reuse and transpose technology from the compilation field 1 to AD in order to produce efficien
72. he list of all output variables of the root procedure n SUORE Specifies the independents i e the list of the input variables of the root procedure with respect to which the derivative is requested Variables which are not input local variables or constants or func tion return are just ignored lt vars gt is a list of variable names be tween double quotes The double quotes may be omitted when there is only one variable in the list If this option is not used the default is the list of all input variables of the root procedure d Calls for differentiation in the tangent mode The short name d is tangent a reminder for dot or the deprecated name direct This option is the default if neither d b nor p option is used b Calls for differentiation in the reverse mode The short name b is multi Modifies the current differentiation mode to turn it into multi directional cf section 4 11 In the current version of TAPENADE this option applies only to the tangent mode p Does no differentiation at all and therefore ignores the outvars Preprocess and vars options This mode produces a program which is equiva lent to the input program but analyzed and restructured by TAPE NADE This option is also useful for debugging purposes INRIA TAPENADE 2 1 37 6 1 2 File information options These options control the style and location of several files that TAPENADE will look for inputlanguage lt lang gt
73. i LOG a i a i 1 IF a i LT 0 0 THEN IF a i LT 0 0 a i 2 a i CALL PUSHREAL4 a i END IF a i 2 a i ENDDO CALL PUSHINTEGER4 3 END ELSE CALL PUSHINTEGER4 2 END IF ELSE CALL PUSHINTEGER4 1 END IF ENDDO CALL PUSHINTEGER4 i 7 SUBROUTINE 1_D a ad n x CALL POPINTEGER4 ad to Paes DO i ad_to 2 7 DO i 2 n 7 CALL POPINTEGER4 branch IF a i GT 1 0 THEN IF branch GE 2 THEN ad i ad i a i ad i 1 IF branch GE 3 THEN a i LOG a i a i 1 CALL POPREAL4 a i IF a i LT 0 0 THEN ab i 2 ab i ad i 2 ad i END IF a i 2 a i CALL POPREAL4 a i END IF ab i 1 ab i 1 ab i END IF ab i ab i a i ENDDO END IF END ENDDO Figure 12 Differentiation of the flow of control RT n 0300 22 Hasco t and Pascual 4 7 Input Output I O procedures Input Output I O procedures are essentially not differentiated because they do not par ticipate in the numerical computation However TAPENADE takes care of some side effects that affect derivatives When a variable is overwritten by an I O procedure e g a READ it becomes not varied downstream and useless upstream Either its derivative can become implicit null or TAPENADE automatically forces reinitialization of its derivative to zero However the end user should check that this is the behavior wanted and therefore TAPE NADE sends a warning message ADO5 at differentiation time Please refer to section 6 3 for a complete desc
74. ide effect communication of the REAL value Why should you care If the original REALs are active i e depend on the indepen dent input variables and the final REALs are useful i e influence the dependent output variables then this temporary conversion into INTEGERs has lost activity and derivatives Differentiability is lost derivatives are wrong What can you do REALs must remain REALs If absolutely necessary there is a secret option in TAPENADE to actually differentiate these strange REAL INTEGERs but we won t write it in this manual Active variable V written by I O to file File Useful variable V read by I O from file File This is comparable to the above AD01 and AD02 These two messages are usually associated They indicate the following side effect situation Some programs write interme RT n 0300 50 Hasco t and Pascual diate REAL values into a file and later on read back these REAL values from this file This process looses differentiability cf section 4 7 Why should you care If the written value is active i e depends on the indepen dent input variables and the corresponding equal value read is useful i e influences the dependent output variables then this temporary storage into a file has lost activity and derivatives Differentiability is lost derivatives are wrong What can you do Do not use files as temporary storage Use arrays instead You may also use TAPENADE black box mechanism to disco
75. ines according to the type of the stacked value Its internal representation of programs as Flow Graphs allows TAPENADE to use struc tured programming in the backward sweep like in the forward sweep using very little memory space to store the control and with no restriction on the original control constructs all sorts of GOTO s arithmetic IF s alternate procedures or I O returns The principle is the right time to store the control is when the original control flow merges and what must be stored then is where the control actually came from For iterative constructs TAPENADE tries to store an iteration counter which is a single integer so that the backward sweep can also feature an iterative loop For most control constructs the backward sweep tries to use the same constructs Thus the reverse of an IF is an IF the reverse of a SELECT CASE is a SELECT CASE the reverse of a DO loop is most often a DO loop etc In particular situations of loops with multiple entries or exits TAPENADE must store more control information on the forward sweep and the backward sweep may be unable to keep a clean DO loop struc ture Figure 12 illustrates how TAPENADE builds the control structure of the differentiated procedures INRIA TAPENADE 2 1 21 original program TAPENADE reverse forward sweep SUBROUTINE Si a n x DO i 2 n 7 elos IF a i GT 1 0 THEN DO i 2 n 7 CALL PUSHREAL4 a i IF a i GT 1 0 THEN a i LOG a i a i 1 a
76. ions with the nooptim option cfsection 6 1 If the problem appears only with some optimization it is a precious indication you can give us A last general remark before we examine specifically debugging of the tangent mode and of the reverse mode is that TAPENADE makes hypotheses on the memory size occupied by primitive data types For example a DOUBLE PRECISION number is supposed to take 8 bytes This is crucial for solving equivalences an different splittings of a given COMMON and in the reverse mode this conditions the way a variable is PUSHed and POPped If TAPENADE size hy potheses are different from the actual sizes on the target architecture then the differentiated program is likely to crash segmentation fault or give wrong results In the ADFirstAidKit directory of the standard installation the testMemSize program can help you find out these actual sizes and the i r and dr command line options must be used to modify the default choices of TAPENADE accordingly 9 2 1 Debugging tools for the tangent mode For the tangent mode we devised a way to locate the origin place of the mismatch The general idea is to run two computations of derivatives using Divided Differences on one hand using TAPENADE s tangent code on the other hand checking intermediate derivatives on the fly at places selected by the user The first difference probably indicates where the bug lies This is documented in the Validation section of the FAQ page in
77. lected outputs of P with respect to some selected inputs of P Whatever the differentiation mode tangent reverse these restrictions allow the AD tool to produce a much more efficient differentiated program Essentially fixing some inputs and neglecting some outputs allows AD to just forget about several intermediate differentiated variables This has two main consequences INRIA TAPENADE 2 1 13 e several differentiated variables just disappear from the differentiated code because they will contain either null or useless derivatives Memory usage of the differentiated code becomes smaller e several differentiated instructions are simplified or erased because one of their deriva tive arguments has a known trivial value Execution time of the differentiated code becomes shorter Like most other AD tools TAPENADE has a specific analysis known as activity analysis that detects these situations therefore allowing for a better differentiated code TAPENADE allows the end user to specify that only some output variables the depen dent must be differentiated with respect to only some input variables the independent We say that variable y depends on x when the derivative of y with respect to x is not trivially null We say that a variable is varied if it depends on at least one independent Conversely we say that a variable is useful if at least one dependent depends on it Finally we say that a variable is
78. ll probably feature the multi directional reverse mode as well The multi directional differentiation model is based on spreading each memory cell that holds a single derivative instead of holding just a scalar it now holds an array whose dimension is the maximum number of differentiation directions X or Vu Figure 17 shows multi directional tangent differentiation of function FF yielding subroutine FF_DV Several things must be noted e Differentiated variables have an extra dimension with respect to their original variable In the case of arrays this new dimension is put first so that when one element of the array is passed to a procedure e g y i in the call to F2_DV the array of all its derivatives may be passed along e g yd 1 i e Since FORTRAN does not allow a function to return an array differentiated functions with an active result must become subroutines This is what happens here to function FF which becomes a subroutine FF DV with output parameters ff and ffd e A simple instruction such as an assignment produces now a differentiated instruction which is a loop that iterates for each differentiation direction To minimize loop over head TAPENADE performs a data dependence analysis to reorder the differentiated instructions The goal is to create sequences of differentiated loops as long as pos sible and then to fuse these loops because they have the same iteration space and INRIA TAPENADE 2 1 29 original
79. locations One can then choose new locations to further reduce the place where the problem lies RT n 0300 66 Hasco t and Pascual In practice one needs to start from the initial test bed shown on figure 28 and in sert the appropriate subroutine calls inside FOO_D and FOO_B at all locations L j 0 p not forgetting Lo at the very beginning nor L at the very end Figure 31 shows what must be inserted at each location L supposing that the active variables X at this loca tion are exactly u v and array t whose dimension is e g t 3 s therefore equivalent to a mono dimensional array of size 3 s Keep in mind that the X are stored into a stack and therefore must be popped from the stack in the reverse order Subroutines original program TAPENADE tangent TAPENADE reverse backward sweep ud 3 0 ud ub v vb 3 0 u 2 0 u 3 0 u 2 0 u vb call DPFWDREAL8 ud DPREVREAL8ARRAY tb 3 s call DPFWDREAL8 vd DPREVREAL8 vb call DPFWDREALSARRAY td 3 s DPREVREAL8 ub call DPFWDDISPLAY Mid point DPREVDISPLAY Mid point vd u vd ud v v u v 3 0 ub Figure 31 Procedure calls inserted for split dot product test DPFWDREAL8 DPFWDREAL8ARRAY DPFWDDISPLAY and the corresponding reverse DPREVREAL8 DPREVREAL8ARRAY DPREVDISPLAY are all provided in the ADFirstAidKit directory For each location L 4 for j 0 up to p 1 FOO_D prints the dot product Xi Xj41 followed by the name of L 1 For each location
80. ly the storage of EN ho Li AAA ee m L 1 2 p Figure 3 The Store All strategy intermediate values just before they are overwritten by an instruction J and their corresponding retrieval during the backward sweep Brute force SA strategy has a INRIA TAPENADE 2 1 11 linear memory cost with respect to p The ADIFOR 4 and TAPENADE tools use this strategy In all cases these back and forth strategies are a problem regarding vocabulary What would mean before or after a given instruction In the sequel before and after refer to the global execution order of the complete differentiated program On the other hand we shall use downstream and upstream to refer to the order of the corresponding original instructions In other words in the backward sweep the instruction after is also upstream to the original program s origin and the instruction before is also downstream to the original program s end All specialists of the reverse mode e g salmons will testify how much harder it is to go upstream 3 2 Checkpointing in the reverse mode On real large programs neither the RA nor the SA strategy can work Both need a special storage recomputation trade off in order to be really efficient This trade off is called check pointing The idea is to select one or many sub parts S of the program P possibly nested For each S one can spare some repeated recomputation in the RA case some memory in the SA case at the cost
81. n previous Undehned constant vale Bap SSCS DD14 Undefined constant value Exp DD15 Module M is not defined DD16 Double declaration of module M ignored new DD17 Symbol Sis not defined in module M DD18 Incorrect public private declaration 20 Cannot combine successive declarations of interface I ignored new EZH E ED Ea EA ES anta EA E EA In general these messages indicate that there are two successive definitions or partial declarations of some symbol as a variable constant type label or procedure or module name These successive declarations occur in the same scope and do not combine into a coherent declaration for the symbol TAPENADE therefore chooses either to ignore the new piece of declaration or to overwrite the previous piece of declaration as indicated by the message For DD08 the two choices are equivalent Why should you care TAPENADE chooses to ignore the indicated declaration or part of your original program Therefore the types considered by TAPENADE may not be the ones you expect or the program actually differentiated may differ from the program you think In particular when types are concerned remember that differentiation occurs only on REAL or COMPLEX typed variables Therefore some variable may have no derivative whereas you think it should have one Also for DD10 or DD13 the flow of control may be different from what you think or from what your local compiler used to build Als
82. n GOTO 6 3 4 Data flow problems DFO1 Illegal non reference value for output argument N of procedure F DF02 Potential aliasing in calling function F between arguments Args Variable V is used before initialized Variable V may be used before initialized These problems are detected through the built in inter procedural data flow analysis of TAPENADE Why should you care Programs with DF01 usually provoke segmentation faults Aliasing DF02 is a major source of errors for Differentiation in the reverse mode as illus trated in section 5 3 Uninitialized variables lead to uninitialized derivatives What can you do Output formal arguments of a procedure must be passed reference i e writable actual arguments It is better to avoid aliasing When aliasing is really a problem one can easily avoid it by introducing temporary variables so that memory locations of the parameters do not overlap Variables should be initialized before they are used 6 3 5 Automatic Differentiation problems Actual argument N of F is active while formal argument is non differentiable Actual output N of Fis useful while formal result is non differentiable These two messages are usually associated They indicate the following situation Some programs implicitly convert REALs to INTEGERs during a function call then these INTEGERs follow their way through the code and finally are converted back into REALs during another function call This is a sort of s
83. n reality if the root procedure has many different inputs and outputs of different shapes and types it is absolutely not required that they are gathered in one array such as X and Y We do this here only to make the explanation simpler Variables X and Y may even overlap if some variables are at the same time in and out Subroutines FOO_D and F00 B are produced by differentiation with TAPENADE with the following simplifying assumption which can be lifted easily we set the independents to all X and the dependents to all Y If the computed derivatives are correct then the three numbers printed by the procedure on figure 28 must be roughly the same Usually the tangent norm and the adjoint norm match very well up to the last few digits The norm obtained with Divided Differences matches only to half the machine precision sometimes after some adaption of the value of eps In general this is enough to be convinced that the derivatives are correct 9 2 What if validation fails If validation fails we must look in more detail at the differentiated programs First one must make the tangent norm match the Divided Differences norm and then the adjoint norm must match the tangent norm Even worse validation may fail simply because of arithmetic exceptions returning re sults like NaN or Inf Actually differentiation can introduce NaN s in the resulting program because programs are sometimes non differentiable and AD is then in trouble for ex
84. nalysis Varied and Useful Active variables Adjoint Liveness analysis dead adjoint code TBR analysis Required variables Adjoint Written analysis Optimal snapshots Tangent AD Tangent program Reverse AD Adjoint program Figure 27 Internal Data Flow analyzes needed by AD in TAPENADE RT n 0300 60 Hasco t and Pascual 9 VALIDATION AND TUNING Let s face it TAPENADE will probably not produce a perfect differentiated program at first try After Automatic Differentiation a validation step is necessary to check the derivatives computed Section 9 1 proposes a framework to perform this validation As long as validation doesn t work the end user must identify the problem and if possible modify the input program to solve the problem Section 9 2 describes the most frequent problems found It may happen that you find an unknown problem in the source produced by TAPENADE in this case please keep us informed by sending a mail to tapenade lists sop inria fr When the computed derivatives are finally correct it may be good to check whether some well known improvements are applicable to the differentiated code This is described in section 9 3 9 1 A classical framework for validation Recalling the notations of section 3 we differentiate a program P that computes a function F with input X JR and output Y F X IR Classically one validates the results of the tangent mode
85. nnect differentiation on the calling procedures and differentiate them by hand with an ad hoc mechanism to propagate deriva tives This reminds you that a variable that was active has been overwritten by an I O operation cf section 4 7 Why should you care The derivative is of course reset to zero Maybe this is not what you expected What can you do Check that it is OK Otherwise try to put all I O operations outside the program part that you are actually differentiating ADO6 Differentiate of function F needs to save undeclared side effect variables V This call needs to save undeclared side effect variable V Sometimes the checkpointing mechanism of the reverse mode requests to save some vari ables which are declared in some called procedure but not in the current calling procedure Think for example of a SAVE variable or a COMMON which is not declared at the calling procedure level Why should you care The instruction that saves this variable cannot be put here since this variable is not declared here in the calling procedure The differentiated program is therefore wrong What can you do Insert the COMMON declaration at the level of the calling procedure Do something similar for SAVE variables AD08 Don t know the size of dimension N of array V which must be saved In some cases for example in reverse mode AD the differentiated function must save some variables to restore them later If one such variable is
86. o for DD11 some information that you give about an external procedure may be ignored and replaced by another one What can you do Remove the declaration or definition in excess or modify conflicting declarations so that they combine well into the type that you expect For DD11 clean up the library file by merging the data about a given procedure into one single entry Use the nolib and noADlib options if you need to disactivate loading of F77GeneralLib and F77ADLib respectively INRIA TAPENADE 2 1 45 6 3 2 Type problems TC01 Expression is expected to be numeric and is here T EXE TC06 Loop index is expected to be integer and is here T Loop bound is expected to be integer and is here T TC13 Triplet element is expected to be numeric and is here T TC14 Substring index is expected to be numeric and is here T TC15 Argument of logical operator is expected to be boolean and is here T The indicated expression is used in a context which requires some type Using the available declarations the type checker found that this expression actually has a different type indicated as T in the message Why should you care This may be the sign of an error in the program Even if this original program actually compiles and runs well remember that differentiation occurs only on REAL or COMPLEX typed variables If some sub expression becomes of another type there will be no derivative associated even if you think there sh
87. of remembering a snapshot i e enough variables to be able to restart execution from a given point Specifically in the RA case the snapshot is taken at the end of S in order to reduce the length of recomputation sequences Symmetrically in the SA case the snapshot is taken at the beginning of S so that the initial forward sweep does no storage at all but an extra run of S will be done later this time with storage Figure 4 illustrates checkpointing for both RA and SA strategies for one single checkpointed sub part Ckp as well as for nested checkpointed parts Ckp Recompute All Store All E O gt 0 gt i ee o o o o gt OO Ckp a 3 e 82 e gt e 8 lt e Ld bo OO Ckp E y o e poe gt amp o mo RE qu fr Figure 4 Checkpointing in the contexts of RA and SA RT n 0300 12 Hasco t and Pascual Interestingly fig 4 shows that when the number of checkpoints increases the overall structure of the adjoint program becomes similar and so do the costs in extra recomputation and snapshot storage The difference lies in the smaller program parts shown in grey on fig 4 in which the pure RA or SA strategy is applied Usually these parts are the size of a procedure call a loop body or even a basic block of instructions TAPENADE uses the SA strategy and applies checkpointing to procedure subroutine or function calls This is illustrated on figure 5 on which we define som
88. on for that static analyses compile time analyses of programs are non decidable In other words there will always be programs on which the answer to a data flow question is unknown The tool that uses the analysis must therefore make a conservative choice i e generate a code that copes with the worst case scenario and this has a cost For instance in all modes of AD there may remain unnecessary initializations of deriva tive variables because analyses couldn t tell whether this derivative will be used before it is next overwritten Also specifically in the reverse mode some storage may have been inserted to write intermediate values on the tape or to take a snapshot although the user knows that it is not necessary Obviously removing this unnecessary storage will improve the program in execution time as well as memory usage However this must be done with care until a directive is available In the reverse mode another source of inefficiency is a poor choice of checkpoints Fig ure 5 summarizes where TAPENADE puts checkpoints This choice is in general efficient when the call graph is well balanced In other cases it may be wise not to checkpoint some very small procedures or to checkpoint selected parts of time consuming procedures especially for loops cf 14 Again this should be done during AD driven by user given directives Until this mechanism exists in TAPENADE this can be achieved by changing the call graph of the original
89. ose epsilon for divided differences eps 1 e 10 Save the initial input X Xsave X Create the input differentiation direction Xd Xdsave 1 0 Prepare the Xtepsilon Xd input for F X Xsave eps Xdsave Run Yeps F Xtepsilon Xd call FOO X Y Yeps Y Restore the initial input X X Xsave Restore the differentiation direction Xd Xd Xdsave Run the tangent code Y Yd F_D X Xd call FOO_D X Xd Y Yd Compute the square norm of Yd from divided differences Ydeps Yeps Y eps normEps SUM Ydeps Ydeps print Divided Differences normEps Compute the sguare norm of Yd from tangent AD normTgt SUM Yd Yd print AD Tangent mode normTgt Restore the initial input X X Xsave Initialize Yb Yd Yb Yd Run the adjoint code Xb F_B X Yb call FOO_B X Xb Y Yb Compute the dot product Xb Xd normAdj SUM Xb Xd print AD Reverse mode normT gt END Figure 28 Subroutine framework for validation of derivatives RT n 0300 62 Hasco t and Pascual F is the subroutine FO0 X Y where X is an array of n real numbers which are the inputs X of F and Y is an array of m real numbers which are the outputs Y of F This validation method requires us to call FOO repeatedly and its results must be reproducible Therefore X must really contain all inputs to F There must be no hidden side effect inputs on which we have no control I
90. ould be one Remember also that these problems make the program non portable What can you do Check the types You may also use the appropriate conversion functions Be aware that when you convert a REAL into an INTEGER the differentials are reset to zero TC16 Type mismatch in assignment T1 receives T2 TC17 Type mismatch in case T1 compared to T2 TC08 Loop times expression is expected to be integer and is here T TC18 Type mismatch in comparison T1 compared to T2 The two terms in the assignment or comparison or the switch and case expressions must have compatible types Otherwise the meaning of the instruction is not well defined and not portable Why should you care This can be the indication for a standard type error and like above this can lose derivatives What can you do Check the types Use conversion functions when appropriate RT n 0300 46 Hasco t and Pascual TC19 Equality test on reals is not reliable Equality and non equality are not well defined between REALs because they are defined only up to a given machine precision Equality test can yield different results on different machines Why should you care As far as AD is concerned equality tests on REALs are some times used as the stopping criterion for some iterative process Then this relates to a well known problem do the derivatives converge when the function does and if yes do they converge at the same speed What can you
91. piece of data flow information At each location in the program i e before or after each instruction this data flow information associates an elementary piece of information to each accessible variable For most analyses Read Written Activity Adjoint Liveness TBR and RT n 0300 58 Hasco t and Pascual Adjoint Written this elementary piece of information is a handful of booleans On the other hand for Pointers Dependency and Differentiable Dependency this elementary piece of information is itself a vector of booleans one for each accessible variable The reason is that dependency analysis for instance computes for each variable the list of all variables it depends on The propagated pieces of information are larger their manipulation is slower The differentiation time is therefore dominated by these three analyses whose complexity is grossly the number of instructions in the program times the square of the number of declared variables To terminate this section let us illustrate the performances of the differentiated code itself on several examples coming from large real applications In theory the tangent differentiated code should take a small multiple 3 to 4 of the time of the original code and the reverse differentiated code should also take 4 to 5 times the time of the original code In practice the reverse code may take much longer because these estimations do not take into account the cost of storing intermediate vari
92. program introducing new procedures or inlining existing procedures However the most profitable improvements come from the user s knowledge of the orig inal program s algorithm and meaning rather than from a generic short sighted static analysis of the program s source A variety of directives may be very helpful to give this knowledge to TAPENADE Some of these improvements may be automated partly in the meantime using a combination of black box mechanism and limited hand modification We shall discuss below a number of such improvements namely about indeed inactive variables Linear code fragments auto adjoint code fragments iterative resolutions and loops with independent iterations RT n 0300 68 Hasco t and Pascual 9 3 1 Indeed inactive variables Consider for example the computation of the time step during the iterative resolution of a steady state problem Strictly speaking this time step is computed at each iteration and depends on the current iterated state Conversely this time step clearly influences the next iterated state and therefore the final steady state Therefore an AD tool will probably declare this time step as an active variable However the end user knows that the time step should not influence the final result more than through minor approximation effects It may be appropriate to force the AD tool to consider this time step inactive As a consequence the whole piece of code that computes this time step doe
93. re fully the validation and improvement techniques for differentiated codes Key words automatic differentiation algorithmic differentiation adjoint gradient opti mization inverse problems static analysis data flow analysis compilation Projet Tropics INRIA Sophia Antipolis Unit de recherche INRIA Sophia Antipolis 2004 route des Lucioles BP 93 06902 Sophia Antipolis Cedex France T l phone 33 4 92 38 77 77 T l copie 33 4 92 38 77 65 Manuel de l utilisateur de TAPENADE 2 1 R sum Ce rapport est le manuel d utilisation de la version 2 1 du logiciel de Diff renti ation Automatique TAPENADE Etant donn un programme source qui calcule une fonction math matique diff rentiable F TAPENADE construit un nouveau programme source qui cal cule certaines d riv es de FF et plus particuli rement des d riv es directionnelles mode direct et des gradients mode inverse Ce rapport rappelle succintement les justifica tions math matiques de la Diff rentiation Automatique puis d crit en d tail le mod le de diff rentiation impl ment dans TAPENADE dans le but d am liorer la compr hension et la confiance des utilisateurs dans le code diff renti Ce rapport documente les options de la commande TAPENADE les fichiers de configuration disponibles ainsi que les diagnostics et r sultats produits par Voutil Ce rapport d crit rapidement l architecture de TAPENADE et ses performances p
94. rementation instructions Bi D tection of Alidsing eus dues A 4040404 di on Ra 0 5 4 Splitting complex expressions 5 5 Adjoint Liveness analysis USING TAPENADE FROM THE COMMAND LINE 6 1 Available TAPENADE options 6 1 1 Fundamental options 6 1 2 File information options 6 1 3 Differentiation options 6 1 4 System Options e ri une Papas Gers eV aa ee ea eo 6 1 5 Display and Debugging options 6 2 Giving information on black box procedures 6 3 Messages issued by TAPENADE 6 3 1 Declarations problems 6 3 2 Type problems 4 44 A Ra BOB E mme AE 6 3 3 Control flow problems RT n 0300 4 Hasco t and Pascual 6 3 4 Data flow problems 6 3 5 Automatic Differentiation problems 6 4 Specifying required size information 6 5 Compiling the differentiated code 7 USING TAPENADE FROM THE WEB INTERFACE 7 1 The TAPENADE input form web page 7 2 The TAPENADE output result web page 8 ARCHITECTURE AND PERFORMANCES 9 VALIDATION AND TUNING 9 1 A classical framework for validation
95. renti ated and non differentiated components This would not gain much in memory space and would cost a lot in readability and complexity We apply here a general principle which is to prefer generalization to specialization especially when specialization leads to combi natorial complexity This principle will be applied again for subroutines and functions cf section 4 8 As a last refinement when all components of the original structured type appear in the differentiated structured type then the differentiated type is simply replaced by the original type Figure 7 illustrates this on a small tangent mode example where structured type VECTOR is differentiated into VECTOR_D and type components y and name do not appear in VECTOR_D because there are never any derivatives to store there 4 3 Simple assignments Now consider an assignment I In tangent mode cf equation 3 derivative instruction I implements Xy f Xx 1 Xr 1 with initial Xy X In reverse mode cf equation 4 derivative instruction s T implements Y 1 ff Xr 1 Y q with initial Y Y Here also TAPENADE uses the principle of keeping the original program s structure just like the original program overwrites variables the differentiated program overwrites the dif ferentiated variables writing values X over previous values X in tangent mode or writing values Y 1 over previous values Y in the reverse mode For example if I is a i x b j COS a i
96. ription of TAPENADE s messages There is a subtler problem related to 1 0 in some programs I O files are used as tem porary storage If the stored variables indeed should have a derivative TAPENADE is unable to detect this Should it detect this TAPENADE would have to create a new file as well as the I O instructions that write and read the derivatives to and from this file TAPENADE does not do that The least TAPENADE can do though is to warn the user when this might happen Thus each time a varied variable is written into a file which is not the standard output a message ADO3 is sent Also each time a useful variable is read from a file which is not the standard input another message ADO4 is sent The end user is responsible for checking that everything is OK Figure 13 illustrates this in the tangent mode on some representative I O calls The first DO ii1 34 66 2 bd ii1 0 0 ENDDO read b 34 66 2 READ b 34 66 2 write 22 a i i 1 100 3 WRITE 22 a i i 1 100 3 DO i 0 33 td i 0 0 ENDDO read 22 t i i 0 33 READ 22 t i i 0 33 bd 1 33 t 1 33 ad 34 66 b 1 33 a 34 66 t 1 33 b 1 33 a 34 66 t 1 33 File ADO5 Active variable b 34 66 2 overwritten by I 0 File AD03 Active variable ali i 1 100 3 written by I 0 to file 22 File ADO4 Useful variable t i i 0 33 read by 1 0 from file 22 Figure 13 Differentiation of I O procedure calls showing warning message
97. s message ADO5 refers to the first READ if the user wants to study derivatives with respect INRIA TAPENADE 2 1 23 to the values read into b then the program should be modified to take this READ out of the differentiated section Otherwise one need not worry The next pair of messages AD03 and AD04 point out the possible communication of an active value through file 22 If the user knows this is the case this is not treated by TAPENADE and therefore the result is probably incorrect For example we see that variable ad was not written along with a so t has not received a derivative and the differentiation of the t a product is incorrect 4 8 Procedure calls TAPENADE treats procedure calls differently from simple instructions because a procedure call indeed represents a bunch of instructions possibly with control Therefore the dif ferentiated instructions cannot be put before the original call but rather inside yielding a differentiated procedure with additional arguments for the derivatives of original arguments which are active either just before or just after the call In tangent mode a call to SUB just becomes a call to the differentiated SUB_D Notice that if SUB is a function that returns an active value TAPENADE makes SUB_D a function too that returns the derivative of the value and it also returns the original value but this time as an extra output argument named SUB In reverse mode TAPENADE checkpoints the
98. s not need to be differentiated saving actually a lot of derivative computations Until a directive exists one way to cheat TAPENADE could be to introduce an extra assignment time_step DISACTIVATE time_step where the black box function DISACTIVATE essentially copies its input to its output but whose signature in the ADLib file specifies that the output depends on nothing and is therefore inactive 9 3 2 linear code fragments A code fragment may perform a linear operation Consider the case where the active output Y of this fragment is a linear function of its active input X i e Y Mx X where M itself is not active This information can be hidden very deeply inside the fragment and actually the fragment may perform a lot of non linear intermediate operations This is the case for example for GMRES 24 solvers An AD tool has no hope of detecting this situation Therefore the differentiated code will probably differentiate the whole computation sequence even if the user knows most of it is useless Not only will the differentiated code perform useless operations but moreover the principle of AD will use the iterative control that solves for Y to solve for Y or X too This may prove dangerous because the solver takes care of instabilities in the intermediate values that lead to Y and the differentiated solver will not do the same for the intermediate differentiated values Therefore the user certainly knows better than the AD tool
99. t differentiated code that can compete with hand coded derivatives This is the user s guide to version 2 lof TAPENADE The main difference from previous version 2 0 is that program sources in FORTRAN95 22 are accepted in addition to FOR TRAN77 At the moment when we write this report however some development is not yet terminated and the pointer and memory allocation part of FORTRAN95 is still missing See section 10 about the others developments to come soon For readers who know AD and other AD tools section 2 gives a quick reference to the classical characteristics of TAPENADE After a brief description in section 3 of the theoretical basis of AD section 4 describes the AD model implemented by TAPENADE showing how it relates to the theoretical description This part is based on concrete examples to gain understanding of programs produced by TAPENADE Section 5 focuses on refinements to the AD model for adjoint codes There are two ways to use TAPENADE either as a shell command from the command line or from a Makefile or as a server on the web with a HTML graphical interface Section 6 presents the fundamental command line mode with the available options and configuration files Section 7 presents the alternative web interface Section 8 gives a quick overlook of TAPENADE s internal architecture and discusses the performances Section 9 addresses the problem of validation and Section 10 concludes with known problems and further developmen
100. t on each piece of FOO that goes from a location L to L 41 Each time execution of FOO_D reaches a location L all derivatives X are stored and the dot product X je X is computed and printed Conversely each time execution of the backward sweep of the adjoint FOO_B reaches the corresponding location L the stored value of X is retrieved its dot product with the present X which is going upstream with the adjoint code is computed and printed and last but not least X is overwritten with the contents of X j Figure 30 illustrates the mechanism Call F the function implemented XX XX Es E mW w gt X gt gt A x V A F T a F X X Xat Xa Figure 30 Split execution of the dot product test by the fragment of FOO between L and L 1 During F00_D the code fragment F from L to L 1 computes from its input tangent direction X its output tangent derivative Fi X x X During F00 B the code fragment F from L y1 to L computes from its input X which has just been completely overwritten to contain X j 1 the gradient X Fi X x Xj41 For F just like for the complete F it is then easy to check that Xy X Aja Xn 8 The right hand side of 8 is what is printed by FOO_D at location L 41 and the left hand side of 8 is what is printed by FOO_B at location L If the two numbers don t match then the problem with the adjoint code lies between these two
101. t the final executable INRIA TAPENADE 2 1 7 USING TAPENADE FROM THE WEB INTERFACE As an alternative to a local installation one can also use the TAPENADE web server It requires no installation and of course always runs the latest version of TAPENADE It can be triggered in a few clicks from most web browsers 7 1 The TAPENADE input form web page To begin an AD session the URL to get started is http tapenade inria fr 8080 tapenade index jsp which takes you to the TAPENADE input page or form shown on figure 23 Notice that this interface gives the user access only to the fundamental options of the command line TAPE NADE Specifically the web interface lets the user specify the source and include files their language the root procedure the dependents the independents and finally the differenti ation mode On the other hand library files about black box procedures output language suffixes of differentiated variables and optimization levels cannot be specified and therefore retain their default values Filling the TAPENADE request form consists of six steps visible on figure 23 1 A radio button selects the input language then 2 the next step allows the user to upload all necessary source files this time also uploading the include files Notice that for security reasons each file must be uploaded separately which may unfortunately make the process tedious Source files and include files must be uploaded in t
102. t two intrinsic array functions play a original program TAPENADE tangent TAPENADE reverse a 0 n 3 x SUM b ad 0 n 3 xd SUM b amp xb xbtSUM b SUM ab 0 n 3 amp x SUM bd bb bb tx SUM ab 0 n 3 ab 0 n 3 0 0 Figure 9 Differentiation of an array assignment very special role with differentiation the SUM and SPREAD intrinsics The SPREAD intrinsic is often implicit like here on the scalar x which is implicitly spread as an array section of the appropriate dimension The differentiated instructions come from the fact that the tangent derivatives of a SUM is a SUM and the one of a SPREAD is a SPREAD Conversely the adjoint of a SUM is a SPREAD and vice versa This is easily checked by writing down the local Jacobian matrix and its transpose like in section 4 3 The other array intrinsic functions such as PROD are treated like black box routines whose differentiation is given to TAPENADE in special library files 4 5 Activity and reinitializations of derivatives As introduced in section 3 3 only the derivatives of the active variables need be computed and manipulated If variable v is not varied then vd is certainly null and the value of vb does not matter Conversely if v is not useful then the value of vd does not matter and vb is RT n 0300 18 Hasco t and Pascual certainly null TAPENADE automatically detects active and inactive variables and simplifies the differentiated program
103. ted code especially in the reverse mode Figure 20 shows another form of aliasing local to an instruction where an assigned variable may or may not be the same as a read variable In this situation it is impossible to write a single reverse differentiated instruction because the differentiated code strongly depends on the fact that the assigned variable is also read or not TAPENADE detects this situation and automatically inserts a temporary variable e g tmp therefore removing local aliasing through instruction splitting For example on figure 20 there is a local aliasing in the third instruction because equality between i and n i could not be decided 5 4 Splitting complex expressions The derivative of complex expressions often turns out to be even more complex and longer Even if the original expression contains no duplication naive differentiation introduces du plicate sub expressions TAPENADE provides an automatic splitting of expressions that virtu ally eliminates all duplication coming from differentiation For the moment only the reverse INRIA TAPENADE 2 1 33 original program TAPENADE reverse TAPENADE reverse MRE _ terward sweep backward sweep CALL PUSHREAL4 a i CALL POPREAL4 a n i 3 a i a i 1 a i 3 a i a i 1 tmpb ab n i CALL PUSHREAL4 a i 2 ab i ab i a n i tmpb 2xa i a it2 2 a i ab n i a i tmpb tmp a i a n i CALL POPREAL4 a i 2 CALL PUSHREAL4 a n i ab i ab i
104. ts to come in TAPENADE INRIA TAPENADE 2 1 7 2 QUICK REFERENCE GUIDE Automatic Differentiation Tool A Name TAPENADE version Dall 4 il Date of birth January 2002 Ancestors Odyss e 1 7 Address www inria fr tropics tapenade html Specialties AD Reverse Tangent Vector Tangent Restructuration Reverse mode Strategy Store All Checkpointing on calls Applicable on FORTRAN95 FORTRAN77 and older Implementation Languages 90 JAVA 10 C Availability Java classes for Linux and Solaris or Web server Internal features Type Checking Read Written Analysis Forward and Backward Activity Adjoint Liveness analysis TBR Reduced Snapshots Figure 1 TAPENADE s ID card For readers who know AD and other AD tools here is a quick description of TAPENADE about the usual features of AD tools Figure 1 just puts it in a funnier way TAPENADE is a software tool for automatic differentiation of FORTRAN95 and FORTRAN77 programs with the source regeneration approach TAPENADE is not based on overloading It produces tangent Jacobian times vector codes and adjoint Transposed Jacobian times vector codes In its so called vector or multi directional mode it produces tangent code for several directions at a time Several tactics and analyses are used to produce better code Some are classical others are specific Internal implementation is mostly JAVA There are two ways you can use TAPENADE as a web server always
105. ty with programs produced by TAPENADE In its current version TAPENADE can differentiate FORTRAN77 programs and FORTRAN95 programs Thus we shall describe TAPENADE s differentiation model for these languages only Differentiation of C is planned in the near future and will call for a few specific additions to the model which are not described here In the following sections we shall address the constructs of FORTRAN95 The restricted model for FORTRAN77 can be simply derived from the general model for FORTRAN95 4 1 Symbol names First consider symbol names If a variable v is of differentiable type cf section 4 2 and currently has a non trivial derivative cf activity 3 3 this derivative is stored in a new variable that TAPENADE names after v as follows vd v dot in tangent mode and vb v bar in reverse mode Derivative names for user defined types procedures and COMMONS are built appending _D or _d in tangent mode and _B or _b in reverse mode Please bear in mind that FORTRAN is case independent so the choice of the case is just a matter of style TAPENADE only puts type and procedure names in upper case Figure 6 summarizes that Notice that TAPENADE automatically checks for possible conflicts with names already original program TAPENADE tangent TAPENADE reverse SUBROUTINE Ti a SUBROUTINE T1 D a ad SUBROUTINE Ti B a ab REAL a 10 w REAL a 10 ad 10 w wd REAL a 10 ab 10 w wb INTEGER jj INTEGER jj
106. ually prints a hint about the actual value that should be given to these constants If you don t know exactly the values of these constants you should over approzimate them To define these constants in the DIFFSIZES inc we recommend that you use the syntax of the following FORTRAN77 example or its FORTRAN95 equivalent INTEGER nbdirsmax PARAMETER nbdirsmax 25 INTEGER SIZE20FarraybINmyfunc PARAMETER SIZE20FarraybINmyfunc 150 6 5 Compiling the differentiated code The differentiated code produced by TAPENADE consists of ordinary FORTRAN source In most cases a FORTRAN77 compiler is appropriate to compile it However when the orig inal files use some specific FORTRAN95 constructs or when the user required it using the outputlanguage command line option cf section 6 1 2 the output code uses the FOR TRAN95 syntax and must be compiled with a FORTRAN95 compiler When the differentiated code required the user to provide some array dimension sizes it contains the line INCLUDE DIFFSIZES inc In this case you must create this file and put it in the directory where compilation is made To compile files differentiated in the reverse mode you need to compile and link the definition of the PUSH and POP subroutines They are provided in the standard TAPENADE distribution or can be downloaded from the TAPENADE web server They consist of a FORTRAN77 file adBuffer f and a C file adStack c Both files must be compiled and linked to ge
107. udimentary divide and conquer approach to run the dot product test on parts of the procedure FOO Bugs can thus be located precisely in a relatively small part of the program and may be tracked down more easily Suppose we choose locations L 7 0 p in the middle of FOO_D RT n 0300 Hasco t and Pascual SUBROUTINE VALIDATION X Y n m real 8 ddeps epszero integer phase ddfile common comphase phase ddfile ddeps epszero Run Yeps F X epsilon Xd Xd Xdsave ddeps eps epszero 0 00000001 phase 1 ddfile 37 OPEN 37 FILE epsvalues call FOO_D X Xd Y Yd Yd 0 0 CLOSE 37 Yeps Y Restore the initial input X Run the tangent code Y Yd F_D X Xd phase 2 OPEN 37 FILE epsvalues CALL FOO_D X Xd Y Yd CLOSE 37 Compute the square norm of Yd from divided differences Figure 29 Subroutine framework for debugging the tangent code INRIA TAPENADE 2 1 65 with obvious corresponding points inside the backward sweep of FOO_B The constraint is that at each L all active variables are visible in other words no active variable can be in a common which is not declared and visible for L This is the most stringent restriction on the approach Call these variables X and we will assume they are the elements of an array Xj for simplicity in the description only Set Lo to the very beginning of FOO and L to its very end We are going to run the dot product tes
108. uis discute plus en d tail des techniques de validation et d am lioration des programmes diff renti s produits par TAPENADE Mots cl s diff rentiation automatique adjoint gradient optimisation probl mes in verses analyses statiques analyses data flow compilation TAPENADE 2 1 Contents 1 2 INTRODUCTION QUICK REFERENCE GUIDE AUTOMATIC DIFFERENTIATION OF COMPUTER PROGRAMS 3 1 Tangent and Reverse AD modes 3 2 Checkpointing in the reverse mode Bid ACtivity oe oe dre ls l a bb A Baw aa aa aa eee THE DIFFERENTIATION MODEL OF TAPENADE 41 Symbol names mossos oyo ese ok eR a r ee soles Se ete we es 5 0 GS 4 2 Differentiated data types 4 3 Simple assignments 4 4 Array or Vectorial notation 4 5 Activity and reinitializations of derivatives 4 6 Control Flow structure 4 7 Input Output I O procedures 48 Procedure calls rt ee eh qu tep re ete nent tx de Sr A 4 9 Overloading z 208 404 Pa parues B by blb an e e cette ce Roe 4 10 Overall Modular Structure 4 11 Multi directional differentiation SPECIFIC REFINEMENTS FOR THE ADJOINT 5 1 To Be Restored analysis 5 2 Gathering inc
109. umann Reducing the memory requirement in reverse mode automatic differen tiation by solving TBR flow equations In P M A Sloot C J K Tan J J Don garra and A G Hoekstra editors Computational Science ICCS 2002 Proceedings of the International Conference on Computational Science Amsterdam The Netherlands April 21 24 2002 Part IT volume 2330 of Lecture Notes in Computer Science pages 1039 1048 Berlin 2002 Springer RT n 0300 76 Hasco t and Pascual 24 Y Saad and M H Schultz GMRES A generalized minimal residual algorithm for solving nonsymmetric linear systems J Sci Comput 7 856 869 1986 INRIA TAPENADE 2 1 Index activity active variables 12 13 17 20 23 38 40 48 50 65 68 backward activity see useful variables forward activity see varied variables useful variables 12 13 17 20 22 40 varied variables 12 13 17 20 22 40 AD 8 13 Adifor 11 adjoint mode see reverse mode Algorithmic Differentiation see AD aliasing 32 49 59 analyses Activity analysis 13 13 18 40 59 Adjoint Liveness analysis 23 33 34 59 Adjoint Written analysis 23 34 38 59 Dependency analysis 13 41 59 Read Written analysis 23 40 57 59 TBR analysis 31 38 59 array notation 17 46 assignments 16 17 Automatic Differentiation see AD backward sweep 10 11 12 20 23 65 69 70 basic blocks 20 black box procedures 13 17 18 37 40 43 50 57 67 69 call gr
110. ure An important question is whether the differentiated modules should contain 1 the differentiated procedures only or 2 also the original procedures The answer is 2 because both the original procedures and the differentiated procedures must sometimes operate on global module variables which may be private Therefore both procedures must be inside the same differentiated module to gain access to this global variable The example on figure 16 illustrates this differentiation of modular structured programs in tangent mode on an example module that contains one type definition variables and one function Figure 15 also illustrates differentiation of modules This relates to the more general question of whether the differentiated object can exist independently of the original object or must they be attached inside the same enclosing level For example a differentiated instruction must be in the same procedure as the original instruction because they share a common control Similarly a differentiated procedure must be in the same module as the original because both may access a private object of this module On the contrary a differentiated field x of a structured type T need not be added into T but can rather go into a stand alone differentiated structured type T_D or T_B therefore saving memory space for the objects of type T which are not active Similarly a RT n 0300 26 Hasco t and Pascual MODULE MOD_T MODULE MOD_T_D MODULE
111. wo classical strategies to cope with that e Recompute All RA the X are recomputed when needed restarting P on input Xo until instruction IK Figure 2 graphically illustrates this RA strategy The I RT n 0300 10 Hasco t and Pascual nE o gt E OO gt T time o o p 1 o t o Figure 2 The Recompute All strategy instructions and backwards arrows represent the execution of the differentiated in structions that effectively compute Yx_1 fi Xk 1 x Yr The big black and white dots represent respectively the storage and retrieval of enough values to be able to restart computation from the initial point Brute force RA strategy has essentially a quadratic time cost with respect to the total number of run time instructions p The TAF 11 tool uses this strategy together with a slicing strategy called ERA 12 to limit recomputations to those actually needed to compute the required Xy e Store All SA the X are restored from a stack when needed This stack is filled during a preliminary run of P that additionally stores variables on the stack just before they are overwritten These stored variables form what is sometimes called the trajectory deprecated or the tape This preliminary run is called the forward sweep P The differentiated instructions strictly speaking form the backward sweep P Figure 3 graphically illustrates this SA strategy The small black and white dots represent respective
112. wo separate boxes 3 The third step is to fill in the name of the root procedure 4 The fourth step is to fill in the names of all dependent output variables separated by white space 5 The fifth step is to fill in the names of all independent input variables separated by white space 6 The sixth and last step selects the differentiation mode and triggers differentiation Steps 3 4 and 5 can be skipped in which case the usual default answers are taken respec tively the topmost procedure in the call graph the list of all outputs of this procedure with a differentiable type and the list of all inputs of this procedure with a differentiable type During the uploading step 2 keep in mind that all files travel on the Internet to our server in Sophia Antipolis This connection is not encrypted nor protected We will keep your files confidential but we cannot guarantee they will not be spied on the web If security of your files matters download TAPENADE on your local system On the input page figure 23 we encourage the user to register to the tapenade users mailing list to be kept informed about improvements and new developments This is not compulsory to use the server but it helps us to know our users better There is also a link to the on line documentation of the tool RT n 0300 54 Hasco t and Pascual _TAPENADE On line A Bookmarks Tools Window Eile Edit View Go 2 DE z mpmapenadeinar6080epenadefomjp
113. yb ivar is ais dyb ivar is ivar 1 5 dxb ivar is ais dxb ivar is c POPREALS ais g is 1 ns volsb is volsb is aisb vols is 2 ais 1 0d0 vols is POPINTEGER4 is dx ivar is dx ivar is ais dy ivar is dy ivar is ais POPINTEGER4 index1 dz ivar is dz ivar is ais POPREAL4ARRAY dx 5 index1 3 POPREAL4 resx a TRANSP_B dx 5 indexl dxb 5 indexl resx re mab N N AL SSS gt A z 8 fluroe TC16 Type mismatch in assignment REAL 8 receives DOUBLE PRECISION 9 fluroe TC30 Type mismatch in argument 1 of function dsgrt expects DOUBLE PRECISION receives REAL 8 10 fluroe TC16 Type mismatch in assignment REAL 8 receives DOUBLE PRECISION 11 fluroe TC30 Type mismatch in argument 1 of function dsgrt expects DOUBLE PRECISION receives REAL 8 12 gradnod TC34 External function maxx is not declared T5 gradnod TC30 Type mismatch in argument 1 of function transp expects REAL 3 receives REAL 8 REAL O Figure 24 HTML interface for TAPENADE output correspondence between original and differentiated code as well as warning messages The same interface is used when the user adds the html option to the command line tapenade The four top frames show the original program on the left and the differentiated program on the right The two top frames show the call graphs and the middle frames show the FORTRAN source of the selected subroutine
114. zing the tape size In G Corliss Ch Faure A Griewank L Hasco t and U Naumann editors Automatic Differentiation of Al gorithms From Simulation to Optimization Computer and Information Science chap ter 34 pages 293 298 Springer New York NY 2001 10 Ch Faure and Y Papegay Odyss e User s Guide Version 1 7 Technical Report RT 0224 INRIA Sophia Antipolis France 1998 11 R Giering Tangent Linear and Adjoint Model Compiler Users Manual Center for Global Change Sciences Department of Earth Atmospheric and Planetary Science MIT 1997 Unpublished url http www autodiff com tamc INRIA TAPENADE 2 1 75 12 R Giering and T Kaminski Generating recomputations in reverse mode AD In G Corliss Ch Faure A Griewank L Hasco t and U Naumann editors Automatic Differentiation of Algorithms From Simulation to Optimization chapter 33 pages 283 291 Springer Verlag Heidelberg 2002 13 M B Giles Adjoint methods for aeronautical design In Proceedings of the ECCOMAS CFD Conference 2001 14 A Griewank Achieving logarithmic growth of temporal and spatial complexity in re verse automatic differentiation Optimization Methods and Software 1 1 35 54 1992 Also appeared as Preprint MCS P228 0491 Mathematics and Computer Science Divi sion Argonne National Laboratory Argonne III 1991 15 A Griewank Evaluating Derivatives Principles and Techniques of Algorithmic Dif ferentiation

TAPENADE 2.1 user's guide - Sophia Antipolis

Contents

Download Pdf Manuals

Related Search

Related Contents