Home

RAMAN manual

1. calculate either Euclidean distance statistic m Pressing M switches between two options The default is to calculate an overall test for no differences among treatments and stop The second is to proceed to compare all possible pairs of treatments This option can be very useful when there are several treatments as otherwise post hoc or preplanned pairwise comparisons would have to be done by editing the data set When this option is selected both the standard P value calculated by comparing the observed value of the test statistic s to their randomization distribution s and P values adjusted for multiple comparisons using Sidak s adjustment which is slightly more powerful than Bonferroni s adjustment are presented If output is being sent to the screen only the program will pause after the initial analysis and each pairwise comparison to allow reading the results but if output is being sent to a file the initial results and pairwise comparisons may flash past quickly and examining the output file will be necessary o Pressing O will cause the program either to pause each time the screen is full of information or to run as fast as possible with no pauses This option will only appear if output is being sent to a file using option c When output is not being sent to a file this option does not appear since the program must pause or you might miss the output of analyses p Pressing P will cause the program to display each line of
2. 341 04 34 52 27 14 High 311 54 25 61 29 27 Low 362 61 37 06 25 78 Medium 357 91 40 83 27 33 High 2 77 18 20 65 31 37 Low 386 48 31 75 25 75 Medium 325 35 31 7 31 57 Note that as long as each treatment has a unique code the observations don t need to be in any particular order in this data file they are organized by the blocks in which the experiment was done but the blocks are ignored in analyses The data would produce exactly the same results if all the Low treatment observations were first followed by the Medium followed by the High or in any other order A data file organized as above should have the extension csv if it does not RAMAN may try to interpret it using a different format and may not run properly You must close the file in your spreadsheet program before trying to load it into RAMAN If the program terminates with a lO Error 5 when you try to execute an analysis you probably have not closed the file in the spreadsheet and RAMAN cannot get access to it to read it Close it in your spreadsheet program and try again RAMAN version 1 73 manual 5 Running The Program Open a DOS window change directories until you are in the directory containing the data file You will then see a command prompt like C data gt Enter the name of the executable file on the command line and press the Enter key C data raman173 The program will run and you will see a screen like this 4MS DO
3. BLOSSOM Cade and Richards 2001 an excellent software package distributed free by the US Geological Survey and available from http www mesc usgs gov products software blossom shtml and the book Permutation methods A distance function approach written by P W Mielke and K J Berry and published in 2001 by Springer Verlag NY RAMAN version 1 73 manual 2 Regardless of what statistic or statistics are calculated RAMAN performs hypothesis tests by 1 calculating the value of the statistic s from the data in their original form 2 repeatedly randomly reassigning all of the observations in the data sets to treatments with the constraint that each treatment is assigned the same number of observations it had in the original arrangement of the data recalculating the statistic s of interest 3 comparing the observed values to the values from the randomly reassigned observations and 4 keeping track of how many random reassignments produce values as extreme or more extreme than the observed values If the observed values result from genuine effects of the treatments on the variables measured one would expect that statistics would be less extreme in most cases when observations are randomized among treatments The principles behind this are discussed in detail by Manly 1991 1997 and by the other references listed at the end of this document The concept of computing 99 binomial confidence intervals for P values calculated from rando
4. Entering an integer in that range will mean that the random number generator is initialized with that number and will mean that the same sequence of random numbers will be generated each time the program is run with that seed value Pressing F toggles between two possible settings The default is that the data are not transformed using the In n 1 transformation before being analyzed The alternative is that they are The In n 1 transformation is often applied to data that are counts because among other things it often uncouples the mean and the variance of count data Pressing G toggles between two possible settings The default is that the data are not transformed to 1 for present any data value other than 0 or O for absent prior to further analysis The alternative is that they are This transformation can be RAMAN version 1 73 manual 8 useful for ecological data where it may be worthwhile to compare the outcomes of examining differences among treatments when the data are counts of individuals present for each species in each observation with the outcomes when the data are simply 1 for that species being present and 0 for it being absent Setting the presence absence transformation on will turn off standardization option h and it cannot be turned on until the presence absence transformation is turned off i Pressing allows you to choose how data are standardized within variables columns before being analyzed Most standard mul
5. not preplanned and you wish to presenve the experimentwise type error rate at a 0 05 the Sidak adjusted P values should be used See Hochberg and Tamhane 1989 for a very useful discussion of the philosophy and practice of multiple comparisons would consider that the initial overall analysis preserves the experimentwise error rate at amp 0 05 and would examine the upper binomial 99 confidence limits for each pariwise comparison keeping a comparisonwise error rate of 0 05 and maximizing the power to RAMAN version 1 73 manual 13 detect differences between the treatments In this example all pairs of treatments differ significantly even if the Sidak adjusted P values are used Box 1 Output written to txt file by RAMAN testing for significant treatment effects using Euclidean distances for all treatments and each pair of treatments in the sample data file testdata csv RAMANOVA Ver 1 73 multivariate randomisation test with 20000 randomisations Random number seed from clock 11377 Carried out on 24 Feb 2002 at 14 22 48 Analysis for data from file testdata csv Larval mass at metamorphosis percent survival to metamorphosis and length of larval period Variables Avemass Pctsurv AveLP Treatments High Low Medium 3 variables 24 observations 3 treatments Variables centered and standardised using Euclidean distance s 2 m 0 0 n 8 5 99 CL for P Statistic Value P lt lower upper Sum Euclidean dist 26 812
6. 1991 Randomization and Monte Carlo methods in biology London Chapman and Hall Manly BFJ 1997 Randomization bootstrap and Monte Carlo methods in biology London Chapman and Hall Mielke PW amp Berry KJ 2001 Permutation methods A distance function approach Springer Verlag NY 352pp RAMAN version 1 73 manual 15
7. 835 0 00005 0 00000 0 00018 RAMANOVA Ver 1 73 multivariate randomisation test with 20000 randomisations Comparison of treatment High and treatment Low 3 variables 16 observations 2 treatments Variables centered and standardised using Euclidean distance S 1 m 0 5 n 5 0 99 CL for P Statistic Value P lt lower upper Sidak P Sum Euclidean dist 14 788282 0 00015 0 00000 0 00037 0 00045 RAMANOVA Ver 1 73 multivariate randomisation test with 20000 randomisations Comparison of treatment High and treatment Medium 3 variables 16 observations 2 treatments Variables centered and standardised using Euclidean distance S 1 m 0 5 n 5 0 99 CL for P Statistic Value P lt lower upper Sidak P Sum Euclidean dist 19 895571 0 00030 0 00000 0 00062 0 00090 RAMANOVA Ver 1 73 multivariate randomisation test with 20000 randomisations Comparison of treatment Low and treatment Medium 3 variables 16 observations 2 treatments Variables centered and standardised using Euclidean distance S 1 m 0 5 n 5 0 99 CL for P Statistic Value P lt lower upper Sidak P Sum Euclidean dist 26 186194 0 00090 0 00035 0 00145 0 00270 RAMAN version 1 73 manual 14 References Cade BS amp Richards JD 2001 User Manual for BLOSSOM Statistical Software Midcontinent Ecological Science Center U S Geological Survey Fort Collins Colorado USA available from http www mesc usgs gov products software blossom shtml Manly BFJ
8. RAMAN manual Documentation for RAMAN version 1 73 Copyright C 1999 by Ross A Alford School of Tropical Biology James Cook University Townsville Queensland 4811 Australia email ross alford jcu edu au Legal information All commercial rights reserved This documentation and the program RAMAN EXE may not be sold or included in any compilation that is sold for profit without the written permission of the author They may be included in archives of software accessible for download or in compilations that are sold for prices sufficient only to cover the costs of production Alhough the software has been exhaustively tested the author cannot be held responsible for any errors that may exist nor for any consequences ensuing from such errors Use of the software or of results derived from it are entirely at the risk of the user RAMAN version 1 73 manual Contents Introduction Introduction RAMAN RAndomization MANova is a program written to make tests of one way hypotheses on multivariate data using randomization as simple to perform as possible RAMAN will accept data sets of up to 2000 observations cases or lines with up to 1024 different variables in each observation and up to 100 different treatments lt uses rerandomization of the data to produce Monte Carlo tests of the hypothesis that all treatments in a one way design have similar response vectors It will also automatically perform all pairwise comparisons o
9. S Prompt RAMAN I BEE Multivariate one way randomisation tests Maximum 2000 cases 1024 variables 100 treatments RAMANOVA Version 1 73 Copyright 1993 1995 1999 by Ross A Alford School of Tropical Biology James Cook University Townsville Qld 4811 Australia Ross Alford jcu edu au All commercial rights reserved all liability disclaimed Based in part on algorithms from Manly B F J 1991 Randomization and Monte Carlo Methods in Biology Press any key to continue q to quit program RAMAN version 1 73 manual 6 After you press a key you will see Data Tile selection Enter name of Tile Or press ENTER alone to select from all files matched by cSV Or enter another name with wildcards to select from list Enter aq to quit this routine or ah for help Filename _ At this point you can type in the name of the data file and press the Enter key or just press enter and you will see a list of all the files that have the csv extension in the current directory Once you get this list you can use the arrow keys on the keyboard to move until the desired filename is highlighted then press enter to select that file for analysis You will now see the main RAMAN parameter selection screen where you set up the analysis that is to be performed It looks like this la x Fp t TiTe ee Parameters set to defaults Input file contains header as first line Output file screen only Number o
10. comma delimited file An example of how the data should look in a spreadsheet is Microsoft Excel Eile Edit wiew Insert Format Tools Data Window Help Dae S S6QY Blo o x z sw Sl iy ior gt Prompt Oy 2 4 testdata csv loj H z ak Avemass Pctsur AveLlP 204 6r 23 70 36 21 304 4 fed 22 05 343 3 35 73 31 69 279 45 22 46 35 22 453 14 30 7 2749 340 66 32 75 35 33 l 292 04 22 50 4D 4A 361 67 31 04 oY At 366 35 29 53 35 2 aF 21 17 44 55 363 97 31 14 25 56 366 51 29 69 23 46 fe 4 25 93 36 24 419 52 PFS 25 22 351 69 z0 29 1 296 30 30 14 24 61 305 52 35 04 26 59 341 04 34 52 2r 1A 311 54 25 61 29 27 362 61 ar 06 25 70 357 9 40 63 PERGE 2rr 18 20 65 31 37 306 46 31 75 25 9 325 35 31 7 31 57 RAMAN version 1 73 manual 4 When this is saved in a comma delimited data file it should look like this if loaded into a program such as notepad Larval mass at metamorphosis percent survival to metamorphosis and length of larval period Density Avemass Pctsurv AveLP High 284 67 23 78 36 21 Low 384 41 27 79 22 85 Medium 343 3 35 73 31 69 High 279 45 22 46 35 22 Low 453 14 30 7 27 49 Medium 340 66 32 75 35 33 High 292 84 22 58 42 41 Low 381 67 31 04 27 41 Medium 368 35 29 53 35 2 High 293 7 21 17 33 35 Low 383 97 31 14 25 56 Medium 366 91 29 69 23 46 High 272 41 25 93 36 24 Low 419 82 37 5 26 22 Medium 351 69 30 29 1 High 296 38 30 14 29 61 Low 385 52 35 04 26 59 Medium
11. d not contain blanks or punctuation with the exception of the underline character which can be substituted for blanks The third through last rows of the worksheet should contain the data The first column of each observation is the treatment code this can be any alphanumeric value of up to 8 characters in length It can contain underlines and a few other punctuation characters including parentheses RAMAN version 1 73 manual 3 and but cannot contain embedded blanks and cannot be more than 8 characters long Obviously all observations in each treatment must share the same code which must differ from the code for all other treatments Observations do not need to be sorted by treatment The second through the last columns of each observation should contain data values which must be numeric These may include decimal points but should include nothing else except digits Observations with missing values should be eliminated from the data set before attempting analysis they cannot be included and will cause the program to fail with an error message when it attempts to load the data Once the worksheet is organised as above it should be saved using a filename of 8 characters or less with the 3 character extension CSV It should be saved as a comma delimited list not in the native format of the spreadsheet program being used If you are using Excel one of the options in the Save As dialog box will be to save the data as a
12. degrees of freedom to allow inversion of the H H matrix If there are not an error message is displayed that suggests turning this option off and rerunning the program k Pressing K turns the calculation of the sum of log F statistic on or off This statistic is simply the sum of In F of the univariate ANOVAS calculated for each variable column of the data set A large sum of log F suggests that that the means of treatments differ significantly See Manly 1991 1997 The default for this statistic is off RAMAN version 1 73 manual 9 Pressing L presents three options The default is to calculate and test the sum of Euclidean distances statistic The sum of Euclidean distances is simply the sum across all treatments of the total distance in multivariate space between each data point observation and the centroid treatment average for that treatment If the data points for treatments are clustered at different locations in multivariate space this will be relatively small if they are dispersed throughout multivariate space this will be large The sum of the squared Euclidean distances is similar but instead of the absolute value of the distance of each point from its group centroid this statistic sums the square of the distance It can be more strongly affected by outlying points and may be a poorer test statistic than the sum of the Euclidean distances see Mielke and Berry 2001 and Cade and Richards 2001 A third option is to not
13. ept that the time and date have changed and a different random number seed is being used The variables are now being centered and standardized to unit variance and the values and P values for the three selected test Statistics appear in the table In this case all three give results similar to one another and to the analysis based on Euclidean distance it is clear that the treatments really do affect the responses Using And Interpreting Output Written To Disk Files Although testing any single hypothesis will never cause more output to appear than will fit on the screen you are likely to want to make permanent records of the outcomes of analysis and possibly to print them out To do this use option c of RAMAN For example to analyze the data in testdata csv determine which of the three possible pairwise combinations of treatments differ significantly according to the sum of Euclidean distances statistic and write the output into a file run RAMAN and select testdata dat for analysis Then press c at the options screen followed by 3 so the output will be written to a file called testdata txt which can easily be loaded into a word processor for examination and printing Also press m so that pairwise comparisons of treatments will be performed Finally press x to execute the analyses If you are using a fast PC the tests and output will flash by very quickly too quickly to follow on the screen Run Word or your favorite word processing prog
14. f randomizations 20000 Random numbers seeded from clock Data not In x 1 transformed Data not presence absence transformed Data will be standardised using Euclidean distance C 1 xX1 xXbar mean of x1i xbar Wilks Lambda will not be calculated The Sum of Log F statistic will not be calculated Sum of Euclidean Distances statistic will be calculated and tested Pairwise comparisons of treatments will not be performed Permutation will never be used not yet available The input data will not be displayed Restore default parameters Read parameters from a Tile Save parameters to a Tile Press letter to change parameter x to start analysis ESC to quit RAMAN version 1 73 manual T Setting The Options Pressing a key corresponding to any of the letters at the left of the screen will allow you to change the item shown to the right of the letter Details of what will happen are a Pressing A will take you back to the input date file selection menu allowing you to choose a different data file for analysis Pressing B allows you to change the expected format of the input file This option is included for backwards compatibility with other programs such as Brian Manly s software It should normally not be changed or if it is accidentally selected option 2 input file contains header as first line should be selected Pressing C allows you to select where output from the program will be directed Output is always
15. f treatments and give tests of the null hypothesis for each pair both adjusted and unadjusted for multiple comparisons The program makes it possible to test hypotheses that could not be tested using standard parametric techniques because insufficient degrees of freedom would be available or because the data may not fit the assumption of multivariate normality of errors The statistics that can be calculated and tested include the standard MANOVA test as given by Wilks A the hypothesis see Manly 1991 or 1997 that the sum of log F taken over the univariate ANOVAs for each response variable is relatively large and the hypotheses that the sum of the squared or unsquared Euclidean distances between observations and their group centroids is relatively small Manly 1991 or 1997 Cade and Richards 2001 Mielke and Berry 2001 The sum of Euclidean distances statistic is particularly easy to deal with as it has a simple and easily understood geometric interpretation Cade and Richards 20071 The basic algorithms used in the program were derived from FORTRAN source code included in the first edition of B F J Manly s excellent book Randomization and Monte Carlo Methods in Biology published by Chapman and Hall London For further reading would recommend that book or the second edition now titled Randomization Bootstrap and Monte Carlo Methods in Biology also published by Chapman and Hall London also recommend reading the manual for
16. f variables observations and treatments RAMAN version 1 73 manual 11 Next the program reports what technique was used if any to center and standardize the data by columns variables before analysis The next line reports the degrees of freedom these data would have for a standard MANOVA Finally the table presents the calculated test statistic or statistics and the results of comparing it with the randomization distribution In this case we did not change any options so the program defaulted to standardizing the data using Euclidean distance and calculating only the sum of Euclidean distances statistic The P value in this example shows that the calculated total Euclidean distance between observations and their group centroids was the smallest of the total of 20 000 distances calculated the one based on the original data and 19 999 based on random reassignments of the observations among treatments The 99 CL confidence limits for the P value are based on the binomial distribution and mean that if we repeated this analysis a large number of times we would expect that 99 of the time the P value would fall between these limits If the upper 99 confidence limit is less than our selected threshold for rejection of the null hypothesis usually o 0 05 that the treatments do not differ we can reject that null hypothesis with considerable confidence it is very unlikely that a different run of the program would lead us to a different c
17. input data as it is read This option can be useful to check that data are being read in correctly q Pressing Q will reset all of the program s parameters to their default values r Pressing R will cause the program to read parameter values from a file This can be very useful if you are experimenting with the program or with how transformations affect the outcomes of analyses You will be presented with three options 1 don t read parameters after all 2 choose the name of an input file which acts similarly to the data file input screen allowing you to enter a filename or choose from a list of possibilities 3 read from a parameter file with the same base filename as the data file being analyzed but the extension prm s Pressing S will present you with three options 1 write the current program parameters to a file file with the same base filename as the data file being analyzed but the extension prm 2 choose another file name which acts like the usual file name choice screen or 3 don t save the current parameters after all RAMAN version 1 73 manual 10 Finally pressing X will execute the program running the analysis as currently specified or pressing the Esc key will stop the program As the program runs a bar will show the percentage of each set of randomizations that has been completed After this reaches 100 a table of output will appear Interpreting The Output If you run the program using its default para
18. meters on the sample data set you will get an output screen like that below although it is possible that the P value may be slightly different because it is likely that the random number seed will be somewhat different 4MS DOS Prompt 11x iB E ejs i RAMANOVA Ver 1 73 multivariate randomisation test with 20000 randomisations Random number seed from clock 9995 Carried out on 24 Feb 2002 at 00 21 37 Analysis for data from Tile testdata csv Larval mass at metamorphosis percent survival to metamorphosis and length of arval period lel x Variables Avemass Pctsurv AveLP Treatments High Low Medium 3 variables 24 observations 3 treatments Variables centred and standardised using Euclidean distance 99 CL for P Tower c temp gt _ The first line identifies the program and version and reports how many random reorderings of observations among treatments were used to generate the randomization distribution s against which the test statistic s were compared The second line reports what number was used to seed the random number generator and the date and time on which the analysis was performed The third line identifies the file from which the data were read and the fourth and possibly later lines repeat the heading that was present on the first line of the data file The next section specifies the names of the measured variables and the treatments and summarizes the numbers o
19. mization tests was derived from the existence of this feature in the program StatXact version 4 published by Cytel Software http Awww cytel com Installing The Program Download the file RAMAN EXE and copy it into a directory that is on the DOS PATH The C Windows and C Windows command directories are good places to put it If it is in one of these directories it can be run from any other directory on your computer Alternatively you can put it into a directory not on the path but you will then need to be in that directory to run the program You may want to put this manual in and the sample data file testdata csv somewhere perhaps in a new subdirectory reserved for working with RAMAN and its output files Setting Up The Data It is simplest to set up your data using a spreadsheet such as Microsoft Excel Cell A1 top left should contain text that descibes the data what experiment it was what date it was collected on or anything else useful This can be up to 255 characters long The second row of the spreadsheet should contain a name for the treatments in the first column the word treat will do if you are not feeling more creative but this cell must have something in it In the second column through the last column it should contain the names of the response variables that make up each response vector These names must be 8 characters or less long and should be made up of letters and numbers only they shoul
20. onclusion Since in this case the upper 99 confidence limit of 0 00018 is much less than 0 05 we would reject Ho and conclude that the treatments affected the locations of the points in the multivariate space defined by the variables measured lf we ran the program again but wanted to do more conventional multivariate tests we could once again select the testdata file and while the parameters screen is displayed press followed by 1 so that the data are standardized to unit variance then press j so that Wilks A will be calculated and tested k so that the sum of log F statistic will be calculated and tested and followed by 2 so that the sum of the squared Euclidean distances will be calculated and tested If we then press x the output will look like s amp Promp joj z Hee e a al o R 2 eed from clock 8888 Carried out on 24 Feb 2002 at 00 23 20 Analysis for data from file testdata csv Larval mass at metamorphosis percent survival to metamorphosis and length of 1 arval period Variables Avemass Pctsurv AVeLP Low Medium 24 observations 3 treatments l Variables centred and standardised to unit variance 8 5 99 CL for P lower upper Wilks lambda 0 090430 0 00005 0 00oog 0 00018 Sum of log F 9 287080 0 00005 0 00oog 0 00018 Sum CEuclid distjA2 24 588053 0 00005 0 00goog 0 00018 RAMAN version 1 73 manual 12 The top lines are the same exc
21. ram and load the file testdata txt You may have to specify that it be put into a non proportionally spaced font such as Courier New so that the spacing of the tabular output is not disrupted by proportional spacing of letters After loading into your word processor the output should look like that shown in Box 1 on the following page The first section of the output reports the results of the overall test of significance in a format similar to that shown on the screen but including which version of the program produced the results The remainder of the output presents the results of each pairwise comparison including the value s of the test statistic s for that pair of treatments with the rest of the data ignored the P value calculated by from re randomizing that subset of the data the indicated number of times and comparing the observed value of the statistic with the randomization distribution the lower and upper 99 binomial confidence limits for the P value and finally the P value adjusted using Sidak s adjustment for multiple comparisons If only certain pairwise comparisons are considered as a result of preplanning which pairs of treatments to compare or if you wish to preserve the comparisonwise error rate at w 0 05 either the P value from the randomization distribution or the uppern 99 confidence limit of that value are the ones to use in deciding whether a pair of treatments differ significantly If the pairwise comparisons are
22. sent to the screen pressing the c key allows you to choose to send it to a file as well The file will be in standard ASCII text format and if loaded into a program like Word it should be displayed and prionted using a fixed pitch font such as Courier New The default is to send output to the screen only you will probably want to send it to a file as well unless you are running preliminary analyses Pressing D allows you to select how many random reorderings of the data will be carried out to create randomization distributions of the test statistic or statistics The default is 20 000 reorderings which is usually enough to lead to a reasonably narrow 99 confidenXce interval for test statistics When a P value is very close to a critical value you may want to increase this number for slow computers and large data sets you may want to decrease it although less than 5000 is probably not a good idea Pressing E allows you to change how the random number generator used to reorder observations among treatments while generating randomization distributions is initialized N ormally the random number is initialized from the system clock which means that it starts at an arbitrary point The number that is used is reported in output for possible re use should this be necessary When you press E you will be prompted for a number which should either 0 meaning that the generator will be initialized from the clock or an integer between 1 and 32767
23. tivariate statistics use data that are standardized to mean zero variance 1 before being analyzed This removes effects of differing measurement scales among variables The transformation used is pare Yes S X Where Y is a transformed observation X is an original observation X is the mean of that variable taken over all treatments and S is the standard deviation of the variable taken over all treatments The default for RAMAN is to use a different standardization more suitable for use with the sum of Euclidean distances statistic in which each value of each variable is standardized so that the mean of all observations of that variable is zero and the mean distance of all observations Is one unit _ X X mean of X X This preserves linear relationships among data better than the usual standardization Itis also possible to choose not to standardize data at all which may lead to unpredictable consequences since variables measured on different scales for example cm and mm will be weighted differently in determining distances among points in multivariate space but may be useful if all variables are intrinsically on a common scale and differences caused by different ranges need to be preserved j Pressing J turns the calculation and testing of the Wilks A lambda statistic on or off The default is off If A is to be calculated the data should be standardized to unit variance and there must be sufficient univariate

RAMAN manual

Contents

Download Pdf Manuals

Related Search

Related Contents