Home
Tutorial Analyzing Affymetrix® Gene Expression data in GeneSpring
Contents
1. Legend Spre K3 a Rows 54675 0 selected Columns 9 0 selected 1646 Unigene Avad Hs 161008 120M of 151M i Figure 21 The Spreadsheet view shows normalized intensity values of the selected entity list for the selected interpretation Also shown are selected annotations associated with each probe set GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 28 Agilent Technologies Exercise 3 View data in the Scatter Plot View The scatter plot view can be useful in comparing global expression of entities between two samples or two experimental conditions Doing so allows you to compare at a high level the effects of the experimental conditions on gene expression The scatter plot will also allow you to qualitatively identify entities whose expression is significantly different between two samples or conditions An entity list can be made from any selected entities within the plot 1 View data for the Congestive Heart Failure experiment in the scatter plot view Select the All Entities list from the Analysis folder in the navigator Select the CHF Etiology interpretation from the Interpretations folder in the navigator Click View gt Scatter Plot By default the scatter plot displays the normalized intensity values of each entity The horizontal axis represents the first condition in the selected exper
2. Figure 19 Use the Order Parameter Values window to manipulate the order in which the conditions will be displayed for a particular experiment parameter Congestive Heart Failure ces ies x Elena B r Experiments lt z Experiment Set on BE Congestive Heart fail Quick Start Guide Experiment Grouping Create Interpretation lt Mm B Congestive Hear 14 x m Samples S E Interpretations i gi All Samples ool CHF Etiology Qi E Analysis Quality Control 2 Analysis Normalized Intensity Values Color By Female Non failing A Female Female Female Male N Male Is Male Idiopathic t7 be Female Tm Male __44 5 7 Description Gender CHF Etiology Launched on interpretation i w Displaying 54675 0 selected 188M of 254M ij GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data Agilent Technologies Figure 20 The profile plot of the active interpretation is shown with the new order of the conditions Exercise 2 View expression data in Spreadsheet View The Spreadsheet View allows you to view the normalized intensity values for the entities in the entity list selected in the Navigator Selected annotations for these entities are also displayed within the spreadsheet The normalized intensity values reported in the Spreadsheet View are determined by the interpretation selected in the Navigator 1 Open the All Entities lis
3. GO 0003 nucleic a GO 0043 macromo GO 0044 cellular G0 0004 ubiquitin G0 0046 transition G0 0006 RNA proc GO 0019 regulatio GO 0006 ubiquitin GO 0019 regulatio GO 0044 lintracellul GO 0005 cytoplasm_ GO 0045 regulatio regulatio biopalymn protein 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 09 9 9 09 9 9 9 E Spreadsheet Figure 43 Results from GO analysis Only GO categories that satisfy the p value cutoff will be displayed in the Spreadsheet The Pie Chart displays how genes in the selected Entity List are categorized at a particular node within the GO Tree Labels within the Pie Chart provide information such as the GO ID GO term number of genes found in the selected list that are also found in the category and the p value and corrected p value calculated to indicate the significance of this enrichment Exercise 2 Gene Set Enrichment Analysis GSEA GSEA is another analytical method that allows scientists to make biological interpretations of their gene expression data In the above exercise we only looked at genes that were found to be differentially expressed and asked whether there is a significant enrichment of these genes in a particular GO classification GSEA interrogates genome wide expression profiles from samples belonging
4. 0 0 995 AFFX r2 P1 7 0 38264 0 99846 0 01976 AFFX M2783 8 0 52356 0 42818 0 02111 12l at i 0 18801 0 24092 0 02164 lt Change cutoff Figure 31 Results window from the 2 way ANOVA GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data Save Entity List This window displays the details of the entity lists created as a result of statistical analysis Objects 2way ANOVA Corrected p value CHF Etiology P lt 1 2way ANOVA Corrected p value CHF Etiology Gender Notes Experiment Congestive Heart Failure P value cutoff 1 Selected Test 2way ANOVA P value computation Asymptotic Multiple Testing Correction Benjamini Hochberg Name 2way ANOVA Corrected p value CHF Etiology P lt 1 Creation date Mon Jan 14 15 12 20 GMT 05 30 2008 Last modified date Mon Jan 14 15 12 20 GMT 05 30 2008 Owner gxuser Technology Affymetrix GeneChip HG U133_Plus_2 Number of entities 13632 Experiments Entities Attributes Probe Set Corrected Corrected Corrected a a AFFX BioB 5 30 056866 0 976085 0 999468 0 010004 0 85 16364 0 395 A A 0 978372 1 0 999468 0 02022 1 0 869173 1 032i AFFX BioC 3_ 0 056888 0 9947479 0 999468 0 010033 0 9684989 0 52 AFFX BioDn 5 0 0752 11 0 9676343 0 999468
5. 0 017741 0 792026 0 496 ALLY Riain aa ananin lA nanana la aaaaco lo aanacr la ATAnDA Wt is gt Find Find Next Find Previous Match Case ea Figure 32 Saves the entitiy list passing the cut off along with its details and annotations Exercise 2 Find candidates for differential expression using the One way ANOVA Results from the 2 way ANOVA showed that the parameter CHF Etiology best explains the differences in gene expression between the samples in the experiment with little to no contribution from the other parameter Gender In addition there is little to no interaction between the two parameters Thus for this analysis we will choose to disregard the Gender parameter and use the One way ANOVA to identify genes that are differentially expressed between the three CHF Etiology conditions A probe set with a significant p value from the ANOVA has a statistically significant change in intensity value between at least two of the conditions tested When comparing three or more conditions it is not known between which pairs or groups of conditions the probe set is differentially expressed In cases where three or more conditions are tested a post hoc test can be applied to identify the pairs or groups of conditions between which significant changes occur For this analysis you will apply the One way ANOVA and a post hoc test to the Congestive Heart Failure experiment Only the pro
6. Click on Change cutoff In the p value cutoff box type 0 01 and hit Enter Click Close Note that the results have been updated to reflect the new corrected p value cutoff e Save each significant GO category as an Entity List O O In the GO Analysis Step 2 of 2 Output views window click Finish The probe sets found in each category will be saved as an Entity List Each Entity List will be named after the GO term associated with that category All lists from the GO Analysis will be saved into a folder named GO analysis with p value cutoff X the cutoff value used for the analysis The saved lists will be sub divided into three folders corresponding to the three highest levels of the GO Classification schema Cellular Process Molecular Function and Biological Process Close the GO Analysis with p value cutoff 01 folder by clicking on the minus sign next to the folder GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 68 Agilent Technologies F GO Analysis Step 2 of 2 Output views Output views of the GO analysis Displaying 49 GO terms satisfying corrected p value cutoff 1 To change use the control buttons below ji ae RANU GO ACCE GO Term p value a correc Count Vie GO 0043 biopatym oj GO 0044 nuclear p o G0 0016 RNA met o GO 0008 zinc ion o G0 0006 nucleoba G0 0044 primary GO 0006 transcript GO 0008 metaboli
7. Failure All Samples in the Name box Click Finish 6 Inspect the Condition Tree Figure 29 in the browser The Condition Tree is saved and appears as an object in the Navigator Once the Condition Tree is saved it should be automatically displayed in the browser If you close this view and would like to display it again double click on the Congest Heart Failure All Samples Condition Tree object in the Navigator Make sure that the All Entities list is selected in the Navigator as this was the input list for analysis Remember that GeneSpring GX will only display entities in the Entity List selected in the Navigator Condition trees display sample similarities as a dendrogram a tree like structure made up of branches Shorter branches nest within longer ones until eventually one stem joins all branches This nested structure forces all samples to be related at a certain level with longer branches representing the more distantly related samples The tree structure represents the relationship between the samples used in this analysis Samples are being grouped according to the similarity of their expression profiles across the probe sets in the Entity List used for the analysis Note that samples group well according to the parameter CHF Etiology Interestingly samples of the Idiopathic condition are more similar in their expression profiles to Non failing samples than to Ischemic samples To manipulate the size of the Conditi
8. GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 72 4i Agilent Technologies F GSEA Step 5 of 5 Results from the GSEA The results table shows those gene sets that pass the q value cutoff When pressing Finish all gene sets that pass will be saved as entity lists You can also save a subset of the results by selecting the gene sets and pressing Save Custom Lists and pressing Cancel to avoid also saving the complete set Displaying 8 Gene Set s with q value less than 0 300000 out of 272 Gene Set s containing 15 matching Genes or more Gene Sets satisfying minimum Gene requirement Gene Sets chr q13 MIT Broa 8 234 Aj 0 467 chr19p13 MIT Broa 507 0 2 __1 0 306 chr22q12 MIT Broa 107 0 472 875 0 27 chr17q25 MIT 34 71 9 229 1 0 309 chr2p23 MIT Broa 05 0 531 0 8 0 276 chr9p22 Pra 4 33 941 9 0 251 chr7q35 z 5 0 0 0 245 ichr2q35 i f 0 271 Figure 46 The Results from the GSEA window displays the gene sets with significant q values All gene sets displayed in this window will be automatically saved as Entity Lists once you click the Change q value cut off button Exercise 3 Perform pathway analysis on the genes of interest GeneSpring GX allows you to import and view BioPAX pathways BioPAX is an open platform for the distribution of network and pathway information More information regarding the BioPAX format can be found at ht
9. Non failing Ischemic idiopathic p values A Probe Set ID value Corrected p value AFFX BioB 5_at 0 0018504484 0 017587943 a AFFX BioC 5_at 0 003954528 0 02626869 AFFX BloC 3_at 0 0013408682 0 014872357 AFFX BioDn 5_at 0 0039212354 0 026170889 AFFX BioDn 3_at 2 817457E 5 0 0024829279 AFFK CreX 5_at 0 009743195 0 043648317 4 AFPX CreX 3_at 0 007471015 0 037456773 AFFX PheX 3_at 0 0025601627 0 020860495 AFFX r2 Ec bioB 5_at AFFX 1r2 Ec bioB M_at AFFX r2 Ec bioB 3_at 0 0055129915 0 03 1539407 4 Carry wi ce hiat Eae l z rcoaonrar al anazaa 0 0021277294 0 01891947 5 466339E 5 0 003 4702686 Ce e Figure 33 Results window from the One way ANOVA and Post hoc test 7 Save the probe sets of interest from the One way ANOVA and Post hoc test e Probe set with a significant p value from the One way ANOVA indicates that the intensity values associated with the probe set are statistically different between at least two of the CHF etiologies However you have no information about which two etiologies or between how many pairs For example the intensity values could be statistically different between non failing and idiopathic non failing and ischemic idiopathic and ischemic or between non failing ischemic and idiopathic Results from the post hoc test will allow you to determine between which etiologies the intensit
10. p lt 05 e Click Finish e A folder named 2way ANOVA cutoff 05 will be saved to the Navigator Within this folder will be the one Entity List saved from the 2 way ANOVA f Statistical Analysis Step 7 of 8 Results To apply a new p value cutoff click on Change cutoff button To save entities that passed the applied cutoff click Next To save a subset of these entities as a custom entity list select entities from the view and click Save custom list button Displaying 6803 entities out of 44 566 satisfying p value cutoff 05 ifferenti ression is Ri Test Description Selected Test 2way ANOVA P value computation Asymptotic Multiple Testing Correction Benjamini Hochberg Result Summary Pall ksi Pae Corrected p value CHF Etiology 44566 6803 1635 Corrected p value CHF Etiolog 44566 o 0 Corrected p value Gender 44566 0 0 Expected by chance l 2228 891 Cor acta dun Gandar Omis p values Q m Probe Se p valuec p valuec p valuec Correcte Correcte Cori AFFX BioDn 8 14995 0 96423 0 65879 0 02091 0 99423 0 9954 AFFX 12 Ec t 0 00125 0 85686 0 95425 0 02414 0 97650 AEPCr2 Ec t 0 00685 0 86105 0 68342 0 04778 0 97716 EEE 0 87028 0 90809 0 02612 0 10 99 i AEFK r2 Ec t 0 0 97710 0 89067 0 02634 0 0 995 fu AFF 12 PL_ 4 0 91254 0 60056 0 01664
11. CHF Etiology but not Gender This indicates that the parameter CHF Etiology explains the variance in gene expression data across the samples more than the parameter Gender e Looking at the various tables and plots we decide that the samples in this experiment are of acceptable quality for further analysis e Click the Close button to close the Quality Control on Samples results window F Quality Control PCA Scores a 20 10 0 10 20 30 cel cel cel PAS 6 PAS 8 PCA Component 2 PA N_249 cel poe ogen 80 60 40 20 0 200040006000 PA N_322 cel PA N_326 cel PAD_10 cel PAD _4 cel PAD_7 cel x axis PAD_9 cel E Correlation Coefficients j Correlation Plot Y Axis PCA Component 2 PCA Component 1 PCA Component 1 bridization Controls 1 Legend PCA Scores Color by Gender Female E Male Shape by CHF Etiology Idiopathic log2 Normalized Sig A Ischemic AFFX AFFX AFFX AFFX AFFX AFFX AFFX AFFX r2 P1 cre Non Failing All Samples Description Algorithm Principal Components Analysis __ Parameters _ Internal Controls 3 5 ra E Experiment Grouping SA Hybridization Controls Add Remove Samples Figure 28 The Quality Control window shows values for various metrics that are used to gauge sample quality GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 37 tt Agilent Technologies Exercise 2 Use the Hi
12. Chromosome End Index Avadis Gene Title Chromosome Strand Avadis Q Entrez Gene Ensembl SuusiceDesk Figure 24 The Search Entities tool allows you to search for a specific probe set based on a number of annotation criteria e Click Next gt gt GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 32 tt Agilent Technologies e The Search Entities Step 2 of 3 Output views window should now display the search results Click Next gt gt to save these entities as an Entity List 3 Create an Entity List containing the probe sets identified in the search e GeneSpring GX will create an Entity List for the search results as in Figure 25 Type Entities search result for GATA4 in the Name box and click Finish e Select the Entities search result for GATA4 list from the Navigator These probe sets should now be displayed in the Profile Plot Note that the profiles of these probe sets are quite different Search Entities Step 3 of 3 Inspector Search For Entities Results Name Entitylist created after search For GATA4 Notes Created from selection Columns selected Probe Set ID Unigene Avadis Gene Symbol Creation date Mon Jan 14 11 08 59 GMT 05 30 2008 Last modified date Mon Jan 14 11 08 59 GMT 05 30 2008 Owner gxuser Technology Affymetrix GeneChip HG U133_Plus_2 Number of entities 5 Experiments aa Figure 25 Save the search resu
13. Entity List and Interpretation window click Next gt gt 3 In the Filter by Expression Step 2 of 4 Input Parameters window set the filtering criteria e Range of interest o Upper percentile cutoff 100 o Lower percentile cutoff 20 o For this analysis we assumed that if a gene is expressed in the sample the signal intensity value for the probe set representing the gene would be greater than the 20th percentile of all signal intensity values of the sample e Retain entities in which o Atleast 100 of the values in any 1 out of 6 conditions are within range o If probe sets were filtered such that they must have values within the range in all 6 conditions potentially interesting genes that may not be expressed in one or several experimental conditions will be excluded Thus potentially interesting biological changes between experimental conditions could be missed To decrease the chances of missing these changes we decreased the stringency of the filter such that even if the gene is only expressed in all GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 41 ogee Agilent Technologies of the samples in any one experimental condition the probe set will pass the filter e Inthe Filter by Expression Step 2 of 4 Input Parameters window click Next gt gt 4 In the Filter by Expression Step 3 out of 4 Output Views of Filter by Expression window preview the filtering results See Figure 30 e This window displays
14. Tutorial for Affymetrix data 11 i gt Agilent Technologies S New Experiment Step 3 of 4 Summarization Algorithm Select a summarization algorithm from the dropdown list and the baseline transformation to create new experiment with normalized expression values Summarization Algorithm RMA Baseline Transformation Do not perform baseline transformation Baseline to median of all samples Baseline to median of control samples Choose samples Figure 7 The New Experiment Summarization Algorithm window allows you to select the normalization and baseline transformation methods to apply to the experiment 4 Download the Technology needed to import data into GeneSpring GX e If the technology for the dataset has not already been installed in GeneSpring GX you will be prompted to do so Upon clicking Yes the technology will be downloaded from the Agilent server See Figure 8 Technology not found J Technology Affymetrix GeneChip HG U133_Plus_2 was not Found J Do you want to download it now Yes Figure 8 This window allows you to download the technology for the dataset 5 View the newly created experiment GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 12 Bee Agilent Technologies e Once an experiment has been created from the imported data files a BoxWhisker Plot view of the data automatically opens Each BoxWhisker shows the distribution of
15. changes in expression across the samples within this experiment can be due to differences in CHF Etiology differences in Gender or the interaction between CHF Etiology and Gender To determine the contribution of each parameter to the changes in gene expression across the samples you will apply the two way ANOVA to the Congestive Heart Failure experiment 1 Activate the Statistical Analysis tool e In the Workflow panel open the Analysis section and click on the Statistical Analysis link 2 In the Significance Analysis Step 1 of 8 Input Parameters window select the Entity List and Interpretation to be used for statistical analysis e Click the Choose button to select Entity List for the analysis o From the Analysis folder select the QC probe sets list and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the CHF Etiology Gender interpretation and click OK e Click Next gt gt 3 In the Significance Analysis Step 2 of 8 Select Test window select the statistical test to be performed GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 43 Agilent Technologies Select 2 way ANOVA from the Select test drop down menu Click Next gt gt 4 In the Significance Analysis Step 5 of 8 p value Computation window select the p value computation method P value Computation Asymptotic Multiple Testing Correction Benjamini Hochberg FD
16. conditions if applicable to the study Therefore interpretations allow alternative analysis approaches Starting GeneSpring GX Upon launching GeneSpring GX for the first time a Demo Project will automatically open This project created using the Agilent One color technology contains an experiment called HeLa cells treated with compound X and data objects derived from analysis of this data If you would like to be guided through the analysis of this dataset please refer to the Quick Start Guide that can be accessed from GeneSpring GX gt Help in toolbar gt Document Index gt Quick Start Guide For the purpose of this tutorial we will use the data in the Demo Project to become familiar with the GeneSpring GX interface 1 Start up GeneSpring GX e Double click the GeneSpring GX icon on the desktop 2 Open the Demo Project e If this is your first time launching GeneSpring GX the Demo Project and HeLa cells treated with compound X experiment will automatically open If you have previously launched GeneSpring GX go to Project gt Open Project gt Select Demo Project and click Open e A GeneSpring GX window should appear with the name of the project Demo Project shown on the upper left hand corner of the window below the Project Navigator bar See Figure 1 For the Demo Project the HeLa cells treated with compound X experiment will be automatically opened GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 3
17. corner of the view to close the window Close all of these views before proceeding with the rest of the tutorial In the Profile Plot view each continuous line corresponds to a single probe set s normalized intensity value y axis for each condition x axis within the Congestive Heart Failure experiment GeneSpring GX uses log base 2 of the intensity values for calculations and for display In GeneSpring GX data is generally normalized and baseline transformed to center values around a baseline of 0 Therefore normalized values of 0 represents baseline level of probe set intensity values values greater than 0 represent upregulated probe sets and values less than 0 represent downregulated probe sets 1 View data for the Congestive Heart Failure experiment in Profile Plot view e Inthe navigator pane click the All Entities list in the Analysis folder and the CHF Etiology Gender interpretation within the Interpretations folder e From the Menu Click View gt Profile Plot See Figure 17 e Close the Profile Plot GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 23 et Agilent Technologies s GeneSpring GX 9 Congestive Heart failure S E Experiments Experiment Set Y EE Congestive Heart fail Quick Start Guide Experiment Grouping Create Interpretation Idiopat Idiopat Ischemi Ischemi Non fai Non failing Mal eldiopathia amp Ischemic amp Non faili 4 Gender CHF Etiology Displa
18. individual signal intensity value for each entity in each sample will be used for display o Click Next gt gt e Save the new interpretation as CHF Etiology Step 3 of 3 o GeneSpring GX will give each object created a default name However this can be changed to a name of your choice In the Name box type CHF Etiology See Figure 15 Click Finish GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 18 ee Agilent Technologies S Create Interpretation Step 1 of 3 Select parameters 4n Interpretation specifies how samples will be grouped into experimental conditions for display and used for analysis Select the parameter s to group samples by All samples with the same parameter values will be grouped into an experimental condition Select experiment parameters CHF Etiology C Gender Figure 13 The Create Interpretation Select parameters window allows you to select the experiment parameter s to group samples by Create Interpretation Step 2 of 3 Select conditions Select the conditions defined by the selected parameter s to include in the interpretation Samples within a condition are considered as replicates and For each entity the average intensity value across replicates will be used for visualization and analysis Unselect conditions to exclude Idiopathic Ischemic Non Failing Average over replicates in conditions Figure 14 The Create In
19. since the interpretation defines how samples are grouped as replicates into experimental conditions For each entity the average intensity values across the replicates are used for display and analysis If you are familiar with previous versions of GeneSpring GX you will notice that a key difference in GeneSpring GX 9 0 is that you can have multiple views of your data open at GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 22 he Agilent Technologies one time For example you can have a scatter plot view of the expression data displayed at the same time that you have a profile plot of the same data displayed The advantage of this is that you can simultaneously view your data in multiple ways The data for the views are also linked in that selecting entities in one view will select the same entities in all the other opened views as well However without being diligent about closing views that are no longer needed you can end up with many windows open Exercise 1 View expression data in a Profile Plot As we were creating the different interpretations in the previous section a profile plot for each interpretation was automatically generated and displayed Thus at this point of the tutorial you already have views opened in the browser You may not see multiple views opened if the view has been maximized to fit the browser In this case the views are entirely stacked upon each other Click on the red X in the upper right hand
20. the pathway grab and move the pathway on the canvas center the pathway in view and select multiple ways to organize the network pathway All of these actions can be accessed through the icons within the pathway view window Take some time to try these various actions e Also note that the legend for the pathway can be found in the panel below the Workflow panel a Congestive Heart Failure Experiment Setup y Quick Start Guide Experiment Grouping Create Interpretation Quality Control A Class Prediction Results Interpretat Y GO Analysis GSEA Find Similar Entity Lists Find Similar Pathways Utilities Save Current View Genome Browser Import BROAD GSEA Ge Import BioPax pathways Legend IL 7 A Theme Legend COMPLEX O PROTEIN SMALLMOLECULE Selected Nodes 0 Controls 0 515M of 579M fi Figure 49 The pathway view in GeneSpring GX Nodes outlined in blue are those that are represented by entities in the currently selected Entity List 7 Import the other pathways into the Congestive Heart Failure experiment e Any pathways that have been imported into the GeneSpring GX database can be searched for and subsequently added to the active experiment e Search for the pathways to add o From menu go to Search gt Pathways o Inthe Search Wizard Step 1 of 3 Search Parameters window leave the Search keyword box empty and click Next gt gt This will command GeneSpri
21. the results of Hierarchical clustering analysis e Inthe Clustering Step 4 of 4 Object Details window type Hierarchical Combined Tree of significant 1 5 fold change probe sets in the Name box GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 61 Agilent Technologies Click Finish 6 Inspect the 2 D dendrogram See Figure 41 The combined entity and condition tree is automatically displayed in the browser If you had closed this view and wanted to display it again double click on the Hierarchical Combined Tree of significant 1 5 fold change probe sets tree in the Navigator Make sure that the Fold change greater than 1 5 in Non failing vs Ischemic or Non failing vs Idiopathic Entity List is selected in the Navigator This was the input list for the generation of the tree Selecting this Entity List while viewing the tree will instruct GeneSpring GX to show all the probe sets used for the analysis The horizontal tree structure Condition tree represents the relationship between the samples used in this analysis Samples are being grouped according to the similarity of their expression profiles across the probe sets in the Entity List used for the analysis The vertical tree structure Entity tree represents the relationship between the probe sets used in this analysis Probe sets are grouped according to the similarity of their expression profiles across the samples selected for analysis To manipulate the size of the c
22. to two different classes e g normal and tumor and determines whether genes in an a priori defined gene set correlate with class distinction A gene set is defined as a group of genes that either share common biological function chromosomal location or regulation First genes are ranked based on the correlation between their expression intensities and class distinction As a result genes that differ most in their expression between the two classes will appear at the top and bottom of the GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 69 gt Agilent Technologies list The assumption is that genes related to the phenotypic distinction of the classes will tend to be found at the top and bottom of the list An enrichment score ES is then calculated to reflect the degree of overrepresentation of genes in a particular gene set at the top and bottom of the entire ranked list A p value is then derived for the ES to estimate its significance level The p value is then adjusted for multiple hypothesis testing 1 Download the gene sets from the Broad Institute e Download all four gene sets C1 C2 C3 and C4 to a local directory from the following website http www broad mit edu gsea 2 Import the gene sets into GeneSpring GX e In the Workflow panel open the Utilities section and click on the Import BROAD Lists link e Select file you would like to import and click Open 3 Activate the GSEA tool e In the Workflow panel ope
23. 507 0 357194 Hs APTX aprataxin 54840 GH CHF Etiology Gender J 222658_s at 0 02006 0 023623 0 038645 Hs 20158 APTX aprataxin 54840 5 Analysis 203655_at 0 04909 0 07793 0 071158 Hs 98493 XRCCL X ray repair 7515 All Entities 206568 at 0 02740 0 022439 0 01761 Hs INPI transition 7141 Results 1553567_s_at 0 02004 0 053386 0 08525 Hs ATPI3A5 ATPase type 344905 223197_s_at 0 17946 0 004672 0 243561 Hs 410406 SMARCADI SWI SNF relat 56916 Utilities A 201088_at 0 038601 0 27868 0 221766 Hs 594238 KPNA2 karyopherin 3838 211762_s_at 0 170124 0 31776 0 033997 Hs 594238 KPNA2 karyopherin 3838 202055_at 0 067197 0 11731 0 01302 Hs 161008 KPNAL Karyopherin 3836 202056at 0 208020 0 14586 0 12241 Hs 161008 KPNAL karyopherin 3836 202057_at_ 0 134029 0 36836 0 053288 Hs 161008 KPNAL Karyopherin 3836 202058_s_at 0 118227 0 214298 0 64924 Hs 161008 KPNAL karyopherin 3836 202059_s_at 0 073369 0 07929 0 21755 Hs 161008 KPNAL Karyopherin 3836 Description 213741_s_at 10 116536 10 164968 0 31003 Hs 161008 KPNAI karyopherin 3836 Launched on interpreta 205798 at 0 029910 0 236975 0 2022773 Hs 591742 IZR interleukin 3575 IIRA at A AAD A ACIL A ATA 170 intarlankin 367G Create Int Class Pr 2
24. AS_ amp cel Add Parameter Edit Parameter Delete Parameter Figure 11 The Experiment Grouping window displays the experiment parameter s values associated with each sample within the experiment GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 16 pe Agilent Technologies Add Edit Experiment Parameter Grouping of Samples Samples with the same parameter values are treated as replicate samples To assign replicate samples their parameter values select the samples and click on the Assign Yalues button and enter the value For the group Parameter name CHF Etiology Samples Parameter Values Assign Value Enter a value for the selected samples Non Failing Figure 12 Define a new experimental parameter and assign parameter values using the Add Edit Experiment Parameter window Sample Name CHF Etiology Gender PA N_249 txt Non failing Female PA N_300 txt Non failing Male PA N_322 txt Non failing Male PA N_326 txt Non failing Female PAD_10 txt Idiopathic Male PAD 4 txt Idiopathic Female PAD _7 txt Idiopathic Female PAD_9 txt Idiopathic Male PAS _3 txt Ischemic Female PAS _6 txt Ischemic Female GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 17 com Agilent Technologies PAS 7 txt Ischemic Male PAS _8 txt Ischemic Male Table 1 Experimen
25. Configure Columns Figure 27 The Entity Inspector window shows various information for a specific probe set such as annotations intensity values for each sample or condition and the expression profile Section 4 Perform Quality Control on Samples Although much of the quality control process should occur even before samples are hybridized to a microarray there are several tools in GeneSpring GX that can be used for quality control assessment after the gene expression data have been imported into GeneSpring GX Using these tools outlying samples can be detected allowing you to make the decision of whether or not to include these samples in subsequent analyses Exercise 1 Perform quality control on samples GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 35 Eee Agilent Technologies The Quality on Samples tool allows you to assess sample quality using various criteria including Internal Control 3 5 ratio hybridization control plots sample correlation matrix and Principal Components Analysis on samples If a poor quality sample is detected and you would like to remove the sample from your experiment select the sample from any of the displayed plots and click on the Add Remove button If a sample is removed re summarization of the remaining samples will be performed Internal Control 3 5 ratio gives an indication of the integrity of the starting RNA and efficiency of the first strand cDNA synthesis You shoul
26. F GeneSpring GX 9 HeLa cells treated with compound X DER 2 Tal ESES ESE ees Project Navigator a HeLa cells treated with compound x Demo Project x 10 3 Experiments E Experiment Se Y 2 HeLa cells treated w Quick Start Guide Experiment Grouping Create Interpretation Quality Control 2 a _ _ E eLa cells treat x w Samples gg Interpretations Qe All Samples oe Treatment Non ave a Treatment 5 Analysis Class Prediction Normalized Intensity Values Results Interpr 2 Utilities za Le g Legend BoxWwhisker Plot US2250270 US2250270 US2250270 US2250270 US2250270 US22502705_25 Description Launched on interpretation All Saj i All Samples i 6a fi B Displaying 20173 0 selected 54M of 64M i Figure 1 GeneSpring GX window displays the name of the project Demo Project that the window represents below the Project Navigator bar 3 Activate the Profile Plot view for the experiment e From the menu select View gt Profile Plot All Views can be accessed from two places The first is from the menu by going to View and selecting the desired view The other is by clicking on the individual view icons below the menu For the purpose of tutorial you will be instructed to select Views from the menu 4 Explore the GeneSpring GX Interface e As you go through the tutorial you will need to use different parts of the GeneSpring GX application wind
27. R Click Next gt gt 5 View results of the 2 way ANOVA The Significance Analysis Step 7 of 8 Results window Figure 31 reports the results from the 2 way ANOVA in several displays For explanation of each result display consult the GeneSpring GX User Guide Manual GeneSpring GX will save 3 Entity Lists from the analysis one containing probe sets that have a significant p value for the parameter CHF Etiology one containing probe sets that have a significant p value for the parameter Gender and one containing probe sets that have a significant interaction p value Empty Entity List with zero entities in them will not be saved In one of the results displays these three lists are automatically projected into the Venn Diagram allowing you to compare the content of the three lists From the Venn Diagram you can identify interesting probe sets For example perhaps you are interested in probe sets that are differentially expressed across CHF Etiology conditions but not across Gender Probe sets from any region of the Venn Diagram can be saved as an Entity List To do so select the region of interest in the Venn Diagram and click on the Save custom list button Note that all of the significant probe sets were found to only be differentially expressed between CHF Etiology conditions These results indicate that the differences in CHF Etiology conditions of these samples account for the variance in gene expression across the samples Diff
28. Spring GX 9 Data Analysis Tutorial for Affymetrix data 54 Eee Agilent Technologies F Fold Change Step 3 of 4 Fold Change Results Probesets that satisfy a fold change cutoff gt 2 0 in at least one condition pair are displayed by default To change the fold change cutoff click the Change cutoff button enter the required cutoff and rerun To save custom entity list select entities from the view and click Save custom list button Displaying 1230 out of 10905 entities with fold change cutoff gt 2 0 in 1 out of 2 condition pairs Profile Plot By Group wv w 3 w gt gt a Cc w w D a N w E o am Non failing Ischemic Idiopathic N E Fold changes _ A Profile Plot By Group Figure 37 Results window for fold change analysis Exercise 4 Find other genes with similar expression profiles to a target gene The Find Similar Entities tool allows you to identify probe sets with similar expression profiles to a selected target probe set It is thought that genes with similar expression profiles may share similar biological functions At the beginning of this tutorial we looked at the expression level of GATA4 a gene that encodes a transcription factor that modulates the expression of other genes implicated in congestive heart failure We will use the Find Similar Entities tool to identify genes that have similar expression profiles to GATA4 as they may also play an important role in
29. This action should move the Gender column to the left of the CHF Etiology column See Figure 18 GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 24 Bee Agilent Technologies o Click on the Re order parameter values icon 23 Within the Order Parameter Values window select a condition defined by the parameter Gender and use the up and down arrows on the right hand side of the window to move the conditions in the right order See Figure 19 The order going from top to down in this window will be the order going from left to right on a profile plot Thus Female should be listed first then Male Click OK Now we will order the conditions within the CHF Etiology parameter Click in any of the parameter value cells for the CHF Etiology and repeat the steps for ordering the conditions The order should be Non failing Ischemic then idiopathic Click OK In the Experiment Grouping window click Close 3 View data for the Congestive Heart Failure experiment in Profile Plot view e Inthe navigator pane click the All Entities list in the Analysis folder and the CHF Etiology Gender interpretation within the Interpretations folder e From the Menu Click View gt Profile Plot See Figure 20 e Verify that the conditions are in the following order going from left to right on the X axis Female Non failing Female Ischemic Female Idiopathic Male Non failing Male Is
30. Tutorial Analyzing Affymetrix Gene Expression data in GeneSpring GX 9 he Agilent Technologies Introduction to Tutorial This tutorial provides a hands on exploration of the variety of GeneSpring GX functionalities by guiding you through the analysis of an Affymetrix gene expression microarray dataset In doing so this tutorial aims to demonstrate how to use the tools available in GeneSpring GX to answer biological questions relevant to the experimental design Understanding GeneSpring GX Terminology Some terms used in the general biological research community have a more specialized use in GeneSpring GX A brief definition of each is provided below to clarify the tutorial instructions More terminology can be found in the GeneSpring GX User Manual A project is the primary workspace which contains a collection of experiments The ability to combine experiments into a project in Genespring GX allows for easy interrogation of cross experimental results For example you may want to visualize how genes that were found to be differentially expressed in one experiment are behaving in another experiment within the project A project could have multiple experiments that are run on different technologies and possibly different organisms as well A technology in GeneSpring GX contains information on the array design as well as biological information about all the entities on a specific array type Technology refers to this package of inform
31. assification icon to view the clustering results See Figure 42 e The goal of clustering analysis is to group probe sets with similar expression profiles into a cluster Though intracluster variability can always be decreased by increasing the number of clusters to be generated doing so may lead to creating clusters that share similar expression profiles In this case we are starting to separate probe sets with similar expression profiles into different clusters This is not desirable as probe sets representing genes with similar biological functions may now be separated into different clusters 7 Create an Entity List for each cluster generated e GeneSpring GX can generate an Entity List for each cluster in a Classification Doing so will allow you to interrogate more closely the genes that share similar expression profiles in your experiment o Right click on the K Means with 5 clusters Entity Classification object in the Navigator o Select Expand as Entity List 8 Close the Profile Plot view window for the K Means with 5 clusters Entity Classification GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 65 i Agilent Technologies S K Means with 5 clusters Entity Classification Cluster Cluster 1 Cluster Cluster 2 Cluster Cluster 3 Lh h fh hh hth hh hh Not elsaecy Not elsaucy tNot elsaucy Cluster Cluster 4 Cluster Cluster 5 kyi Lh fh fh U l ii iNo elsaucu iNo
32. ation available for each array type for e g Affymetrix HG U133 plus 2 is one technology Agilent 12097 Human 1A is another and so on An experiment comprises samples which all belong to the same technology A technology initially must be installed for each new array type to be analyzed An entity is a discrete feature measured by microarray analysis such as a probe or probe set A sample contains data from a microarray run for a single biological source An experiment is a collection of samples used for a particular research study that are to be analyzed as a set In GeneSpring GX an experiment consists of multiple interpretations which group these samples by user defined parameters A parameter is a variable in experiments such as treatment type tissue type time or dose Parameter values are values assigned to experiment parameters For example Day 14 could represent a parameter value of the experiment parameter Time GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data KSR Agilent Technologies A condition consists of one or more samples that represent a common biological state For example if you have serum from 3 different patients with cancer these serum samples describe the tumor condition The normal condition is accordingly represented by a different set of serum samples from healthy patients Multiple interpretations can be made from the same experiment data Interpretations group samples into different
33. ation selected in the Navigator o Plot tab This window displays the profile plot for the selected probe set Conditions displayed on the X axis are defined by the interpretation selected in the Navigator e Click OK to close the Entity Inspector 5 Close the Profile Plot S GeneSpring GX 9 Congestive Heart failure BAR Project Search View Tools Windows Help al g Experiments Experiment Setup of Congestive Heart Fail Quick Start Guide Experiment Grouping Create Interpretation Quality Control za Class Prediction a Congestive Hear x w Samples Interpretations QE All Samples o QE CHF Etiology QE Gender CHF Etiolog S E Analysis 5 8 All Entities Lower expressio E gfEntitylist creal Normalized Intensity Values Legend Profile Plot Non failin Ischemic Idiopathic E ol l l lidiop l Color By Non Failing SE a 2 9 a v CHF Etiology 5 Displaying 5 1 selected 225M of 279M iif Figure 26 Expression data for the probe sets representing the GATA4 gene GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 34 D 1570276_a_at Technology Affymetrix GeneChip HG U133_Plus_2 Annotation Data Plot Annotation Probe Set ID 1570276_a_at Unigene Avadis Hs 243987 Gene Symbol GATA4 Gene Title GATA binding protein 4 Entrez Gene 2626 GOfAvadis GO 0006350 GO 0006350 GO 0006355
34. atistical analysis and grouping of dissimilar expression profiles in clustering analyses While different methods exist to remove probe sets with unreliable measurements for this dataset you will use the Filter Probesets by GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 40 tt Agilent Technologies Expression tool to remove these probe sets and produce a list of quality probe sets that will be used for subsequent analyses Exercise 1 Filter for probe sets with reliable intensity measurements The aim of this filtering is to remove low intensity signals of genes that are not expressed For this dataset it was determined that intensity values below the 20t percentile in each sample likely represent signal intensity values corresponding to genes that are not expressed 1 Activate the Filter Probesets by Expression tool e In the Workflow panel open the Quality Control section and click on the Filter Probesets by Expression link 2 In the Filter by Expression Step 1 of 4 Entity list and Interpretation window select the Entity List and Interpretation to use for the analysis e Click the Choose button to select Entity List for the analysis o From the Analysis folder select the All Entities list and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the CHF Etiology Gender interpretation and click OK e Inthe Filter by Expression Step 1 of 4
35. ays q Non similar Pathways i ii SSS Pathway Number o Number o Number 0 pVal Pathway Number of Nodes IMT Heawtetal Pathway 16 Change cutoff Figure 47 This window displays the results of Find Similar Pathway analysis Pathways satisfying the cutoff are listed in the left panel while pathways in which GeneSpring GX cannot match a single entity in the experiment to the pathways are listed on the right panel 4 Change the p value cutoff for the Find Similar Pathways analysis e Inthe Find Similar Pathways Step 2 of 2 Results window click the Change cutoff button e Inthe Change P Value Cutoff box enter 0 5 hit the Enter key and click OK e The Find Similar Pathways Step 2 of 2 Results window should now be updated with the new p value cutoff See Figure 48 e The IL 7 pathway should now show in the Similar Pathways panel GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data qo Agilent Technologies S Find Similar Pathways Step 2 of 2 Results Pathways showing significant overlap with entities in the entity list selected For the analysis are displayed in the left hand spreadsheet To modify the level of significance click on the Change Cutoff button and enter new p value cutoff To import significant pathways into the experiment select the pathways and click Custom Save button Pathways in which a match cannot be made for any entities on the array are listed in the righ
36. be sets that have a significant p value calculated by the One way statistical test would be used for the post hoc test GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 46 he Agilent Technologies 1 Activate the Statistical Analysis tool e In the Workflow panel open the Analysis section and click on the Statistical Analysis link 2 In the Significance Analysis Step 1 of 8 Input Parameters window select the Entity List and Interpretation to be used for statistical analysis e Click the Choose button to select Entity List for the analysis o From the Analysis folder select the QC probe sets list and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the CHF Etiology interpretation and click OK e Click Next gt gt 3 In the Significance Analysis Step 2 of 8 Select Test window select the statistical test to be performed e Select ANOVA from the Select test drop down menu e Click Next gt gt 4 In the Significance Analysis Step 3of 8 Select Post hoc Test window select the Post Hoc test to be performed e Select SNK from the Post Hoc drop down menu e Click Next gt gt 5 In the Significance Analysis Step 6 of 8 p value Computation window select the p value computation method e P value Computation Asymptotic e Multiple Testing Correction Benjamini Hochberg FDR e Click Next gt gt 6 View the result
37. c o Inthe Save New Probe Set List window type Differentially expressed between Non failing and both diseased etiologies in the Name box Click OK e Make an Entity List containing probe sets with intensity values that are statistically different between Non Failing and Idiopathic o Click on the blue box corresponding to Non failing and Idiopathic 5 148 probe sets Click on the Save custom list button This will create an Entity List containing probe sets that are differentially expressed between Non failing and Idiopathic o Inthe Save New Probe Set List window type Differentially expressed between Non failing and Idiopathic in the Name box Click OK e Make an Entity List containing probe sets with intensity values that are statistically different between Non Failing and Ischemic o Click on the blue box corresponding to Non failing and Ischemic 9 748 probe sets Click on the Save custom list button This will create an Entity List containing probe sets that are differentially expressed between Non failing and Ischemic o Inthe Save New Probe Set List window type Differentially expressed between Non failing and Ischemic in the Name box Click OK 8 Save the probe sets that passed the One way ANOVA statistical test as an Entity List e Probe sets with significant p values calculated from the One way ANOVA have intensity values that are statistically different between at least two of the three tested conditions To sa
38. cause these same probe sets were saved in Step 7 of the Exercise 2 above Take a moment to consider how these are the same probe sets 6 Close the Venn Diagram window Choose Entitylists Choose Entitylists Entity List 1 Differentially express Entity List 2 Differentially express Entity List 3 All Entities Figure 34 The Choose Entity Lists window allows you to select the Entity Lists to display in the Venn Diagram GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 51 et Agilent Technologies Venn Diagram Sky List 1 Otterentialy om Entky Let2 Al Entties 54675 erthies 5148 entities Erthy Us 2 Offerentialy 7 between Nor faling Figure 35 The Venn Diagram allows you to compare the content of up to three Entity Lists Here we are comparing two Entity Lists of interest and the All Entities list Exercise 3 Filter probe sets based on fold change Although statistical analysis allows you to identify probe sets whose change in expression between at least two experimental conditions is statistically significant the magnitude of the change is still undefined In this exercise you will perform fold change analysis on the probe sets that were found to be differentially expressed between the CHF Etiology conditions to identify those that have at least a 1 5 fold change in expression between at least two of the CHF Etiology conditions 1 Activate the Fold Change tool
39. chemic and Male Idiopathic e Close the Profile Plot GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 25 et Agilent Technologies S Experiment Grouping Experiment parameters define the grouping or replicate structure of your experiment Enter experiment parameters by clicking on the Add Parameter button You can also edit and re order parameters and parameter values here Gender CHF Etiology PA N_249 cel Female Non failing PA N_300 cel Male 0 Non failing PA N_322 cel Male 0 Non failing PA N_326 cel Female Non faalling PAD_10 cel PAD_4 cel PAD_ cel PAD_3 cel PAS_3 cel PAS_6 cel PAS_ cel PAS_8 cel Figure 18 Experiment Grouping window allows you to specify the order of experimental parameters and conditions to be displayed in various views and plots GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 26 Agilent Technologies 2 Order Parameter Values Order Parameter Values The order in which the conditions appear in the window below will be the order in which they are displayed in views For example in a Profile Plot the first listed condition in the window below will be the first condition displayed on the X axis To re order the conditions select the condition and move it up or down by clicking on the appropriate icon Parameter Values Female
40. d expect 3 5 ratio for these probe sets to be close to 1 A 3 5 ratio of greater than 3 indicates that either the starting RNA was degraded or that there was problem with the cDNA synthesis reaction Ratio values greater than 3 will be colored red to flag your attention Pre mixed hybridization control transcripts in known staggered concentrations are added to the hybridization mix These controls allow you to monitor the hybridization and washing process The signal intensity of these controls should increase as expected with the known staggered concentrations Deviation from the expected intensity profile of these controls indicates a potential problem with the hybridization or washing process Principal Component Analysis PCA allows you to compare the expression profile of samples Samples representing the same experimental condition should be more similar to each other than to samples representing a different experimental condition Thus they should group closer together in a PCA plot Deviation from this assumption could be due to poor quality samples in the dataset or true biological variation within the populations under study 1 Activate the Quality Control on Samples tool to assess sample quality e In the Workflow panel open the Quality Control section and click on the Quality Control on Samples link 2 Interrogate results from Quality Control on Samples analysis Figure 28 e Click on the Correlation Coefficient tab Browse th
41. ding the parameter Gender O O GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 14 ti Agilent Technologies o Once all samples have been assigned a CHF Etiology and Gender parameter value click OK in the Experiment Grouping window S Experiment Grouping Experiment parameters define the grouping or replicate structure of your experiment Enter experiment parameters by clicking on the Add Parameter button You can also edit and re order parameters and parameter values here PA N_249 cel PA N_300 cel PA N_322 cel PA N_326 cel PAD_10 cel PAD _4 cel PAD_ cel PAD_3 cel PAS_3 cel PAS_6 cel PAS_ cel PAS_8 cel Add Parameter Edit Parameter Delete Parameter Figure 10 The Experiment Grouping window allows you to define the parameters associated with the experiment and parameter values associated with each sample GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 15 es Agilent Technologies 2 Experiment Grouping Experiment parameters define the grouping or replicate structure of your experiment Enter experiment parameters by clicking on the Add Parameter button You can also edit and re order parameters and parameter values here CHF Etiolog Gender PA N_249 cel Non failing Female PA N_300 cel Non failing PA N_322 cel Non failing PA N_326 cel Non failing PAD_10 cel PAD _4 cel PAD_7 cel PAD _3 cel PAS_3 cel PAS_6 cel PAS_ cel P
42. e Correlation Coefficients table This table reports the correlation coefficient calculated between all possible pairs of samples within the experiment e Click on the Correlation Plot tab Browse the Correlation Plot This plot reports the same information as the Correlation Coefficients table except that correlation values are being represented in a color scheme e Click on the Internal Controls 3 5 ratios tab Browse the Internal Controls 3 5 ratios table Samples with values above 3 will be colored red in the table e Click on the Hybridization Controls tab Browse the Hybridization Controls plot Each profile represents the signal intensities of the hybridization control probes in GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data Agilent Technologies each sample Here you see that the profiles across all samples are similar and that within each sample the profiles reflects the staggered concentration of these probes This indicates good hybridization and washing of the arrays e Browse the PCA Scores plot Click in the PCA plot to bring up the legend for PCA Here you see that samples with the same parameter values are being colored and shaped similarly For example Non failing samples are represented by a circle Idiopathic samples by a square and Ischemic samples by a triangle Female samples are colored red and male samples are colored blue In this dataset samples are grouping well according to the parameter
43. e In the Workflow panel open the Analysis section and click on the Fold Change link 2 In the Fold Change Step 1 of 4 Input Parameters window select the entity list and interpretation to be used for fold change analysis e Click the Choose button to select Entity List for the analysis GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 32 he Agilent Technologies o From the Analysis folder select the Differentially expressed between at least two CHF etiologies Entity List and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the CHF Etiology interpretation and click OK e Click Next gt gt 3 In the Fold Change Step 2 of 4 Pairing Options window select the conditions to be used for fold change analysis See Figure 36 o Inthe Select pairing option drop down menu select All conditions against control o In the Select control condition drop down menu select Non failing o Click Next gt gt 4 Perform fold change analysis using 1 5 fold change as cutoff e Note that GeneSpring GX will automatically perform fold change analysis with default cutoff of 2 fold See Figure 37 e Change the fold change cut off to 1 5 o Click the Change cutoff button and type in 1 5 for Fold Change cutoff You must hit Enter key on the keyboard for the change to be applied o Inthe Minimum number of pairs box leave the selection as 1 o This
44. e analysis tools within GeneSpring GX will require you to select an Entity List and an experiment Interpretation as inputs for the analysis One way to select these inputs for analysis is the method described above Alternatively before activating the link for the tool you can select the Entity List and Interpretation from the Navigator itself Once the tool is activated the Entity List and Interpretation that was selected in the Navigator will be automatically chosen as inputs for the analysis This method is often faster For the purpose of this tutorial instructions are written such that you will select Entity List and Interpretation inputs using the method described in steps 1 and 2 of this exercise e Clustering Algorithm o From the drop down menu choose Hierarchical e Click Next gt gt 3 In the Clustering Step 2 of 4 Input Parameters window select the input parameters for Hierarchical clustering e Cluster on o From the drop down menu select Conditions GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 38 Agilent Technologies Distance metric o From the drop down menu select Pearson Centered Linkage rule o From the drop down menu select Centroid Click Next gt gt 4 In the clustering Step 3 of 4 Output views window click Next gt gt 5 Save the results of Hierarchical clustering analysis In the Clustering Step 4 of 4 Object Details window type Congestive Heart
45. e icon Ia in the Experiment Grouping window Figure 10 and select the Experiment Parameter txt file contained within the Congestive Heart Failure Dataset for Affymetrix Tutorial folder that you had downloaded This folder also contains the data files for the experiment Click Open o The Experiment Grouping window should now be populated with parameter and parameter values for each sample See Figure 11 e Manually enter the parameter and parameter values for each sample o First we need to remove the information that has been loaded from the file Click within a cell under the CHF Etiology parameter column and click the Delete Parameter button Click within a cell under the Gender parameter column and click the Delete Parameter button o Click on the Samples column header to sort the samples according to Samples values Click on the Add Parameter button in the Experiment Parameters window In the Add Edit Experiment Parameter window type CHF Etiology in the Parameter Name box See Figure 12 o For the CHF Etiology parameter there are three unique values Non failing Ischemic and Idiopathic Use information in Table 1 to enter the appropriate parameter values for each sample Select all samples sharing the same parameter value e g select all four Non failing samples and click Assign Value Enter the parameter value o Once all samples have been assigned a CHF Etiology parameter value click OK o Repeat the same process by ad
46. e probe sets selected for the clustering analysis Exercise 2 Use the K means clustering algorithm to group probe sets with similar expression profiles together K means clustering will also allow you to group probe sets based on the similarity of their expression profiles Unlike Hierarchical clustering probe sets will be grouped into discrete clusters based on the similarity of their expression profiles GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 63 he Agilent Technologies 1 Activate the Clustering tool e In the Workflow panel open the Analysis section and click on the Clustering link 2 In the Clustering Step 1 of 4 Input Parameters window select the following e Click the Choose button to select Entity List for the analysis o From the Analysis folder select the Fold change greater than 1 5 in Non failing vs Ischemic or Non failing vs Idiopathic Entity List and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the CHF Etiology Gender interpretation and click OK e From the Clustering Algorithm drop down menu select K Means e Click Next gt gt 3 In the Clustering Step 2 of 4 Input Parameters window select the following e From the Cluster on drop down menu select entities o From the Distance metric drop down menu select Pearson Centered e Inthe Number of clusters box type 5 e Inthe Number of Itera
47. elsaucu Figure 42 Result from a clustering analysis is saved as a Classification object which can be displayed in Profile Plot view in the browser Section 6 Biological Queries After identifying genes of interest in GeneSpring GX it is often desirable to put these statistically significant findings into a biological context The first step in doing this involves determining the biological functions of these genes of interest Three main analyses that can be performed in GeneSpring GX to achieve this goal are GO Ontology analysis Gene Set Enrichment Analysis GSEA and Pathways analysis In this section you will learn how to use these three tools in GeneSpring GX to further analyze your statistically significant findings in a biological context Exercise 1 Perform GO Ontology analysis to determine the biological functions of your genes of interest At this point of the analysis you have identified your probe sets of interest i e probe sets that were found to be differentially expressed and or probe sets that show a certain GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 66 coe Agilent Technologies magnitude of change in expression between experimental conditions and have saved them as an Entity List The GO Analysis tool allows you to quickly group genes of interest based on the GO terms associated with each gene This then allows you to answer the questions what biological process molecular function and cell
48. erarchical Clustering algorithm to create a condition tree Within GeneSpring GX various clustering algorithms are available to identify probe sets with similar expression profiles or samples with similar expression profiles Hierarchical clustering Condition Tree groups samples conditions together based on the similarity of their expression profiles of the probe sets selected for analysis Thus building a condition tree can be used to perform quality control on samples Similar to the assumption made for PCA samples representing the same experimental condition should be more similar to each other than to samples representing a different experimental condition Thus they should group closer together in a PCA plot Deviation from this assumption could be due to poor quality samples in the dataset or true biological variation within the populations under study 1 Activate the Clustering analysis tool e In the Workflow panel open the Analysis section and click on the Clustering link 2 In the Clustering Step 1 of 4 Input Parameters window select the Entity List Interpretation and Clustering algorithm for the analysis e Click the Choose button to select Entity List for the analysis o From the Analysis folder select the All Entities Entity List and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the All Samples interpretation and click OK NOTE Most of th
49. erences in Gender of these samples do not account for the variances in gene expression across the samples Furthermore there is no interaction between the CHF Etiology and Gender parameters In other words how a gene s expression changes across CHF Etiology conditions does not depend on whether you are a female or male Conversely how a gene s expression changes across females and males does not depend on CHF Etiology conditions 6 Save the probe sets that passed the statistical test as Entity Lists In the Significance Analysis Step 7 of 8 Results window click Next gt gt This will save the three Entity Lists generated by the 2 way ANOVA If an Entity List does not contain at least one entity the list will not be saved In the Significance Analysis Step 8 of 8 Save Entity List window the default names that will be given to the three Entity Lists from the 2 way ANOVA would be GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 44 Agilent Technologies listed on the left hand side of the window See Figure 32 However for this analysis only one of the three Entity Lists contains 1 or more entities Thus only one Entity List is saved e f multiple Entity Lists are to be saved each Entity List name can be changed by selecting the list from the left hand panel and typing in the new name in the Name box e For our analysis we will save the Entity List with its default name 2way ANOVA corrected p value CHF Etiology
50. existing experiment from a previous project Choose Experiment O Open existing experiment Figure 3 Experiment selection dialog e In the New Experiment Experiment description window enter the information below See Figure 4 a Experiment name Congestive Heart Failure b Experiment type Affymetrix Expression c Workflow type Advanced Analysis GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data gt Agilent Technologies d Data analysis in GeneSpring GX can be performed using the Guided Workflow mode or the Advanced Analysis mode The Guided Workflow mode guides you through a workflow that is routinely performed on microarray gene expression profiling experiments This includes creating an experiment performing quality control on both samples and entities finding differentially expressed entities and performing Gene Ontology classification analysis This mode will be helpful to users who are new to GeneSpring GX or users who are not familiar with microarray gene expression data analysis The Advanced Analysis mode is classic GeneSpring GX This mode gives you the flexibility of performing analysis using any combination of filtering and analytical tools available in GeneSpring GX The Advanced Analysis mode will be useful to users who are familiar with GeneSpring GX and or users who are already familiar with microrarray gene expression data analysis e Click the OK button to continue F New Experiment Expe
51. hat are differentially expressed between experimental conditions Identifying genes that are differentially expressed between a set of conditions is often the first step in an attempt to understand the biological process under examination In this study we are attempting to understand the molecular mechanism underlying congestive heart failure caused by ischemic and idiopathic cardiomyopathy It is thought that the Ischemic and Idiopathic conditions may have resulted from the dysregulation of a set of key genes Thus identifying genes that are differentially expressed between these CHF etiologies may lead to a better understanding of the disease process Exercise 1 Find candidates for differential expression using the 2 way ANOVA Significant change in gene expression can be identified using parametric or non parametric statistical tests between 2 or more experimental conditions One way tests are applied to test for differential expression across conditions defined by one parameter i e Treatment type Two way tests are applied to test for differential expression across groups defined by two parameters i e treatment type and tissue type When comparing 3 or more conditions with one way tests a parametric or nonparametric post hoc test can subsequently be used to identify the pairs of conditions between which significant changes occur In this study each sample is associated with two different experiment parameters CHF Etiology and Gender Thus
52. hese three groups is represented by 2 female and 2 male patient samples The Congestive Heart Failure dataset can be downloaded from the GeneSpring GX web page http genespring com From there click on the GeneSpring GX link and follow the link to the GeneSpring GX Extras page Click on the GeneSpring GX 9 Dataset for Affymetrix Tutorial link This will lead you to download a zip file containing the Congestive Heart Failure gene expression microarray dataset to be used with this tutorial Upon unzipping the file you should see a folder labeled Congestive Heart Failure Dataset for Affymetrix Tutorial Within the folder you will see another folder labeled Dataset containing 12 data files corresponding to the 12 samples in the dataset You will also see an addition file named Experiment Parameters This file contains information regarding the parameters and parameter values associated with each sample You will need this information when we set up the experiment for analysis 1 To begin data analysis in GeneSpring GX create a new project and experiment e From the toolbar click Project gt New Project e Inthe Create New Project window type CHF Tutorial and click OK e Inthe Experiment Selection Dialog window click on the Create new experiment radio button See Figure 3 e Click OK r Experiment Selection Dialog Choose whether you would like to be guided through the creation of a new experiment or if you would like to open an
53. iment we are interested in interrogating GATA4 a transcription factor that is known GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 31 tt Agilent Technologies to regulate the expression of genes associated with cardiac hypertrophy Thus before you even begin your analysis you would like to quickly check the expression of this gene 1 View the expression data in a Profile Plot e Select the All Entities list from the Analysis folder in the navigator e Select the CHF Etiology interpretation from the Interpretations folder in the navigator e From the menu click View gt Profile Plot 2 Search for the probe sets that represent GATA4 in the data e From the menu click Search gt Entities e Inthe Search for box type in GATA4 and leave all other default settings This search criteria will instruct GeneSpring GX to search through all the probe sets in the All Entities list and return probe sets that have GATA4 in the selected annotation columns on the right hand side If you would like to expand the search to other annotation columns select the desired columns from the left hand side and click the appropriate arrow See Figure 24 Search Entities Step 1 of 3 Input Parameters Search in Entities Search for GATA4 Columns available for search Columns to search GO Avadis Probe Set ID Chromosome Number Ayvadis 3 gt Unigene Avadis Chromosome Start Index Avadis Gene Symbol
54. iment interpretation and the vertical axis shows the second condition of the same interpretation To change the condition to display use the drop down menu for the X Axis and Y Axis 2 Change the scatter plot to display expression data for the Idiopathic condition on the X Axis From the X Axis drop down menu select condition Idiopathic 3 Create an Entity List of probe sets whose expression values are downregulated in the Idiopathic condition relative to expression in the Ischemic condition Using a scatter plot we are only qualitatively identifying probe sets that appear to have lower expression values in the Idiopathic condition relative to the Ischemic condition Select a few probe sets that appear to be down regulated in the Idiopathic condition relative to the Ischemic condition by drawing a box around those probe sets Selected probe sets should now be colored green in the plot See Figure 22 To create an Entity List for these probe sets click on the Create Entity List icon E from the toolbar In the Entity List Inspector window Figure 23 type Lower expression in Idiopathic than Ischemic in Scatter Plot and click OK The Lower expression in Idiopathic than Ischemic in Scatter Plot Entity List is now saved under the All Entities list in the Navigator Close the Scatter Plot view 4 View the expression profiles of entities in the Lower expression in Idiopathic than Ischemic in Scatter Plot Ent
55. ing and Idiopathic o Entity List 2 Differentially expressed between Non failing and Ischemic o Entity List 3 All Entities e Click OK 3 Save an Entity List of probe sets that are differentially expressed between Non failing and Idiopathic but not between Non failing and Ischemic e Select region in Venn Diagram that corresponds to region A in Figure 35 and click on the Create Entity List icon EEI within the Venn Diagram window e In the Entity List Inspector window type Differentially expressed between Non failing and Idiopathic but not between Non failing and Ischemic in the Name box e Click OK 4 Save an Entity List of probe sets that are differentially expressed between Non failing and Ischemic but not between Non failing and Idiopathic e Select region in Venn Diagram that corresponds to region B in Figure X and click on the Create Entity List icon B within the Venn Diagram window e In the Entity List Inspector window type Differentially expressed between Non failing and Ischemic but not between Non failing and Idiopathic in the Name box e Click OK 5 Save an Entity List of probe sets that are differentially expressed between Non failing and Idiopathic AND Non failing and Ischemic GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 50 gh Agilent Technologies e These probe sets would correspond to region C However we do not need to save an Entity List containing these probe sets be
56. intensity values of the probe sets within the sample See Figure 9 r GeneSpring GX 9 Congestive Heart Failure i Project Search View Tools Windows Help AS S G Ea E i AN uth k Project Navigator q Congestive Heart Failure Ginisa BoxWhisker Plot Experiments a Experiment ii Congestive Heart Failure Quick Start Guide Experiment Gro Create Interpre Q a iii Congestive Heart Fai 1 x w Samples 5 Interpretations Qe All Samples 5 Analysis x Results Inte ermm Utilities Normalized Intensity Values rn m jm ee ee Legend BoxWhisker Plot PA N PA N PA N PA N PAD_ All Samples Displaying 54675 0 selected PAS_8 cel 0 79 78M of 101M E Figure 9 A new GeneSpring GX window for the newly created CHF Tutorial project containing the Congestive Heart Failure experiment Section 2 Setting Up the Experiment There are several steps that must be taken to set the experiment up for analysis in GeneSpring GX These steps include defining experimental parameters for the experiment assigning parameter values to each sample and creating experiment interpretations to group these samples by a parameter or combination of parameters Replicate measurements of the same gene for the same biological condition can add great value to the data mining process Statistical calculations based on replicate measurement error help determine the reliability of the analysi
57. ity List in a Profile Plot GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 29 a Agilent Technologies e Select the Lower expression in Idiopathic than Ischemic in Scatter Plot entity list from the Analysis folder in the navigator e Select the CHF Etiology interpretation from the Interpretations folder in the navigator e From the menu click View gt Profile Plot e If you had selected the correct probe sets from the Scatter Plot the expression profiles of these entities should show a down ward slope from the Ischemic condition to the Idiopathic condition e Close the Profile Plot E E Experiments on HEE Congestive Heart fail Congestive Hear x C Samples S E Interpretations F GE All Samples ae QE Gender CHF Etiolog 5 Analysis i All Entities Ischemic Experi Y Quick Star Experimen Create Int Quality 2 a Class Pr A Idiopathic Legend Sca Color by Non failing OO O p Y Axis Ischemic aa B X Axis Idiopathic Displaying 54675 6 selected 224M of 251M fi Figure 22 The Scatter Plot view allows you to plot each probe set according to the intensity values in two samples or two conditions GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 30 tii Agilent Technologies p Entitylist Inspector Lower expression in Idiopathic than Ischemic in Scatter Plot Created from
58. ix Agilent Technologies Choose Query Entity Daaa eee za onns ara easa cass wos abet T ooa msa Siana GAS roma 959 co ano7050 aisat Heas esr oro a cosa azasaat Hsasaa esr foro a oosa anza Heas esz foro a coa azsa Heas esz foro a cosa asea hsz fast foan a oa assiza Hezret GaTAbindina 2628 comesa assa iar M ATA faes coannessu Hs 243987 as conesa Gam mas eaa GaTAbindna 2 coauu350 Isara Hear faataa CaTa tindina 2028 0 0008350 Find GATA4 Find Next Find Previous 7 Match Case Configure Columns Figure 38 The Choose Query Entity window allows you to search for the gene of interest based on any annotations in the technology GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data S EE Agilent Technologies Find Similar Entities Step 1 of 3 Input Parameters Define inputs For Find Similar Entities analysis Pearson Figure 39 Input window allows users to input the parameters for Find Similar Entities tool GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 58 Eee Agilent Technologies 3 Change the correlation coefficient cutoff for the analysis e The Find Similar Entities Step 2 of 3 Output View of Find Similar Entities window displays the analysis results 96 probe sets were found to have an expression profile with a correlation coefficient greater than 0 95 to the expression profile of GATA4 Note that a 0 95 corre
59. lation coefficient cutoff was automatically applied for the analysis See Figure 40 e Increase the correlation cutoff for the analysis to o Click Change Cutoff o Inthe Minimum box type 0 99 o Inthe Maximum box type 1 You must hit Enter key on the keyboard for the change to be applied Click Close The number of probe sets that pass the current analysis parameters is displayed on top of the profile plot GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 59 tii Agilent Technologies Find Similar Entities Step 2 of 3 Output View of Find Similar Entities The expression profile of the target entity is shown in bold Also displayed are the expression profiles of entities whose correlation coefficients to the target profile are above the similarity cutoff To alter the similarity cutoff click on the Change cutoff button Displaying 96 entities out of 10905 entities satisfying cutoff in range 0 95 1 0 Profile Plot male Non faili Female Ischemic Female Idiopathic Male Non failing Male Ischemic Male Idiopathic Ta FY E meel e a M al e aaasta ea Gs Coe Figure 40 This window displays the expression profiles of probe sets that pass the current filter criteria 4 Save the probe sets as an Entity List e In the Find Similar Entities Step 2 of 3 Output View of Find Similar Entities window click Next gt gt e Inthe Find Similar Entities Step 3 of 3 Save Entity List wi
60. lay GeneSpring GX will display expression data for the entities in the selected entity list according to the sample grouping defined by the selected interpretation GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 20 Agilent Technologies a GeneSpring GX 9 Congestive Heart failure DAR i Project Search View Tools Windows Help Proje ct Navigator a EJ New Project x Experiments ee ee experiment Set 1f Congestive Heart Failure Quick Start Guide Experiment Grouping Create Interpretation Legend Profile Plot Color By Non Failing lt gt iii Congestive Heart fai x w Samples a J Interpretations oo Qt All Samples E a Analysis Normalized Intensity Values Non failing Ischemic Idiopathic 2 9 fi E CHF Etiology gues LY Displaying 54675 0 selected 187M of 209m if Figure 16 Expression data for the Congestive Heart Failure experiment is being displayed in the Profile Plot View The experimental conditions on the x axis are defined by the Interpretation selected in the Navigator 4 Create an interpretation in which samples are grouped by the parameter Gender e When an experiment has more than one experimental parameter associated with it samples can be grouped in multiple ways Thus for a single experiment multiple interpretations are often created e Repeat this exercise by first activating the Create Interpre
61. link e Inthe Create Interpretation Step 1 of 3 Select parameters window check the parameters CHF Etiology and Gender and click Next gt gt e In the Create Interpretation Step 2 of 3 Select conditions window make sure that all the conditions defined by the parameters CHF Etiology and Gender are checked There are six unique parameter values Therefore samples will be grouped into six unique experimental conditions Idiopathic Female Idiopathic Male Ischemic Female Ischemic Male Non failing Female and Non failing Male o Make sure that the box Average over replicates in conditions is checked o Click Next gt gt e Save the new interpretation as CHF Etiology Gender o Inthe Name box type CHF Etiology Gender o Click Finish Section 3 Viewing expression data in GeneSpring GX Now that we have set up the experiment you will explore the different ways to view your data in this next part of the tutorial The general rule for viewing expression data in GeneSpring GX is that what you see in the browser is determined by what objects you have selected in the navigator Two objects that nearly always have to be selected to view data in the browser are an entity list and an experiment interpretation GeneSpring GX will only show the expression data for the entities in the selected list The normalized intensity values displayed for each entity will be determined by the selected interpretation
62. ll Entities list is first input list for any analysis in GeneSpring GX Suppose you take the All Entities list as the input for Filter on Flags analysis to filter for quality probes and created an Entity List of the results This Entity List will be stored in the Navigator as a child of the All Entity List node In other words each data object will be saved as a child of the node of the input Entity List used to generate that object In the HeLa cells treated with compound X experiment click on the plus sign next to the All Entities list to open that node Open the Filtered on Flags P M Entity List Open the T test p lt 05 Entity List Open the Fold change gt 2 0 Entity List Open the GO Analysis folder Your navigator for the experiment should now look like the one displayed in Figure 2b From this data hierarchy structure can quickly tell that started my analysis with the All Entities list and used that list as input for Filter on Flags analysis to obtain a list of quality probes This filtered list was then used for T test statistical analysis to obtain a list of differentially expressed probes which were then subjected to Fold Change analysis to obtain probes with a greater than 2 fold change between the conditions The resulting Entity List from Fold Change analysis was then used as the input list for GO Analysis and Clustering analysis 6 Close the Demo Project To close the project click on the X button next to the Demo Projec
63. lts are then reported in the Find Similar Pathways Step 2 of 2 Results window For more detailed explanation of the values reported please refer to the GeneSpring GX User Manual e Inthe Similar Pathways panel pathways that have a significant overlap with the Entity List will be listed With this analysis you will see that no pathways are GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 74 i Agilent Technologies found to have a significant overlap p value cutoff of 0 05 with the Entity List See Figure 47 Keep in mind that GeneSpring GX comes pre loaded with only a small set of pathways To expand this analysis first import more pathways into GeneSpring GX In the Non similar Pathways panel all the pathways in which GeneSpring GX cannot match a single entity in the entire experiment to the pathways are listed F Find Similar Pathways Step 2 of 2 Results Pathways showing significant overlap with entities in the entity list selected For the analysis are displayed in the left hand spreadsheet To modify the level of significance click on the Change Cutoff button and enter new p value cutoff To import significant pathways into the experiment select the pathways and click Custom Save button Pathways in which a match cannot be made for any entities on the array are listed in the right hand spreadsheet Displaying 0 Objects satisfying corrected p value cutoff 05 To change use the control buttons below Similar Pathw
64. lts as an Entity List 4 Interrogate the data for the probe set representing the GATA4 gene You should see two expression profiles that are shape like a V where expression is significantly decreased in the Ischemic condition Select the profile with the lowest expression value in the Ischemic condition See Figure 26 e Double click on profile to activate the Entity Inspector window for the probe set See Figure 27 GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data Bee Agilent Technologies e The Entity Inspector window gives detailed information regarding the probe set The information are organized into three separate tabs o Annotation tab This window contains Information for the gene that the probe set represents Note that only a pre selected set of annotations are being displayed To display other annotations that are contained within the technology click on the Configure Columns button and select the annotation columns of interest Also note that the annotation values are actual links For example if you click on 2626 value for the Entrez Gene annotation the Entrez Gene page for that specific entry will appear o Data tab This window displays the Normalized and Raw intensity values for each sample in the experiment Also displayed are the experimental conditions that each sample belongs to GeneSpring GX will only show the conditions for the parameter that is being used to defined samples in the chosen interpret
65. mediating the disease mechanism 1 Activate the Find Similar Entities tool e In the Workflow panel open the Analysis section and click on the Find Similar Entities link 2 In the Find Similar Entities Step 1 of 3 Input Parameters window select the parameters for the analysis e Click the Choose button to select Entity List for the analysis GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 39 Eee Agilent Technologies o From the Analysis folder select the Differentially expressed between at least two CHF etiologies Entity List and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the CHF Etiology Gender interpretation and click OK e Click on the Select button to select the target probe set for the analysis o Inthe Find box type in GATA4 This will instruct GeneSpring GX to find probe set that contains GATA4 in any of the annotation columns in the table o The probe set representing GATA4 should now be highlighted in green Sort the values in the Gene Symbol column by clicking on the column header Note that there are multiple probe sets representing the GATA4 gene o Select the probe set with the Probe Set ID 243692 _at and click OK See Figure 38 e In the Similarity Metric drop down menu select Pearson Figure 39 e Click Next gt gt GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 56 x
66. n the Results Interpretation section and click on the GSEA link 4 Perform GSEA e Inthe GSEA Step 1 of 2 Input Parameters window select the Entity List to be used for the analysis o Click the Choose button to select Entity List for the analysis From the Analysis folder select the All Entities list and click OK o Click the Choose button to select Interpretation for the analysis From the Interpretation folder select the CHF Etiology Interpretation and click OK o Click Next gt gt e Inthe GSEA Step 2 of 5 Pairing Options window select the pair of conditions to compare See Figure 44 o Select all three listed conditions o Click Next gt gt e Inthe GSEA Step 3 of 5 Choose Gene Sets window select the following parameters for the Gene Set Enrichment Algorithm and click Next gt gt See Figure 45 Min no of Genes to be found in a Gene Set 15 Maximum no of permutations 1000 Gene Set Search Simple Search BROAD Gene Sets C4 Neighborhood Sets O O 0 O GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 70 Agilent Technologies GSEA Step 2 of 5 Pairing Options You can choose all conditions against a control or explicitly specify pairs of conditions Select pairing option Pairs of conditions E Condition Pairs Select Condition 1 Condition 2 Idiopathic Ischemic Idiopathic Non failing Ischemic Non Failing Figure 44 The Pairing Options window allows yo
67. nalysis Tutorial for Affymetrix data 10 et Agilent Technologies F New Experiment Step 1 of 4 Load Data You can choose data files previously used samples or both to use in this experiment Once a data file has been imported and used as a sample it will be available for use in any future experiment Selected Files and samples PA N_249 cel PA N_300 cel PA N_322 cel PA N_326 cel PAD_10 cel PAD_4 cel PAD_7 cel PAD_9 cel PAS 3 cel PAS_6 cel PAS _7 cel PAS_8 cel Choose Files Choose Samples Remove Figure 6 The New Experiment Load Data window displays the files to be loaded Type TT il 3 Define summarization and baseline transformation methods for the experiment e Changes in gene expression across samples within an experiment may be attributed to true biological variation or systematic variation To answer biological questions that the experiment was designed to address we only care to measure true biological variation across the experimental conditions Applying data normalization allows you to limit the systematic variation in the data such that true biological variations are revealed and more readily detected e In the New experiment Step 3 of 4 Summarization Algorithm window select the following options See Figure 7 o Summarization Algorithm RMA o Baseline Transformation Baseline to median of all samples o Click Finish GeneSpring GX 9 Data Analysis
68. ndow type Entities similar to GATA4 with cutoff 0 99 in the Name box e Click Finish Section 7 Clustering Gene Expression Data Within GeneSpring GX various clustering algorithms are available to group genes with similar expression profiles together These algorithms include K means and Hierarchical clustering among others Genes that share similar biological functions are thought to exhibit similar expression profiles across a set of experimental conditions Thus clustering analysis can be used to cluster your genes of interest that share similar biological functions together GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 60 tt Agilent Technologies Exercise 1 Use the Hierarchical clustering algorithm to build an entity and condition tree Hierarchical clustering algorithm can be used to generate an entity tree in which probe sets are grouped based on the similarity of their expression profiles across the experimental conditions selected for analysis This relationship between probe sets is displayed in a dendrogram Hierarchical clustering algorithm can also be used to group samples or conditions based on their expression across a set of probe sets In this way the expression profiles of samples or conditions can be compared Here you would expect that replicate samples within the same experimental condition would be more similar in their expression profiles than samples belonging to a different condition 1 Acti
69. ng GX to return all pathways saved in the database See Figure 50 o In the Search Wizard Step 3 of 3 Search Results window select all of the pathways in the table except IL 7 and click on the Add selected pathways GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 7I ti Agilent Technologies to active experiment icon These pathways should now be added to the Imported Pathways folder within the Congestive Heart Failure Experiment o These pathways can be view the same way as we did with the IL 7 pathway Search Wizard Step 3 of 3 Search Results The table below shows a maximum of 500 search results To change the maximum go to Tools gt Options gt Miscellaneous gt Search Results You can select on the table and based on the object type you can inspect open or add object as appropriate EL oa bell SJ oa Owner AJoha6Beta4integrin Mon Jan Of 14 38 55 P Re ar OF TESS ee AndrogenReceptor BCR EGFR1 Hedgehog ID IL IL2 IL3 IL4 ILS ILG ae Jan OF 14 38 59 E ta Jan OF 14 38 59 P eee Displaying 21 results matching the search criteria Figure 50 The search results window displays pathways that satisfy the input search criteria Select the pathways you want to add to the current experiment and click on the Add selected pathways to active experiment icon GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data
70. ombined tree o Click on the icon WE to contract the tree vertically o Click on the icon to expand the tree vertically gt o Click on the icon a to expand the tree pone o Click on the icon Inspect the grouping of E o Note that the condition tree is organized into 3 distinct grouping clusters with each cluster representing a CHF Etiology Interestingly the expression profiles of the Idiopathic samples across the genes found to be differentially expressed and have a fold change of 1 5 or greater between at least two conditions are more similar to Non failing samples than to Ischemic samples This may indicate that ischemic and idiopathic cardiomyopathy have distinct disease mechanisms 7 Close the tree view GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 62 Hierarchical Combined Tree of significant 1 5 fold change prob fe x TY a PA N_30 PA N_32 PA N_32 PAD 10 PAD 4 cel PAD_7 cel PAD_9 cel PAS 3 cel PAS_6 cel PAS_8 cel PAS_7 cel lt ol 7 A Au Figure 41 Hierarchical Clustering can be used to generate an entity tree and a condition tree In one dimension entity tree probe sets are grouped according to the similarity of their expression profiles across a set of samples or conditions In the other dimension samples or conditions condition tree are grouped according to the similarity of their expression profiles across th
71. on Tree o Click on the icon HE to expand the tree vertically ss gt GE to contract the tree vertically EF to expand the tree horizontally o Click on the icon o Click on the icon EET sen Click on the icon to contract the tree horizontally Close the Condition Tree view GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 39 ee Agilent Technologies Congestive Heart Failure Hierarchical Condition Tree on All Samples PA N_24 PA N_30 PA N_32 PA N_32 PAD 10 PAD 4 cel PAD_7 cel PAD_9 cel PAS_3 cel PAS_6 cel PAS_8 cel PAS_ cel CFB SNX19 SNX19 MAML1 sEC24C RAP1A SPOCK1 MxIl UNC119B ACADS CUTLL TRAMZ TRAMZ CBFB TCEAL4 N A RAB3GAP2 RAB3GAP2 sEC24D SERPINA3 LEPROT LEPROT NKTR Nik LR Figure 29 Condition Tree generated using the Hierarchical Clustering algorithm in which samples are grouped according to the degree of similarity of their expression profiles over the selected probe sets Samples with more similar expression profiles are grouped closer to each other in the tree Section 5 Perform Quality Control on Probe sets Before you proceed with analysis a good practice is to remove probe sets with unreliable expression measurements These include probe sets representing genes that are not expressed in any of the samples Including these probe sets in analyses may yield erroneous results such as false positives in st
72. ow This window is organized into 3 main parts See Figure 2a e Locate the Navigator on the left side The navigator displays the project that you have opened and all the experiments associated with the project Once experiment s within the project are opened the navigator will be divided into multiple panels The top panel is the project navigator and each experiment will have its own navigator panel The Navigator panel for each experiment contains folders of data objects that have been imported into or created within GeneSpring GX Items in multiple navigator folders are usually selected to create a useful data display e Locate the browser in the center portion of the window The browser is an empty space within the interface that gets populated by a View or analysis result window GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data Agilent Technologies Locate the Workflow panel on the right The Workflow panel contains various tools that you will use to set up an experiment and analyze your data The tools are grouped into different categories that reflect the order of an analysis workflow For example the first category is Experiment Setup followed by Quality Control and then Analysis 5 Explore the Navigator of GeneSpring GX A new feature in GeneSpring GX 9 0 is the data hierarchy structure of the Navigator This allows users to quickly determine the workflow that was performed to obtain a data object For example the A
73. r Analysis data objects are organized in a hierarchical structure Section 1 Loading Data and Creating an Experiment Exercise 1 Import Data and create an experiment Now that you have been introduced to the GeneSpring GX terminology and interface we will now begin the analysis of the one color Affymetrix dataset in GeneSpring GX Experimental Design of the Tutorial Dataset Patients with cardiomyopathy have weakened heart pumps which can result in the heart not being able to pump enough blood to the body s other organs a condition known as congestive heart failure CHF Patients with ischemic cardiomyopathy have weakened heart pumps due to insufficient blood and oxygen being delivered to the area Patients with idiopathic cardiomyopathy have weakened heart pumps due to an unknown cause To better understand the molecular mechanism underlying congestive heart failure caused by ischemic and idiopathic cardiomyopathy transcriptional profiling of human myocardial samples from patients with the mentioned etiologies and non failing hearts was performed GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data tt Agilent Technologies In the experiment that you will be analyzing myocardial mRNA was collected amplified labeled and applied to Affymetrix HG U133 Plus 2 arrays The experiment consists of 4 biological replicates for each of the following groups non failing ischemic cardiomyopathy and idiopathic cardiomyopathy Each of t
74. riment description Enter a name For the new experiment select the appropriate experiment type and choose the desired workflow Guided workflows will take you through experiment creation and analysis while advanced analysis will allow access to the Full set of analysis tools Experiment name Congestive Heart Failure Experiment type Affymetrix Expression Workflow type Advanced Analysis Experiment notes Figure 4 Enter experiment description and select workflow type in this window 2 Import data files into GeneSpring GX e Inthe New Experiment Step 1 of 4 Load Data window click on the Choose File s button to search for the data files See Figure 5 e Inthe Open window locate the data files of the dataset e Select all 12 files and click Open GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data E Agilent Technologies e In the New Experiment Step 1 of 4 Load Data window click Next gt gt See Figure 6 e Inthe New Experiment Step 2 of 4 Select ARR Files window click Next gt gt 2 New Experiment Step 1 of 4 Load Data You can choose data files previously used samples or both to use in this experiment Once a data file has been imported and used as a sample it will be available for use in any future experiment Type Selected files and samples Figure 5 Use the New Experiment window to choose files or samples to import GeneSpring GX 9 Data A
75. s from the One way ANOVA See Figure 33 e The Significance Analysis Step 7 of 8 Results window reports results from the One way ANOVA in several displays For explanation of each result display consult the GeneSpring GX User Guide Manual GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 47 Agilent Technologies F Statistical Analysis Step 7 of 8 Results To apply a new p value cutoff click on Change cutoff button To save entities that passed the applied cutoff click Next To save a subset of these entities as a custom entity list select entities From the view and click Save custom list button Displaying 10905 entities out of 44 566 satisfying p value cutoff 05 Differential Expression Analysis Report a Post hoc Analysis Report a Test Description Test Description Selected Test Oneway ANOVA P value computation Asymptotic SNK Post Hoc test Entities Found to be differentially expressed are represe Multiple Testing Correction Benjamini Hochberg in the blue boxes while entities found not to be differentially expressed are represented in the orange boxes To save entities of interest as an entity li select one or multiple boxes and click on the Union or Intersection buttor Result Summary P all P lt 0 05 P lt 0 02 P lt 0 01 P lt 0O Correcte 44566 10905 5281 2659 1191 Expected 2228 891 445 222 Result Summary Group Name Non failing Ischemic Idiopathic
76. s results Thus having replicate samples for each experiment condition is a crucial part of good experiment design In the Congestive Heart Failure CHF experiment each of the 12 samples represents one of three CHF Etiology conditions Each unique CHF Etiology condition is represented by 4 replicate samples 2 females and 2 males To group individual samples into replicates within an GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 13 cot Agilent Technologies experimental condition you must first define the parameter s associated with the experiment and assign each sample the proper parameter values Exercise 1 Define experiment parameters and assign parameter values to each sample 1 Activate the Experiment Grouping window e In the Workflow panel open the Experiment Setup section and click on the Experiment Grouping link 2 Create parameters and assign parameter values e Parameters associated with your experiment can be added to this window in one of two ways Parameters and parameter values for each sample can be loaded automatically from a file containing such information To add parameters and parameter values from file click on the Load experiment parameters from file icon select the file and click Open Parameters and parameter values can also be added to this window manually We will explore both methods in this tutorial e Load parameter and parameter values from file o Click on the Load parameters from fil
77. selection Creation date Mon Jan 14 13 58 52 GMT 05 30 2008 Last modified date Mon Jan 14 13 58 52 GMT 05 30 2008 Owner gxuser Technology Affymetrix GeneChip HG U133_Plus_2 Number of entities 6 Experiments Entities Attributes Probe Set Unigene Gene Sym Gene Title Entrez Ge GO Avadis 204409_5_at Hs 461178 EIFI AY eukaryotic 9086 GO 0006412 204410_at Hs 461178 EIFLAY eukaryotic GO 0006412 205000_at Hs 99120 DDX3Y DEAD GO 0005737 221008_s_at Hs 106576 AGXT2L1 alanine glyox GO 0005739 228492_at Hs 655309 USP9Y ubiquitin GO 0006511 azasaat Hs 672300 JN Transcribed Sunn Find Next Find Previous Match Case Configure Columns Figure 23 The Entity List Inspector shows the probe sets contained in the Entity List along with selected annotations associated with each probe set Exercise 4 Use the Entity Inspector to view data for a single entity When performing data analysis there is often a need to interrogate the data for a single entity of interest For example if gene x is known to play an important role in the biological process the experiment is examining you may want to immediately see the expression profile of gene x in your experiment In GeneSpring GX you can search for a gene of interest and interrogate the data for that gene For this Congestive Heart Failure exper
78. setting will instruct GeneSpring GX to return all probe sets with at least a 1 5 fold difference in intensity value between Non failing and Ischemic OR Non failing and Idiopathic o Click Close e The Fold Change Step 3 of 4 Fold Change Results window should now reflect the results of the analysis using the new cutoff o The number of probe sets that pass the current filter criteria is displayed on top of the profile plot o Clicking on the Fold change tab below the Profile Plot will allow you to see the calculated fold change for each probe set that passed the filter e Click Next gt gt 5 Save filtered probe sets as an Entity List e Inthe Fold Change Step 4 of 4 Object Details window type Fold change greater than 1 5 in Non failing vs Ischemic or Non failing vs Idiopathic in the Name box e Click Finish GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data aye oes Agilent Technologies Fold Change Step 2 of 4 Pairing Options You can choose all conditions against a single control condition or explicity specify one or more pairs of conditions Select pairing option fall conditions against control Select control condition Non failing lt lt Back Next gt gt Finish Figure 36 The Fold Change Pairing Options window allows you to select the pair s of conditions for fold change analysis Fold change analysis can also be performed between all condition against control Gene
79. t Navigator bar Alternatively you can go to toolbar and choose Project gt Close Project GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data Samples S Interpretations o qt All Samples i Treatment Non avera 5 Analysis All Entities Navigator Normalized Intensity Values Untreated Browser Treatment Experiment Setup y Quick Start Guide Experiment Grouping Create Interpretation Utilities Legend Profile Plot Color By Untreated Treated E 2 6 D d Description Displaying 20173 0 selected 155Mof 208 fi Figure 2a GeneSpring GX main window is divided into 3 main sections 1 navigator 2 browser and 3 Workflow panel GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data et Agilent Technologies Demo Project S E Experiments 5 HeLa cells treated with compound X HeLa cells treated with compound X H Samples S E Interpretations Q All Samples oe Treatment Non averaged Qu Treatment 5 Analysis z a s 9 Filtered on Flags P M 3 T test p lt 05 5 8 Fold change gt 2 0 S E G0 Analysis C molecular_function 2 cellular_component E 4 K means on Treatment 3 Fold change gt 5 0 gig Hierarchical Combined Tree on All Samples 3 Significant genes that bind both Cadmium and Copper gj Entities similar to 4_23_P37983 0 95 lt r lt 1 0 E Imported Pathways Figure 2b View of the GeneSpring GX Navigato
80. t hand spreadsheet Displaying 1 Objects satisfying corrected p value cutoff 5 To change use the control buttons below Similar Pathways g Non similar Pathways Pathway Number 0 Number 0 Number 0 Pathway Number of Nodes IMT HeavyM etal Pathway 16 Change cutoff Figure 48 The results window automatically updates the results as a new p value cutoff is entered 5 Save the significant pathway results e In the Find Similar Pathways Step 2 of 2 Results window click Finish This will save all of the pathways in the Similar Pathways panel to the Similar Pathways satisfying p value cutoff folder in the Navigator of the active experiment 6 View the IL 7 pathway in GeneSpring GX e First make sure that the Fold change greater than 1 5 in Non failing vs Ischemic or Non failing vs Idiopathic Entity List is selected in the Navigator as this is the input list for Find Similar Pathways analysis Like any other Views in GeneSpring GX only the entities found in the Entity List that are also found in the pathway will be displayed GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 76 ee Agilent Technologies e Double click on the IL 7 pathway icon The IL 7 pathway should now be displayed in the browser The nodes in the pathway that have a blue outline are the gene products that are also found in the currently selected Entity List See Figure 49 e You can zoom into any part of
81. t in a spreadsheet view e From the Analysis navigator folder select All Entities list e From the Interpretations navigator folder select CHF Etiology interpretation e From the menu Click View gt Spreadsheet The spreadsheet window opens and reports the normalized intensity values for all entities in the selected list See Figure 21 e Close the Spreadsheet r GeneSpring GX 9 Congestive Heart Failure i Project Search View Tools i Congestive Heart Failure Sirise amp Spreadsheet ag iments Experim Y Probe Set Non faili Ischemic Idiopathic Unigene Gene Sym Gene Title Entrez Quick Start 204858 s at 0 114522 0 00279 0 166543 Hs 592212 ECGF1 endothelial ce 1890 Experiment 217497_at_ 0 15138 0 033212 0 172005 Hs 592212 ECGFI endothelial ce 1890 202825 at 0 119742 0 07369 0 005886 SLC25A4 Solute carrier 291 al 1557631 at 0 102730 0 027917 0 04192 Hs 246506 SLC25A6 Solute carrier 293 EH Congestive Heart 1 x f 203466_at 0 00534 0 102159 0 02197 Hs 75659 MPVI7 MpVi7 4358 Quality A Samples 207118 s_at 0 029490 0 177704 0 11297 Hs 192316 MMP23A matrix 8510 85 J Interpretations 206235_at 0 069334 0 03420 0 033387 Hs 166091 LIG4 ligase IV DNA 3981 Analysis N Ge All Samples 227766_at 0 19001 0 14706 0 2329129 Hs 166091 LIG4 ligase Iv DNA 3981 Gi E 218527at _ 0 12655 0 03
82. t parameter values associated with each sample Exercise 2 Create experimental interpretations to group replicate samples into conditions Creating the appropriate parameters and assigning the proper parameter values to each sample allows you to identify and group samples in multiple ways This next step will demonstrate how you can create different interpretations to group samples in different ways for subsequent analysis 1 Activate the Create Interpretation window e In the Workflow panel open the Experiment Setup section and click on the Create Interpretation link 2 Create an interpretation in which samples are grouped by the parameter CHF Etiology e Inthe Create Interpretation Step 1 of 3 Select parameters window check the parameter CHF Etiology and click Next gt gt See Figure 13 e In the Create Interpretation Step 2 of 3 Select conditions window make sure that all the conditions defined by the parameter CHF Etiology are checked There are three unique parameter values for parameter CHF Etiology Therefore samples will be grouped into three unique experimental conditions Idiopathic Ischemic and Non failing See Figure 14 o Make sure that the box Average over replicates in conditions is checked When this box is checked the average signal intensity value for each entity across the replicate samples in the condition will be used for display and for analysis If this box is unchecked the
83. tation window o In the Workflow panel open the Experiment Setup section and click on the Create Interpretation link e Inthe Create Interpretation Step 1 of 3 Select parameters window check the parameter Gender and click Next gt gt e In the Create Interpretation Step 2 of 3 Select conditions window make sure that all the conditions defined by the parameter Gender are checked There are two unique parameter values for parameter Gender Therefore samples will be grouped into two unique experimental conditions Male and Female o Make sure that the box Average over replicates in conditions is checked o Click Next gt gt e Save the new interpretation as Gender GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 21 he Agilent Technologies o Inthe Name box type Gender o Click Finish 5 Create a new experiment interpretation that will group samples by both parameters CHF Etiology and Gender e Samples can also be grouped by multiple parameters For example you can group the samples in this experiment by both CHF Etiology and Gender Only samples with the same CHF Etiology value and Gender value will now be grouped in the same condition and be considered replicate samples e Repeat this exercise by first activating the Create Interpretation window o In the Workflow panel open the Experiment Setup section and click on the Create Interpretation
84. te of Bioinformationcs Bangalore India To import new BioPAX pathways into GeneSpring GX go to the Utilities section of the Workflow and click on the Import BioPax pathways link Pathways will be imported and saved in the GeneSpring GX database For more information regarding importing BioPAX pathways please refer to the GeneSpring GX Quick Start Guide or the GeneSpring GX User Manual For this tutorial we will look at the pathways that have already been pre loaded into GeneSpring GX 1 Activate the Find Similar Pathways tool e n the Workflow panel open the Results Interpretation section and click on the Find Similar Pathways link 2 Perform Find Similar Pathways analysis e Inthe Find Similar Pathways Step 1 of 2 Input Parameters window select the Entity List to be used for the analysis o Click the Choose button to select Entity List for the analysis From the Analysis folder select the Fold change greater than 1 5 in Non failing vs Ischemic or Non failing vs Idiopathic list and click OK o Click Next gt gt 3 Review results for Find Similar Pathways analysis e The Find Similar Pathways tool will match the entities in the input Entity List to all of the pathways that have been saved in the GeneSpring GX database For each pathway the Fisher s Exact test is used to compute a p value that indicates the whether the overlap observed between the entities found in the Entity List and the pathway is due to chance The resu
85. terpretation Select conditions window allows you to determine what conditions you would like to include for the interpretation to be created In addition it GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 19 KSN Agilent Technologies allows you to choose whether or not to average the intensity values for each entity across the replicates in each condition Create Interpretation Step 3 of 3 Save Interpretation This page displays the details of the interpretation created Name CHF Etiology Notes Creation date Mon Jan 14 10 15 17 GMT 05 30 2008 Last modified date Mon Jan 14 10 15 17 GMT 05 30 2008 Owner gxuser Average over replicates in conditions Yes Parameters Conditions Parameters CHF Etiology Ce Figure 15 The Save Interpretation window saves the details of the interpretation created 3 View expression data as defined by the CHF Etiology interpretation See Figure 16 e A profile plot displaying your expression data should automatically appear in the browser of GeneSpring GX e Look into the Analysis folder The All Entities list is in bold indicating that the list is selected for display in the profile plot GeneSpring GX will only show the expression data for the entities in the selected list e Look into the Interpretations folder The CHF Etiology interpretation is in bold indicating that the interpretation is selected for disp
86. the number of probe sets that passed the filter criteria Here you see that 44566 probe sets out of 54675 probe sets in the All Entities list had a signal intensity value above the 20th percentile in 100 of the samples of at least 1 experimental condition e Click Next gt gt 5 In the Filter by Expression Step 4 out of 4 Save Entity List window save the probe sets that passed the filtering criteria as an Entity List e Inthe Name box type OC probe sets click Finish F Filter by Expression Step 3 of 4 Output Views of Filter by Expression Profile plot and spreadsheet view of entities that passed the filter Displaying 44566 of 54675 entities where at least 100 percent of samples in any 1 out of 6 conditions have values between 20 0 and 100 0 percentile Profile Plot uw w 3 Gi gt gt n a ge v N w z n failing Femal Non failing Male Ischemic Female Ischemic Male Idiopathic Female Idiopathic Male t Non failing u Ischemic t Idiopathic aA Profile Plot Figure 30 Filter on Expression window shows the number of probe sets that would pass the current filter criteria Section 6 Identifying Probe sets of Interest GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 42 com Agilent Technologies After eliminating samples and probe sets of poor quality the next step in analyzing this dataset will be to identify probe sets t
87. tions box leave the default number of iteration at 50 e Click Next gt gt 4 In the Clustering Step 3 of 4 Output views window click Next gt gt e This window allows you to preview the results of the clustering analysis We will look more closely at the results once we have saved it as a data object in the Navigator Note that there is a Cluster Set tab within the window This view allows you qualitatively assess the quality of the clustering Click on the Cluster Set tab 5 Save the results of K means clustering analysis e Inthe Clustering Step 4 of 4 Object Details window type K Means with 5 clusters Entity Classification in the Name box e In GeneSpring GX a clustering result is saved as a data object called Classification e Click Finish 6 Inspect the clustering analysis results e Once the Classification has been saved the clustering results will automatically be displayed in the currently selected view For this analysis if a Profile Plot is the currently selected view GeneSpring GX will display 5 Profile Plots within one window Each Profile Plot corresponds to a cluster and displays the expression profiles of the probe sets belonging to that cluster A Profile Plot is the most useful GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 64 Eee Agilent Technologies display for viewing clustering results If the Profile Plot is not already selected select it before double clicking on the Cl
88. tp biopax org This website also contains links to a number of pathway databases that provide pathways in the BioPAX format such as KEGG BioCyc and NCI Cancer Cell Map A list of other sources of BioPAX compatible pathways are provided at the Pathguide site http pathguide org Note You are not permitted to download or import KEGG pathway data for use with the Software unless you have obtained the appropriate license to do so directly from Pathway Solutions Inc pws kegg org See also http pathway jp index html or http www biopax org for details Other pathway networks data providers may require similar license GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 73 te Agilent Technologies agreements and User should obtain all appropriate licenses before downloading any such data The Pathways tool in GeneSpring GX allows you to integrate information regarding the dynamics and dependencies of the genes of interest within a pathway The Find Similar Pathways tool also allows you to quickly answer the questions what pathways are my genes of interest found in In which biological pathways is there a significant enrichment of my genes of interest GeneSpring GX comes pre loaded with a small set of 21 pathways in the BioPAX format courtesy of the Computation Biology Center at Memorial Sloan Kettering Cancer Center the Gary Bader s lab at the University of Toronto the Pandey Lab at Johns Hopkins University and the Institu
89. u to select the pairs of conditions to be used for GSEA GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data fal Agilent Technologies GSEA Step 3 of 5 Choose Gene Sets Please set the parameters for the Gene Set Enrichment Algorithm Algorithm parameters Min no of Genes to be found in a Gene List 15 Maximum no of permutations 1000 Search Options Gene Set Search Simple Search Advanced Search BROAD Gene Sets C1 Cytogenetic Sets C2 Functional Sets C3 Regulatory Motif Sets C4 Neighborhood Sets lt lt Back Next gt gt Fit Figure 45 The Choose Gene Sets window allows you to define the parameters for GSEA 5 Review results for GSEA analysis Gene sets with significant q value for any of the three pairs of condition selected for analysis are listed in the GSEA Step 4 of 5 Results from GSEA window See Figure 46 To get information on the reported values please refer to the GeneSpring GX User Manual Click Finish to save all significant Gene sets Activate the Entity List Inspector for the chr6q13 list by double clicking on the Entity List icon in the Navigator In the Notes section of the Entity List Inspector scroll down to see the q values reported for each of the three pairs of conditions selected for analysis You will see that it is the Idiopathic vs Non failing comparison for which there is significant enrichment of the genes in the chr6q13 gene set
90. ue and corrected p values for the category and the number of counts probe sets in the Entity List that are found in the category Note that these labels can be moved around by dragging them to their desired position Double click on any region of the Pie Chart This will instruct GeneSpring GX to display the GO categories directly under the selected parent category the category that you double clicked on Use the right and left arrow icons to move up or down the GO classification schema To save the probe sets in a specific category select the region of interest in the Pie Chart and click on the Save custom list button o The Spreadsheet displays GO categories in which there was a significant enrichment of the probe sets used for the analysis Note that GeneSpring GX automatically applied a corrected p value cutoff of 0 1 Thus the GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 67 he Agilent Technologies Spreadsheet will only show categories with corrected p value of less than 0 1 The corrected p value cutoff can be change by clicking on the Change cutoff button and entering a new cutoff value The values in any of the columns in the Spreadsheet can be sorted by clicking on the column header To save the probe sets in a specific category select the category and click on the Save custom list button e Inthe GO Analysis Step 2 of 2 Output views window apply a new corrected p value cutoff O O O O
91. ular component are my genes involved in ls there a significant enrichment of my genes of interest in any particular ontology Did my experimental conditions have a significant effect on the expression of genes involved in a particular biological function For this tutorial we will limit our analysis to only one of the Entity Lists of interest 1 Activate the GO Analysis tool e n the Workflow panel open the Results Interpretation section and click on the GO Analysis link 2 Use the GO Analysis tool to identify the GO categories in which there is a significant enrichment of the genes found to be differentially expressed between Non failing and Idiopathic conditions e Inthe GO Analysis Step 1 of 2 Input Parameters window select the Entity List to be used for the analysis o Click the Choose button to select Entity List for the analysis From the Analysis folder select the Differentially expressed between Non failing and Idiopathic Entity List o Click Next gt gt e Inthe GO Analysis Step 2 of 2 Output Views window view the results from the GO Analysis See Figure 43 o The GO Analysis Step 2 of 2 Output Views window reports results from the GO Analysis in several displays For explanation of each result display consult the GeneSpring GX User Guide Manual o Inthe Pie Chart display click on the Call out icon to see the labels for the different regions of the Pie Chart Each label contains the GO ID GO term p val
92. vate the Clustering tool e In the Workflow panel open the Analysis section and click on the Clustering link 2 In the Clustering Step 1 of 4 Input Parameters window select the following e Click the Choose button to select Entity List for the analysis o From the Analysis folder select the Fold change greater than 1 5 in Non failing vs Ischemic or Non failing vs Idiopathic Entity List and click OK e Click the Choose button to select the Interpretation for the analysis o From the Interpretations folder select the All Samples interpretation and click OK e From the Clustering Algorithm drop down menu select Hierarchical e Click Next gt gt 3 In the Clustering Step 2 of 4 Input Parameters window select input parameters for Hierarchical Clustering analysis e From the Cluster on drop down menu select Both entities and conditions This will instruct GeneSpring GX to simultaneously perform Hierarchical Clustering on both entities and conditions where the results will be a 2 dimensional dendrogram e From the Distance metric drop down menu select Pearson Centered e From the Linkage rule drop down menu select Centroid e Click Next gt gt 4 In the clustering Step 3 of 4 Output views window click Next gt gt e This window allows you to preview the results of the clustering analysis We will look more closely at the results once we have saved it as a data object in the Navigator 5 Save
93. ve these probe sets as an Entity List click Next gt gt in the Significance Analysis Step 7 of 8 Results window e Inthe Save Entity List window type Differentially expressed between at least two CHF etiologies and click Finish Exercise 3 Identify probe sets of interest using the Venn Diagram GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 49 gt Agilent Technologies The Venn Diagram is a visualization tool in GeneSpring GX that can be used to compare the content of up to three different Entity Lists From this comparison you can create Entity Lists containing entities in the overlap or non overlap regions of the Venn Diagram As a result entities of interest can be extracted from the comparison and saved as an Entity List At this point of the analysis we are interested in answering the following questions What genes are differentially expressed between Non failing and Idiopathic but not between Non failing and Ischemic What genes are differentially expressed between Non failing and Ischemic but not between Non failing and Idiopathic What genes are differentially expressed between both Non failing and Ischemic AND Non failing and Idiopathic 1 Activate the Venn Diagram e From the View menu select Venn Diagram 2 Project the three Entity Lists of interest onto the Venn Diagram See Figure 34 e Inthe Choose Entity List window select the following o Entity List 1 Differentially expressed between Non fail
94. y values significantly changed e The Post hoc Analysis Report panel displays the Post hoc results in a matrix In this view the three tested conditions are put into a matrix The blue boxes indicate the number of probe sets with intensity values that the post hoc test determined to be statistically different between the conditions The orange boxes indicate the number of probe sets with intensity values that the post hoc test determined to not be statistically different between the conditions To make an Entity List of probe sets of interest click on the box you wish to select and click on the Save Custom Lists button Entities from multiple boxes can be saved to a single Entity List To GeneSpring GX 9 Data Analysis Tutorial for Affymetrix data 48 he Agilent Technologies do this click on the boxes you wish to select and click on Union or Intersection button e Make an Entity List containing probe sets with intensity values that are statistically different between both Non failing and Idiopathic and Non failing and Ischemic o Click on the blue box corresponding to Non failing and Idiopathic 5 148 probe sets Hold down on the Shift key and click on the blue box corresponding to Non failing and Ischemic 9 748 probe sets Both boxes should now be selected Click on the Intersection button This will create an Entity List containing probe sets that are differentially expressed between Non failing and Idiopathic AND Non failing and Ischemi
95. ying 54675 0 selected 217M of 234M T iii Congestive Hear 14 x J Samples 5 Interpretations gi GE CHF Etiology Q Gender CHF Etiolog a Analysis Normalized Intensity Values Figure 17 Each line in the Profile Plot View represents a probe set s normalized intensity values Y axis in each condition X axis defined by the selected interpretation The plot shows the averaged value for each probe set in a condition 2 Set the order of experimental conditions for the display The order in which experimental conditions are displayed in the views can be specified For example in this experiment you have parameters CHF Etiology and Gender and you want your profile plots to group conditions by Gender first with the order of Female first followed by Male Furthermore within each Gender conditions you would like the CHF Etiology conditions to be in the following order Non failing first followed by Ischemic and Idiopathic To achieve this use the Experiment Grouping window to set the conditions in the desired order e Activate the Experiment Grouping window o In the Workflow panel open the Experiment Setup section and click on the Experiment Grouping link e Set the order of experimental conditions o Select the Gender column by clicking into any of the parameter value cells o Click on the left arrow icon button ag on the upper left of the Experiment Grouping window
Download Pdf Manuals
Related Search
Related Contents
clavo acmed ca i nou ouvrent leur maison Black Box 120H User's Manual Composition Propriétés et caractéristiques Mode d`emploi C45 www.tunturi.com User Manaual User Guide - Cardiac Science 乗る人の愉しみを広げる、 FORESTER性能 一括ダウンロード(2.53MB) Descargar - Notifier by Honeywell Copyright © All rights reserved.
Failed to retrieve file