Home

now - Textco BioSoftware

1. ae ColEl Sigma 2 EMBL3 LeftArm Clontech 35 EMBL3 RightArm Clontech 4 EMBL3 S6 T7 LeftArm Clontech De EMBL3 S6 T7 RightArm Clontech 6 f1 phage Phage 7 fd phage Phage 8 fd 478 phage Phage 9 fd tet phage Phage 10 lambda phage Phage Tis 13 phage Phage LA 13 PhageScript Stratagene 13 13BM20 Boehringer 14 13BM21 Boehringer PO 13mp10 Amersham 16 13mp18 NovaGen 17 13mp19 Pharmacia 18 13mp8 Boehringer ders 13mp9 Boehringer 20 13tg130 Amersham 21 13tg131 Amersham 22 p2Bac In Vitrogen 23 pA0815 In Vitrogen 24 pAC360 In Vitrogen 25 pAcUW31 Clontech 26 pACYC177 New England Biolabs 27 pACYC184 New England Biolabs 28 pADbeta Clontech 29 PADVAntage Promega 30 pAL 781 In Vitrogen 31 pALTCON Promega 32 pALTER 1 Promega 33 pALTER Ex1 Promega 34 pALTER Ex2 Promega 35 pAMP1 BRL 36 pAMP10 BRL 37 pAMP18 BRL 38 pAMP19 BRL 39 pAMP2 BRL 40 pAT153 Amersham Page A 22 Appendix List of all Vectors Included With Gene Inspector 41 pAX4a U S Biochemicals USB 42 pAX4a U S Biochemicals USB 43 pAX4b U S Biochemicals USB 44 pAX4b U S Biochemicals USB 45 pAX4c U S Biochemicals USB 46 pAX4c U S Biochemicals USB 47 pAX5 U S Biochemicals USB 48 pAX5 U S Biochemicals USB 49 pBacPAK1 Clontech 50 pBacPAK8 Clontech 51 pBacPAK9 Clontech 52 pBC KS Stratagen
2. 0c eee eee eee Taking Notes Using Background Test Creating and Using Style Sheets 0 00 eee eens Adding More Analyses to a Setup 00 c cece eee eee Appendices Hiding Large Amounts of Data o oo o Customizing and Saving Analysis Setup Suites 4 Using Predefined Analysis SuiteS 0 0 0 0 Restriction Enzyme Digests 00 00 e eee eee Displaying Formatted Sequence Information o o ooo Testcode An Interactive Analysis Dot Matrix Analysis Another Interactive AnalySiS page 1 TABLE OF CONTENTS Using Bookmarks in the Gl Notebook 0 0 cece eee eee 2 62 Creating Your Own Analysis Tables 2 63 BLAST Searching tdi oir rod sois 2 65 The Gl Sequence Editor Introduction to the Sequence Editor 3 1 The Overview Pane 0 0 cece eee eee eee eee eens 3 1 The Editing Pane sc EAR de peels ba peels a PA ATA AA 3 2 Manipulating A Sequence o ooccoccocncco eee 3 4 Formatting A Sequence Within the Sequence Editor 3 5 Drag and Drop Sequence Editing 3 6 Entering and Checking Sequences 0 0 eee eee ee eae 3 7 Mapping the Keyboard 00 cece cence eee eens 3 7 Defining Speech Preferences Mac only o oooooooo o 3 7 Confirming Sequences 0 20 eee eee eee eee 3 9 Multiple Sequence Alignments 3 10 Enhancing Aligned Sequence Displays 3 12 Sequence Adornments
3. Show Hide Positions This option will toggle the appearance of the positioning numbers at the start of each line to be shown or hidden e Manipulate The Manipulate submenu deals with simple manipulations that can be per formed directly on sequences within the sequence editor rather than going through an analysis setup Invert and Translate are available only for nucleic acid sequence while Reverse Translate is available only for peptide sequences Invert To invert a sequence is to flip it over so that the opposite DNA strand is shown in the 5 to 3 direction The option is only available when a segment of DNA is selected Note that the inversion operation takes place in the con text of the sequence you are editing and will change the original sequence by inverting the selected segment in place Translate Like Invert Translate is only available when a segment of nucleic acid is selected You can choose the translation table to use and then the selected segment will be translated The newly generated peptide sequence will be placed in a new peptide sequence window Reverse Translate The ReverseTranslate option is only available for peptide sequences You can Translate Setup Please choose a translation table Table Drosophila melanogaster B Cancel_ EB Figure 6 30 Reverse Translation Dialog choose the table you wish to use for the reverse translation using the dialog Page 6 48 Menu
4. scere enrere ewe 3 13 Using Custom Score Adornments 3 14 Creating a Features Object View of a Sequence 055 3 16 Importing Sequences 3 19 Generating SEQUENCES srie n ea a e a a a e E 3 20 Analyses Starting ah AndlySIS ENEE 4 1 The Analysis Setup WiINdOW 4 2 The Analysis Monitor 00 0 cee eee eee eee eee 4 3 Input Sequence Panel 0 0 0 cece eee eee eee 4 4 The Output Location Panel 4 7 Adding Analyses to an Analysis Setup Window 00000ee 4 8 Adding Analysis Setups to the Menu 4 8 Modifying Output Objects 0 cee eee 4 9 Style Sheets isa NEEN NEEN ENEE ieee bas EEN eevee REENEN 4 9 page 2 TABLE OF CONTENTS FRAMES sec e SoS coe NE ae nae E 4 11 Median Sieving UataSieving ananuna eee eee 4 11 Editing Translation and Codon Preference Tables 4 13 Nucleic Acid ANalySiS 4 16 Align 2 Sequences Global 4 16 Align Multiple Sequences ccc eee eee ees 4 20 B sse Composition is a A 4 21 Base Distribution 4 21 CodonPreference 0c cece eee eee 4 23 Dot Matti o o a a a 4 26 Find Inverted Repeats o oocoocccccocc ip aie m Rp 4 31 End Repeat o 4 34 le RE En 4 34 GC Coding Prediction 4 37 Open Reading Frames 4 39 Restriction Enzyme Digest 0 cece eee eee 4 42 TestCode mori aed ee ee ee ee ee dE 4 44 BLAST SEE NEE 4 47 Protein IER 4 48 Accessible Surface Area 4 48 Align 2 Sequences Global 4 49
5. OOA Save Gene Inspector Save this document as Save As Sample Notebook fa lt gt m Gene Inspector 1 6 Q searct PR NH2 L Galerie 2 orful notebook 6 Network ds eck 4 yk H 3 Gck 2 5 ene tor App El NewHampshire2 3 GiData Y Jan apr Notebook Tour HI Desktop ap 0 3 Gl Seqs ys bobgross hat ap Sample Notebook o Documents gt Serializer 4 O KeyServering f L Installers nternet nnect app L TextcoBioSoftw Pas lorer app Y 5 Utilities RI New Folder Cancel E Save Figure 6 3 Save As Dialog e Save a Copy Save a Copy is similar to the Save As option discussed in the previous section with one important difference In contrast to Save As which creates a new win dow document and makes it the active window Save a Copy will create a new document but will not close the current window you can continue to work on the original document The Savea Copy option in effect will create a snapshot of the current state of the open window e Revert to Saved Revert to Saved will restore the current window to the state it was in the last time the document was saved If you have made any changes you will be asked if you want to lose all the changes made since the last Save The Revert to Saved option is a convenient way to restore a file to its original state after you might have accidently made unwanted changes It is a kind of super undo Page 6 4 Menu Items Look irr E
6. ED This pattern is interpreted as Ala or Cys any Val any any any any any but Glu or Asp lt A x ST 2 x 0 1 V This pattern which must be at the N terminus of the sequence lt is translated as Ala any Ser or Thr Ser or Thr any or none Val Page A 5 Appendix Protein Cleavage Sites Protein Cleavage Sites Table 1 Protein Cleavage Sites name recog seq comments acid D P pH 2 5 mild acid hydrolysis armillaria mellea pro K tease chymotrypsin Fwy clostripain R cyanogen bromide M endopeptidase LysC K hydroxylamine N G 2M hydroxylamine pH 9 0 NBS 1 wY N bromo succinimide short incu bation NBS 2 THWY N bromo succinimide long incu bation NTCB C 2 nitro 5 thiobenzoic acid Ni pancreatic elastase AGSV pepsin TFWY proendopeptidase D thermolysin ILV trypsin KR V8 1 E staphylococcal protease V8 ammonium acetate pH 4 V8 2 DE staphylococcal protease V8 phosphate buffer pH 7 8 Page A 6 Appendix IUPAC Standard Nucleic Acid Codes IUPAC Standard Nucleic Acid Codes Table 2 Nucleic Acid Codes Code Bases Mnemonic A A Adenine C C Cytosine G G Guanine T U T U Thymine Uracil R AorG puRine Y CorT pYrimidine S GorC Strong bonding W AorT Weak bonding K Gor T Keto M AorC aMino B CorGorT notA D AorGorT
7. Figure 4 30 Find Sequence Output sequence at three locations The initial output object is shown in the top part of this figure which is a graphical display of the location of the two parts of the search sequence displayed along the DNA The start of each part of the query sequence is shown as a tick mark along the horizontal line By choos ing Object gt View As Table you can see the data in tabular form as shown in the bottom part of Figure 4 30 This reveals that the search sequence is found on the bottom strand of the DNA because the First nucleotide column has a higher number than the Last nucleotide column The first and last nucleotides in this case refer to the first and last nucleotides that match the search sequence Page 4 35 Analyses In this search no mismatches were allowed As seen in the setup panel Fig ure 4 29 page 4 34 however it is possible to allow mismatches in any one or more of the search segments If a single mismatch is allowed in the Find sequence Dros hsp 3 First nt ioaat iSeparation itataaa Last nt OAT AMMA 377 Figure 4 31 Find Sequence Output With Mismatches DNA tataaa sequence and the analysis is recalculated the results shown in Fig ure 4 31 page 4 36 are obtained Notice that the positions with mismatches are shown in lower case letters in the table output while the exact matches are in upper case letters the same convention used in the Find Repeats panel If you ente
8. Group characters 0 By percent of character width Characters per group ha Width between groups Moo Lines Characters per line 50 Cancel gt Eo Figure 3 4 Format Sequence Dialog ify whether characters should be grouped or not and to define the group size if grouping is active The Spaces box allows you to set the spacing between groups Spacing can be defined as a fixed number of pixels or as a percent of the standard character width Note that the sequence editor allows you to use proportional fonts like Times Helvetica Palatino and Bookman to display your sequences The characters will still appear uniformly spaced in the win dow even if they do not have uniform width However in order to display pro Page 3 5 The GI Sequence Editor portional fonts in this way the program has to calculate the position of each character individually before drawing it on the screen this will slow down the display of sequences This does not represent a problem on new computers but may be slow on older computers If you find that updating the display is too slow change the font to a monospaced font like Monaco or Courier which eliminates the extra calculations You can also change the font size and styling of the sequences by using the items under the Format menu Some of the commands which alter the E Gene Inspector 1 6 Pose an alert to explain ambiguous commands or warn if an operation cannot be undone No Fig
9. Pick Line Width Figure 6 11 The Lines Submenu If the selected object is a rectangle or other more complex graphic object then the arrowhead options are grayed You can still pick line widths for these objects If the object is a rounded rectangle you can set the Corner Curvature to define the extent of roundness in the rounded corners Font The Font menu lists all of the available fonts for use in Gene Inspector e Style The Style menu list all the styles that can be applied to text One that is not very common in other programs is Conditional Text Conditional text can be shown or it can be hidden You can use conditional text to keep comments for yourself that you do not want to be part of the regular notebook text Con ditional text is discussed under Conditional Text page 5 2 Another useful option is BoxAround This is especially useful in displaying sequence informa tion e Size Size allows you to set the size of any selected text Page 6 17 Menu Items Color Color lists the colors that are currently defined in the Gene Inspector In addi tion to the standard colors additional possibilities are available through the submenus listed below Pick a Color Pick aColor presents you with a standard Color Picker dialog supplied by the operating system You can choose any color your computer can produce Add Color To Menu Add Color To Menu can add a new color to the Color men
10. Calculation method Sliding Window Average Window Size 119 C Median Sieving Mesh Size 10 Input Sequences Table Argos etal ec Gin About the Analysis H a This analysis is based on the statistical distribution of specific Output Location S SONA amino acids in membrane vs non membrane segments for a sample set of proteins Argos et al Eur J Biochem 128 55 1982 This Trans membane Helix analysis is identical to the Membrane Buried Regions analysis d Show Icons Style Default Window size is the number of adjacent amino acids whose property is calculated in each iteration After calculating the value for the first window of amino acids the window is moved one residue along the sequence and a value is calculated again for the new window of amino acids Median Sieving emphasizes data having a specific distribution J A Bangham Anal Biochem 174 142 1988 Figure 4 2 An Analysis Setup Window one analysis output object generated for each analysis of each sequence although some analyses can use more than one sequence Analysis panels represent the third kind of panel There is one analysis panel for each analy sis listed in the Analysis Chooser The different analyses and their analysis panels are discussed later in the chapter Each analysis chosen for a given Analysis Setup Window will have an icon or text name in the list on the left of the Analysis Setup Window Along the top of the window i
11. Page 4 38 Analyses GC coding prediction chick muse AchRec u Y i L L U Uu I Jl LL A DL A PE u i L L LA LU Ii i L Li i LL 1 LU L L L 1 J AS E E Y UE poi oj it LA ll 1 1 Jit tit M1 o u Li 1 11 it tit it 1 L 1 501 1001 1501 2001 Nucleotide Figure 4 34 GC Analysis Output Open Reading Frames Open reading frames ORFs represent stretches along a DNA sequence in which there are no stop codons The setup panel is shown in Figure 4 35 on page 4 40 The Method box lets you choose to determine open reading frames ORFs as stretches of DNA between start and stop codons or just between stop codons For eukaryotic organisms with introns it is best to choose just stop codons but for prokaryotes it might be useful tochoose both start and stop codons The Display box lets you show either ORFs alone or ORFs and rare codons Rare codons are those codons in a synonymous codon set that occur below the defined threshold level Rare codons are not found often in true coding regions and can sometimes be used to confirm a region as being an actual coding region The results of an Open Reading Frame analysis are shown in Figure 4 36 on page 4 40 All six reading frames are shown and arrows indicate the direction of the coding region In Page 4 39 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs O Il High Priority Close Run Sp CS Method C Start and Stop Codons Only Stop Codons a Min
12. The Analyses chapter of the manual Chapter 4 describes what each analy sis does and how the analysis results can be used for biological insight It also describes how you can run analyses The chapter on the GI Notebook Chapter 5 discusses how you can use the Gl Notebook to record and discuss experimental results and how the Gl Notebook can be used as the repository for analysis results In addition the GI Notebook can also be used to design and print posters The Sequence Editor chapter Chapter 3 discusses in detail how you can create align and modify sequences Each sequence editor document can hold one or more sequences These sequences are used as the starting points for analyses Multiple sequence alignments also reside in sequence editor documents Extensive capabilities have been built into the sequence Page 1 1 Getting Started with Gene Inspector editor to enable the display of multiple aligned sequences with tremendous flexibility The Menu Items chapter Chapter 6 lists and discusses every menu option available in the program starting with the File menu on the left and going through the specific menus that appear on the right It can be used as a handy reference Finally the Tips chapter Chapter 7 covers some useful ways of dealing with different aspects of the program offers some suggestions on how you might work more efficiently with your data and answers some frequently asked questions FAQs In many
13. Tutorials Creating and Using Style Sheets Notice that the styles of all the parts in the selected object change because they were all defined in the style sheet Defined style sheets are also available when you start an analyses as part of the Styles popup menu This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 39 Tutorials Adding More Analyses to a Setup TUTORIAL 11 ADDING MORE ANALYSES TO A SETUP 1 Choose Analysis gt New Analysis and create a new protein analysis Select Helical Wheel and choose OK Each new analysis you add will have parameters associated with it With Helical Wheel icon selected in the left list choose the Kyte amp Doolittle table from the Table popup menu on the right 2 Click on Input Sequences icon on the left open the peptide rhodop sins file and select the sequence Lamprey rhodopsin Click on the Lamprey rhodopsin sequence as shown in Figure 2 31 Press the Segment button on the bottom right of this panel and specify that you want to analyze residues 75 100 as shown in the figure The segment selector always refers to the sequence that is highlighted in the list in the top part of the panel Run the Analysis Setup B Analysis Setup Anzkapa 1 depms 1 carps 1 Pb ght nnw Clore Run Kei Ghns ns quecee Ip hrange di sscht Ale y nocited KL BR Larp ey eescht 5 257 407 2 z Add Remo
14. style to highlight a specific segment or use the Upper Case or Lower Case styles to indicate other features 17 Let s take a look at a features object containing a multiple sequence alignment Open the peptide sequence file called rhodopsins Choose Sequence gt Display gt Hide Overview Align all the sequences by Choosing Sequence gt Alignment gt Align All Sequences and perform the alignment with the BLOSUM30 table Use the Sequence gt Consensus gt menu to show the consensus row and to show the scoring row 18 Now click on one of the names in the left column of the sequence win dow and then choose Edit gt Select All to select all the sequences Make sure you have an open notebook window visible and then drag the sequences from the sequence editor window into the notebook with the mouse A fea tures object will appear in the notebook with the same alignment as you saw in the sequence editor window Page 2 53 Tutorials Displaying Formatted Sequence Information 19 Double click on the new features object and a Sequences menu will appear You may use the items in this menu to alter the display of the aligned sequences in this window Note however that you cannot perform any new alignments within this object It is for displaying the results only To perform a new alignment you must start from the sequence editor window or else perform a multiple sequence alignment from the Analysis menu This concludes this tutorial You may
15. Show Hide Right Positions The Show Hide Right Positions menu item will either show or hide the sequence position indicators at the right side of the sequence lines This includes both nucleic acid and translated amino acid positions Show Hide Line Dividers The Show Hide Line Dividers menu item will either show or hide the line dividers which can be used to keep different sequence lines clearly separated Feature Margins The Feature Margins menu item allows you to specify spacing between the Sequence Margins Specify the minimum margin widths in pixels between the sequence and the edge of the view Left margin RH Right margin 0 Space between positions and sequence in pixels Gap margin 6 Cancel GERD Figure 6 26 Feature Margins sequence and the position markers and between the sequence and the edge of the features object The dialog is shown in Figure 6 26 The left margin Page 6 41 Menu Items and the right margin values define the minimum space between the sequence itself and the edge of the feature object The gap margin specifies the spac ing between the position indicators and the sequence Line Spacing The Line Spacing menu item allows extra space to be added between each line A Enter Value Extra space between lines in pixels d Cancel Figure 6 27 Features Line Spacing of sequence displayed Using the dialog box in Figure 6 27 a value can be entered for the number of
16. below for more details e Notebook Layout The Notebook Layout dialog allows you to set the way in which text is arranged on the notebook sheet Note that the notebook sheet size does not necessarily correspond to the size of a printer page Notebook sheets can contain any number or fraction of printer pages The notebook sheet is used to define how large the printed output will be and how many printer pages it will contain Using the File gt Page Setup menu option to select a printer will automatically define the printer page size for you The layout was discussed previously in Gl Notebook Layout page 5 4 The dialog box is also shown in Figure 6 24 In this case the dialog is set to pro duce poster panels of 16 x 20 containing three columns of text Page 6 37 Menu Items a Page Layout Notebook layout style Click in diagram to update display C Standard text layout Side by side layout C Poster sheet layout Poster layout size e Measured in inches C In printer pages 20 wide X E tall Columns Margins in inches Printer page columns Left Full sheet columns Right Columns per E Top page sheet Bottom Space between ft D Binding columns in inches Cancel Save as default page layout Figure 6 24 Notebook Layout Poster Page 6 38 Menu Items Features Menu Mark Sites Mark Sites When a sequence is copied from a sequenc
17. pADbeta pBacPAK1 pBacPAK8 pBacPAK9 pbetagal Basic pbetagal Control pbetagal Enhancer pbetagal Promoter pBI101 pBI101 2 pBI101 3 pBin19 pCMVbeta pDIRECT pDR2 pEUK C1 pEX1 Page A 10 Vectors by Supplier Appendix 23 pGAD10 24 pGAD424 25 pGBT9 26 pGFP 27 pGFP 1 28 pGFP C1 29 pGFP C2 30 pGFP C3 31 pGFP N1 32 pGFP N2 33 pGFP N3 34 pGUSN358 S 35 pKK388 1 36 pMAM 37 pMAMneo 38 pMAMneo Blue 39 pMAMneo Cat 40 pMAMneo LUC 41 pNASSbeta 42 pNOM102 43 pPUR 44 pRAJ275 45 pSEAP Basic 46 pSEAP Control 47 pSEAP Enhancer 48 pSEAP Promoter 49 pSV2neo 50 pSVbeta 51 pT3T7 luc 52 pTKbeta 53 pUC118 54 pUC119 55 pYACneo 56 pYEUra3 57 rpDR2 58 rpSE937 Vectors by Supplier Page A 11 Appendix IBI pIBI24 pIBI25 pIBI30 pIBI31 pSTneo um bk GA bM 3 In Vitrogen p2Bac 2 pA0815 3 pAC360 4 pAL 781 5 pBlueBac4 6 pBlueBac4CAT 7 pBlueBacHis2CAT 8 pCDM8 9 pcDNA3 10 pcDNA3CAT 11 pcDNAI 12 pcDNAlAmp 13 pcDNAlIAmpCAT 14 pcDNAII 15 pCEP4 16 pCEP4CAT 17 pCMV EBNA 18 pCR3 19 pCR3 Uni 20 pCRII 21 pEBVHIS LacZ 22 pEBVHisCAT 23 pHIL D2 24 pHIL S1 25 pLambdaPop6 26 pLEX 27 pMelBacB 28 pMEP4 29 pPIC9 Page A 12 Vectors by Supplier Appendix 30 pPIC9K 31 pRCCMV 32 pRcRSV 33 pREP10 34 pREP4 35 pREP4CAT 36 pREP7 37 pREP7CAT 38 pREP8 39 pREP8CAT 40 pREP9 41 pSE280 42 pSE380 43 pSE420 44 pSL301 45 pTrcHis
18. 18 pCM7 19 pCMV EBNA 120 pCMVbeta 121 pCR3 122 pCR3 Uni 123 pCRII 124 pDIRECT 125 pDR2 26 pDR540 Page A 24 List of all Vectors Included With Gene Inspector Pharmacia Boehringer Sigma Boehringer Stratagene Stratagene Boehringer Pharmacia Promega Promega Promega In Vitrogen In Vitrogen In Vitrogen In Vitrogen In Vitrogen In Vitrogen In Vitrogen Pharmacia In Vitrogen In Vitrogen U S Biochemicals Pharmacia Promega Promega NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen Pharmacia In Vitrogen Clontech In Vitrogen In Vitrogen In Vitrogen Clontech Clontech Pharmacia Appendix List of all Vectors Included With Gene Inspector 127 pEBVHIS LacZ In Vitrogen 128 pEBVHisCAT In Vitrogen 129 pET 5b Promega 130 pET 5c Promega 131 pET 9a Promega 132 pET 9b Promega 133 pET 9c Promega 134 pET11 NovaGen 135 pETlla NovaGen 136 pET11b NovaGen 137 pETllic NovaGen 138 pET1ld NovaGen 139 pET12a NovaGen 140 pET12b NovaGen 141 pET12c NovaGen 142 pET14b NovaGen 143 pET15b NovaGen 144 pET16b NovaGen 145 pET17b NovaGen 146 pET17xb NovaGen 147 pET19b NovaGen 148 pET20b NovaGen 149 pET21 NovaGen 150 pET21a NovaGen 151 pET21b NovaGen 152 pET21c NovaGen 153 pET21d NovaGen 154 pET22b NovaGen 155 pET23 NovaGen 156 pET23a NovaGe
19. Choose File gt Open and select and open the nucleic acid sequence called pBR322 it is in the DNA f in the Gl Segs f You will see the window shown in Figure 2 4 The name column segment indicator ruler Overview 10 20 30 40 50 A ee ee AA pBR322 1 TTCTCATGTT TGACAGCTTA TCATCGATAA GCTTTAATGC GGTAGTTTAT 51 CACAGTTAAA TTGCTAACGC AGTCAGGCAC CGTGTATGAAR ATCTAACAAT 101 GCGCTCATCG TCATCCTCGG CACCGTCACC CTGGATGCTG TAGGCATAGG 151 CTTGGTTATG CCGGTACTGC CGGGCCTCTT GCGGGATATC GTCCATTCCG 201 ACAGCATCGC CAGTCACTAT GGCGTGCTGC TAGCGCTATA TGCGTTGATG 251 CAATTTCTAT GCGCACCCGT TCTCGGAGCA CTGTCCGACC GCTTTGGCCG sequence name 301 CCGCCCAGTC CTGCTCGCTT CGCTACTTGG AGCCACTATC GACTACGCGA 351 TCATGGCGAC CACACCCGTC CTGTGGATCC TCTACGCCGG ACGCATCGTG 401 GCCGGCATCA CCGGCGCCAC AGGTGCGGTT GCTGGCGCCT ATATCGCCGA 451 CATCACCGAT GGGGAAGATC GGGCTCGCCA CTTCGGGCTC ATGAGCGCTT 501 GTTTCGGCGT GGGTATGGTG GCAGGCCCCG TGGCCGGGGG ACTGTTGGGC 551 GCCATCTCCT TGCATGCACC ATTCCTTGCG GCGGCGGTGC TCAACGGCCT 661 CARCCTACTA CTGGGCTGCT TCCTAATGCA GGAGTCGCAT AAGGGAGAGC 651 GTCGACCGAT GCCCTTGAGA GCCTTCAACC CAGTCAGCTC CTTCCGGTGG 701 GCGCGGGGCA TGACTATCGT CGCCGCACTT ATGACTGTCT TCTTTATCAT 751 GCAACTCGTA GGACAGGTGC CGGCAGCGCT CTGGGTCATT TTCGGCGAGG 861 ACCGCTTTCG CTGGAGCGCG ACGATGATCG GCCTGTCGCT TGCGGTATTC 851 GGAATCTTGC ACGCCCTCGC TCAAGCCTTC GTCACTGGTC COGCCACCAR M position Figure 2 4 Sequence Editor with a Single Sequenc
20. Click on the Aat enzyme name to select it Hold down the shift key and click on the name Avr to select all the enzymes that are unique cutters Now while the digests are still highlighted use the Formatmenu to change the color to green and the style to bold 4 Choose Object gt Edit Display Parameters again and now enter a 100 into the no more than text field This specifies that you want to see all enzymes that cut at least once but not more than 100 times Press the OK button Note that the color style of the unique cutters is preserved but all other digests are in their original color of red 5 Choose Object gt Viewas Table to see the digests shown in tabular form Note that when displayed as a table you can change the font attributes The filter ing allowable number of cuts of which enzymes to display applies to both the table and graphic views To switch back to the graphical view choose Page 2 49 Tutorials Restriction Enzyme Digests Object gt View as Graphic This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 50 Tutorials Displaying Formatted Sequence Information TUTORIAL 16 DISPLAYING FORMATTED SEQUENCE INFORMATION Although the sequence editor is ideal for manipulating sequences and dis playing alignments there is often a need to display formatted sequences including translations and restriction sites This can b
21. Drag amp Drop Option 6 13 SNOW Clipboard Wise REES TEE RTE 6 14 Show Hide Page Breaks 200 e cece eee eee 6 14 Windows Menu 6 15 Stack WINdOWS 6 15 Current Window Names 6 15 FORMAL Mentee EES po 6 16 Fill Aen delt ir id 6 16 LINCS let EE Sites Aina eae tae Ee AE Ee iat eat 6 16 Ob seve A ae neato enact gee 6 17 TABLE OF CONTENTS Paragraph eset scene nee ee eee Ge cepa Ss ee Sd Style Sheets NEEN ee ee Analysis Menu i024 Ree a a eed ite SCHER A New AnalySISiic 2a ie oa NEE ee KOER GRE A Previous SS Et P ANE hae ae eee Show Hide Analysis Monitor Tables tna saws nae et es cies diesen does die eS Add Another Analysis 0 ccc cece eee eee eee eee Remove AnalySisS 0 0 00 cece eee eee aa Update Setup muestras a Add Setup To Menu occcccccccc eee eee eE Remove Setup From Menu Custom Analysis SetupS 0000s Notebook Menu 00 cee cece eee e tenes Open For Editing eebe Make AlldS 0 00 sivas ee ed cian EE ea FING Originals ayes ta Bed tae be ee yak BOOKMARKS lt td a a A EEA TRA A Doo Al Arrangement a a eae oido DISPLAY coro ao a REN elen elte EE LINKS 00 ee a ee EE Page Break iuris ee ae ci oy cee eee Notebook Layout Features MENU sises eet cha NEE eet ed wet EE Net Mark SWOS a chee Sew E EE EE AE Ee PANS ALS deg Ae e EE eg Eege EE Define MERON eer Ouer rg Reena ade er Vide EEN Undefine Intron S 2 cece eee eee Display usos o ceba Math
22. Figure 5 5 The Tools Palette page 5 7 The tools palette can be accessed directly from the Notebook menu Any tool extensions you may have added will be available in the Tool Extensions menu see Tool Extensions page 5 17 The tool extension shows up in the bottom part of the palette when the name of the extension is selected in the popup menu Once a graphic object is drawn its color line thickness and fill pattern can be set using the Format menu As in all standard drawing programs first select Page 5 7 The GI Notebook an object and then choose the operation to be done on the object using the Format menu only appropriate menu items will be enabled Graphic objects that you draw can be framed using the Format Frames menu see Framing GI Notebook Objects page 5 8 By framing a simple graphic object you can produce such effects as a red rectangular frame around a blue rectangle filled with a pattern of lines You can make the rectangle which is being framed have a O width border use Format Lines Pick Line Width so the frame will appear to be the border of the rectangle because it is the only vis ible line around the object Multiple objects can be grouped and manipulated as one by using the Note book Arrangement Group menu item Once objects are grouped they can be handled as a single object but individual components of the grouped object cannot be edited individually To edit the components of a group
23. Window size 20 Table Standard zl Paan S Thresholds Color Ranges Input Sequences Threshold value 40 00 260 00 ec Add Threshold gt fo Output Location Change Threshold Remove Threshold Dot size 2x2 Iw Show Icons Style Default s The 7able popup menu defines the table to be used in the analysis Window size is the number of adjacent residues to be used in the comparison Thresholds are defined by typing values in the Threshold box and adding them to the Thresholds list using the Add Threshold button Use the Color submenu in Format menu to define a color for each selected threshold Dot size refers to the size of the dot used to indicate the 300 00 Figure 2 41 Dot Matrix Setup Panel sequence segments that will be compared set this to 20 Select the PAM40 table in the Table popup This table is a scoring table that indicates how simi lar two amino acids are to each other see Dot Matrix page 4 54 for more details In the Threshold box enter 40 and then press the Add Threshold button Repeat this procedure to add thresholds of 20 O 20 and 40 by typing in the number in the Threshold box and then pressing the Add Threshold button These values appear in the Thresholds list at the right Page 2 58 Tutorials Dot Matrix Analysis Another Interactive Analysis 4 Set the Dot size to 2 x 2 using the small popup menu in the lower right This sets the size of the dots to be drawn in the p
24. analysis on the peptide to determine a helical regions Although the output for a helical wheel analysis on a whole sequence might look impressive it will not convey much in the way of useful information The diameters of the spots indicate the degree of hydrophobicity or hydrophilicity Using the Object menu you can choose to show or hide the legend the guide circles or numbering of the amino acid positions This particular example uses the Kyte and Doolit tle values to indicate hydrophobic and hydrophilic amino acids When you make up your own tables the text you enter into the minimum and maximum fields will be displayed as the labels for this plot and others Hydropathy Hydropathy analyses in general examine peptide sequences for regions of hydrophobic and hydrophilic residues using a sliding window approach page 4 68 The Gene Inspector offers ten different types of hydropathy analyses based on tables of values from various authors As shown in Figure 4 59 page 4 61 you can use a popup menu to define which table you would like to use As each table is chosen you will see a text description of the table in Page 4 60 Analyses Helical wheel Lamprey rhodopsin Analyses 1 Inputs O Outputs 0 Il High Priority Close Run Window Size 7 C Median Sieving Mesh Size 4 Input Sequences Table Kyte amp Doolittle O i oo About the Analysis dion Location This hydropathy analysis is based on an aggregate scale ob
25. eee eens 6 53 Add Column s At Right 2 c eee eee eee 6 54 Add Row s At Bottom 00 0 cece eee eee 6 54 Adjust Size To Contents 0 00 cee eens 6 54 Tips For Using The Gene Inspector Using Extra Disk Space for Analyses 000 cece eee ee eee 7 1 Analyses That Take a Long Time 00 0c e eee eee eee 7 3 Temporarily Pausing Long Running Analyses oooooooomooo 7 3 page 7 TABLE OF CONTENTS About Gl Notebook Gre 7 4 Sharing Setups With Colleagues 00 c ee eee eee 7 4 Printing and Viewing Large Objects auauna urnen eee eee 7 5 Appendix Tables acca bend vas Mets Weeden a wes dda awed bea eed dee tas A 1 EE bal iii a in td a E A 1 Bulle Breese 5 oe sets aah NEE a o Oe AT A 1 Eisenberg etal o oocooocooococooc tenes A 1 Emini etal eben gees estaca A 1 Engelman A SteitZ 0 cee eee A 1 Engelman etah tucan ao do da belek A 2 Fauchere amp Disko A 2 FAB EE A 2 GES sos kes Sei A A 2 Hopp and Woods ss eiss sri miwa ar EAE ERDRE R O EAR E Ra eey A 2 daa bei dete eed dee bo A 3 Kyte and Doolittle A 3 Manavalan 8 Ponnuswamy A 3 Parker etal Aes ce A ee A 3 Sweet and Eisenberg A 3 Thornton etial diia ae a ia SEs betes fot dd wa dA A 4 A a a AAE a AAENE A A 4 Wellnes etal iii e E ee A 4 Wolfenden er a A 4 Prosite Language Definitions 0 00 c cee eee eee A 5 Protein Cleavage Sites o oooocoocooconc eee A 6 IUPAC Standard Nuc
26. not C H AorCorT not G V AorCorG not T N any base aNy Page A 7 Appendix IUPAC Standard Amino Acid Codes IUPAC Standard Amino Acid Codes Table 3 Amino Acid Codes Amino Acid 1 Letter Code 3 Letter Code alanine A ala cysteine C cys aspartic acid D asp glutamic acid E glu phenylalanine F phe glycine G gly histidine H his isoleucine ile lysine K lys leucine L leu methionine M met asparagine N asn proline P pro glutamine Q gin arginine R arg serine S ser threonine T thr valine V val tryptophan W trp tyrosine Y tyr Page A 8 Appendix Vectors by Supplier Vectors by Supplier Amersham M13mp10 M13tg130 M13tg131 pAT153 pUEX2 um bk GA bM 3 Boehringer M13BM20 M13BM21 M13mp8 M13mp9 pBR322 pBR328 pBTac2 pEX2 pEX3 pHT3T7bm pHT3T7bm pSPT18 pSPT19 pSPTbm20 PSPTbm21 pUCbm20 pUCbm21 pXal CO d OU E GA bM Pk ph dr ed cd od e e EE DO d On Ui bk GA bM az O BRL pAMP1 pAMP10 pAMP18 pAMP19 pAMP2 pHC79 o Un E L bM gt Page A 9 Appendix 7 8 9 10 11 12 13 14 15 16 17 18 19 pHSV 106 pSP18 pSP19 pSP6 T3 pSP6 T7 19 pSPORT1 pSPORT2 pSV SPORT1 pr712 pT713 pT7T3 18 pT7T3 19 pT7T3alpha 19 Clontech CO d Ou E GA bM 19 20 21 22 EMBL3 LeftArm EMBL3 RightArm EMBL3 S6 T7 LeftArm EMBL3 S6 T7 RightArm pAcUW31
27. translate tata al edd eg 3 17 6 40 UNdETINS SINTON cada iia Ena 3 17 6 40 features object see GI Notebook Features Object figures lit a A O A aed eanveatent A 9 El Me ad 6 2 choose Gl data folder aa r a aaa Taea aa EAEE a aa EPEA aE iS ada ai 6 8 COSO EA EE 6 3 XOLA A ee lee Ee A 6 6 IMPONE aiaa ae a aa a e a aa a e aaa a pees be eiden estamos 6 5 NEE EE 6 2 Open tad ted odes abe eeh wk does dE ica 6 2 page SOU EE 6 7 BEID a oan eh aco ee ee 6 7 CUI GE 6 8 TeVert to Saved WEE 6 4 SAVE EEN A E eeh oN EEN 6 3 Index 5 G EFE COPY E E E E TE 6 4 EEN 6 3 Fill Format Menu cccccececccccccesceeeeeeececeseeueeseeauauesssceseeuauueaueaseeeseeeeessuanaueass 6 16 Find Replace Edit MENU veia eadaue seeeedenect di 6 11 find inverted DNA repeats c ccoooccccccnnnccnnnnnonnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnrnnnnnnnnos 4 31 find nucleic acid sequence ccooccccccnnnnccnnnnnnnnnnnonnnnnnnonnnnnnnnnnnncnnnnnnnnnrnnnnanannnnnnnos 4 34 Find Original Notebook Menu oooccccccnonononoonnnnnnnnnonannoncnononnnnnnannnonnncnnnnnnannns 6 30 find repeats MUGCIGIC ACI WEE 4 34 profe ias is 4 55 find sequence Prosite Style ccccoconcccccononcnnnoncononononononnnnanrnncnnnnnnnonnannnnnos 4 56 Font Format Menu ccconnnnncnnnnnncnnnononananannn nan cn conan nana n nn nano 6 17 Format Menu CONOR iii A A aia 6 18 UE 6 16 font pida 6 17 le lan 6 18 INS cia A ed 6 16 NUMERIC edu EE 6 19 paragiaph EE 6
28. 5 16 feature Object Margins EEN 3 18 GOUD W 3 18 MAKING EE 3 17 multiple sequence features object ccccceceeeeeeeeeeeeeeeeeceeeeeenaneeeeesaaees 3 19 Show hide Site markers ooooonccconcccnnnncconncnnnncnnnnncnnnncnnnnnnnnnrrnnnnnnnnernnnnnnns 3 18 translate ci da its 3 17 undetine Into adn a 3 17 features view E Ouere TEE 2 51 groupn di iii 2 52 Marking EE 2 52 6 39 multiple SEQUENCES EEN 6 44 peptide numbering Style 2 52 translating DNA cocida ta dd odas 2 51 Le IR 2 51 2 54 TAMOS EE 5 8 6 18 Gl appendices see appendices GI notebook aligning analysis objects see aligning analysis objects IER ada 6 37 moving Objects tO APPENAICES orcii eana an ENE a EEEN E ENAA 5 16 MAVIGAUON EE 1 7 notebook layout serieei Mee E E dad 2 48 open for SCILING EE 4 44 5 14 OVOIVIS Wii 1 6 5 1 Index 7 H preferred Size for objects oonocccccccccconnncnnonnnnconnnnnnnnnnnnnn enn r rre 5 8 reduce Of ii A ad id 2 48 selection eins 2 2 SIZE Age ele deed edacetos 7 4 Style SNCS iii EE 5 3 le 2 2 Test TOW ias 2 36 5 10 texto ugi Segen Zeie Eet dE EES 5 10 TOO EXtENSIONS ccccecceececeeeceeceecseceesaeceusaeeeesaeeeesaeeeeeaes 1 8 5 17 5 18 6 28 TOS atlas 6 27 to oi o Ain 2 62 Gl notebook objects AlIQNING A RA An Aaa 6 32 6 49 analySis Outputs crei dad toda eaten ee eee el ane ae 5 15 le lune 4 11 5 8 Get INO EE 2 36 geting lt WEE 5 11 QMOUPING cid 5 8 6 32 ell fe BEE 4 9 preferred WEE
29. 5 8 6 32 recalculating ori ee eee eee ed 2 41 4 9 5 15 reformatting a in See en ee 4 9 Selection versus target coria aden eta ante vee een N 2 2 show dependencies ue EEN ENEE EEN 6 36 Sidebar text EE 5 11 Tables E E E A A A A E E E T ees 5 12 Go To Position Sequence Menu ccccccccccccccececececececeeeeeceeeeeeeeeeeeeeeeeeeeeess 6 46 GOR protein structure prediction cccceeececeeeeseeeeeeeeeeeeeeeseeeeeeeaeeeseeeeeeeeesaees 4 58 GRAIL analysis oia ici 4 48 Grouping Features Menu EE 6 42 GrOUPING OD GCtS ani la aaa nd eed 5 8 H SICH WEE 2 40 4 59 hiding large amounts Of data ENEE 2 43 IGM Tue EE 6 22 Hopp and Woods table 0er tad A 2 hotlinks Index 8 AUTOMATIC it ei adas 6 34 A lancet e coed Ae lee ee a hk ce Saba tat Lak te lied deh ie 6 34 PUNDOSE ii aed eee eee E E E 1 5 TUTOR Ir cada 2 19 2 23 hydration potential cidcid iii 4 62 hydropathy analyses ENEE 2 48 4 60 MYPEMtext INKS Zetting 5 1 Import File Menu iia delatan 6 5 importing Sedu Ne 3 19 input sequence panel EEN ENEE 1 6 Insert Row Column Table Menu cccccccccononancnnnnnncno nono nananan an nn nn ano na nana 6 53 Insert Xs Insert Ns Sequence Menu ccccccccnncnnnnnnnnnnnnnnnnnnnnnnnnananinananan 6 45 inserting rowS COlUMNS in tables AA 6 53 installing Gene INSpector ke 1 2 interactive analyses Lia E A S aay laeeniecehi TS 2 55 ue We EE 4 31 IUPAC standard amino acid codes oooccccccnnonncccnnonnncn
30. 79 pLysS 80 pOCUS1 81 pOCUS2 82 pOCUSlox 83 pSCREEN1b 84 pSHlox1 85 pT7BlueR 86 pTOPE1b Phage fl fd fd 478 fd tet lambda M13 Au E Ab gt Pharmacia M13mp19 pBPV pCANTAB5 pcDV1 pCH110 URUN Page A 16 Vectors by Supplier Appendix pCM7 pDR540 pExCell 9 pEZZ18 10 pGEX 1lambdaT 11 pGEX 2TK 12 pGEX 3X 13 pGEX 4T1 14 pGEX 4T2 15 pGEX 4T3 16 pGEX 5X1 17 pGEX 5X2 18 pGEX 5X3 19 pKK223 3 20 pKK232 8 21 pKK233 2 22 pMC1871 23 pMDSG 24 pMSG CAT 25 pNEO 26 pPL lambda 27 pRIT2T 28 pSL1180 29 pSL1190 30 pSVK3 31 pSVL 32 pT7T13 18D 33 pT7T3 18U 34 pT7T3 19U 35 pT7T3alpha A18 36 pTZ18R 37 pTZ19R 38 pUC18 39 pUC19 40 pUC4K 41 pXa2 42 pXa3 PO Vectors by Supplier Page A 17 Appendix Promega PADVAntage pALTCON pALTER 1 pALTER Ex1 pALTER Ex2 pCAT C pCAT E pCAT P 9 pCl 10 pCl neo 11 pET 5b 12 pET 5c 13 pET 9a 14 pET 9b 15 pET 9c 16 pGEM 11Zf 17 pGEM 11Zf 18 pGEM 13Zf 19 pGEM 15Zf 20 pGEM 2 21 pGEM 3 22 pGEM 3Z 23 pGEM 3Zf 24 pGEM 3Zf 25 pGEM 4 26 pGEM 4Z 27 pGEM 5Zf 28 pGEM 5Zf 29 pGEM 7Zf 30 pGEM 7Zf 31 pGEM 9Zf 32 pGEM luc 33 pGEM1 34 pGEMEX 1 35 pGEMEX 2 36 pGL2 B CO d Ou E GA bM Page A 18 Vectors by Supplier Appendix Vectors by Supplier 37 pGL2 C 38 pGL2 E 39 pGL2 P 40 pGL3 B 41 pGL3 C 42 pGL3 E 43 pGL3 P 44 PhiX 174 45 PinPoint Xa 1 46 PinPoint Xa 2 47 PinPo
31. Consensus pattern ST 2 DE S or T is the phosphorylation site Figure 4 65 Prosite Get Info About Selection match is indicated as the first amino acid of the query sequence and the matching sequence is shown at each position Notice that the query sequence AMID ATION ASN_GLYCOSYLATION a7 37 193 206 530 543 Wen d ix G RK RK i DGRK iN P ST P iNKSL iNRTT NDSQ iNLSI iNVSA iNITI remm Figure 4 66 Prosite Tabular Output is shown for each site using the Prosite language see Prosite Language Definitions page A 5 and Figure 4 52 page 4 56 If you want to refer to PROSITE in a publication you can do so by citing Bai roch A and Bucher P PROSITE recent developments Nucleic Acids Res 22 3583 1994 Page 4 66 Analyses Protein Cleavage Proteins can be cleaved by a variety of chemical and enzymatic treatments This analysis displays either cut sites or recognition sites for treatments cho sen from a list The setup panel shown in Figure 4 67 is similar to the one Analysis Setup Analyses 1 Inputs 0 Outputs O 7 High Priority Close gt run A Cleavage List PeptideCleavage Available cutters Sites to mark acid a TSS acid Al armillara melea pr armilana melea pr Input Sequences chymotrypsin Move All gt gt chymotrypsin peO clostripain dostripain oo cyanogen bromide Remove cyanogen bromide Output Location endopept
32. Helical Wheel analysis from the Analysis Setup by clicking on the Helical Wheel icon in the list on the left and then choosing Analysis gt Remove Analysis In the Windows version of Gene Inspector the Remove Analysis selection is accessible through the right mouse button menu 6 Choose Analysis gt Add Another Analysis In the Windows version of Gene Inspector the Add Another Analysis selection is accessible through the right mouse button menu Add the Amino Acid Composition analysis and then add the analysis called pH pI Notice that the top of the pane now shows two analyses as shown in Figure 2 32 Analysis Setup Analyses 2 Inputs 1 Outputs 2 High Priority Close RS Chosen sequences a b range n ambiguities A modified EISE EE pH pl Amino acid composition Add Remove Remove All Range Output Location Zei Entire sequence Segment Linear sequence From 75 Start 75 To 100 Path NewHampshire2 Applications Gene Inspector 1 6 Gl M show Icons Seqs Peptide sequences rhodopsins Lamprey AR Add and Remove buttons are used to define the Chosen sequences Clicking on a sequence in the list displays its location Path Entire sequence will run all analyses using the entire length of the chosen sequence The Segment button allows the analyses to operate on only a part of the entire sequence A e in the Chosen sequences list indicates that the segment is tru
33. Items shown in Figure 6 30 Gene Inspector will use the table to generate a DNA sequence from the protein sequence The resulting sequence will have a codon bias that is appropriate for the organism specified in the translation table e Alignment The items in the Alignment submenu are designed to allow you to perform mul tiple sequence alignments directly from within a sequence editor document This is an alternative to performing multiple sequence alignments as an anal ysis The pros and cons of the two approaches to multiple sequence align ments are discussed in Multiple Sequence Alignments page 3 10 Align All Sequences The Align All Sequences option will align all the sequences in the currently active sequence window using the Clustal V algorithm Refer to Multiple Sequence Alignments page 3 10 for a detailed discussion Unalign All Sequences Unalign All Sequences will remove all the gaps and spaces from all of the sequences in the document To remove spaces and gaps from a subset of the sequences in the document use the Remove Gaps and Spaces command Remove Gaps and Spaces The Remove Gaps and Spaces command removes all gaps and spaces in a selected sequence It is different from the Unalign All Sequences command which removes gaps and spaces from each sequence in the document e Consensus The Consensus submenu determines which adornments will be displayed to score scoring row conse
34. S r BB EE Crosophila Heps nuc lectate dehydrogenases rex File pame chick cytochrome b5 ruc IT Ges Files of e Gene Inspector fies rk nc aer y Cancel Figure 6 2 Opening a Document scrollable area are folders and Gl Notebook documents Peptide Sequences and Nucleic Acid Sequences check boxes work the same way e Close Close will close the active frontmost window If changes have been made to the window you will be given an opportunity to save the changes before the file is closed e Save Save will save any changes that have been made to the currently active doc ument The Save menu will not be enabled unless you have made some changes to the open document If a document has never been saved before this option will behave identically to the Save As option discussed below Sav ing a document will also reset the Revert to Saved condition Revertto Saved page 6 4 If you make changes to a document you can always return to the most recent saved version e Save As Save As will allow you to save the current active document under a different name see Figure 6 3 page 6 4 This operation will leave the original doc ument having the old name untouched and will create a new document corresponding to the current state of the window After doing a SaveAs the window will correspond to the newly created file on disk and any changes will be made to this new file Page 6 3 Menu Items
35. Selected In Output Object two menu choices become available Object Extract DNA for Selected ORF and Object Translate DNA for Selected ORF As these names suggest Gene Inspector can cre ate a new DNA sequence window containing the DNA corresponding to the selected ORF or it can create a new peptide sequence window containing the translated sequence from this ORF By providing this capability the Gene Inspector makes it easy for you to follow the logical path of examining the peptide after seeing the ORF without forcing you to go through multiple inter mediate steps to generate the peptide sequence The ORF indicators in the CodonPreference page 4 23 GC Coding Prediction page 4 37 and Tes tCode page 4 44 analyses are also interactive in this same way and are discussed in Tutorial 17 Testcode An Interactive Analysis page 2 55 If there are several ORFs you would like to translate or extract DNA from you can select additional ORFs by holding down the shift key and clicking on the ORFs you want to add shift clicking With several ORFs selected the extract and translate menu items will create documents containing multiple sequences Page 4 41 Analyses Restriction Enzyme Digest This analysis will create restriction maps using enzymes chosen from a list Analysis Setup Analyses 1 Inputs 1 Outputs 1 _ High Priority Close gt run 3 Enzyme List all_enzymes_4 zs Ba Available enzymes Sites
36. Selection vs Targeting the left part of Figure 5 4 on page 5 6 You can resize the object using these handles in the same way you would within a drawing program Click and drag a handle to change the size of the object An object can be moved on a page once it is selected simply by clicking inside the object with the mouse and dragging it Many GI Notebook objects can also be edited Double clicking on an object b This is also discussed in Selection vs Target page 2 1 Page 5 6 The GI Notebook makes it the target and causes the appearance of a gray border around the object as shown in the right of Figure 5 4 Once an object is targeted you can edit component parts of the object This might mean changing the font color or pattern of an axis label but might also mean editing the text of a table changing the color of a plot changing the line thickness of a squiggles plot or changing the parameters of an analysis object and recalculating the analysis The difference between making an object a selection or a target is important The terms are specific and will each allow you to perform a set of specific operations on the object Drawing Tools Graphics can be drawn directly in the Gl Notebook using the drawing tools provided under Notebook Tools A palette of the Tools is shown in Figure 5 5 sidebar tool selection tool d sie tool line arrow tool q SE e rectangle tool ellipse tool round rectangle tool
37. Sequence page 4 34 for more details This analysis can be run as a sum mary analysis See page 4 37 for more details Find Sequence Prosite style This powerful search routine uses the Prosite language to define a query sequence Very sophisticated query strings can be constructed to design a Analysis Setup Analyses 1 Inputs 0 Outputs O High Priority Close Run About Prosite Style Find Syntax inclusion any single character within is acceptable example G KA G would match G K G or G AG exclusion none of the characters within are acceptable example G KA G would match any triple except G K G or G AG XE or x 2 4 number s in parentheses represent repetitions example X3 means X X X x 2 4 means X X XXX or KI lt or gt means the sequence must be at the amino lt or carboxy gt end example lt AS means AS with Abeing the amino terminal residue Output Location X matches any amino acid seperates elements in a pattem Find RKDEJ 2 5 z M Show Icons Style Default z Type in the search sequence in the Search for box Enter the search sequence using the language defined in this panel Make sure to include a hyphen between each position in the search sequence Prosite language specifications are also discussed in the manual Figure 4 52 Find Sequence Prosite style Setup very precise search This analysis is not the same as searching the Prosite database for matche
38. Setup aneyses 1 depute arp M 1 ighi nany Clore Fun Ghns ns quecee Ip mee anPansigatise A nocifed Remove Sil toner Ju ter Ju viru D danz se yuan ie Me ne length ot th chazen Do st ee A Figure 2 20 Input Sequence Dialog be using in this analysis For this analysis you will examine all of the acetyl choline receptors You should see something resembling Figure 2 20 Click the Add button to bring up the sequence chooser dialog 4 After pressing the add button you will see the dialog box shown in Figure 2 21 page 2 29 This allows you to choose which sequences will be ana lyzed Find the Peptide Sequences folder in your Gl sequences folder and click once on the file acetylcholine recpts This will place the names of all the sequences in this file into the list in the lower left of the dialog box Now press the Add acetylcholine recpts gt gt button to add this file containing the 9 sequences to the Chosen files and sequences list in the lower right of this window After adding these sequences press the Done button to return to the analysis setup panel 5 Press Run to run the analyses 6 You will see the summary result object appear in the notebook as shown in Figure 2 22 page 2 29 This object lists the 9 sequences examined the number of matches found within each of them and indicates on a linear map where the matches exist This object is a summary of the search results for the query se
39. Sites menu item will mark restriction enzyme sites for the sequence in the features object if it is a DNA sequence and will mark chemical or enzy Page 6 39 Menu Items matic cleavage sites if the sequence is a protein sequence These operations are very similar to the analyses which mark sites Restriction Enzyme Digest page 4 42 for DNA sequences and Protein Cleavage page 4 67 for protein sequences In the features object you will see the site names listed above the sequence The first letter of the site name is directly over the first character of the recognition site e Translate The Translate menu item will ask you for a translation table to use and will then translate the nucleic acid sequence which is selected This menu item is only available for nucleic acid sequences e Define Intron The Define Intron menu item will invert the colors of the selected nucleic acid sequence to indicate the presence of an intron and will cause the selected segment to be ignored when the nucleic acid is translated By defining introns it is possible to translate a nucleic acid sequence across the intron and keep the reading frame intact This option is only available for nucleic acid sequences e Undefine Intron s The Undefine Intron s menu item will remove the indication of any introns from the selected segment of nucleic acid This does not remove any sequence from the display it just changes the display so tha
40. Special Paste This menu option allows you to specify how information in the clipboard will be placed into the GI notebook Information can be placed into the notebook in three different formats picture text and sequence If you copy a sequence from the sequence editor and paste it into the notebook it will create a Fea tures Object allowing you to annotate the sequence see Creating a Fea tures Object View of a Sequence page 3 16 But what if you want to paste the actual sequence into the notebook background text You can do this with Edit Special Paste Paste Text The Paste Text menu item forces any information on the clipboard to be pasted into the Gl Notebook as just plain text information Some applications like Textco BioSoftware s Gene Construction Kit place both graphical and text information on the clipboard In these cases choosing Edit Special Paste Paste Text will past in the sequence information while choosing Edit Special Paste Paste Picture will paste in the actual graphic Clear Clear functions the same as pressing the delete key It will delete the current selection e Select All SelectAll selects all objects of the same kind as the current object If the inser tion point is in the background text of a GI Notebook the entire background text will be selected If the insertion point is in a sidebar text then all text in that sidebar will be selected If a notebook object is selected then al
41. Style Sheets you have defined Because of this you should name your Style Sheets carefully so that you can recognize them easily later on Finally note that Style Sheets can be used to specify properties for back ground text For example you might define a Style Sheet called headline that is blue 18 point bold Helvetica one called Figure that is red 12 point Page 4 10 Analyses bold Times and another one called main text that is black 12 point Times These Style Sheets can also be applied to analysis output objects where they will affect all the text in that object applying main text to a sliding window analysis will change the title axis labels and axis numbering to 12 point Times but will not alter the plot itself Frames Each object in the GI Notebook can have a frame This is a rectangle or rectangles framing an object which is used to distinguish the object from the surrounding background text Frames consist of one two or three concentric rectangles separated from each other and from the object they are framing by a user defined distance measured in pixels The thickness color and pat tern of each line can be set As shown in Figure 5 6 page 5 9 a drop shadow box can also be defined As is the case for Style Sheets Frames can be added to and removed from the Format menu by choosing items in the Format gt Frame submenu see Framing Gl Notebook Objects page 5 8 A style sheet f
42. Tutorial 1 Tour of a Gene Inspector Notebook and Tutorial 2 Editing Sequences this tutorial provides an overview of the three main parts of the Gene Inspector 1 Analysis Setups are a key concept in the Gene Inspector They are how you initiate an analysis and they provide a way for you to create and later return to a specific analysis or set of analyses Choose Analysis gt New Analysis This will bring up the Analysis Chooser shown in Figure 2 7 At this point you Analysis Chooser dignal sequence A y Show ______ pene O Nucleic Acid Analyses Cancel 3 Sliding window fa Protein Analyses a Ri Information about selected analysis Surface probability Transmembrane helix analyses are designed to D identify hydrophobic alpha helical or beta regions of proteins that are likely candidates to be Surrounding hydrophobicit membrane spanning domains Several different E tables of values can be used in the calculation Wfansmemorane HEES v v Show Icons Figure 2 7 The Analysis Chooser could choose to do either a protein or nucleic acid analysis For this tutorial press the radio button at the top of the dialog to specify that you will be doing a Protein Analysis The text area on the right of this window always pro vides information about the analysis that is selected in the list on the left side of the window in this case Transmembrane Helices Also note that the list can be displayed as a text list as in
43. anal ysis and all low priority analyses in the queue to be run while you run a high priority analysis After the high priority analysis is complete the queue c Other programs often completely take over the CPU while they are doing their tasks This other modal approach gives the user no flexibility to perform other operations while the application is running Page 7 3 Tips For Using The Gene Inspector will resume with the next analysis in line This is another example of a way in which the Gene Inspector can conduct analyses and let you continue to work without waiting for the program to com plete its current activity About GI Notebook Size The size of a GI Notebook is determined by the contents of the notebook All background text and notebook objects contribute to the notebook size Again because of the extreme interactivity of the Gene Inspector the program stores a great deal of information about how to rerun analyses and information about the analysis parameters and hotlinked sequences analyses For most analyses the extra information does not amount to much extra size You can see how much disk space is required for a particular object by selecting the object and then choosing Notebook Get Info see Figure 5 9 page 5 11 For the dot matrix analysis a large amount of disk space is used for sharing all the similarity values at each position in the matrix This information is needed to allow you to reset t
44. analysis The table selected on the right will be used to fill in values in newly created table it serves as a starting point for editing Pressing New will bring up a table like the one shown in Figure 4 10 page 4 15 This standard translation table contains a mapping of codons to amino acids in the first two columns When the cursor is moved over the Start or Stop column it turns to a check mark indicating that you can specify any codon to be either a start or stop codon by clicking in that table cell Stop codons are indicated by a red dot in the stop column and the absence of an amino acid in the second column Start codons are indicated by a green dot in the start column Codon preference tables contain additional information as shown in Figure 4 11 on page 4 16 The first two columns and the last two columns contain information just as for the translation tables The weight and total columns contain information about the frequency of use for each codon and the total usage for all codons for a given amino acid In the case shown isoleucine codon usage is highlighted Clicking in the weight column for ATA selects that cell but also selects all cells in the total column that correspond to the same amino acid Thus ATA ATC and ATT are all selected because they all code for isoleucine The weight value is the number of times that particular f Built in tables are in the Standard Tables folder within the Gl Data folder Tables
45. analysis if you can want to reformat the output in any way Size of data on disk 12804323 f Cancel OK Figure 4 23 Discard Data Dialog the output object all the data needs to be stored with the output object If you decide that you don t need the stored data any longer and you are happy with your plot you can choose Object gt Discard Data and just save the picture of Page 4 29 Analyses the analysis You will see a dialog like the one shown in Figure 4 23 As described in the dialog once you dispose of the data you will not be able to alter thresholds and the image will not be changeable but you will still be able to recalculate the analysis The dialog also shows you how much disk Figure 4 24 Selecting a Subrange in a Dot Matrix Plot space you will save by discarding the data The dot matrix window also comes with the ability to launch related analyses by using the mouse to select a subrange in the plot that can be used as a starting point for another analysis As seen in Figure 4 24 on page 4 30 when the dot matrix plot is targeted you can use mouse to select a region on the plot by dragging the mouse over that region of the plot place the cursor at the top left corner of the region press the mouse button down drag to the bottom right corner and then let the mouse button up Once this region is selected you can use the Object menu to either perform an alignment of the sequences in that selected region or
46. as a function of position along the length of the peptide The setup panel is shown in Figure 4 68 on page 4 68 In this case a table containing a value of 1 for each charged amino acid and O for all other amino Page 4 68 Analyses acids is being used Output is shown in Figure 4 69 on page 4 69 The anal ysis examines a number of adjacent amino acids and calculates a value for this window in the sequence This value will be plotted The window is then moved along the sequence by one character and a new value calculated and plotted This is repeated until the end of the sequence is reached In this par ticular case the charged amino acids appear to be clustered in Lamprey rho dopsin Sliding Window Charged Amino Acids Lamprey rhodopsin 1 51 101 151 201 251 301 351 Amino acid Figure 4 69 Sliding Window Output Side Chain Protrusion This sliding window analysis is based on Thornton et al EMBO J 5 2 409 1986 The values used are based on the protrusion of the alpha carbons from a protein s globular surface using values from x ray diffraction studies This analysis is identical to running an Antigenicity analysis using the Thorn ton table Surrounding Hydrophobicity This sliding window analysis is based on data from Manavalan amp Ponnus wamy Nature 275 673 1978 This data indicates the likelihood that any given amino acid will be surrounded by hydrophobic amino acids Values in Page 4 69 Analys
47. asked to enter a number for the length of the insert after which the Gene Inspector will place the extra characters into your Page 3 20 The GI Sequence Editor sequence at the site of the insertion point The other method of generating sequences is to select the menu item Sequence gt Generate Random Again you will be asked to provide a length for the insert after which the Gene Inspector will generate a random sequence at the location of your insertion point Page 3 21 The GI Sequence Editor Page 3 22 Analyses Chapter 4 Analyses l The Gene Inspector provides a large number of sequence analysis options Running analyses requires you to select one or more sequences to be ana lyzed and one or more analyses to be performed on those sequences This process is carried out using Analysis Setup Windows The result of each analysis is placed into the GI notebook as an analysis out put object The format of an output object can be altered after the analysis is run and the analysis output object can be used as a starting point for recal culating the analysis or launching additional analyses This chapter explains the mechanics of how to run and edit analyses as well as providing some information about the algorithms used what the analysis is actually doing Descriptions accompanying these discussions also should help you interpret the analysis output Starting an Analysis New analyses are started by choosing
48. be Page 4 49 Analyses defined for matching mismatching inserting a gap and extending a gap This routine is best used for aligning two sequences that are known to be related closely like two globins Penalties for end gaps are optional and a Z score can be calculated to determine how significant the alignment is See the nucleic acid section Align 2 Sequences Global page 4 16 for more infor mation about Z scores end gaps and other parameters One difference between the protein and nucleic acid global alignments is the existence of several well defined scoring tables for protein alignments that are based upon evolutionary models These tables are called PAM and BLOSUM tables The PAM tables were derived originally from the data of Dayhoff et al in Atlas of Protein Sequence and Structure vol 5 supplement 3 ed M O Dayhoff NBRF Washington D C 1978 p345 A number of sets of closely related proteins were examined to determine the frequency of change of one amino acid into another amino acid during evolution Based on the number of changes observed in these closely related proteins a procedure was developed to estimate how many changes could be observed over larger evolutionary distances where successive changes might occur at the same site in a protein sequence A PAM1 table is a table of probabilities that one amino acid will be converted to another amino acid given a single change per 100 amino acids i
49. called Conditional Text It is available under Format Style Conditional Text Conditional text which can be shown or hidden is embedded in the normal notebook background Page 5 2 The GI Notebook text It can be used for keeping notes that you might not necessarily want to print out but would like to keep available for viewing on screen You can also a Display Preferences Iw Show page breaks using color M p g color I Show paper binding using color Save as default display prefs for new notebooks Cancel Figure 5 2 Notebook Display Preferences Dialog use it for detailing extra information about posters or slides you are making For example you might have designed a diagram with some text to use in making a slide but would also like to keep some extra comments about the figure for storage in the notebook or for printing as lecture notes You can create the extra notes and then define them as conditional text When you want to print your slide hide the conditional text When you want to view or print your notes show the conditional text Conditional text might also be useful for discussions that you want to record but which might not be ger mane to the section of the notebook in which the comments reside You might also think of conditional text as Post It notes that can be hidden Style Sheets Once you have formatted a bit of text in a way that you might like to use again at a later time font siz
50. can also be saved as tool extensions e g a microfuge tube a small image of your face etc If you often create tables of the same type it pays to store a template of the table as a tool extension Thus if you run an 18 lane gel repeatedly you might store a table with eighteen rows and one column as a tool extension Each time you run a new gel just use the tool extension to place an empty table into the GI Notebook and fill in the current list of samples You are not just limited to graphic objects and tables for use as tool exten sions however because any analysis output object can also be used as a tool extension Analysis output object tool extensions can be placed back into the Gl Notebook and used to run analyses This is an alternative way to store analysis setups in addition to being able to add them to the Analysis menu as analysis setups see Chapter 4 Page 5 18 Menu Items Chapter 6 Menu Items This chapter details all of the menu choices available within the Gene Inspec tor Some of the menus are only available when specific conditions are met e g an object is targeted so the Object menu appears Page 6 1 Menu Items File Menu New Chan New SEN Ee Open Ctro Open o The File menu deals with cose Clan SS creating opening printing save as Save es i M i Save As and saving documents 1 Save A Copy The Windows and Mac impor gt t Export D versions a
51. choose a table to use for scoring For nucleic acids an identity table makes the most sense it will give a score of 1 for each match and O for each mismatch This is one of the standard tables Once the table is chosen a range of possible scores is indicated at the top and bottom of the Color Range indicator In Figure 4 21 page 4 27 the range is from O none of the 20 nucleotides match to 20 all twenty match If a different scoring table or window size were chosen the range indicator would display different values To specify how the plot is drawn you need to define thresholds for plotting Thresholds are scoring values above which a specific color is used This is a two step process first define the threshold values and then define the colors for each threshold you have defined Threshold values are entered by typing a value this can be no greater than the maximum score in the threshold box and pressing the Add Threshold button Add all the threshold values you would like to use as cutoffs for displaying the matching data this is dis cussed in Tutorial 18 Dot Matrix Analysis Another Interactive Analysis page 2 58 Now click on one of the values in the Thresholds list and use the Format gt Color menu to define a color for it As colors are defined for each threshold they are indicated in the Color Range Indicator thermometer on the right of the panel The last item to be defined is the dot size This popup menu spec
52. clear as possible all menu selections are indicated as hierarchical choices using a menu font such as Edit gt SelectAll This particular case means to locate the Edit menu and then choose SelectAll under the Edit menu About the Tutorials There are three major parts of Gene Inspector the Gl Notebook the sequence editor and the analysis setups These three parts are dealt with individually in the first three tutorials If you do not have time to do any other tutorials you should at least complete the first three which illustrate these components of Gene Inspector Tutorial 4 Hotlinking Analysis Results is also important in demonstrating how the analysis results in your notebook are alive and connected to the original sequences used for the analysis Other tutorials help explain different capabilities of Gene Inspector and demonstrate ways in which the program might be of special use to you Selection vs Target The difference between choosing an object as a selection or a target within Page 2 1 Tutorials the Gene Inspector is important The terms are specific and each will allow Humah LDH A ka G T k Selected Object Targeted Object Figure 2 1 Selection vs Target you to perform a different set of specific functions on an object These terms are used throughout the manual and are important for you to know The two choices are shown in Figure 2 1 Clicking once on a GI Notebook object makes it the selection
53. curve is below the red line it does not mean that it cannot code for a protein This analysis does not have any built in way of handling ambiguous charac ters e g Y R N so you are given an opportunity of specify what to do when an ambiguous character is encountered in the sequence being analyzed this can be done in the top part of the setup panel Figure 4 42 page 4 Page 4 46 Analyses 45 In Figure 4 44 a few ambiguous characters were introduced into the Hsp7O sequence and the analysis was rerun by substituting A for each ambiguous character The positions of the ambiguous characters are indicated as tick marks in the plot Notice that the curve dips down at about 2400 and that the ORF in reading frame 3 is broken up as the result of an ambiguous character at about 2500 The ORF indicator at the bottom of the output object behaves in the same way as the ORF indicator for the ORF analysis page 4 39 so you can extract DNA and generate corresponding peptide sequences directly from the ORF arrows BLAST Search The BLAST analysis is based on Altschul et al J Mol Biol 215 3 403 1990 You can compare your query sequence to the universe of other sequences and ask if there are any other sequences related to yours The BLAST analysis is designed for speed and the results are returned with a well defined statistical interpretation The BLAST server is located at lt http www ncbi nlm nih gov BLAST gt The query po
54. data is to choose Notebook Page 4 43 Analyses Open for Editing also see page 6 29 This will open up the output object in a separate window that can be scrolled and manipulated as if it were a sepa rate document window There is often a need to filter the amount of information in the digest output to display only a subset of all the enzymes This is done using Object Edit Display a Display parameters Show Iw 3 overhangs Iw 5 overhangs M bunt ends with M atleast 1 sites I no more than 6 sites Cancel OK Figure 4 41 Restriction Digest Edit Display Parameters Parameters The dialog is shown in Figure 4 41 page 4 44 Using the check boxes at the top of the window you can choose to show any combination of enzymes that cut to leave 3 overhangs 5 overhangs or blunt ends The bottom part of the box allows you to define how frequently an enzyme must cut to be displayed In this instance the parameters are set to show enzymes that produce at least 1 but not more than 5 cuts in this case TestCode This analysis is based on that of Fickett Nuc Acids Res 17 10 5303 1982 The algorithm called TestCode takes advantage of the fact that codons for the same amino acid synonymous codons are used with different frequencies in coding regions of DNA This leads to an asymmetry in the dis tribution of nucleotides in every third position along DNA containing a coding sequence compared to the distribution
55. display preferences oncccccccnonnccnnnonnncnnnnnnnnnnnnnnrnnnnnnnnrrnnnnnnrnnnnnnnrrnnnnnrrnnnnanannnnnnns 2 36 d t matik EE 4 54 thresholds eege eege did a 4 28 WINDOW SIZE EE 4 27 Drag amp Drop Options Edit Menu cccccccceccescssssseeeceseeeessseaneeeeeeseseees 6 13 drag and drop Sequence editing EE 3 6 GIAWING tOOIS E 5 7 E edit display parameters ENEE ENEE ENEE ENEE 4 44 Edit Menu el 6 10 COPY EE 6 9 GUT oak eebe is A evade ol ee ele ed 6 9 drag amp drop Options EEN 6 13 find Ee TE 6 11 KEE 6 9 selec chess ees Senge ege Be Kee detent as heated Deeg setae eed 6 10 Show clipboard MEN 6 14 Index 4 Ee VT ilo MEET 6 11 show hide page breaks sio AED 6 14 special Paste iii ea aaa 6 10 UNO EE 6 9 editing Sequences tutorial ooconcccococonoconnooononcncnnnnnnancannnnnnnnonanoncnnnnnonos 2 9 2 13 Eisenberg et al table REENEN eee ence nese ENNEN A 1 Emini Otel table iii a a EENS A 1 Engelman amp SteitZ table sr irre en ae a a eae a A 1 Engelman et al table AEN A 2 Export Fle Menu insuran Ovi an a a a A a 6 6 extending a SClOCUOM EE 2 20 OX CAUTION erau Eege o a a T a 3 6 extracting DNA from a selected ORF coonncccccccnononcnnnnnnnnnnnnnnnennnnnnn nn rn rrr ren 4 41 F Fauchere amp Pliska table iia sli aa A 2 EUREI I EEEE E A E E E ee dE 3 17 adjust size to CONTENTS a ida 6 43 GETING ul rel WEE 3 17 6 40 display AN A a EN 3 18 6 40 GOUPO tds 6 42 Mark Sites cutis 3 17 6 39
56. display within the Gene Inspector when you choose Page Setup The characteristics are used to set page borders text margins and other Gene Inspector features based on printer characteristics If you are having problems printing make sure you use the Page Setup option to help the program understand the characteristics of the printer that is being used e Print This is the standard dialog put up by the operating system which allows you to print the current document Page 6 7 Menu Items e Print Notebook and Appendices In the Gene Inspector it is possible to have parts of a GI Notebook contained in an appendix When you print the notebook using the standard Print option the appendix objects are not printed To print appendices you need to choose Print Notebook and Appendices This provides you with the opportunity to print both appendices and the notebook itself e Choose Gl Data Folder Windows only The Gl Data folder contains all the lists tables style sheets and other infor mation that the program needs during its operation By default Gl looks for the Gl Data folder that resides in the same folder as the application How ever you might want to access your own Gl Data folder while running the program from a different computer This menu option allows you to choose the GI Data folder to be used If you quit the program and start it again it will use the GI Data folder that was in effect the last time the program was
57. dragging with the mouse just as you would in a word processing program However note that unlike a word processor as you extend your selection by dragging the mouse vertically only the sequence you clicked in initially is selected none of the other interleaved sequences are selected A number of different operations can be performed on the selected residues nucleotides or amino acids If a segment is selected and you start an anal ysis the selected range of characters in the selected sequence is used as the default sequence for the input panel in the analysis setup This enables you to be working on a sequence select a range of characters and then conve niently analyze that range of characters You can also create a detailed view of the sequence for displaying restriction sites and translations in a Feature Object see page 3 16 Manipulating A Sequence The Sequence gt Manipulate submenu contains operations which can be per formed on selected sequence segments For nucleic acid sequences the choices are Invert and Translate Invert will take the current DNA strand and flip it over to show the complementary DNA strand in the 5 to 3 direction for example the sequence ACCCGT when inverted will become ACGGGT The strand will be inverted in place and therefore replace the current selec tion This allows you to perform manipulations like inverting an insert in a vector To invert the sequence and also maintain the origin
58. editor is also the window for the results of multiple sequence alignments Multiple sequence alignments can be created as an analysis using the normal Analysis Setup windows or they can be created directly within a sequence editor document itself This is the subject of the tutorial Multiple Sequence Alignments page 2 24 To align all the sequences in the current sequence editor window choose Sequence gt Alignment gt Align All Sequences This will bring up the dialog box shown E Align multiple sequences Table Standard BLOSUM62 Step One Pairwise Grouping k tuple word size 1 maximum gap length 5 gap penalty la number of top diagonals to use 5 Step Two Multiple Sequence Alignment gap creation 10 gap extension 10 Align Cancel Figure 3 10 Multiple Sequence Align Setup Page 3 10 The GI Sequence Editor in Figure 3 10 page 3 10 for protein alignment In this case there are a number of parameters you can enter The multiple alignment algorithm used by the Gene Inspector is called Clustal V Higgins D G A J Bleasby and R Fuchs Comp Appl Biol Sci 8 2 189 1992 The code for doing the alignment was a generous gift from Dr Des Higgins at EMBL To do a complete multiple alignment there is a need to know which sequences are most similar to each other This is done progressively by cal culating a crude guide tree The guide tree is then used as a guide to align
59. have a different length Center Centers the text horizontally for each line Right Justify Aligns the right end of each line of text The left ends can be uneven as each Page 6 19 Menu Items line of text will have a different length Full Justify Adjusts text display so that both the left and right ends text lines are aligned This is accomplished by adding pixels between letters and between words Single Spacing Sets the vertical spacing between lines of text to be equal to the height defined for the tallest font displayed in that line s 1 2 Spacing Sets the vertical spacing between lines of text to be equal to the 1 5 times the height defined for the tallest font displayed in that line s Double Spacing Sets the vertical spacing between lines of text to be equal to the 2 times the height defined for the tallest font displayed in that line s Other Line Spacing Other Line Spacing allows you to set the vertical line spacing e Style Sheets Style sheets are discussed in detail in Tutorial 10 Creating and Using Style Sheets page 2 38 This menu can be customized to contain any style sheet you create Add Style Sheet To Menu Add Style Sheet To Menu will add the style information from the currently selected object to the StyleSheets menu The name you provide will be used to identify the style as a menu choice Remove Style Sheet From Menu If you have cus
60. in the notebook See Tutorial 19 Using Bookmarks in the Gl Notebook page 2 62 Attach Bookmark When an object is selected Attach Bookmark will attach a bookmark to the selected object You will be asked to name the bookmark which will then be appended to this menu as a custom bookmark Remove Bookmarks If you have added any bookmarks to the menu this option will allow you to remove them Custom Bookmarks After the Attach and Remove options will be a list of all the bookmarks you have created in the currently active notebook Selecting one of these items will bring you to the bookmark location in the notebook e Text Flow Since each object in the Gl Notebook can be placed anywhere on the page it is important to be able to define how text should flow around the object This is the function of the TextFlow menu This has been discussed in Text Flow Around Objects page 5 10 Flow Through Flow Through text does not recognize that an object is present and over writes the whole width of the text column completely running through the object Both Sides Both Sides text jumps across the object and is placed on both the right Page 6 30 Menu Items side and the left side of the object Left Side Left Side text only will be placed to the left side of the object Right Side Right Side text only will be placed to the right side of the object Widest
61. in the sequence editing pane The segment indicator does not have to be rectangular and accurately indicates different ranges for different sequence when appropriate see Figure 3 1 page 3 1 Scrolling the sequence editing pane will result in the overview pane being updated auto matically to match the displayed range The overview pane can also be used to navigate within a sequence docu ment Clicking on an arrow in the overview pane will do two things First it will move the segment indicator to include the point that was just clicked and will scroll the editing pane to the same location Second it will select the sequence that was clicked in both the overview and editing panes This can be useful if you have a large number of sequences The overview pane therefore provides a graphical overview of sequences in the document in addition to being a navigation tool for moving around the document The Editing Pane There are three areas in the editing pane the name column the position col umn and the sequence In addition there is a ruler to indicate the position of characters in the sequence other features are available for displaying multi ple aligned sequences see Multiple Sequence Alignments page 3 10 Clicking on the name of a sequence will select the entire sequence and will allow you to perform whole sequence manipulations like copying and past ing an entire sequence to a another sequence window You can also get i
62. interface between the user and the entire set of analyses The Sequence Editor provides a means to enter and edit sequences and to make them available for analyses Analyses are defined using Analysis Setups and the results of the analyses are placed into a Gl Notebook The following sections in this chapter examine these components Page 1 4 Getting Started with Gene Inspector briefly The Sequence Editor Chapter 3 Analyses Chapter 4 and the Gl Notebook Chapter 5 are discussed in more detail in their own chapters Sequence Editor The sequence editor provides a window that can be used to hold one or more sequences Either nucleic acid or protein sequences are allowed but the two types of sequences cannot reside in the same sequence document Each sequence window corresponds to a single file and may contain a single sequence or a collection of sequences Sequences that reside in sequence editor documents are the starting point for virtually all analyses as well as the generation of formatted sequence displays in the Gl Notebook see Creating a Features Object View of a Sequence page 3 16 The Sequence Editor can contain multiple sequences which you might want to store grouped together For example you might have a file of globin sequences containing globins from a number of different organisms or differ ent globins from the same organism You might also choose to group all of your vector sequences together in a single Se
63. like that shown in Figure 2 19 Notice that there is Analysis Setup Analyses 1 Inputs 0 Outputs 0 High Priority Close 3 run A Find New Search E Edit Find Menu Input Sequences e Aen D e Add Segment Insert Segment JC Remove Segment ao Output Location Segment aaaa Max number of mismatches in this segment E Gap before next seg Min 0 O Max 0 M show Icons Style Default EH show summary results Seance bah Enter sequence in Segment box define maximum number of mismatches and d we before next segment and press Add Segment button The Edit Find Menu button allows Geer queries to be saved edited or removed Insert Segment and Remove Segment buttons change the query Mismatches only occur in lower case characters Upper case characters must match exactly Figure 2 19 Find Sequence Dialog a Show Summary Results checkbox circled in the figure You should click in this box to turn it on Type aaaa into the Sequence field as the sequence to search for and then click on the lt empty segment 1 gt text in the list box above to enter that sequence For the number of allowable mismatches enter a 1 There is no need to enter anything in the other fields For more information about the FindSequence analysis see Find Sequence page 4 34 Page 2 27 Tutorials Running Summary Analyses 3 Now click on the Input Sequences icon to choose the sequences you will B Analysis
64. locations throughout the manual you will be asked to select items in menus To make your choices as clear as possible all menu selections are indicated as hierarchical choices such as Edit Select All This particular case means to locate the Edit menu and then choose SelectAll under the Edit menu Throughout the manual figures alternate between Windows and Macintosh images Installing the Gene Inspector The initial Gene Inspector installation requires about 14 megabytes of disk space This includes all the files and databases need to run Gl and carry out the tutorials All the files are stored in a folder on your Gene Inspector CD and need to be installed on your hard disk from this CD We have tried to make the installation of Gene Inspector as simple as possi ble On the Macintosh 1 Insert the Gene Inspector CD and locate the Gene Inspector folder 2 Drag this folder to your hard disk Note that it is important to drag the entire folder from the CD to ensure that the application will run properly Dragging just the application from the CD to your hard disk will not work If Page 1 2 Getting Started with Gene Inspector you have a previous version of Gene Inspector and have files that you would like to keep with the application you can place them into the new Gene Inspector folder once you have dragged that folder from the CD 3 With the CD still in the computer start up the Gene Inspector application you just installe
65. new notebook or added to any currently open notebook all of which are listed in the popup menu 6 Before you run the analyses note that at the top of the Analysis Setup you have selected two sequences inputs and one analysis therefore you will have two output objects The information at the top of the window always lets you know how many output objects you will be generating The High Priority option is discussed elsewhere see Analyses That Take a Long Time page 7 3 and the text around Figure 4 3 page 4 4 leave it unchecked for now Page 2 17 Tutorials Using Analysis Setups 7 To start the analysis running and close the Analysis Setup Window choose the Run button the Close button would just close the panel without starting the analysis 8 Once the analyses have been launched they will appear in the Analysis Monitor You can see the analysis monitor by choosing Analysis gt Show Analysis Monitor The Analysis Monitor shows the status of each analysis being run and will also show the order in which analyses are to be run The Analysis Moni tor will indicate the progress of each analysis as it is run Most analyses will run so quickly that unless you already have the Analysis Monitor open when the analysis starts you will not be able to see the analysis listed in the Anal ysis Monitor because it will be done before the window opens 9 Each completed analysis becomes an object in the GI Notebook Click once o
66. notebook they can be used as a very conve nient way to navigate through a great deal of information You might even choose to create a bookmark called current and place it where you are cur rently entering notes This bookmark can then be moved at the end of each day and still be accessible through the bookmark menu where it will transport you to the last location you were working 8 In addition to the standard styles you find in most word processors like Bold ta ic etc the Gene Inspector has additional styles including one called Conditional Text see the Format gt Style menu This text can be shown or hidden Choose Notebook gt Display gt Show Conditional Text Previously hidden text is now displayed In this particular notebook we set the conditional text to have green bold and strikethru attributes Conditional text can be hidden or shown whenever you want to do so by choosing the appropriate menu items Leave the conditional text showing for the remainder of this tutorial 9 The Gene Inspector also allows you to define your own styles and add them to the Format menu Style Sheets can be used to define the appearance of text as well as the appearance of analysis objects in the GI Notebook Note that the word Objectives at the start of the background text has its own unique style which suggests that it is a section title You can duplicate this style for any other text in the notebook because a Section Title st
67. nucleic acid alignments Sequence gt Consensus gt Show Hide Scoring Row will show or hide an additional row at the top of the current sequences Instead of showing actual sequence data the scoring row will show a histogram of how good the match is between the consensus and the contributing sequences at each location along the consen sus sequence Finally Sequence gt Consensus gt Show Hide Shading will highlight characters in the aligned sequences that match the consensus character This is shown in Fig ure 3 11 The intensity of the shading is directly proportional to the fraction of characters at that position which match the consensus character If all the aligned sequences have the same character at a given position the highlight Page 3 12 The GI Sequence Editor e008 W Rhodopsins matched 35 10 20 30 40 50 SCORE 1 Boilie H RH 9 H mn w em CONSENSUS 1 WNxTEGxNxx VPGSNKTGVD xxPxxYPQYY LAEXWxYxAL GxYxxLLGxL bacteriorhodopsin 1 MLELL PTA VEGVSQAQIT GRP EWIWLAL GTALMGLGTL Halobacterium archa 1 MOPIA LTA AVGADLLG D GRP ETLWLGI GTLLMLIGTF Lamprey rhodopsin 1 MNGTEGDNFY VPFSNKTGLA RSPYEYPQYY LAEPWKYSAL AAYMFFLILV Octopus rhodopsin 1 MVESTTLVNQ TWWYNPT VD IHPH WAKFD PIPDAVYYSV GIFIGVVGII Xenopus rhodopsin 1 MNGTEGPNFY VPMSNKTGVV RSPFDYPQYY LAEPNQYSAL AAYMFLLILL SCORE 51 CONSENSUS 51 bacteriorhodopsin 39 Halobacterium archa 38 me we 8 GFXVNGMxxF VxxQxKKLRT PxNxaxLNLA MANLSFMV
68. object was created the symbol will change from the plain circle to red and yellow exclamation point as shown in Figure 6 21 You can choose to update the object as described in Perform Auto Recalc Now page 6 35 You can define an object as one that is automatically updated hot linked by selecting the object and then choosing this menu item Manual Even if no automatic linking is turned on the original sequence is still con nected to the output object You can manually recalculate the analysis by choosing Object Recalculate No symbols are visible in the output object as they are in the Automatic linked object You can define and object as one that is manually updated by selecting the object and then choosing this menu item Perform Auto Recalc Now Perform Auto Recalc Now will update all of the hot linked notebook output objects that need to be updated You will get a list of all the analyses in the notebook Page 6 35 Menu Items Auto Recalc The following objects seem to be out of date Base composition chick muse AchRec izi Recalculate the listed objects now Cancel Recalculate Now Figure 6 22 Autorecalc Dialog that need to be updated Figure 6 22 Pressing Recalculate Now will start the updating Each analysis will be placed into the Analysis Monitor and will be run in turn Recalc Selected Items Recalc Selected Items provides a shortcut to selecting each output object
69. or sensitivity but will dramatically change the number of gaps introduced Page 3 11 The GI Sequence Editor e of top diagonals to use The number of best diagonals in the imaginary dot matrix plot that are considered Decrease but not less than zero to increase speed of analysis increase to improve sensitivity Step two multiple sequence alignment gap creation Reduce this to encourage gaps of all sizes increase it to dis courage them Terminal gaps are penalized the same as all others Beware of making the gap creation penalty too small 05 if the penalty is too small the program may prefer to align each sequence opposite one long gap gap extension Reduce this to encourage longer gaps increase it to shorten them As for gap creation terminal gaps are penalized same as all others and the same warning applies about making this value too small Enhancing Aligned Sequence Displays Once a multiple sequence alignment has been created the display can be enhanced in a number of different ways These items are available under the Sequence gt Consensus menu Sequence gt Consensus gt Show Hide Consensus Row will show or hide an additional sequence row at the top of the current sequences This row will show the character that is present more than any other character at that position in the aligned sequences If no character is more frequent than any other an x is placed in protein alignments or an n is placed in
70. returned from the BLAST server they will open in your web browser This concludes the tutorials Page 2 66 The GI Sequence Editor Chapter 3 The GI Sequence Editor Introduction to the Sequence Editor A sequence editor window is shown in Figure 3 1 The editor is the part of segment indicator ruler editing pane sequence position of sequerr es name first residue Figure 3 1 The Sequence Editor the Gene Inspector where sequences can be displayed and edited The sequence editor has been designed to make editing one or multiple sequences as simple as possible The overview pane top of window shows a graphical view of all the sequences in the context of the whole document window while the editing pane bottom of window allows you to do sequence manipulation and editing Tools are available for confirming sequence entries and reassigning keys to facilitate easy sequence editing The Overview Pane The overview area displays all the sequences in the document and indicates their relative lengths The scale of the overview pane is based on the length of the longest sequence in the document which will span the entire width of the pane All other sequences are drawn as a proportion of that length Page 3 1 The GI Sequence Editor The segment indicator is an area in the overview pane surrounded by a dot ted line This area indicates the segments of each sequences that are cur rently visible
71. right side of sheet for even numbered sheets You can set the size of the paper binding using Notebook Layout page 6 37 This dialog is shown in Figure 5 3 page 5 4 Set Display Preferences SetDisplay Preferences can be used to specify which adornments are displayed in the notebook and what colors they will have Page breaks text margins printable area and paper binding can be adjusted e Appendices Appendices are separate windows containing information that might normally Page 6 33 Menu Items be found in a Gl Notebook In fact all appendices start their lives as note book objects and get moved to an appendix using choices in this menu Appendices are discussed in Tutorial 12 Appendices Hiding Large Amounts of Data page 2 43 and in Appendix Objects page 5 16 Move Object To Appendices Move Object To Appendices will take the selected object and move it to an appen dix window of its own You will be given an opportunity to create an alias in the notebook that can point to the appendix The named appendix will also be added to the Appendices menu Discard Appendices Discard Appendices will give you a way to dispose of appendices you no longer need You will be presented with a list of appendices in the current notebook from which to choose Return Appendix To Notebook Return Appendix To Notebook will bring an appendix object back into the GI Note book and rem
72. so can only accept the standard TEXT and picture infor mation e Copy Copy transfers a copy of the current selection to the clipboard Once on the clipboard it can be pasted elsewhere with the same caveats as mentioned in the previous section Cut Paste Places what is on the clipboard into a Gene Inspector document at the loca tion of the insertion point If no insertion point is available and a GI Notebook is the frontmost active window the clipboard information is placed in the center of the visible area The GI Notebook can accept text or picture objects from other applications Text from the clipboard will be placed into the note book at the location of the insertion point in the background text If no inser tion point exists e g an object is selected when you paste in the text it will be placed into its own sidebar located in the center of the visible area Pic ture objects will be place in the notebook as their own objects Page 6 9 Menu Items Pasting sequence information from other programs into the Gene Inspector s sequence editor is handled in a special way The Gene Inspector removes any control characters line feeds carriage returns etc and any non nucle otide characters for nucleic acid files or non amino acid characters for peptide files before pasting any information from the clipboard into a sequence document You can check the clipboard contents by choosing Edit Show Clipboard page 6 14 e
73. standard formats GCG EMBL FASTA and Gen Bank e Export Choosing Export provides you with two choices of what to export You can export the entire document or just the selected part s if you are in a sequence document Export Selected Items If you have a notebook open this option will be disabled In a sequence doc ument if you have one or more sequences selected then this option is avail able The export options possible here are described in the next section Page 6 6 Menu Items Export Sequences Save in O Gi fles File pame Dios kend Cancel Save as lype Pearor Fada X Figure 6 6 Export Sequence Dialog Export Entire Document Exporting a GI Notebook will export all the background text into a text file This file can be opened by any word processor You cannot export graphic objects or analyses you should use the clipboard to do that The export dialog is shown in Figure 6 6 When sequences are exported each sequence will be exported in its own file which is named with the exported sequence name All exported sequence files will be placed into a single folder The popup menu is used to define the format for the sequence output files Choose one of these formats to define how the sequence will be formatted when it is exported e Page Setup This is the standard dialog put up by the operating system The characteris tics of the printer are determined and used to format the
74. text ruler which appears at the top of each notebook window The ruler also contains tab and justification icons Page 6 32 Menu Items Show Hide Invisibles Invisibles are characters that indicate tabs returns spaces page breaks etc Normally these characters are not visible but you can make them visible by choosing Show Invisibles Show Hide Conditional Text Conditional Text is text that can be hidden This is discussed in Conditional Text page 5 2 Show Hide Page Breaks Page breaks are lines drawn in the Gl Notebook indicating the location of the edges of printer pages Because Gene Inspector can create sheets that are larger than one printer page it is important to be able to identify the location of printer page breaks within a notebook sheet Show Hide Text Margins Text margins indicate the borders on the notebook sheet that will contain the background text Show Hide Print Area The print area is the area on the notebook sheet that will be printed on the currently selected printer This area is determined through the information obtained through the Page Setup menu option Show Hide Paper Binding If you will be printing the Gl Notebook for binding in a hardcover book or in a looseleaf you might want to introduce a paper binding sometimes called a gutter Paper binding is an extra area added to the side of the sheet nearest the binding edge left side of sheet for odd numbered sheets
75. the entire Analysis Setup with all the parame ters you have defined by choosing that item from the Analysis menu This pro vides a convenient way to save entire suites of analyses with the parameters you want to use for these particular analyses Other users in your lab group can then access this standard analysis set and just put in their own DNA or protein sequences for analysis Analysis Setups are stored on your hard disk and can be shared with other users of the Gene Inspector Analysis Setups can be removed from the Analysis menu by using Analysis gt Remove Setup From c They can be found in a folder called Analysis Setups within the GI Data folder Page 4 8 Analyses Menu Modifying Output Objects Once an analysis is run it creates an analysis output object in the Gl Note book The analysis output object can be modified and used to recalculate an analysis This is discussed in detail in Analysis Output Objects page 5 15 Object gt Reformat and Object gt Recalculate are menu options that are shared by all analysis output objects These menu options are available when the object is targeted by double clicking on it see Selection vs Target page 2 1 Reformat allows editing of the axis ranges tick marks divisions labels and object title Recalculate actually allows you to recalculate the analysis while keeping the GI Notebook as the active document You may change parame ters for
76. the two sequences Thus aligning two sequences of 200 nucleotides each will take four times as long as aligning two sequences of 100 each Although there are no limitations in the Gene Inspector code to perform very long alignments you might need additional disk space and addi tional patience By using disk space to contain temporary date Gene Inspec tor can perform alignments on very long sequences that other programs cannot align The trade off is that you need additional disk space to hold the temporary data see Using Extra Disk Space for Analyses page 7 1 However even though the z score calculations may take a long time to com plete like all analyses you perform in Gene Inspector alignments will run in background so you can continue to work even while the alignment is being computed See Analyses That Take a Long Time page 7 3 for some help ful hints The output from a global alignment is shown in Figure 4 13 page 4 19 Align 2 sequences global Dros hsp22 amp Dros hsp23 First sequence Dros hsp22 Second sequence Dros hsp23 Scoring table Nucleotide identity Gap insertion penalty 2 50 Gap extension penalty 0 30 Unaligned ends treated as gaps Traceback Upper path Score 275 60 Mean 212 26 Standard deviation 10 65 z score 5 95 0 of 100 trial alignments scored greaterthan this one 1 ATG TACCGATGTTI TSG GCATGGCCGACG 37 1 ATG TICCGITGTTGT GsGC TIGCcGA G 37 38 AGATGGCACGGATGCCA CGCCTC
77. to define and launch another Page 2 60 Tutorials Dot Matrix Analysis Another Interactive Analysis Analysis Setup Analyses 1 Inputs 2 Outputs 1 O High Priority Close run gt Match scoring Use scoring table Max 13 00 Min 15 00 Table PAM40 B Input Sequences O Use identity table a Match score M00 Mismatch score 1 00 Output Location oo Gap penalty creation 2 50 for extension 0 30 v Treat unaligned ends as gaps Traceback use Upper path O Lower path Run z score using 100 trials v Show Icons Style Default D Match Scoring determines how scoring will be calculated identity table matches identical residues only Gap Scoring defines a penalty for inserting a gap creation penalty and for extending the gap one residue extension penalty Choosing Upper path or Lower path Traceback path will usually give alternative alignments having the same score Z scores indicate significance of alignments but take extra time to calculate Figure 2 44 Dot Matrix Alignment Setup analysis With this approach you can explore your results in an intuitive and flexible way A number of analyses allow you to use the displayed data as the starting point for other analyses This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 61 Tutorials Using Bookmarks in the Gl Notebook TUTORIAL 19 USI
78. to mark a Asp742l a SES cav Ja a Asp7481 0 me Acci Input Sequences AspLEl Move All gt gt Acil peO AspMDI ent Af al SO AspTIll Rance A 831 Output Location AsuMBl ES Au AvrBl Remove All Asp16Hl Bal30061 EK Bal4751 BanAl Mark cut sites Bt e Mark recognition sites mM Show Icons Style Default r is The Enzyme List popup menu presents all available lists selecting a list places all the enzymes from that list into the Available Enzymes list Select enzymes you wish to use from the left list and move them to the right list of Sites to Mark Sites can be marked at either the actual cut site Mark cut sites or the beginning of the recognition site Mark recognition sites Figure 4 38 Restriction Enzyme Digest Setup Panel The setup panel is shown in Figure 4 38 on page 4 42 Using the Enzyme List popup menu you can specify the list you wish to work with A comprehensive set of lists of enzymes is provided with the Gene Inspector containing all commercially available enzymes in several lists The enzyme list in this figure is for commercially available enzymes which recognize 4 nucleotide sequences commonly called four cutters For enzymes to be used in the search they must be moved from the left list Available enzymes to the right list Sites to mark in this setup panel This can be done either by double clicking on the enzyme name on the left to move it or by selecting one or more names from the li
79. together with Tutorial 2 Editing Sequences page 2 9 and Tutorial 3 Using Analysis Setups page 2 14 will serve as an introduction to the program 1 Double click on the Gene Inspector to start the program You will see a new empty notebook window called Untitled This empty notebook can be used to hold any new analyses you do We will not use it right now but will use a previously created notebook 2 Choose File gt Open which will allow you to open a Gene Inspector file The checkboxes and other details of the dialog box will be discussed in other tutorials For now choose the notebook file in the Gene Inspector folder which is named GI Notebook Tour and press the Open button to open the document It might take a bit of time for the notebook to open as it sets up all the bookmarks and other navigation tools used in this particular notebook You will see Figure 2 2 3 This special notebook is designed to demonstrate the kinds of uses you might have for the GI Notebook in your own research The GI Notebook is basically a word processor with many special features designed to facilitate its use in research Background text starting with the word OBJECTIVES in this case can be entered and edited just as in a standard word processing pro gram Using the Format menu in combination with the items in the GI Note book s ruler allows you to do many of the standard manipulations you expect in a word processor The No
80. way to identify regions of similarity It is the best way to start your comparisons between sequences The input panel is shown in Figure 4 21 It is a panel with many options so each component will be discussed individually The basic algorithm is a sliding window comparison between two sequences If a window of 10 is chosen for the analysis nucleotides 1 10 of sequence 1 i If the object is not targeted you can hold down the option key and click the mouse on the ORF to select it The option key can be used in combination with the mouse to select parts of an object Page 4 26 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs 0 High Priority Close run gt Ea Window size 20 Dot matrix Table Nucleotide identity ES a Thresholds Color Ranges gt Input Sequences Threshold value 14 18 00 20 00 EE 16 00 eo Add Threshold gt oo AA ee Output Location C Change Threshold Remove Threshold 0 00 Dot size 1 x1 3 v Show Icons Style Defaut 9 The Table popup menu defines the table to be used in the analysis Window size is the number of adjacent residues to be used in the comparison Thresholds are defined by typing values in the Threshold box and adding them to the Thresholds list using the Add Threshold button Use the Color submenu in Format menu to define a color for each selected threshold Dot size refers to the size of the dot used to indicate the match in the plot Figure 4 21
81. you create will be in the User Tables folder in the GI Data folder Any table you create can be moved into the Standard Tables Folder where it will become uneditable To edit a standard table make a copy of it and move it into the User Tables folder Page 4 14 Analyses Neurospora crassa Copy Edit Info Cancel C OK Start Stop D Tac Tine 2967 90 769 00 Figure 4 10 Editing a Translation Table codon was used in the sample set of genes used to compile the table So ATA was used 975 times ATC was used 3013 times and ATT was used 1985 times to specify isoleucine As shown in the total column there were 5973 occurrences of isoleucine codons in the data set For isoleucine codons in Drosophila melanogaster ATT is used 5973 50 4 of the time As you type in new numbers for weights the Gene Inspector automatically adjusts the totals to reflect the new numbers You can update the tables supplied with Gene Inspector as more comprehen sive data becomes available for each organism or you can create totally new Page 4 15 Analyses a Drosophila melanogaster Copy Edit Info Cancel OK Figure 4 11 Editing a CodonPreference Table tables for organisms that are not yet well characterized but will be in the future Finally you may wish to create tables for highly expressed gene prod ucts vs infrequently expressed gene products These often can have different codon preferences
82. your search Matrix is currently not use in nucleic acid blast searches The Number of hits to keep determines how many matches are returned to you Once you start the analysis a new object is placed into your notebook When results are returned your web browser will automatically open to display those results The object in your notebook can be used to launch another BLAST search with the same parameters in the future Protein Analyses The analyses discussed in this section all deal with proteins One aspect that many of the analyses share is the ability to view the output using median sieving instead of the more common sliding window mean This powerful alternate calculation method is discussed in Median Sieving Data Sieving page 4 11 Accessible Surface Area This analysis is based on values in Janin Nature 277 491 1979 which determined the surface accessibility of amino acids The ratio of buried to accessible values in the paper Table 1 column 4 were converted to the fraction accessible for this analysis This analysis is identical to the Antigenic ity analysis using the Janin table The setup panel is shown in Figure 4 46 page 4 49 This is a typical sliding window analysis Sliding Window page 4 68 that presents a property of the peptide as a function of position along the peptide sequence The output is shown in Figure 4 47 This plot indicates that segments around 170 and 220 are not very accessible whil
83. 01 151 201 251 301 351 D Aming acid Figure 2 24 Output Objects Before Alignment the top analysis in Figure 2 24 3 Select all of the analysis objects by clicking on one of them once and then choosing Edit gt Select All This will select all Gl Notebook objects two in Page 2 31 Tutorials Aligning Analysis Objects Alignment Click in the controls to the left of and above the diagram H Y Make widths the same Zei AS widest object E O as narrowest object E Make heights the same i 3 as tallest object as shortest object Figure 2 25 Notebook Object Alignment this case Choose Notebook gt Arrangement gt Align Objects You will see the align ment dialog shown in Figure 2 25 page 2 32 The ability to adjust sizes and align objects as shown in the figure is very useful for making all analyses have the same width for example so that the X axes align and graph results can be compared Set the items in the window to match what is shown in Figure 2 25 This will cause the objects to line up along their left edges and to be as wide as the widest object and as short as the shortest object Press the OK button 4 After alignment the analysis objects will look like Figure 2 26 page 2 33 Note that the objects are all aligned on the left as defined in the object alignment dialog Since we did not specify any vertical alignment the tops of each object remain the same as they were before the align
84. 1 Figure 4 43 TestCode Output The setup panel is shown in Figure 4 42 page 4 45 The default window size is 200 nucleotides which is the value recommended by Fickett Using shorter windows will give more localized results and may reflect local biases Page 4 45 Analyses in the base composition not something you are likely to want to do in order to find ORFs The ORF settings and Display options are discussed in CodonPreference page 4 23 The output is shown in Figure 4 43 Parts of the curve that are above the upper green line have a 95 chance of actually coding for a protein Parts of the curve below the lower red line have only a 5 likelihood of actually being a coding region In between the two lines it is safest to assume that the region is not a protein coding region based on the TestCode output alone To evaluate how likely a region is to be a coding region you need to look at ORFs and rare codon plots in the lower part of the output object The plotted TestCode Dros hsp70 i mos ss uua u KK S ann usn soo es ssa ui as P PA gt nu ss A asa is u a Eo rr n nwu na E T Nucleotide Figure 4 44 Testcode Output With Ambiguous Characters TestCode values are statistical attributes so you must be careful in your inter pretation Just because a value is above the green line does not mean that the particular region is definitely a coding region and conversely if the
85. 19 GE E EE 6 17 EE 6 17 Style SOC gege ER e nN E eee alec ade E ee DS 6 20 Format Sequence Sequence MENU ccccncncnnanananananananananannnano non no nono conoce non 6 51 Format Sequences Sequences Menu cnncccccccnnnnanonnnonononononannnno non conocio no noo 6 44 Fraga table iia e adds A 2 raMes nadaa tada Erico diia caia 4 11 5 8 6 18 Frames Format Menu 2 cccccccccccscsssesesceeecececcsseseeceeeeeuecceseedenseesecsecneneess 6 18 G Kat coding prediction eiie i acess Hie ede ed eves a coe 4 37 Gene Inspector installing aia tar aia ada Sate Ma eee 1 2 three main Pants eegene attr core dl ege e Dee gece 1 4 dee Lu Lu e WEE 1 3 Generate Random Sequence Menu cccccccccnccnnnnnnnononinnninnninicininininoninononons 6 46 GES LEE A 2 Get Info Notebook Menu ccccccconnnonccncncnnononananannancnc conocio nan an cnn nana canoa 6 27 Index 6 Gl Data tele E 7 1 Gl notebook aliases iii add 1 7 2 44 aligning objects Oovervlew 5 10 analysis ue 5 15 elen 5 16 attach bookmark iii a a ade aa ig et ee ae ea 5 5 background text tutorial o coococcnccnonanocooonncnnnonanoononononnnnnnancncnononnnns 2 36 2 37 background text justification ENEE EEN 6 19 Do0KMAarkS omita ET 1 7 2 62 5 5 conditional Xt tia ini 5 2 display preferences muii tii decida 2 36 5 1 drawing iO ae ee AGS Ae ee a et ae 5 7 features object dere eesti end Gal ae eee E Gate 3 16 GETING MUON ia RA RR 3 17 description ista a
86. 2 eee eee eee 5 8 Text Flow Around Objects 5 10 Aligning Objectives fiers Cato bed dae EE ads 5 10 Getting Information About Objects ooo ooooomoooo 5 11 Text Objects Sidebar Test 5 12 Table Objects User Tables 5 12 Open for Editing s s aere REN dE ee SE NEE ee 5 14 Analysis Output Objects 5 15 Features ObjectS 0 cee eee eens 5 16 Appendix Objects tacita an aE E E EEE EE e EE E E A 5 16 TOOVEXtENSIONS weree ipa cele a a a a 5 17 TABLE OF CONTENTS Uses for Tool Extensions 5 18 Menu Items Elle Men orar Ee eebe Ses dts ee See Sha Se 6 2 NEW Mls ee Mlk tee ds AE se tags 6 2 A aarne a a a S A a E NR a 6 2 COBE ee ed A Ee RASA E EE i 6 3 e UE 6 3 SV AG aa oot Reged ET dE d e ed tees wet 6 3 Savea COPY EN dE NEEN ined iwi bas tae dea 6 4 Revert to Saved 6 4 Import see ete tte Head EEN REN ee dE 6 5 EXPO tege o Ee ed 6 6 Page SetUD iii a ee EE eed 6 7 A roa Dee ageet alge ition SE 6 7 Print Notebook and Appendices 6 8 Choose Gl Data Folder Windows only eee eee 6 8 Set Alias Resolution Rules Mac only 00ee eee eee 6 8 Quit Mac Exit Windows ansaan anneren 6 8 EdIEMENU EES NEE AE e e 6 9 UNO esa ee os ee es ee ae is 6 9 CUE EE 6 9 KOENEN ai EE MOI 6 9 PAStei feat ted att daw amp ele noe 6 9 Special Paste via pa da eae AE RES 6 10 ET 6 10 SelectAll erte Me o Ae 6 10 SNOW SelECuOM scarico Sege cad ddaw ie Alt 6 11 Find amp Replace cece eee E 6 11
87. 3 The sequence editor window is designed for manipulating sequences while the Features object is designed for displaying sequences Importing Sequences Choosing File gt Import gt Import DNA Sequences will bring up a dialog like the one Import Sequence Look in JC DNAs to import E DNA txt E DNA PIR txt II DNA embl txt E DNA Staden txt DNA GCG txt B owa tdize E DNA GenBank txt DI DNA ibi txt E DNA pearson txt File name DNA GCG txt Files of type Text Files Interperet as GCG C Append sequences to acetylcholine recpts pep Put imported sequences into a new document Apparent file format GCG Sequences in file Figure 3 16 Import DNA Sequence shown in Figure 3 16 In this case the figure shows importing a GCG sequence Using the Interpret As popup menu you can define what kinds of Page 3 19 The GI Sequence Editor documents will be displayed in the file list Clicking Text Files in the Files of type popup will list all files of type TEXT TEXT files can be created by many applications including all word processors and other applications that deal with sequences so choosing this means that the Gene Inspector will have to take a guess at the format of a specific file This is done using either our own code or using ReadSeq code written by Don Gilbert at Indiana Uni versity and available through FTP at ftp bio indiana edu thanks Don Don has made thi
88. 40 e Remove Analysis Remove Analysis complements the Add Another Analysis option discussed in the previous section Remove Analysis is only available when and Analysis Setup Window is active and a specific analysis is selected in the window In the Windows version of Gene Inspector the Remove Analysis selection is acces sible through the right mouse button menu Selecting Remove Analysis will remove the selected analysis from the active Analysis Setup Window If no analysis is selected this menu item will be disabled e Update Setup If you have opened and Analysis Setup from the Analysis menu see Add Setup To Menu below and have made some changes to the Setup you can update the saved Analysis Setup Window by using this menu option In the Windows Page 6 25 Menu Items version of Gene Inspector this selection is accessible through the right mouse button menu The current set of parameters and sequences will replace the ones that were stored with the Setup when it was selected from the Analysis menu e Add Setup To Menu This menu option is enabled whenever you have an Analysis Setup window as the active window In the Windows version of Gene Inspector this selec tion is accessible through the right mouse button menu Choosing Add Setup To Menu will ask you for a name for the current setup and then will add it to the Analysis menu as a custom setup Custom Analysis Setups below The entire setup win
89. 41 pSEAP Enhancer List of all Vectors Included With Gene Inspector nun un un Pharmacia Clontech New England Biochemical Biochemical Biochemical Biochemical Biol S LS LS S abs Pharmacia Clontech NovaGen NovaGen NovaGen In Vitrogen In Vitrogen harmacia lontech lontech Vitroge Vitroge Vitroge Vitroge Vitroge Vitroge Vitroge Vitroge Vitroge Vitroge harmacia U PP vd vd d d vi vd d vi PP vd vd d d vi vd d vi Cratagene Cratagene Cratagene Cratagene Cratagene Cratagene Cratagene tratagene ovaGen n Vitrogen n Vitrogen n Vitrogen Clontech Clontech Clontech HHH SD mo oo mo Do DD D HHH HHH HHH HHH OO Page A 29 List of all Vectors Included With Gene Inspector Clontech NovaGen Promega Pharmacia Pharmacia In Vitrogen BRL BRL BRL BRL Promega Promega Promega Promega Promega Promega Promega Promega Promega BRL BRL Boehringer Boehringer Boehringer Boehringer IBI Promega BRL Clontech Clontech Pharmacia Pharmacia Stratagene Clontech Stratagene Appendix 342 pSEAP Promoter 343 pSHlox1 344 pSI 345 pSL1180 346 pSL1190 347 pSL301 348 pSP18 349 pSP19 350 pSP6 T3 35L pSB6 T1 1 9 352 pSP64 353 pSP64 polyA 354 pSP65 355 pSP70 356 pSP71 357 pSP72 358 pSP73 359 pSPluc 360 pSPluc NF 361 pSPORT1 362 pSPORT2 363 pSPT18 364 pSP
90. 5 Custom Analysis Setups Analysis Menu ccccccccececeeeeeessseeeceeeeeesessseaeees 6 26 CUSTOM score adornments ccceccceeceeeceeeeeeceaececeeseeecuecaeeageeaueenseeeeesees 2 25 3 14 customizing Gl menus adding analysis SCtUPS E 4 8 6 26 adding bookmarks ek deEE an ee eee ee de 5 5 6 30 Index 3 D adding CO OS Cocina da 2 34 adding AMES idas 5 9 6 18 adding numeric formats oooccccccnnoncccnonannnennnnnnncnnnnnnncnnnnnnncnnnnnnnrrnnnnnnnrnnnnnnnnnnns 6 19 adding Style Sheets AAA 4 10 6 20 adding tool extensions cocccocccconncononncnnnnononnncnnncnnnnnnnnnnnnnnnrennnnnnnnernnnanns 5 17 6 29 tutorial escitas es at ca coscues sed ooved e aaee eE ea apaa i aaisa 2 34 2 35 Gut Edit Menu ui iii 6 9 D data sieving see median sieving Define Intron Features Menu ccccccceccesesceseeeeececeeseeseueuuuaeeaseeeeeeusessuunaueaes 6 40 Delete Row s Column s Table Menu cc ccccccccsscsseseeeeseeeeeeeuseseuauaeeeeenes 6 53 dependencias EE 6 36 ISK SPACE serge A SE eee 7 1 Display Features Menu cccccscsccceceesssesseaeeeseseesesssesaeeeeceeeessseeaeeeeeeeeesesees 6 40 Display Notebook Menu ccconcnccconccccnnnononcnoonononcnnnanononnnnonnnnnnonnnnnnnnnnnnnnnnannnos 6 32 Display Sequence Menu ccccmnococoncncnnnnnnonnnoonononnnnnnononnnncnnnnnnnnnnonnonanennnnnnnnnas 6 47 Display Sequences Menu cocccnnncnccnccnnnonononononononnnnnnonnnnnncnnnnnnnnnnannarnnnnnnnnnnannns 6 44
91. 69 PinPoint Xa 2 Promega 270 PinPoint Xa 3 Promega 271 PinPoint C Promega 272 PKK223 3 Pharmacia 273 pKK232 8 Pharmacia 274 pKK233 2 Pharmacia 275 pKK388 1 Clontech 276 pLambdaPop6 In Vitrogen 277 pLEX In Vitrogen 278 pLITMUS28 New England Biolabs 279 pLITMUS29 New England Biolabs 280 pLITMUS38 New England Biolabs 281 pLITMUS39 New England Biolabs 282 pLysE NovaGen 283 pLyssS NovaGen 284 pMAL c New England Biolabs 285 pMAL c2 New England Biolabs 286 pMAL cRI New England Biolabs 287 pMAL p New England Biolabs 288 pMAL p2 New England Biolabs 289 pMAM Clontech 290 pMAMneo Clontech 291 pMAMneo Blue Clontech 292 pMAMneo Cat Clontech 293 pMAMneo LUC Clontech 294 pMB9 Sigma 295 pMC1871 Pharmacia 296 pMDSG Pharmacia 297 pMelBacB In Vitrogen 298 pMEP4 In Vitrogen Page A 28 Appendix 299 pMEX5 300 pMEX6 301 pMEX7 302 pMEX8 303 pMSG CAT 304 pNASSbeta 305 pNEB193 306 pNEO 307 pNOM102 308 pocusl 309 pOCUS2 310 pocuslox 311 pPIC9 312 pPIC9K 313 pPL lambda 314 pPUR 315 pRAJ275 316 pRcCMV 317 pRcRSV 318 pREP10 319 pREP4 320 pREP4CAT 321 pREP7 322 pREP7CAT 323 pREP8 324 pREP8CAT 325 pREP9 326 pRIT2T 327 pRS403 328 pRS404 329 pRS405 330 pRS406 331 pRS413 332 pRS414 333 pRS415 334 pRS416 335 pSCREEN1b 336 pSE280 337 pSE380 338 pSE420 339 pSEAP Basic 340 pSEAP Control 3
92. 9 Dot Matrix A dot matrix analysis is used to compare two sequences for regions of simi larity The result is a two dimensional plot indicating graphically the regions of Page 4 54 Analyses CF structure prediction Dros hsp23 Alpha Beta 0 25 30 75 100 125 150 175 Figure 4 51 Chou Fasman Structure Prediction similarity between the two sequences Scoring tables can be used to specify similarity criteria The setup and output are discussed in detail in the nucleic acid section on Dot Matrix page 4 26 Protein scoring tables can play a significant role in defining your output These tables are discussed in Align 2 Sequences Global page 4 49 Find Repeats This analysis will search peptide sequences for repeats of any defined length The repeats can have some mismatches and the maximum distance between the two parts of the repeat can be specified See the nucleic acid analysis Find Repeats page 4 34 for more details This analysis can be run as a summary analysis See page 4 32 for more details Page 4 55 Analyses Find Sequence This allows you to define a query sequence and find it in a target peptide The query sequence can be in multiple parts and each part can be allowed to have up to a user defined number of mismatches The minimum and maxi mum distance between any two parts of the query sequence can also be specified Results are presented graphically or as a table See Find
93. A 46 pTrcHisB 47 pTrcHisC 48 pTrcHisCAT 49 pTrx 50 pTrxFus 51 pVL1392 52 pVL1393 53 pYES2 54 pZeoSV 55 pZeoSVLacZ 56 pZErO New England Biolabs NEB pACYC177 pACYC184 DANZ pLITMUS28 pLITMUS29 pLITMUS38 pLITMUS39 pMAL c ONDUBWD gt Vectors by Supplier Page A 13 Appendix 9 pMAL c2 10 pMAL cRI 11 pMAL p 12 pMAL p2 13 pNEB193 14 Yep24 15 Yip5 NovaGen ONDUBWHD gt M13mp18 pBlueSTAR1 pCITE 3a pCITE 3b pCITE 3c pCITE 4a pCITE 4b pCITE 4c 9 pCITE1 10 pCITE2b 11 pCITE2c 12 pET11 13 pET1la 14 pET11b 15 pET11c 16 pET11d 17 pET12a 18 pET12b 19 pET12c 20 pET14b 21 pET15b 22 pET16b 23 pET17b 24 pET17xb 25 pET19b 26 pET20b 27 pET21 28 pET21a Page A 14 Vectors by Supplier Appendix 29 pET21b 30 pET21c 31 pET21d 32 pET22b 33 pET23 34 pET23a 35 pET23b 36 pET23c 37 pET23d 38 pET24 39 pET24a 40 pET24b 41 pET24C 42 pET24d 43 pET25b 44 pET26b 45 pET27b 46 pET28a 47 pET28b 48 pET28c 49 pET29a 50 pET29b 51 pET29c 52 pET3 53 pET30a 54 pET30b 55 pET30c 56 pET31b 57 pET32a 58 pET32b 59 pET32c 60 pET3a 61 pET3b 62 pET3c 63 pET3d 64 pET3xa 65 pET3xb Vectors by Supplier Page A 15 Appendix 66 pET3xc 67 pET5 68 pET5a 69 pET5b 70 pET5c 71 pET7 72 pET9 73 pET9a 74 pET9b 75 pET9c 76 pET9d 77 pEXlox 78 pLysE
94. Align Multiple Sequences 4 52 Amino Acid Composition 4 53 tal EE 4 54 CF Structure Prediction 4 54 Dot Matrix EE 4 54 Find Repeats comica A a pa 4 55 FING SEQUENCE mai A a i 4 56 Find Sequence Prosite style 4 56 GOR Structure Prediction 0 cee eee 4 58 Helical Wheel e NENNEN ENNEN ee EEN ees A E 4 59 FIV rOpathy sodio aisles SE EIERE EE Ee 4 60 Hydration Potential 4 62 Membrane Buried RegionS 4 63 Optimal Matching Hydrophobicity 4 63 page 3 TABLE OF CONTENTS PIP eee tk ate hace rae EE 4 63 Physical Characteristics 4 63 Prosite Motif Search 4 64 Protein Cleavage 4 67 Protein Interior e NEE ee ees NEE dE 4 67 Side Chain Flexibility 0 0 0 cc cece eens 4 67 Signal SEQUENCE EE 4 68 Sliding Window 4 68 Side Chain ProtruSion 000 c cece ee ee eee ees 4 69 Surrounding Hydrophobicity 0 0 cece eee 4 69 Temperature Factor nesrodan ccc ee eens 4 70 Transmembrane HeliceS 0 0c cee 4 70 BLAST Sar er Se tall ge kode aaa aha Solel Ms 4 70 The GI Notebook Overview of the Gl Notebook 5 1 Conditional Text 5 2 SUE SES ia eos AWA AR A FTR 5 3 GI Notebook Layout o occccocccc eens 5 4 Ee tel 5 5 GI Notebook ObjectS 0 0 ccc tenets 5 6 Selecting vs Targeting 5 6 Drawing LOIS isis eae erer green EE todd eek eee 5 7 Preferred Size for Objects cusur eir cenae eee eee 5 8 Framing Gl Notebook Objects 20 0
95. Analy sis selection is accessible through the right mouse button menu Use the Enzyme list popup to choose the Commercial list Select in the left list all the enzymes starting with the letter A by selecting Aatl using the scroll bar to scroll down to the last A enzyme Axy as of this writing and then holding down the shift key and selecting this last A enzyme this is called shift click ing and is a standard way of extending a selection Move all the selected enzymes to the right list of sites to be marked by pressing the Move gt gt button 4 Select Input Sequences on the left of the analysis setup and add the DNA sequence bovine LDH which is in the lactate dehydrogenases DNA file 5 This Analysis Setup you have just defined might be something you will use in the future so let s add it to the Analysis menu Choose Analysis gt Add Setup to Menu and name the setup DNA Analysis Suite In the Windows ver sion of Gene Inspector the Add Setup to Menu selection is accessible through the right mouse button menu Any time you want to run this set of analyses on a new DNA sequence you can do so by selecting the DNAAnalysis Suite analysis setup from the Analysis menu and then changing the sequences to be analyzed in the input panel j Remember patience is a virtue Page 2 46 Tutorials Customizing and Saving Analysis Setup Suites 6 Run this entire set of analyses by pressing the Run button 7
96. Analysis gt New Analysis You will see the Analysis Chooser shown in Figure 4 1 page 4 2 By using the Nucleic Acid Analyses and Protein Analyses radio buttons on the top of the window you can see a list of either the nulceic acid analyses or a list of the protein analyses that are available in the Gene Inspector A few analyses like sequence alignment and dot matrix analysis appear in both lists most are unique to each list As different analyses are selected in the list on the a The exception is multiple sequence alignments which direct their outputs to a sequence editor document see Multiple Sequence Alignments page 3 10 and Align Multiple Sequences page 4 20 Page 4 1 Analyses Analysis Chooser Accessible surface area Show Align 2 sequences global O Nucleic Acid Analyses fo Cancel gt Align multiple sequences EI Protein Analyses en Amino acid composition gt Antigenicity CF structure prediction r Information about selected analysis ______ Dot matrix 4 Helical wheel analysis projects a view of a peptide Find repeats 3 segment looking down the axis of an alpha helix mad sequence 2 The distribution of side chains from the helix is Find sequence Prosite es 4 readily apparent in this view From Schiffer amp GOR Ak prediction Edmundson 1967 Biophys J 7 121 Hydropathy Hydration potential Membrane buried regions Optimal matching hydro pH pl v _ Show Icons Fig
97. C CTC CGC CAT F167 27 lle Lew Lew Trp Ala Thr Lys Ser Ala Ser Lew Arg His 168 GCC ATT AAA TGC ACT ERE Ala Ile Lys Cys Thr 4 213 TOG BAA CAA AGC AAA CAA AAA ACT AAA C 45 Trp ys D er ys Gin Lys Thr Lys AC AGA AAC CTT 25 is reg Aen Leu 5 fie Leu Val Giu Lys Lys Val Val Phe Seg Lys Gin Val A T SS 291 left positions line dividers boxed style right Figure 3 14 A Features Object Edit gt Copy to copy the sequence from the sequence window to the clipboard Pasting the sequence into a GI Notebook will automatically create a Features object containing the sequence If there is more than one sequence in the clipboard then a multiple sequence Features object will be created this is Page 3 16 The GI Sequence Editor discussed later Features objects can be moved around in the Gl Notebook with the mouse like other GI Notebook objects Targeting the Features object by double click ing will add a Features menu to the menubar if there is only a single sequence in the object If there is more than one sequence in the Features object a Sequences menu will appear The different parts of a DNA Features object are shown in Figure 3 14 A Features object cannot be edited but its display can be altered Once placed into a GI Notebook the sequence cannot be changed it is no longer a true sequence but it is an object containing a string of characters represent ing a sequence The Features object is not
98. CT GA GC GG GT TA Te TG TT Figure 4 16 Base Composition Output Analysis Setup Analyses 1 Inputs 0 Outputs 0 J High Priority Close Run Method Window size 20 a window size A b offset Offset 1 s If a 10 b 2 then first window is all values in nucs 1 10 second window is 3 12 eto Input Sequences w O Keep track of oo Ca de do mi Output Location Plot results as e 3 C Percent of matches Je Show Icons Style Default Window size is the number of adjacent nucleotides whose base composition is calculated in each iteration Offs et is the number of nucleotides the window moves between each iteration Use the Keep track of check boxes to specify the nucleotides to monitor The Style popup menu allows a predefined style to be applied to the results Results can be plotted either as a sum or a percentage using the Plot results as buttons Figure 4 17 Base Distribution Panel specify any single base or any combination of bases In this figure the G and C boxes are checked so the analysis will display the distribution of G C base content along the length of sequence A window of 20 is chosen with an off set of 1 This is illustrated in the picture at the top of the panel Nucleotides 1 20 a window of 20 will be examined for G C content and the number of Gs plus Cs plotted for this window Next the window is moved by 1 an offset of 1 and nucleotides 2 21 are evaluated This process is repeated using
99. DNA Sequences gt BB ecetvichotins recptors nuc Mtysozymes nw P TZ mu Dros 195 ENZ rue rat gobns nue Ss la SS BB EE Drosophila Hsps nuc lactate debydrogenases rns File pame chick cytochrome b5 rue Open Files of e Gene Inspector fies vk mu pep xl Cancel Figure 6 4 Importing Into a Notebook e Import Import provides a means to bring information in files created in other programs into Gene Inspector There are three kinds of importing that you can do GI cannot import files directly from other application s formats such as DNA Strider You must first export the sequences from the other program as a TEXT or ASCII file This can then be imported into GI mport Text Into Notebook When you have text information that you would like to import into a GI Note book or would just like to examine this is the option that can do it As shown in Figure 6 4 you can import the text into a new notebook or you can choose to append it to the end of an open notebook Selected file format GenBank C8 E Show File info Append sequences to lt new sequence esnor document gt TE New Folder Cancel Gia Figure 6 5 Import Sequence Dialog Import Peptide Sequence The dialog that appears when you choose this option is shown in Figure 6 5 Page 6 5 Menu Items page 6 5 In this case there was no open sequence window in the Gene Inspector so the Append sequences to option was not available Usin
100. Depending on your current configuration you may see an extra cautions dialog See the text describing Figure 3 5 page 3 6 for more information d Being a palette the Sequence Monitor always remains in front of other windows It will reflect information about the active sequence window the one containing the selection or insertion point If a different sequence window is brought to the front the information in the Sequence Monitor will change Page 2 10 Tutorials Editing Sequences Monitor pBR322 nuc pBR322 61 290 C 2 G 3 TIU 4 N Map Keys O Sequence Monitor File pBR322 Seq pBR322 Sel 161 479 A 1 C 2 G 3 T U 4 N 5 Map Keys 7 Speak nucleotides while typing Read Sequence Confirm Reentry Figure 2 5 Sequence Monitor With this method as you type in the sequence a second time the program will compare it with the characters you entered the first time Any differences will be brought to your attention with a beep The Map Keys button allows you to redefine the keyboard for entering sequences more conveniently The default values for the keyboard map are shown in Figure 2 5 The top of the figure shows the options for Windows XP while the bottom figure shows options for Macintosh OS X For more details see Chapter 3 7 Select 5 10 lines of the pBR322 DNA sequence and then choose Sequence gt Manipulate Sequence gt Translate Specify a translation table the E coli t
101. Dialog part of the frame you can define the line thickness pattern and color using the Format menu as indicated in the figure To edit the properties of the inner frame click once on the line next to the inner frame text to select the line and then make changes using the Format menu The middle and outer frame lines can be set the same way The Drop Shadow thickness color and pat tern can be set using the Format menu once you have selected the line next to the Drop Shadow text You can also define the space between any two of the framing rectangles a gap Frame definitions can be added to the menu by selecting a framed object and choosing Format Frame Add Frame To Menu A frame can be applied by selecting an object and choosing the desired frame to be applied using the Format Frame submenu Simple frames such as a 1 pixel wide black border can serve to separate an analysis output object or a user table from the surrounding text When an object is resized the frame stays with it and resizes appropriately You can also use frames to create rectangles that have a different fill color and pat tern from the border color and pattern d Even though there is no visible frame around the object to start with you can actually think of this as being an object with a frame having zero width lines The menu item allows you to edit this invisible frame Page 5 9 The GI Notebook Text Flow Around Objects Background text
102. Figure 4 1 page 4 2 or as an icon list shown here by using the draw icons check box 2 Select Transmembrane Helices from the list on the left and press the OK button This will create a new Analysis Setup Analysis Setups contain a number of different panels each represented by an icon on the left of the window The Analysis Setup panel for Transmembrane Helices is shown in Figure 2 8 page 2 15 Page 2 14 Tutorials Using Analysis Setups Analysis Setup Analyses 1 Inputs O Outputs 0 T High Priority Close Run Window Size 19 C Median Sieving Mesh Size 110 Input Sequences Table Argos etal Ol 9 Aboutthe Analysis H A This analysis is based on the statistical distribution of specific Output Location A NEE amino acids in membrane vs non membrane segments for a sample set of proteins Argos et al Eur J Biochem 128 55 1982 This Trans membane Helix analysis is identical to the Membrane Buried Regions analysis Y Show Icons Style Default 7 Window size isthe number of adjacent amino acids whose property is calculated in each iteration After calculating the value for the first window of amino acids the window is moved one residue along the sequence and a value is calculated again for the new window of amino acids Median Sieving emphasizes data having a specific distribution J A Bangham Anal Biochem 174 142 1988 Figure 2 8 The Analysis Setup Panel 3 Notice the popup menus on the right o
103. Figure 6 17 There are seven kinds of tools in F K F Dos Ss F F F al O O Figure 6 17 The Tools Submenu this menu xl The arrow in the top left is used to select objects in the Gl Notebook The Text tool will create a text object see Text Objects Sidebar Text page 5 12 Ss Ki The table tool will create a table in the notebook see Table Objects User Tables page 5 12 The Line tool will create lines that optionally can contain arrowheads on one or both ends BD Page 6 27 Menu Items oleole The tools on the bottom row will create rectangles rounded rectangles or ellipses Reduction Set Reduction As shown in Figure 6 18 Set Reduction brings up a dialog that allows you to E Set Reduction Percent of full size pooo Cancel Figure 6 18 Set Reduction Dialog define the extent of the reduction Enlarge Enlarge is enabled only if the Gl Notebook is actually reduced You cannot enlarge to greater than original size This option complements the Reduce option and will enlarge the notebook to have twice the area multiply each dimension by J2 Enlarge To Full Size Enlarge To Full Size removes any reduction and restores the notebook to full size Reduce This option complements the Enlarge option and will reduce the area of the Gl Notebook to half the area of what it was multiply each dimension by 1 2 Reduce to Fit In Window Reduce to Fit I
104. GENE INSPECTOR 2 0 Tutorials amp User Manual Textco BioSoftware Inc 27 Gilson Road West Lebanon New Hampshire 03784 U S A April 2012 First Edition Gene Inspector 2 0 Manual is Copyright Textco BioSoft ware Inc 2003 2012 All rights reserved gt ei SOFTWARE Textco BioSoftware Inc 27 Gilson Road West Lebanon New Hampshire 03784 U S A voice FAX 603 643 1471 email info textco com URL http www textco com TABLE OF CONTENTS Getting Started with Gene Inspector About This Manual Installing the Gene Inspector ccc eee eee Updating Gene Inspector 0 00 c eee eee eee System Requirements Mac 1 1 0c ee eee eens System Requirements WindOWS 0 0c cece eee eee eee Overall Design Philosophy Three Main Parts of the Application 0 00 cece eee eee eee SEQUENCE Editor sico al ek ee Te eee eee Analysis Setups sopien Aidit a GlNotebook cio GN ENNEN a KOENEN KEE Sa Tutorials About the TutorialS resisaer mi ccc ees Selection VS Target misa tira pl Pade ha cab das SEQUENCES coccion SS ke AS Tour of a Gene Inspector Notebook Editing SeQUeNCES gii sa iihi cece Using Analysis SetupS 0 0 00 cee eee eee eens Hotlinking Analysis Results 000 cece eee eee eee eee Multiple Sequence Alignments Running Summary AnalyseS 00 c cee eee ee Aligning Analysis Objects 200 cece eee Customizing Gene Inspector MenuS
105. NG BOOKMARKS IN THE GI NOTEBOOK Bookmarks can be attached to selected objects and used to remember spe cific locations in a Gl notebook Bookmarks are automatically added to the Notebook menu Selecting a bookmark from the menu takes you to the location of that bookmark in the notebook 1 Open the Gl Notebook called Sample Notebook which should be in your Gene Inspector folder Select the title object in the notebook it is the box that says A Sample Gene Inspector Notebook at the very beginning of the notebook Select Notebook gt Bookmarks gt Attach Bookmark Name the bookmark Start of Notebook 2 Select Notebook gt Bookmarks gt End of Notebook This bookmark has already been added to the notebook and is accessible from the Bookmarks menu Selecting the item will take you to the end of the notebook and bring a rect angle into view that is the actual object to which the End of Notebook book mark is attached 3 Try selecting the Start of Notebook and End of Notebook menu items in the Book marks submenu You can use bookmarks in this way to remember the loca tion of specific analyses that might be key to your experiments 4 Specific locations in the notebook can be defined as bookmarks by plac ing a small graphical object into the notebook at the desired location and attaching the bookmark to that object as was done in Tutorial 1 Tour of a Gene Inspector Notebook This concludes this tuto
106. Nucleic Acid Analysis Align 2 Sequences Global This routine will provide the best alignment between two sequences using the entire lengths of the sequences a global alignment The analysis will gener ate the alignment containing the highest score possible The score is calcu lated by adding points for each matched nucleotide and subtracting points for gaps and mismatches You define these values in the panel shown in Figure 4 12 There are a number of parts to this setup panel The top part of the panel specifies the way in which scoring will be con ducted This can be through a previously defined scoring table or through an identity table You can use the Table Editor within the Gene Inspector to cre ate a scoring table or your own although GI comes with most of the standard scoring tables see Creating Your Own Analysis Tables page 2 63 and Tables page 6 23 In a scoring table you define the value to be added to Page 4 16 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs 0 High Priority Close Run A Match scoring O Use scoring table Max 1 00 Min 0 00 able Nucleotide identity L Input Sequences A Use identity table D EH d Match score 100 Mismatch score 1 00 oa Output Location 2 gt Gap penalty creation 2 50 for extension 0 30 Mw Treat unaligned ends as gaps Traceback use 0 Upper path O Lower path Run z score using 100 trials v Show Icons Style De
107. Nucleic Acid Dot Matrix Panel will be compared to 1 10 of sequence 2 If the two segments meet the defined scoring criterion a dot will be place in the plot at coordinates repre senting the two segments being compared Next 1 10 of sequence 1 are compared to 2 11 of sequence 2 then 3 12 of sequence 2 then 4 13 of sequence 2 etc until the entire length of sequence 2 is compared to the segment of sequence 1 Next nucleotides 2 11 of sequence 1 are used to compare to sequence 2 and the process is repeated In this way a plot can be generated which indicates graphically those regions of the two sequences which are similar they will show up as a diagonals on the plot The values you enter in the input panel will determine when a dot will be drawn and what that dot will look like color amp size The first thing you should do is define a window size by typing in a number in the window size box this specifies the size of the segment that will be used in the sequence comparison The smaller the window the more sensi tive the analysis will be to local changes in sequence The increased sensitiv ity however will also increase noise level For nucleic acids a window of at least 10 is recommended but 20 is probably better Each pair of sequences will have its own best window as it will depend on how similar the sequences Page 4 27 Analyses are and how the similarity is distributed along the length of the sequences You also need to
108. SELKYKYLDESYYLYE 87 58 GKDGFGYCMDYSHFEPSELYWGWODNSYY VE o3 88 AKSEQQEAEQG SRHFLRAR LPEGEADKYTSTLSS 124 s GNHEEREODHGr TANFYAR ALPPGEADKYASTLSS 129 Figure 4 48 Global Sequence Alignment alignment indicators are shown and indicate regions of highest similarity Align Multiple Sequences The parameters for this analysis are similar to those in the section on multiple sequence alignments in nucleic acid sequences page 4 20 and are dis cussed in detail in Multiple Sequence Alignments page 3 10 The only dif ference between running a multiple sequence alignment as a sequence analysis Analysis New analysis and initiating the multiple sequence alignment within a sequence editor is the way in which you are allowed to choose sequences for the analysis Doing a multiple sequence alignment as an anal ysis will let you choose to align any number of sequences from any number of sequence files Running the alignment from within the sequence editor only works on the entire set of sequences in the sequence editor document you Page 4 52 Analyses Analysis Setup Analyses 1 Inputs O Outputs O O High Priority Close gt Run li Output type Graph Table AMino18 COmMposItON 19 Display results as Input Sequences Zei Number of occurences w O O Percent of all occurences Output Location v Show Icons Style Default D Use the Output type box to have the amino acid composition displayed either as a gr
109. Save the notebook using a name you will remember because you will need it again in Tutorial 15 Restriction Enzyme Digests page 2 49 This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 47 Tutorials Using Predefined Analysis Suites TUTORIAL 14 USING PREDEFINED ANALYSIS SUITES 1 We have provided you with several predefined suites of analyses in the Gene Inspector These suites provide an easy way to setup a number of common analyses The predefined suites can be modified or discarded and are meant to serve as an example of how analysis setups can be used 2 Choose Analysis gt Hydropathy Analyses This analysis suite contains 10 differ ent hydropathy analyses one using each of the available tables in the popup menu To use this suite select the input sequence panel and choose the pep tide sequence you want to analyze 3 After choosing the sequence to be analyzed press the Run button 4 A new notebook will be created and your analyses will be started While the analyses are running choose Analysis gt Show Analysis Monitor This shows you all the analyses that are scheduled to be run and the order in which they will be run As each one is completed it is removed from the list and the next one in line starts up 5 To see that all of the analyses really did run choose Notebook gt Reduction gt Reduce to Fitin Window This will shri
110. Sequence Margins Specify the minimum margin widths in pixels between the sequence and the edge of the view Left margin d Right margin o Space between positions and sequence in pixels Gap margin 6 hk Figure 3 15 Define Features Margins can set the minimum space between the edge of the sequence characters and the border of the Features object itself You can also set the space between the sequence and the position indicators Features gt Groupingcan be used to set the organization of the characters in the sequence listing With this submenu you can define the size of the group in which sequence segments are organized and can insert or remove line Page 3 18 The GI Sequence Editor breaks In addition to the possibilities discussed above you can also fine tune the formatting by using options under the Format menu Of particular use is the For mat gt Style gt Box Around item which will place a simple box around any selected segment of sequence This is useful for bringing attention to a particular sequence within the Features object BoxAround works just like any other item in the Style submenu If you create a multiple sequence Features object in the GI Notebook you will see a Sequences menu You will not be able to Mark Sites as was possible when only a single sequence is present but you will have the ability to apply custom adornments to the multiple sequence alignment Custom Adornments are discussed on page 3 1
111. Sheets gt Add Style Sheet After you provide a name a Style Sheet will be added to the Style Sheets submenu where it can be applied to any Gl Notebook output object Any Style Sheet can be applied to any output object but only the common 3 Style Name Create style sheet from entire object selected part s Name for style sheet x axis 1 Save in resource Cancel OK Figure 4 7 Creating a Style Sheet for Part of an Object attributes will be modified If a sliding window Style Sheet is applied to a base distribution plot almost all the attributes are comparable both have x and y axes titles and a plot However applying a sliding window Style Sheet to a GOR protein squiggles plot will only affect the title It is also possible to add Style Sheets corresponding to specific parts of an analysis object by making the analysis a target double click on it selecting the part whose style you wish to copy and then choosing Format gt Style Sheets gt Add Style Sheet You may add a Style Sheet corresponding to the entire object or just to the selected part s as shown in Figure 4 7 Once a Style Sheet has been added to the menu it can be used from within an Analysis Setup Panel Each Analysis Setup Panel has a popup menu to allow you to choose a specific style for the output object In addition to a default style set by Textco before shipping the Gene Inspector the popup menu will contain all of the
112. Side Widest Side text only will be placed to the side of the object that has the greatest distance between the object and the border of the text column If the object is moved the text will flow only to the widest site Neither Side Neither Side text is not placed on either side of the object and jumps from above the object to below the objects Set Text Standoff This item can be used to set the number of pixels that will be maintained Text Standoff Specify the space between the text and the graphic in pixels Horizontal standoff GU Vertical standoff 2 C Cancel EE Figure 6 19 Setting the Text Standoff Distance between the object and the surrounding background text The dialog is shown in Figure 6 19 Note that the vertical and horizontal standoffs can be set inde pendently Arrangement This menu deals with arranging objects in the GI Notebook Send To Back When an object is selected and SendTo Back is chosen the selected object is sent behind all other objects on the sheet Bring To Front When an object is selected and BringTo Front is chosen the selected object is Page 6 31 Menu Items placed in front of all other objects on the sheet Group When more than one object is selected and Group chosen all of the selected objects will be joined together as a group The group of objects can be manipulated as a single object instead of many individual objec
113. T GCC P 38 ATTTGGGCCGQAATGTCCATGGTGCCCTTCT GCC 74 69 CT AC GCCTTCTTCCACGAGCCGCC TTGGA 104 75 CT ACTGCCAGCGCCGACGAATCCCC TTGG 110 Figure 4 13 Global Alignment Output for Nucleic Acids There are three parts to this analysis output object the title the summary and the body of the alignment itself You can show or hide the summary and the body by choosing the appropriate menu item under the Object menu when the output object is targeted You can also choose to show or hide alignment Page 4 19 Analyses indicators in the same way Alignment indicators are characters that indicate something about the relationship between the two aligned characters For example might indicate a score of 1 a score of 0 5 and a score of O You can choose which characters you would like to use as alignment indi cators as well as defining the color code for indicating scores by choosing Object gt Edit Thresholds You will see the window shown in Figure 4 14 To E Set Alignment Thresholds High threshold 1 00 Char Mid threshold 0 50 Char Lowthreshold Jong Char _ Mismatches and gaps Char Cancel OK Figure 4 14 Editing Alignment Thresholds change a character just click in the Char box and type a new character into the box on the right To change the color select the Char and use Format gt Color To change the thresholds that are used for each character or color ty
114. T19 365 pSPTbm20 366 pSPTbm21 367 pSTneo 368 pSV B GAL 369 pSV SPORTL 370 pSV2neo 371 pSVbeta 372 pSVK3 373 pSVL 374 pT3T7 lac 375 pT3T7 luc 376 pT3T7BM ST PTI 0 378 pT7 1 379 pT7 2 380 pT712 381 pT713 382 pT7BlueR 383 pT7T3 18 384 pI7T3 18D Page A 30 U S U S U S BRL BRL Biochemical Biochemical Biochemical NovaGen BRL Pharmacia USB USB USB nan Nn Appendix 385 pT7T3 18U 386 pT7T3 19 387 pT7T3 19U 388 pT7T3alpha 19 389 pT7T3alpha A18 390 pTKbeta 391 pTOPE1b 392 pTrcHisA 393 pTrcHisB 394 pTrcHisC 395 pTrcHisCAT 396 pTrx 397 pTrxFus 398 pTRXN 399 pTRXN 400 pTZ18R 401 pIZ18U 402 pTZ19R 403 pTZ19U 404 pUB110 405 puc118 406 puc119 407 pUC18 408 pucl9 409 pUC4K 410 pucs 411 pUC9 412 pUCbm20 413 pUCbm21 414 pUEX2 415 pVL1392 416 pVL1393 417 pWE15 418 pXal 419 pXa2 420 pXa3 421 pXPRS 422 pXPRS 423 pYAC4 424 pYAC55 425 pYACneo 426 pYES2 427 pYEUra3 List of all Vectors Included With Gene Inspector U harmacia RL Pharmacia RL harmacia lontech ovaGen Vitrogen Vitrogen Vitrogen Vitrogen Vitrogen Vitrogen S Biochemicals USB S Biochemicals USB harmacia DO IO ud PP vd vi vd YD C N I I E I I T U U P S igma Pharmacia Sigma Sigma Clontech Clontech Pharmacia Pharmacia Pharmacia S
115. The position is given from the 5 end of the top strand the one containing Seq 1 The fourth column Seq 2 is the sequence of nucleotides in the top strand corresponding to the inverted repeated Any mismatches are shown in lower case like the c in row 1 To make it easier to follow the sequences after inversion the inverse of the sequence in column 4 is shown in column 5 labeled Seq 2 The mis match information in this table may be of some use in your analyses while the graphical view Figure 4 26 page 4 32 provides a more comprehensive picture of the patterns of the inverted repeats The Show summary results checkbox in Figure 4 25 on page 4 31 will create a single output containing the inverted repeat results of all the Page 4 32 Analyses Seq 1 Start i CCTCCGCC Find inverted repeats summary Drosophila HSPs Sequences 8 Matches Figure 4 28 Inverted repeat summary analysis analysis is shown in Figure 4 28 For this analysis the summary results are presented in table format To see the result of an individual sequence analy sis in the format shown in Figure 4 26 page 4 32 first target the summary output object select the sequence s you want to examine and then choose Object gt Search Selected Sequences You will see an analysis setup panel with just the one sequence entered Running this analysis will show the single sequence inverted repeat analysis This is a convenient way t
116. a window of 20 and an offset of 1 until the end of the DNA sequence is reached This analysis is useful for showing local regions of DNA which might have noteworthy base composition The result is shown in Figure 4 18 page Page 4 22 Analyses Base distribution chick musc AchRec o4 7 r 7 T T T 8 1 501 1001 1501 2001 Figure 4 18 Base Distribution Output 4 23 Notice how easy it is to pick out the region rich in A T low G C around position 250 This analysis is very useful because it points out DNA features not visible through other analyses CodonPreference This analysis which is based on the paper by Gribskov M et al Nuci Acids Res 12 1 539 1984 is used to find coding regions in DNA The Analysis Setup Analyses 1 Inputs O Outputs O I High Priority Close Run Window size codons 25 Replace ambiguous characters with A a Method TS C Start and Stop Codons Only Stop Codons Input Sequences O Min length ORF to consider amino acids 100 DD Output Location Preference Table Selection Cutoff 9 fio co Table Standard Drosophila melanogaster EJ Iw Show Icons Style Default 3 Window size is the length of the segment of DNA analyzed in this sliding window analysis 25 is recommended by Gribskov The Method box defines ORFs as between stop codons or between start and stop codons The Display box lets you show ORFs or ORFs and rare codons Choose a codon table
117. aaees 6 26 Show hide analysis monitor ceceeceeeeeee cece teen E S E S EES 6 21 Update Setup EE 6 25 analysis MONOT ardiarena a ee eee ee eee eee ee eee ee aaa ner 4 3 6 21 analysis setups adding analyses soea EES a ens 4 8 adding analyses tutorial cccccccesssssssececeeeesessseeaeeececessssssaeeeeeeees 2 40 2 42 analysis Panels usina al id dada des 4 3 Customizing suites tutorial occccocononocococnnonnonnnocncnnnnnnnonnnrnncnnnonnnonannnns 2 46 AISCUSSION A dl a eee one 1 6 input sequence panel eiii aa 4 2 output location Panel ENEE EEN 4 2 overview tutorial ooococnnnnnnnnonncnnncnnnnncnnnoncnann nn ana nono nono nonnn nana nn nan c canon 2 14 2 18 predefined suites tutorial ooononncncnnnnnnncnnnnnnnnnnnnonnnnnnnnna canaria nana nn nara 2 48 analysis tables elle EE 6 23 creating tutorial a r e ar a a e eaea a Er Aae OEE UAA SERET ELIE enaos iii 2 63 2 66 antigenicity analyse EE 4 54 app endices A PEE EEE EE ii ia 1 8 5 16 6 33 Appendices Notebook Menu ooocococococccccccccononococonononononononononononononenonononononons 6 33 appendices tutorial unid cala dee dee 2 43 2 45 Argos CET ET A 1 Index 2 Arrangement Notebook Menu c cccccccccssssseececeeeesssssssaeeeeeeeeesesneaeeeeesesees 6 31 automatic links see hotlinks B background text low dees noes ee eee need idee de 6 30 DASE COMPOSITION cece cecceeccaeccaeceeeceeeceeeceeecaeecaeecauecasessees
118. able might be appropriate for pBR322 and click OK A new sequence win dow will be created containing a translation of the segment of DNA selected in the pBR322 window Note that this is not the same as looking for an open reading frame but simply represents a translation of the selected nucleic acid sequence Note that the DNA is translated in groups of three starting with the first nucleotide in the selected segment any stop codon is indicated as e Click on the name of the new peptide sequence in the new window and choose Sequence gt Sequence Info This shows information about the generated sequence and provides a text field for storing comments Because the pro e To identify the locations of open reading frames see Open Reading Frames page 4 39 Page 2 11 Tutorials Editing Sequences gram generated the sequence for you it also placed some appropriate com ments in the sequence info box for you Close the dialog box 8 Close the Sequence Monitor and then close the pBR322 sequence file and the generated untitled peptide window Choose File gt Open and open the peptide file called rhodopsins it is in the Peptide Sequences folder inside 12 rhodopsins pep Bacteriorhodopsin T MLELLPTAVE GVSOAQITGR PEWIWLALGT ALMGLOTLFL VKOMGVSOPO Halobacterium archaerhodopsin 1 MOPIALTAAV GADLLGOGRP ETLWLGIGTL LMLIGTFYFI VKGWGVTOKE Lamprey rhodopsin 1 MNGTEGONFY VPFSNKTGLA RSPYEYPOYY LAEPWKYSAL AAYMFFLILV Octopus rh
119. acters into a sequence However the Generate Random command will insert a randomly generated sequence of valid nucleotide or amino acid characters instead of just inserting Xs or Ns e Go To Position The GoTo Position menu item presents a dialog box Figure 6 29 into which Enter Value Go to this nucleotide position he Cancel Figure 6 29 Go To Position Dialog you can type a location This menu item is only active when the cursor is actually within a sequence After pressing OK the character at the position you indicated will be visible in the window and will be selected Cancel will close the dialog box without moving the current cursor location e Speak Typing The Speak Typing menu item will either be checked or not If it is checked each character that you type into the sequence window will be spoken by the com puter as it is entered The speed of speech and other parameters can be set Page 6 46 Menu Items using Speech Prefs as described in the next section e Speech Prefs The Speech Prefs menu item allows you to set the way in which speech is handled by the program This was discussed in Defining Speech Preferences Mac only page 3 7 You can set the speed the time to pause between groups and even record your own sounds e Map Keys Map Keys was discussed in Mapping the Keyboard page 3 7 For nucleic acid sequences it provides a way to assign specific keys i
120. actual object but is rather a pointer to an object that is not visible Gl Notebook aliases can point to objects in an Appendix or to any other object in the notebook Aliases can be used as navigational tools For more information about aliases see Make Alias page 6 29 g The Appendix windows are almost identical to the Open For Editing windows page 5 14 Both can be edited but closing the appendix will make it invisible while closing the editing window will return the object to the notebook Appendix windows are also listed under the Appendices menu Page 5 17 The GI Notebook in the GI Notebook will transfer a copy of the tool extension into the GI Note book at its original preferred size with the top left corner of the object being placed at the location of the mouse click If instead of clicking the mouse but ton you hold it down and drag out a rectangle the tool extension will be scaled to fit into the rectangle Uses for Tool Extensions Because tool extensions can contain any GI Notebook object they have many uses For example you could draw a complex graphic consisting of a number of shapes rectangles circles etc group them together and then create a tool extension from them You could then use the complex object as an icon for use as a bookmark or an indicator of a new experiment or as an indica tor of important data or any other point you want to highlight Graphic objects from other programs
121. aestdegedecatecads 2 60 thresholds a eaa deeg cia 2 58 MULOM All os coerce cate octet cece cated Duce ay tecutacee EE 2 58 2 61 find repeats ecinic eraron e a eee e EE E a ERA EE O 4 55 find sequence Prosite style ccccccccccecessssesseeeceecsesesseaeeeseeesessseeaeeeseeeess 4 56 GOR Structure prediction c cccceceeeeeeeeeeeeeeeeeeeeeeeeeeaaseeeeesaaeeeeeesaaeeeeesaaees 4 58 helical Wheel esis ee ccec coin d uge lt 2 40 4 59 hydration potential ocios Hate ee teckel eat wining 4 62 klen 2 48 4 60 membrane buried reGiOnS 4 63 optimal matching hydrophobicity AAA 4 63 DREAD EE 4 63 physical characteristics cia RER 4 63 prosite motif search uosis mergi eia aaa aa a aiao ia eian aaaea draa 4 64 protein cleavage nia e EA ONE eE E E NE EAE 4 67 lte Lu 4 67 Side chain flexibility saarien aaea aE ener eee cece ee ea ee EE EEEEEN aaie 4 67 Side chain ProtruSlOn ci eee eee eed eee ieee 4 69 e LEUR UE 4 68 sliding WINDOW wes 2sezee ee cident Roxas edn tooled Woodend he ei elec ae ee ge 4 68 surrounding hydrophobicity AE 4 69 temperature aci n mita else titan eet EE ees 4 70 testcode tutorial NEEN AEN 2 55 2 57 transmembrane helices ccocconcocconnconcocnoncanioncannoncanconcnnnononnnnncanananans 2 14 4 70 protei cleavage E 4 67 protein cleavage SMES omic A 6 protein interior seage deee eea ee daaa drd E a aee a E n wees 4 67 protein physical characteristics cccecceeeee cece eee ee eee eee ee eeee a
122. al sequence you first need to copy the sequence you want to invert and paste it into a new sequence window or a new sequence within the same window then choose Page 3 4 The GI Sequence Editor Invert A new sequence editor document can be created by choosing File gt New and a new sequence within the same document can be created by choosing Sequence gt New Sequence Choosing Sequence gt Manipulate gt Translate will translate the current selected nucleic acid sequence characters The Gene Inspector will ask you to choose a translation table and then it will create a new peptide sequence window containing the translation of the selected nucleic acid segment For peptide sequences you can choose Sequence gt Manipulate gt Reverse Translate With this option you are asked to choose a codon preference table and the Gene Inspector will use the frequencies found in the table to create a DNA sequence which could code for the given peptide The codon frequencies in the generated DNA will match the codon frequencies for the organism you specified Formatting A Sequence Within the Sequence Editor A number of options are available for defining the format for displaying sequences in the sequence editor Choosing Sequence gt Format Sequence will bring up the dialog shown in Figure 3 4 The Groups box allows you to spec Format Sequence Groups Spaces D Don t group characters D Space groups by pixel width
123. ally text objects Text that is in a text object will not flow with the background text when the background text is edited 12 Scroll down to Figure 3 or use the Notebook gt Bookmark menu to go to the Figure 3 bookmark Notice how specific parts of a figure can be indicated and referred to from within the background text 13 Scroll down to notebook sheet 4 or use the Subcloning the Coding Sequence bookmark to see how restriction digests were used to identify an appropriate region of the DNA for subcloning 14 Looking further down on sheet 4 of the notebook will show you Figure 2 3 These are aliases Just like the Finder s aliases they point to another Features of pBG123 2 the cloning XN strategy Figure 2 3 Notebook Aliases location In this case the aliases point to appendix objects which can contain large amounts of data that you might not want directly in your notebook but do not want to discard either Double click on the icon in the left part of the a The sheet number is indicated in the bottom left corner of the notebook window Page 2 6 Tutorials Tour of a Gene Inspector Notebook alias which says Features of pBG123 2 It will open up a new window containing the appendix to which the alias points This appendix contains a features view of the sequence being cloned A features view is a kind of notebook object the Gene Inspector uses to display formatted sequence information Scroll
124. ambiguities 0 Comments SSACHRMR 2884bp RNA MAA 02 NOW1992 DEFINITION Porcine mRNA for muscarinic acetylcholine receptor mAChR ACCESSION Maa KEYWORDS acetylcholine receptor glycoprotein muscarinic receptor pyrC gene SOURCE domestic pig ORGANISM Sus scrofa Eukaryota Animalia Metazoa Chordata Vertebrata Mammalia Theria Eutheria Atiodactyla Suiformes Suidae REFERENCE 1 bases 1 to 2884 AUTHORS Kubo T Fukuda K M kami A Maeda A gt Figure 3 3 Peptide Sequence Get Info same information as that shown for the nucleic acid sequences except that you can not make a sequence circular because there are no circular proteins yet Sequences can be re ordered within the sequence document by holding the option key down clicking on a name and dragging the sequence name up or down the column of sequence names As the sequence is option dragged you will see an indicator of where the sequence will be placed when the Page 3 3 The GI Sequence Editor mouse button is released New sequences can be added to a sequence document by selecting Sequence gt New Sequence This will present you with a dialog box allowing you to enter a name for the new sequence This sequence will be created below the last sequence in the sequence editor document You can then paste in a new sequence from elsewhere or begin to type in the new sequence Selecting a range of characters within the sequence itself is also possible by
125. an be used to design and produce illustrations for publication or presentation In addition to being a receptacle for analysis output the GI Notebook is a capable word processor having special functions enhancing its use as a labo ratory notebook The background text of the Gl Notebook can be used to dis cuss experimental strategies and to take notes on the experimental results User tables allow you to set up repeatedly used information e g lanes on a gel buffer recipes assay setups etc in a convenient and readily accessible way Conditional text can be used in the GI Notebook This is a special kind of style that allows the text to either be shown or hidden Conditional text can be used to take notes that you might not want to show when you design a poster or it might contain information that is parenthetical to the main discus sion The drawing and text tools in the Gl Notebook can be used for assembling posters or slides for presentations The size of a Gl Notebook sheet is not restricted to printer page size and can be set to correspond to your particular needs If you need to have 16 x 20 inch panels for a poster set the sheet size to 16 x 20 and define the number of columns of text you want to have Navigation is made simple in the Gl Notebook through the use of bookmarks and aliases Bookmarks identify specific locations in the Gl Notebook and can be attached to any GI Notebook object Each bookmark appears by name in th
126. and causes the appearance of eight handles little black squares at the corners and midway along each side around the edges of the object Figure 2 1 left You can resize the object using these handles in the same way you would within a standard drawing program Double clicking on a GI Notebook analysis object makes it the target and causes the appearance of a gray border around the object Figure 2 1 right Once an object is targeted you can modify components within the object and can use the features available in the Object menu which appears when an object is targeted Sequences You will be using a number of DNA and protein sequences in these tutorials They are all saved in a folder called GI Seqs which was placed in the Gene Inspector folder folder during installation Within the GI Seqs folder there is a peptide folder and a DNA folder Each file may contain multiple sequences for example see Figure 2 6 page 2 12 Page 2 2 Tutorials Tour of a Gene Inspector Notebook TUTORIAL 1 TOUR OF A GENE INSPECTOR NOTE BOOK One of the central components of the Gene Inspector is the Gl Notebook The notebook is a Glo created by the Gene Inspector that serves as a place for you to take notes about experiments design posters or slides for presen tations and as a receptacle for output from sequence analyses performed by Gene Inspector This tutorial will take you through some of the features of a Gl notebook and
127. any group size and will then organize the selected sequence into groups of that size Insert Line Break The Insert Line Break menu item will insert a line break at the insertion point in the sequence The means that character immediately after the insertion point will become the start of the next line If a segment of sequence is selected when Insert Line Break is chosen a line break is inserted just before the first character in the selected sequence and another line break is inserted just after the last character in the selected sequence Remove Line Break s The Remove Line Break s menu item will remove all line breaks from the selected segment of sequence e Adjust Size To Contents The Adjust Size To Contents menu item will adjust the size of the features object so that it will exactly contain the entire contents This is a useful command if the editing operations you perform on the features object causes it to shrink or grow Page 6 43 Menu Items Sequences Menu This menu appears when you target a multiple sequence Features Object For single sequence Features Objects you will see a Features menu which is dis cussed as Features Menu page 6 39 e Sequence Info Selecting a sequence in the multiple sequence Features Object and then choosing this menu item will display information about the selected seqeunce This is the same information that would be displayed in the sequence editor window see Fig
128. aph or as a table The data can be presented as the sum of units how many of each amino acid or as the percent of total by using the Data to plot box Figure 4 49 Amino Acid Composition Setup can not add any additional sequences nor can you use a subset of the sequences in the sequence editor document Amino Acid Composition The amino acid composition analysis determines the number or percentage of each amino acid in the peptide s being analyzed The setup panel is shown in Figure 4 49 Results can be displayed either as a histogram or a table and can be shown as a raw number or as a percentage of the number 50 Amino acid composition Octopus rhodopsin Occurences e a 8 D s 3 D Ala Arg Asn spCys Gln Glu Gly His lle Leu Lys Met Phe Pro Ser Thr Trp Tyr Yal Figure 4 50 Amino Acid Composition Output of amino acids in the peptide Figure 4 50 The table output lists the results as both an amount and a percentage not shown Page 4 53 Analyses Antigenicity There are six Antigenicity Analyses available in the Gene Inspector In gen eral these analyses are based on the likelihood of a given domain of a pep tide residing on the surface of the peptide The analysis is a standard sliding window analysis Sliding Window page 4 68 in most cases The following tables are available for use in the Antigenicity analysis Emini et al page A 1 Hopp and Woods page A 2 Janin page A 3 Parker et a
129. application This means that you will never have to interrupt your work while an analysis is running For time consuming analyses like database searching or sequence comparisons this can be a real time saver 5 You will get a plot like the one shown in Figure 2 40 Any points above TestCode Dros hsp70 T T T T T T 1 1001 2001 3001 4001 501 Figure 2 40 TestCode Output the upper green threshold line at about 0 95 correspond to a likelihood of gt 95 that the region actually codes for a protein In this case the region from about 1600 to 3600 corresponds to the raised area on the plot and to the open reading frame labelled as A The tick marks indicate the presence of a rare codon For this predicted protein there are very few rare codons which also suggests that it is a real gene Page 2 56 Tutorials Testcode An Interactive Analysis 6 The output suggests that the reading frame A codes for a protein and it might be of interest to create a protein sequence corresponding to that region of the DNA This can be done easily in the Gene Inspector Double click on the output object to make it the target 7 Now select the ORF of interest by clicking once on the arrow A itself Once the ORF is selected choose Object gt Translate DNA for Selected ORF The Gene Inspector will read that segment of DNA translate it using the table you specified in the analysis and place the generated p
130. appropriate for the DNAs being analyzed The rare codon Cutoff value is based on the frequency of Figure 4 19 CodonPreference Panel codon preference plot that is produced is useful for identifying genes and exons and for detecting DNA sequencing errors resulting from insertions or deletions The setup panel is shown in Figure 4 19 page 4 23 The analysis Page 4 23 Analyses requires the use of codon frequency tables to specify codon usage for the organism being studied A codon frequency table is a table containing a list of each codon and the frequency at which that codon is used for specifying a particular amino acid For example there are 4 possible codons for glycine but they are not likely to each be used 25 of the time in any one organism Each of the four codons will be used at a different frequency in different organisms Available codon frequency tables are chosen in the bottom of the analysis setup panel using the popup menu This popup contains the 48 Stan dard tables supplied with the Gene Inspector along with any User tables you might have defined on your own see Tutorial 20 Creating Your Own Anal ysis Tables page 2 63 and Editing Translation and Codon Preference Tables page 4 13 Any User table of the correct type that resides in the User Table folder in the Gl Data folder will be available in the popup menu If your organism is not listed you can try using a codon preference table from a related organism
131. at an ORF has to be in order for that ORF to be drawn in the analysis output by using the setup panel on Figure 4 19 page 4 23 You can also specify whether all ORFs must start with a start codon probably true for prokaryotes or can start with any codon which can occur as the result of introns in eukaryotic genes ORFs are indicated as hor izontal arrows as shown in Figure 4 20 page 4 25 You can select an ORF arrow with the mouse if the CodonPreference output object is targeted Once and ORF is selected it is possible to extract either the corresponding DNA or peptide sequence into a new sequence editor win dow This is discussed in more detail in Open Reading Frames page 4 39 and was the subject of Tutorial 17 Testcode An Interactive Analysis page 2 55 Using the CodonPreference analysis and showing rare codons and ORFs pro vides three independent methods of identifying a coding region all in the same output object Notice how the three pieces of information agree in iden tifying the true coding region for Drosophila hsp7O in Figure 4 20 on page 4 25 Dot Matrix This was the central topic of Tutorial 18 Dot Matrix Analysis Another Inter active Analysis page 2 58 A dot matrix analysis is used to compare two sequences for regions of similarity The result is a two dimensional plot indi cating graphically the regions of similarity between the two sequences This method gives a very intuitive
132. be slow on older computers You will need to try this on your own system to see if it is acceptable for your use Custom Score Adornments This option is used for adjusting the display of multiple sequences It allows you to display aligned or non aligned characters using highlighting of the characters or the backgrounds behind the characters This is discussed in detail in Using Custom Score Adornments page 3 14 and in Tutorial 5 Multiple Sequence Alignments page 2 24 Page 6 50 Menu Items e Format Sequence Format Sequence is discussed in Formatting A Sequence Within the Sequence Editor page 3 5 e Use Extra Caution Use Extra Caution is discussed in the text around Figure 3 5 page 3 6 Page 6 51 Menu Items Sidebar Menu e Adjust Size To Contents The Adjust Size To Contents menu item will adjust the vertical height of the sidebar text object so that it will exactly contain the entire contents This is a useful command if the editing operations you perform on the sidebar text causes it to shrink or grow Page 6 52 Menu Items Ta ble Men u Hide Column Headers Hide Column Headers Hide Row Headers Hide Row Headers Insert Row The Table menu contains com Ca Insert Row Delete Row s Add Column s At Right Add Row s At Bottom Add Column s At Right mands needed to modify and aa ee change the display of user cre Adjust Size To Contents Ctrl J ated tables The dif
133. ble Drosophila melanogaster E 19 ORFs and rare codons Cutoff 10 00 Table Drosophila melanogaster B B Mi show Icons Style Default By Window size is the length of the segment of DNA analyzed in this sliding window analysis 200 is recommended by Fickett ORFs can be found between stop codons or between start and stop codons Method box The Display box lets you show ORFs or ORFs and rare codons Choose a codon table appropriate for the DNAs being analyzed The rare codon Cutoff value is based on the frequency of occurence in synonymous codons Figure 2 39 The TestCode Setup Panel examine this panel in depth now The Minimum Length Open Reading Frame Page 2 55 Tutorials Testcode An Interactive Analysis to Consider should be 200 labelled A in the figure the Method should be Only Stop Codons and press the ORFs and rare codons button and choose Drosophila melanogaster as the standard table using the popup menu labelled B in the figure 3 Click on the Input Sequence icon on the left and choose the sequence hsp70O from the Drosophila Hsps DNA sequence file 4 Run the analysis by pressing the Run button On slower computers this analysis might take some time to run While it is running you can enter text into the notebook by clicking in the background and then typing The Gene Inspector will continue to process analyses while you work in the notebook or even if you switch to a different
134. but there is no guarantee that the codon frequency tables for your organism and for the one you have chosen are similar If your organ ism is not listed try the Testcode analysis TestCode page 4 44 which does not depend on codon preference tables or create a new codon prefer ence table using your own coding data Because the analysis relies on the ability to recognize specific codons and look up values in a table corresponding to those codons any ambiguous characters found in the sequence can cause problems in generating a mean ingful output Ambiguous character handling is discussed in the TestCode section on page 4 44 That discussion also pertains to the CodonPreference analysis being discussed here You can specify how the analysis should han dle ambiguous characters using the setup panel in Figure 4 19 page 4 23 A default standard window size of 25 codons is recommended by the authors It represents the segment size of the DNA that will be examined for its codon usage The codon usage in this sliding window will be compared to h Note that many species use the same genetic code but might have significantly different codon preference tables Translation tables contain information about which amino acid is coded for by each codon while codon preference tables contain codon usage informa tion Page 4 24 Analyses the frequencies in the codon frequency table for that organism The closer the actual usage is to t
135. by translation it can be copied and pasted as plain text into a sequence editor for manipulation Note that the features translation should be shown as one letter amino acids before you copy it to the sequence window Page 3 17 The GI Sequence Editor be defined in the DNA sequence You can also have multiple translations of the same DNA for example in different reading frames The Features gt Displaysubmenu allows you to specify exactly what the Features object will look like You can show or hide Site Markers restriction sites or pro tein cleavage sites LeftPositions numbering Right Positions numbering and Line Dividers which separate adjacent lines of sequence from each other In addition for the DNA Features object with translations you can Show or Hide Translations and choose to show the translation as either One Letter AACode or Three Letter AA Code Line Spacing determines how much space is placed between lines of sequence To change the font characteristics for the position indicators select the posi tion numbers and use the Format menu For translated sequence position indi cators in DNA sequence numbering of amino acid positions the amino acid number formatting is set to match the formatting of the closest amino acid character You cannot change the amino acid numbering font characteristics through the Format menu Features gt Display gt Features Margins will bring up Figure 3 15 With this dialog you
136. cccnccncnninononacnncno nono nanana nana nn nan nono nn narran 6 10 s lection ve TARGET ic tri 2 1 5 6 send Ter b ck cone be EENEG 6 31 SEQUENCE CHOOSER EEN ENEE 2 15 Open SeqUeNCES cio a a a a aa a aa E a a EEEa 4 6 saved SOQUBNCES ENEE E ea Aa aeaa ENEE 4 6 Sequence Ce EE e NEE 4 4 sequence editor eene Bn UE le 3 4 aligned sequence display 3 12 aligning multiple SEQUENCES EEN 2 24 alignment picture in notebook 3 13 confirm re entry EE 3 10 CONFIFMING SEQUENCES REENEN aa aaa AS aiai aaea t eied ianao uiet 3 9 Index 14 S CONSENSUS TOW EE 2 25 3 12 defining speech paramters iii diia e utes 3 7 e let EE 1 5 drag and Kelte We e ME 3 6 editing pane MAME COMM EE 3 2 e VE 3 2 POSITION COLUMN a 3 2 UI 3 2 entering Een 3 7 extras LL EE 3 6 formatting a S EQUENCE iii ete haere ee hy is 3 5 generating SEQUENCES EE 3 20 importing sequences EE 3 19 mapping the keyboard EE 3 7 multiple sequence alignment onccooncccnccccnonncnnnncnnnnnnnnnonnnnnnnnnnnnnnnncnnnnnennnnnnnanns 3 10 name column option dragging sssssssssssssssssrrnnsrunnnnnrnnnnnrnnnnnnnnnnnnnnnnnnnnnnnnnnn nne 3 3 NUMbEAND WEEN 3 2 overview of sequence editor cceceeecececeeeeeeeeeeeeeeeeeeaaeeeeseaaeeeeeeeageeeensaeeeeeaes 3 1 overview pane ele Ee NEE 3 1 fOr Ee le NEE 3 2 Segments Le ee EE 3 2 POSITION INOICATOFS tic A da 2 9 Se We WEE 3 9 re Ofdering SEQUENCES EE 3 3 eelere EE 2 25 3 12 gu ie Tee 2 9 Sequence Monit
137. ce Page 2 25 Tutorials Multiple Sequence Alignments Adornments page 3 13 11 Choose Sequence gt Consensus gt Automatic Updating This instructs Gene Inspector to automatically update the score top left corner of the window each time you edit one or more of the sequences Note that this does not automatically realign the sequences it just updates the score Automatic updating places some demands on the computer so it might slow down your typing if you are using an older slower computer 12 Try editing the sequences to see if you can increase the score The Clustal algorithm is quite good and it will be difficult to better the alignment generated by the algorithm This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 26 Tutorials Running Summary Analyses TUTORIAL 6 RUNNING SUMMARY ANALYSES Gene Inspector allows you to run Summary Analyses on multiple sequences at once and it will present the results in a single output object From within the resulting output object it is possible to see explore details of any of the individual sequence analyses it contains Summary Analyses are described in this tutorial Summary Analyses are available only for appropriate analyses 1 Choose Analyses gt New Analysis and then choose to do a Protein Analysis finally select the FindSequence analysis and press OK 2 You will see a dialog
138. ce on the disk which contains the Gene Inspector you should not encounter any limitations However if you are doing analyses with large sequences e g a a There is no built in limitation to the size of sequences which can be analyzed by the Gene Inspector it is limited only by your disk space Page 7 1 Tips For Using The Gene Inspector dot matrix comparing two sequences each of 5 000 characters requiring more than 25 megabytes of data storage you might run out of disk space If you have a different volume disk that contains adequate space you can use that space for the scratch data The extra volume might be a removable media drive e g SyQuest Zip CD RW etc or a different permanently con nected hard drive You can tell the Gene Inspector to use a new scratch volume by creating an alias to a Scratch Data folder which you create on a new scratch volume Do the following steps 1 To Locate and open the folder named Gl Data It is in the same folder as your Gene Inspector application Locate the folder called Scratch Data inside the GI Data folder and drag it to the volume you want to use as your scratch volume Drag your original Scratch Data folder into the Trash After doing this you should have a GI Data folder without a Scratch Data folder inside it Select click once the Scratch Data folder located on the new Scratch Volume Choose Make Alias from the File menu to create an alias t
139. ch Promega Promega Promega Promega Appendix 213 pGEM 2 214 pGEM 3 215 pGEM 3Z 216 pGEM 3Zf 217 pGEM 3Zf 218 pGEM 4 219 pGEM 4Z 220 pGEM 5Zf 221 pGEM 5Zf 222 pGEM 7Zf 223 pGEM 7Zf 224 pGEM 9Zf 225 pGEM luc 226 pGEM1 227 pGEMEX 1 228 pGEMEX 2 229 pGEX llambdaT 230 pGEX 2TK 231 pGEX 3X 232 pGEX 4T1 233 pGEX 4T2 234 pGEX 4T3 235 pGEX 5X1 236 pGEX 5X2 237 pGEX 5X3 238 pGFP 239 pGFP 1 240 pGFP C1 241 pGFP C2 242 pGFP C3 243 pGFP N1 244 pGFP N2 245 pGFP N3 246 pGL2 B 247 pGL2 C 248 pGL2 E 249 pGL2 P 250 pGL3 B 251 pGL3 C 252 pGL3 E 253 pGL3 P 254 pGUSN358 S 255 PhageScript SK List of all Vectors Included With Gene Inspector Phar Phar Pharmacia Clon Clon Clon Clontec n n n Clo Clo Clo Clontec Promega Promega Promega Promega Promega Promega Promega Promega Clontech Stratagene Page A 27 Appendix List of all Vectors Included With Gene Inspector 256 pHC79 BRL 257 pHIL D2 In Vitrogen 258 pHIL S1 In Vitrogen 259 PhiX 174 Promega 260 pHSV 106 BRL 261 pHT3T7bm Boehringer 262 pHT3T7bm Boehringer 263 pIAN7 New England Biolabs 264 pIBI24 IBI 265 pIBI25 IBI 266 pIBI30 IBI 267 pIBI31 IBI 268 PinPoint Xa 1 Promega 2
140. change the query Mis matches only occur in lower case characters Upper case characters must match exactly Figure 4 29 Find Sequence Panel DNA The query sequence can consist of a number of segments The sequence you want to search for is entered by typing it in the segment box and press ing the Add Segment button which will add that segment to the sequence list In this figure two sequences have been entered in the sequence list caat and tataaa The caat sequence is the currently selected sequence so the items in the bottom of the panel pertain to the caat sequence In this Page 4 34 Analyses case it has been specified that an exact match must occur O mismatches and there must be from 10 to 40 nucleotides before the next segment in the query sequence is found Similarly O mismatches are allowed in the tataaa sequence If you enter a sequence that you would like to search with again in the future you can save it by pressing the Edit Find Menu button in the top right of the panel The current entry might be saved under the name promoter The ability to save search sequences might be useful if you have binding motifs or other sequences you are interested in for your research The output is shown in Figure 4 30 Matches were found with this query Find sequence Dros hsp 3 201 1001 1501 2001 2501 3001 3501 4001 Find sequence Dros hsp 3 First nt cast Separation itataaa Last nt 1130 ATI TATANA 1109
141. ci n 4 26 le EE E 4 28 TING inverted A peren ees dira Terre AAA SENEE E rer ERS 4 31 TING repeal ara a a a a aia 4 34 las e 4 34 GCs Coding prediction acacia 4 37 GRA ls citada 4 48 Open reading frames 22 cecee cece cece cece eeee cnet eee ee ee ee eae ne eee ee ee eeeeaaeaaaaneeeeeseeeee 4 39 restriction enzyme digests discussion EE 4 42 edit display parameters 0 2 2 eeeeeee cece ee ee ee cece eee ee ee eee ae ee eaeeeeaaaeeeaeeeeaaeeees 2 49 tutorial EE 2 49 2 50 KU Ee Ee ue 2 50 VIEW GS Table viii ada aa 2 49 Ste ee EE 4 44 nuclelc acid Codes tables coi rta A 7 nucleic analyses EIERE aia 4 47 Numeric Format Format Menu ccccnnncccccccnonanananonano nono nana na nana nn cnn conan 6 19 NUMEFIC TOM S viciado 2 34 Index 1 1 O O pen for editing secession a S 4 44 5 14 6 25 6 29 Open For Editing Notebook Menu ccccnmnoncnocnccnncnononnnocncnnnnnnnnnnancnnnoncnnnos 6 29 pen reading EC 4 25 4 39 Open File Menta id traida 6 2 optimal matching hydrophobicity AAA 4 63 ption dragging A sete lanee cease AEA ARETE NR 2 43 5 13 output location panel EN 1 6 2 17 4 7 overview PANE ist ia 3 1 P Page Break Notebook Menu ooooccccccnccononocnncnnnonononaconononononnonannnncnnnnnnnannnnnns 6 37 Page Setups File Menu nicol 6 7 PAM Values vs Change in Sequence table ccoconcococuccnccnonococonononnnnnnos 4 50 Paragraph Format Menu occcoccncccoccccnncnonnnoonnnnnnn
142. column width essiens E nn nn rr nn 5 14 adornments siii dis 2 21 6 18 AIIASOS att das 2 44 6 29 aligning multiple SEQUENCES ENEE 4 20 4 52 IN SOQUENCe Editor EEN 2 24 le line Bee ni ads 5 10 ITU a EE 6 32 TUTOFI EE 2 31 2 32 aligning sequences BEOSUM tables comica odo ette eege e ee ee cial 4 51 DNA dd 4 16 editing thresholdS 4 20 JAP SCOMNG oriana s AAE E NA ele ed ee eee 4 17 PAM tables ee egeg eege A A ee e 4 50 le WEE 2 43 4 49 LESCANO dc eege 4 18 Alignment Sequence Menu cocccccnnccconnncnnnnonononononnnnnnnnonnnonnnnnnnnnnnnnnnnnnnnnnnnnnnnns 6 49 ambiguous Characters acostar io pao ee ebe S 4 46 amino acid codes table occonconconcocnonconconencnoncnnnnncnnnoncnnnnncnnnnnrnnnnnrnnnnanennnnncnons A 8 AMINO acid COMPOSITION 4 53 analyses also see nucleic acid analyses Index 1 A also see protein analyses analysis Chooser sitial Eege 4 1 ION Priority cc a 4 4 6 22 WEIEN 2 14 DAUSING EE 7 3 recalculating AN ANALYSIS 1 7 Starting an analysis dit Eege 4 1 analyses that take a long tiMe ee eee ee eeeee cece ee ee ee eee eee ee ean E i 7 3 analysis chooser dISCUSSION EE 4 1 ele Le 4 2 fig re A 2 14 Analysis Menu add another analysis iii cc ee ee cada oi 6 25 add Setup O MU ia ess 6 26 NEW ANALYSISE ii tele 6 21 PFEVIOUS Setup iia tt tddi 6 21 remove SNE eed ENEE EE 6 25 remove Setup from MENU ccceccececeeeeeeeeeeeaeeeeeeaaeeeeeeaaeeeeeeeaaeeeeesaaeeeeeee
143. connected to the original sequence in any way once it is part of the Gi Notebook The Features object is used to create a display for your sequence highlighting the specific features you want to emphasize in the sequence The Features menu can be used to perform a number of operations on the sequence in the Features object The Mark Sites menu option will mark either restriction sites on DNA or chemical enzymatic cleavage sites on proteins The choices are the same as if you had chosen to do a restriction enzyme digest page 4 42 or protein cleavage page 4 67 analysis The Translate Define Intron and Undefine Intron s choices are available only for DNA Features objects Translate will create an amino acid sequence below the DNA sequence using any translation table you specify If a segment of the DNA has been defined as an intron that segment is skipped in the translation of the DNA You can define a segment of DNA as an intron by selecting the segment and then choosing Features gt Define Intron Any number of introns can b You can also select a whole sequence click on the sequence name and then drag it directly from the sequence window to the notebook This will automatically create a Fea tures object in the GI Notebook If you select only a part of a sequence and drag it to the Gl Notebook it will be treated as text and placed as part of the background text c Once an amino acid sequence is generated in a Features object
144. d Enter the personalization information name and organiza tion The first time you run the new Gene Inspector you will be asked to insert the original CD unless it is already in the CD drive This is the only time you will need to do this unless you reformat your hard disk 4 You are finished installing the software Please read the notes below and enjoy using Gene Inspector On Windows 1 Insert the Gene Inspector CD and run the installer application Follow the steps that are presented on screen Updating Gene Inspector Updates will be made available for Gene Inspector 2 0 from our web site as they are needed Check lt http www textco com downloads updates html gt to see if there is a newer version To run the newer ver sion of the application download it from the web site and put it into the Gene Inspector folder on your hard disk After checking that it works discard the older version of Gene Inspector 2 0 There is no need to type in 20 30 char acters to activate the new version We have included a demo version of Gene Construction Kit on the Gene Inspector CD This is Textco s DNA manipulation presentation and cloning program that complements Gene Inspector You can install the demo version by dragging the entire Gene Construction Kit Demo folder to your hard disk Mac or running the installer found in the Gene Construction Kit demo folder on Windows Please call us if you have any questions or probl
145. d Once you have provided a name the custom Page 6 18 Menu Items frame will then be appended to the Frames menu The bottom radio button will present the same dialog as Edit Frame Remove Frame From Menu If you no longer have a use for a custom frame you can remove it from the Frames menu with Remove Frame From Menu e Numeric Format Numeric formats can be set for numbers that are displayed in user tables This provides you with a way to have all the numbers in a table formatted consistently This menu works similarly to the Color and Frames menus Set Format Set Format brings up the numeric formatting dialog as shown in Figure 2 28 page 2 35 You can specify the number of decimals scientific notation or even add degrees or to the numbers Add Format To Menu Add Format To Menu allows you to add a custom numeric format to the Numeric Format menu Remove Format From Menu Remove Format From Menu allows you to remove a custom numeric format from the Numeric Format menu If you do not have any custom numeric formats then this option will be disabled e Paragraph The Paragraph formatting menu item applies to any text that is not background text such as text found in sidebars Background text is formatted using the icons in the Gl Notebook ruler see Figure 5 1 page 5 2 Left Justify Aligns the left end of each line of text The right ends can be uneven as each line of text will
146. d Wed Nov 9 1994 2 02 PM Modified Thu Nov 30 2006 11 32 PM Comments This table is best used to compare proteins that are very similar to each other Come Figure 6 16 Editing Table Info up Figure 6 16 page 6 24 You can change the title of the table which will be used whenever the table is made available for use in an analysis The minimum and maximum adjectives are used as labels in some analyses like Page 6 24 Menu Items Helical Wheel page 4 59 The Cancel button will close the table window without saving any changes while the OK button will create a new table and store it in the folder called User Tables in the Gl Data folder Open For Editing Open For Editing will allow you to edit a table that you have created using Create New from the Analysis Table submenu You cannot edit the standard tables supplied with the program Remove Remove will provide an opportunity to remove any analysis tables you may have created The dialog is similar in appearance to the create table dialog Figure 6 14 page 6 23 e Add Another Analysis Add Another Analysis provides an opportunity to add additional analyses to an Analysis Setup Window This option is only enabled when an Analysis Setup window is open In the Windows version of Gene Inspector this selection is accessible through the right mouse button menu It is discussed in Tutorial 11 Adding More Analyses to a Setup page 2
147. d begin typing You should see text appear in the background in the first avail able free space The background text flows around objects and can be used to describe and track experiments within the Gl Notebook It can also be used to discuss analysis results Type in some descriptive text in this notebook You can generate several pages of text quickly by typing in a paragraph and then copying and pasting this paragraph back into the notebook 2 Choose Notebook gt Display gt Set Display Preferences You will see Figure 2 30 Display Preferences Y Show page breaks using color E v Show text margins using color v Show printable area using color Ea m Show paper binding using color O Save as default display prefs for new notebooks Figure 2 30 Set Display Preferences This allows you to specify the way different boundaries are indicated on the Gl Notebook page Set colors for the text border and for the printable area border by pressing the Set button next to the item to be set These borders will let you see how items are placed on the notebook sheet relative to the borders 3 Select an output object and choose Notebook gt Get Info This provides you with information about the particular object You can enter comments here that you might like to refer back to in the future Close the Get Info window 4 While one of the objects is still selected look at the Notebook gt Text Flow submenu By choosing o
148. d scrolled into view Find works in conjunction with Find Next page 6 13 to allow you to rapidly go through a document and find each instance of a search string For sequence documents the dialog looks like Figure 6 8 page 6 12 There are some extra options in this dialog The top section of the dialog allows you to define criteria for the searching You can define a query sequence and Page 6 11 Menu Items 800 Find in Sequence Editor Find Maximum number of mismatches allowed 0 z Match ambiguous chars exactly Y matches only Y A Interpret ambiguous chars Y matches C T and Y z Find the next ambiguous character Target 3 Search active sequence chick musc AchRec Search all sequences in acetylcholine receptors __ Search bottom strand as well as top strand Distinguish U in DNA from T in RNA go Wrap around Search backwards Find Next Figure 6 8 Sequence Editor Find Dialog allow mismatches up to a specified maximum number of characters You can enter ambiguous characters like X for proteins or N for nucleic acids so you also need to tell the search routine what it should do with these characters If you want X N to match only the character X N in the target sequence the sequence that is being searched then choose Match ambiguous chars exactly If you want X N to match anything then choose Interpret ambiguous chars The last choice Find the next ambiguous char will fi
149. d text in the clipboard it can be pasted into a GI table To do this just target the table and select the table cell you with to make the top left corner of the new values to be pasted in Choose Paste to place the clipboard values into the table starting at that loca tion In addition to simply dragging the divider line between two columns to adjust the column width two special commands can be used to adjust the width of all table columns simultaneously Holding down the option key while dragging the divider line between columns will change the widths of all table columns by the same amount Holding down the shift key while dragging the divider line will adjust all columns to the same width as the column just to the left of the selected divider line Open for Editing Any of the analysis output objects in the Gl Notebook can be opened in a separate window for editing This provides you with scroll bars and other Transmembrane helices Gotopus rhodopsin This is text in the the notebook In i is in theground f background and 2 A _ Transmembrane helices Octopus rhodopsin TM Helix Argos et al Octopus rhodopsin be used to d example the octopus rho WE E E EN T 1 101 201 301 401 Amino acid You can als Figure 5 11 Open For Editing Page 5 14 The GI Notebook capabilities that are not available for the object within the notebook itself This is done by selecting the output object and then ch
150. describes a method of predict ing protein secondary structure using statistical measures Standard output is GOR structure prediction Pig musc Ach Recpt Alpha Tum Beta Pr ye ll ny Ny Coil O 4 D gt 0 100 200 300 400 Figure 4 55 GOR Output as Graph shown in Figure 4 55 The accuracy of this structure prediction algorithm is on the order of 60 65 as is true for the Chou Fasman algorithm page 4 54 so you should interpret the predicted structures with some caution Tar geting this object and choosing Object View As Squiggles will show you Figure 4 56 page 4 59 This is a representation of the structure prediction in a dif ferent way You can change back to the graph view by choosing Object View As Graph At the bottom of Figure 4 56 page 4 59 is a legend You can use this leg end to change the appearance of the squiggles plot First target the output object by double clicking on it Now click on one of the legend items for Page 4 58 Analyses Figure 4 56 GOR Output as Squiggles example the beta label Once the label is selected you can make changes to the line thickness color and pattern using the items under the Format menu You can also change the font and size of the legend text Any changes you make will be reflected in the drawing of the squiggles plot The squiggles and the graph plots indicate the same information about the sequence Helical Wheel The helical wheel analysis projects a vie
151. display There are five separate kinds of adornments in addition to the shading dis cussed on page 3 13 e Grade background color of characters that will place a background behind each of the characters that either match or do not match the con sensus sequence The intensity of the background color will be higher when higher fractions of the residue in that column actually match or don t match the consensus sequence This is basically the same as shading the alignments page 3 13 but it has the option to shade non matching characters as well depending on which radio button is chosen e Fill behind characters that allows you to define a color to use as back ground for all characters that match and or do not match the consensus character Note that you can choose to create a background color for matches as well as non matches by using the checkboxes in this section of the window e Invert the characters that will invert the colors of the characters match ing the consensus character or not matching depending on which radio Page 3 14 The GI Sequence Editor E Match Adornments Grade background color of characters that C match the consensus sequence with aa do not match consensus sequence with Fill behind characters that match the consensus sequence with E do not match the consensus sequence with C Invertthe characters that C match the consens
152. domains of specified length This lends itself particu larly well to picking out regions in various protein analyses that might be of a length needed to span a membrane for example As shown in Figure 4 8 on Page 4 12 Analyses page 4 12 peaks are more clearly resolved and ambiguities are often clari fied when compared to the standard sliding window mean Figure 4 8 is a standard hydropathy analysis of a chick muscarinic acetylcholine receptor with the standard sliding window on the top and median sieving on the bot tom Note how much more defined the peaks are in the median sieved ver sion of the analysis and how well the different peaks are resolved The mesh size is used to define how the median sieving is carried out As a starting point a value should be used which is about half of the size of the sliding window You can try to use other mesh sizes slightly larger or smaller than this value to see if it helps better define what you are looking for When you do find an appropriate mesh size that presents data of interest you can calculate the size of the window it corresponds to by doubling the mesh value and adding one For example if a mesh size of 9 brings out some interesting feature of your peptide then you are seeing a feature corresponding to regions that are 19 amino acids long 2 9 1 A mesh of four corre sponds to regions of 9 amino acids length 2 4 1 etc Editing Translation and Codon Preference Tables Bot
153. dow will be added including all analysis parameters the suite of analyses and the sequences used if any e Remove Setup From Menu Remove Setup From Menu complements the previous one lt allows you to remove a Custom Analysis Setup from the Analysis menu In the Windows version of Gene Inspector this selection is accessible through the right mouse button menu You will be presented with a list of the current Custom Analysis Setups and can chose which to delete e Custom Analysis Setups After the Analysis menu items listed above will be a list of all the Custom Analysis Setups you have added to the application Custom Analysis Setups actually are stored on your hard disk as files in the Analysis Setups folder inside the Gl Data folder Because they are just simple files you can share your custom analysis setups with other users just by giving them your setup files Page 6 26 Menu Items Notebook Menu Les The Notebook menu deals with the ex Tools eses TT Reduction editing arrangement and behavior Tool Extensions of objects in the Gl Notebook v Get Info The Get Info menu item presents Cie information about the currently Display gt selected object s The information Insert Page Break shown differs for different objects en Ee Using this option you can enter Notebook Layout and edit comments to save with the object es Tools The Tools submenu is shown in
154. down the sequence to see how different features can be labelled and highlighted in different ways Close the appendix by clicking in the close box at the top left corner of the appendix window 15 Appendices like bookmarks are also available through a menu Choose Notebook gt Appendices gt Assaying CAT Activity This appendix is really just a text object yes another use for a text object that has been moved to the appendices for this notebook The advantage of placing a protocol like this in the appendices instead of just leaving it in the notebook is that it can now be accessed from anywhere in the notebook by choosing it from the menu You might also choose to store buffer recipes in the same way Putting commonly used information in appendices makes the information available from anyplace in the notebook Look at some of the other appendices to get an idea of how you might be able to use appendices Close this appendix window when you are done looking at it 16 Choose Notebook gt Bookmarks gt Define Promoter Behavior This section of the notebook contains a table that was created using the notebook s table tool to create the table Tables present a useful way to organize experimental infor mation In this case the table is displaying the data from a particular experi ment 17 You might create a table for repeated use e g for identifying lanes in gels and add it as a tool extension as explained elsewhere in the manual To see
155. e contains the name of the sequence being displayed in the line to the right of the name itself For single sequence files like the pBR322 sequence shown in Figure 2 4 the sequence name is only shown on the first line of the sequence The position column contains the position of the first nucleotide or amino acid in each line The overview pane at the top of the window shows a graphical representation of the sequence Within the overview pane the dotted rectangle called the segment indicator indicates the segment of the sequence currently visible in the sequence editor window The Sequence gt Display menu can be used to show or hide these different parts of the sequence window Feel free to try different items in this menu at this time b Figure 2 6 page 2 12 shows a sequence editor document with multiple sequences this will be discussed later Page 2 9 Tutorials Editing Sequences 2 Scroll down the sequence by using the scroll bar on the right of the win dow Notice how the segment indicator in the overview pane moves along as you scroll and indicates exactly where you are in the sequence 3 You can also navigate to different positions within the sequence by click ing with the mouse in the overview pane The sequence editor will automati cally scroll to the location in the sequence that was clicked in the overview and will also select the clicked sequence You can even drag the segment indicator to navigate within th
156. e 53 pBC KS Stratagene 54 pBC SK Stratagene 55 pBC SK Stratagene 56 pbetagal Basic Clontech 57 pbetagal Control Clontech 58 pbetagal Enhancer Clontech 59 pbetagal Promoter Clontech 60 pBI101 Clontech 61 pBI101 2 Clontech 62 pBI101 3 Clontech 63 pBinl9 Clontech 64 pBK614 Sigma 65 pBlueBac4 In Vitrogen 66 pBlueBac4CAT In Vitrogen 67 pBlueBacHis2CAT In Vitrogen 68 pBlueScribe KS Stratagene 69 pBlueScribe KS Stratagene 70 pBlueScribe M13 Stratagene 71 pBlueScribe M13 Stratagene 72 pBlueScribe M13 Stratagene 73 pBlueScribe SK Stratagene 74 pBlueScribe SK Stratagene 75 pBluescript II KS Stratagene 76 pBluescript II KS Stratagene 77 pBluescript II SK Stratagene 78 pBluescript II SK Stratagene 79 pBluescript KS Stratagene 80 pBluescript KS Stratagene 81 pBluescript SK Stratagene 82 pBluescript SK Stratagene 83 pBlueSTAR1 NovaGen Page A 23 Appendix 84 pBPV 85 pBR322 86 pBR325 87 pBR328 88 pBS 89 pBS 90 pBTac2 91 pCANTAB5 92 pCAT C 93 pCAT E 94 pCAT P 95 pCDM8 96 pcDNA3 97 pcDNA3CAT 98 pcDNAI 99 pcDNAIAmp 100 pcDNAIAmpCAT 101 pcDNAII 102 pcDV1 103 pCEP4 104 pCEP4CAT 105 pCF20 106 pCH110 TOT pOL 108 pCI neo 109 pCITE 3a 110 pCITE 3b 111 pCITE 3c 112 pCITE 4a 113 pCITE 4b 114 pCITE 4c 115 pCITE1 116 pCITE2b 117 pCITE2c
157. e color style you can create a style sheet corresponding to the styled text The style sheet contains information about text characteristics and can be applied to any other selected text To create a style sheet select the text containing the style you wish to use it can be as small as a single character and then choose the Format Style Sheet Add Style Sheet menu option You will be asked to provide a name which will then be added to the Format Style Sheet menu To use the style sheet just select the text you want to style and choose the style from the Format Style Sheet menu Style sheets can also be used to entirely define the style of output objects including axis format This is covered in Tutorial 10 Creating and Using Style Sheets page 2 38 Page 5 3 The GI Notebook GI Notebook Layout The Gl Notebook consists of a number of sheets They are called sheets and not pages to distinguish them from printer pages Sheets can be of any size you care to define and are not limited to multiples of printer page sizes This allows you to define a size like 16 x 20 inches which might be conve nient for posters or define a notebook size that exactly fits your computer screen You can specify the layout of the Gl Notebook using the Notebook Lay out menu The Layout dialog box is shown in Figure 5 3 There are several Page Layout r Notebook layout style Click in diagram to update display J Standard text
158. e Notebook menu An alias functions in a way similar to the way Finder aliases work lt can serve as a hypertext link to any object in the Gl Note book and can point to Appendix objects see below that are not directly vis ible in the notebook window or even to objects in a different notebook Any output object in the Gl Notebook can be used as a starting point to recalculate an analysis All the original analysis parameters are stored with a The alias can be use to go instantly to another location in the notebook Hypertext links like this allow you to navigate through a document in a non linear way you can find information that is of interest to you at any time by following the link alias Page 1 7 Getting Started with Gene Inspector each analysis output object Since the output object can be copied and pasted several copies of the analysis can be generated recalculated with slightly different parameters and then displayed adjacent to each other If you choose to hotlink an analysis it can be updated for you automatically each time the analyzed sequence changes In addition to the drawing tools provided user definable tools called tool extensions provide a rapid means to store often used objects Tool exten sions may include graphical objects analyses text and tables This provides great flexibility in organizing and displaying information It functions as a very powerful scrapbook Appendices are separate
159. e Style popup menu allows a predefined style to be applied to the results Output type specifies whether the results should be presented as a graph or as a table Results can be displayed as Number of occurences or Percent of all occurences using the Display results as buttons Figure 4 15 Base Composition Panel Figure 4 15 The upper popup menu allows you to specify whether the Gene Inspector should calculate information about mononucleotide dinucleotide or trinucleotide composition The results of the analysis can be displayed either as a table or as a graph The output graph can plot either the number of occurrences of the mono di or trinucleotides or it can plot the percentage of all occurrences of each specific mono di or trinucleotide The table output lists both kinds of data Output from a dinucleotide base composition analy sis is shown in Figure 4 16 The relative heights of the different bars on the histogram will remain the same but the values along the Y axis will change when the output is shown as either a percentage or number of occurrences Base Distribution The Base Distribution analysis determines the distribution of any particular base or combination of bases as a function of position along the DNA The setup panel is shown in Figure 4 17 page 4 22 Using this panel you can Page 4 21 Analyses 200 Dinucleotide Composition Human lysozyme Number of Occurences 3 a 8 3 a 3 0 AA AC AG AT CA CC CG
160. e analyses by pressing the Run button 8 Arrange and resize the three analysis objects to all fit on the screen for easy viewing and then select them all by selecting one and choosing Edit gt Page 2 20 Tutorials Hotlinking Analysis Results SelectAll 9 With the three analysis objects selected choose Notebook gt Links gt Automatic You will see the appearance of a small green circular adorner icon in the top right corner of each output object indicating that it is now hotlinked see Figure 2 14 page 2 21 They should all be green indicating that no sequences have been changed since the analysis was initially run apprcleotade Composition Lamprey rhodopsin ZS fw P OH oo CH CH oc CH Humber of Occurrences A C Gi T Figure 2 14 Hotlinked No Updating Needed 10 Bring the rhodopsins2 sequence window to the front again you can use the Windows menu to do this Click in the Xenopus sequence and type in a few characters This changes the sequence and will notify the corresponding output object in the Gl notebook that it needs to be updated 11 Bring the notebook window to the front and notice the change in the hotlink adorner for the analysis that depends on the Xenopus rhodopsin sequence The adorner now is red and yellow containing an exclamation point as shown in Figure 2 15 Notice that only the analysis object which is depen dent on the altered sequence needs to be updated The other tw
161. e document and pasted into the Gl Notebook it con appears as a Features object Adjust Size To Contents Ctrl J Grouping Such an object is shown in Fig Adjust Size To Contents 3 ure 6 25 The appearance of the different parts of the Features object is defined by the items in the Features menu A discussion of the Features Object can also be found in Tutorial 16 Displaying Formatted Sequence Information page 2 51 The different parts of the features object can be shown or hidden individually by using the items in this menu as described below translation marked sites intron GTG GCA TCC ACG G GAG CCT CTC GTG TCC fe ai Ala Ser Thr Asp Giu Pro Leu Val Ser 13 oct der 90 CCT CCT CGT GAG CTG TGC AGG GGG ATG AGC CGC GCT TCC fizs 4 Pre Pro Arg Giu Leu Cys Arg Giy Met Ser Arg Ala Ser 2g Aci 129 ATC TTA CTT TGG GCA ACC AAG AGT GCG AGC CTC CGC CAT fie a2 dile Leu Leu Trp Ala Thr Lys Ser Ala Ser Leu Arg His pP BEA 168 GCC ATT AAA TGC ACT Rea fo Alia Tie Lys Cys Tar WEEN 213 TGG AAA CAA AAA ACT AAA CAC AGA AAC CTT 251 AAG AAA GTT GTT TTT TCA AAG CAG GTT 290 ys a A T 291 left positions line dividers boxed style right SA Figure 6 25 Gl Notebook Features Object If you simultaneously paste multiple sequences into the notebook you will create a multiple sequence features object This is discussed in Sequences Menu page 6 44 e Mark Sites The Mark
162. e done by creating a features object in the notebook 1 Open the DNA sequence file pBR322 2 Select nucleotides 1 through 150 and Copy them to the clipboard by choosing Edit gt Copy Note that if you select these nucleotides and drag them to the notebook the sequence itself will be placed in the notebook as text 3 Bring the Untitled Gl Notebook window to the front by clicking on it or choosing Window gt Untitled 4 Paste the DNA sequence into the notebook Edit gt Paste This creates a new notebook object called a Features object 5 Double click on the new features object to target it You will see the appearance of a new Features menu added at the right 6 Select nucleotides 18 through 96 and then choose Features gt Define Intron The selected segment of DNA is displayed in inverse colors to indicate that it is an intron 7 Select the entire DNA sequence from 1 through 150 and then translate it by choosing Features gt Translate When asked for a translation table choose the E coli translation table 8 Choose Features gt Adjust Size To Contents to force the size of the object to expand to include all of the DNA sequence as well as the amino acid k If you do not have an untitled notebook window create a new notebook window by using the File gt New gt Notebook menu Note that another menu item Edit gt Special Paste gt allows you to paste the sequence into the notebook s background text m Obviou
163. e eeeaaeeseaeeeeaaeeeees 4 63 Q Quit FileMen linia id dd EEN 6 8 R rare Een caia iaa 4 25 Index 13 S reducing notebook view EE 6 28 Reduction Notebook Menu occccccccnconononcnnncnononononananann nn nono ocn nnn nana n cnc nc nana ninas 6 28 Remove Analysis Analysis Menu c scccccscesesssesseeeceeeeeesesssseeeeeeeeeseees 6 25 Remove Setup From Menu Analysis Menu scccceceeeeseessseeeeeeeeseees 6 26 removing CUSTOM MENUS cipal dais ii 2 35 Replace Find Next ic Ee dE Ee ease 6 13 restriction enzyme digest GISCUSSIONN id a lis 4 42 edit display parameters ooocconcncononncnnncnnonnncnnnncnnnnnnnnnrnnnnnnnnnnnnnnnrnnnnnennrnnnnans 4 44 Mark CUL SOS 2 sa ea dadeoneeceteus a eusaccouduinawe sabes 4 42 Mark recognitiON SITES asi r asra cece ee ee cette eee ee O eee rre 4 42 return appendix to notebook cece cece eee e eee eee ee eee ee ee eee sees eae esa ee eae eee 2 44 reverse translates a ENEE ERAN ENEE ANEN 6 48 Revert to Saved File Menu ooconnccccccnnnnncnccnnnnncnnona nana nan cnn nan n ono nana rca 6 4 POW headers ii ds dao 5 13 6 53 S Sample ne EE 2 2 Save File Menu cccccccccccccccsceeeeeececesceeeeuseaeeueaseceueaueeaaauausauedeeseesseauaesaeeeeeees 6 3 Save a Copy File Menu ta a adn eS 6 4 Save As File Menu sretnici aa a a a a 6 3 SCOMNG re EE 2 25 3 12 scratch ica A es EE E 7 1 segment MACON clic ica 3 2 Select All Edit Menu onnnc
164. e sequence part of the window Try clicking in the overview pane of the window to navigate Try dragging the mouse in the overview pane 4 Select 20 30 nucleotides by clicking and dragging the mouse over the nucleotides just as you would in a standard text processor Choose Format gt Color gt Red and notice the change Now choose Format gt Style gt Bold Notice that the sequences stay aligned even after making them bold if at all possible the Gene Inspector will keep your sequences aligned automatically 5 Select the entire sequence by choosing pBR322 in the name column on the left and change the font to Times using Format gt Font gt Times Notice how all the characters still line up appropriately even though you are using a proportional font Times instead of a monospaced font Courier You can deselect the sequence by clicking anywhere in the sequence itself just like in a word processor 6 Select Sequence gt Show Sequence Monitor which will bring up a palette like the one shown in Figure 2 5 page 2 11 The Sequence Monitor shows relevant information about the sequence you are working with in the sequence win dow it allows you to set some speech properties and it provides two ways for confirming a sequence The first way is simply to have the program speak the sequence to you it will start reading at the beginning of the current selection in the sequence editor The second way is to confirm by re entry c
165. e the seg ment at about 250 is accessible Surface accessibility is thought to be related to the antigenicity of the segment of the peptide Page 4 48 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs O I High Priority Close Run Window Size 7 Mesh Size 4 Input Sequences About the Analysis e Based on values in Janin Nature 277 491 1979 which d sl determined the surface accessibility of amino acids The ratio of buried accessible values in the paper Table 1 column 4 were converted to the fraction accessible This Antigenicity analysis is identical to the Accessible Surface Area analysis Output Location x V Show Icons Style Default g Window size isthe number of adjacent amino acids whose property is calculated in each iteration After calculating the value for the first window of amino acids the window is mowed one residue along the sequence and a value is calculated again for the new window of amino acids Median Sieving emphasizes data having a specific distribution J A Bangham Anal Biochem 174 142 1988 Figure 4 46 Accessible Surface Area Setup y yocessible surface area Chick musc Ach Recpt 0 7 v a a UD o a O A 0 4 0 2 1 101 201 301 401 Amino acid Figure 4 47 Accessible Surface Area Output Align 2 Sequences Global This routine will provide the best alignment between two sequences using their entire lengths a global alignment Alignment parameters can
166. earch This analysis will search through the Prosite database of sequence motifs for any sites that might be present in the protein s you have chosen to be ana lyzed The setup panel is shown in Figure 4 63 Protein recognition sites with known functions are included in this comprehensive database which is bro ken into a number of categories Clicking with the mouse on an item in the Page 4 64 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs O High Priority Close Run Prosite Database Release 14 0 of November 1997 v Post translational modifications Prositelmotifsearct v Domains 19 Y DNA or RNA associated proteins oxoreductases Input Sequences Transferases peO Hydrolases oo Lyases Output Location Isomerases Ligases Others Electron transport proteins Select All gt Deselect All 3 mM Show Icons Style Default D Click on categories in the list to check them All checked categories will be used in the search for similar sequences in any sequence being analyzed Figure 4 63 Prosite Setup Panel Prosite motif search Dros hsp7O 1 51 101 151 201 251 301 351 401 451 501 551 601 651 AMIDATION ASN_GLYCOSYLATION CK2_PHOSPHDO_SITE HSP70_2 1 HSP70_3 1 MYRISTYL m4 1 H ID PKC_PHOSPHO_SITE Figure 4 64 Prosite Graphical Output list will place a check mark next to that category All checked items will be searched The result
167. ector can create Tables in the GI Notebook To do this choose the table tool Figure 5 5 page 5 7 and then either click in the Gl Notebook at the location you wish to insert the table or drag a rectangle to contain the table If you just click in the Gl Notebook rather than drag you will be prompted to enter the number of rows and columns to use in the new table If you drag to create a table you will see an indication of the number of rows and columns being created as you drag the mouse to enlarge the table size In either case the table size can be altered after it is created by using the Table menu that appears when a table is the target A Gl Notebook table is shown in Figure 5 10 Values have been entered into each of the cells in the table and correspond to the amount of radioactivity in a transcription reaction at different NaCl concentrations Using the Format Page 5 12 The GI Notebook click here to select row headers click here to select column headers NaCl mM cpm 10 lt Column headers SS row headers Figure 5 10 A User Table menu it is possible to set the justification style size color and numeric for mat for any item in the table Column headers are the cells at the top of each column row headers are the cells at the left of each row By clicking at the locations indicated in Figure 5 10 in the top left corner of the table you can select either all the column headers or all the row
168. ed object you must first ungroup the object by choosing the Notebook Arrangement Ungroup menu item Preferred Size for Objects When an object is first created or placed into the Gl Notebook it will appear at a specific size which is called the preferred size The preferred size dimen sions are stored with the object so that if the object is resized it can always be returned to its preferred size To return an object to its preferred size use the Notebook Arrangement Restore Preferred Size menu item The preferred size for an object can be set to the size the object currently has by selecting the object and then using the Notebook Arrangement Save Preferred Size menu item Framing Gl Notebook Objects Any Gl Notebook object can have a graphic frame placed around it To create a frame first select the object and then select the Format Frames Edit Frame menu item You will see the dialog shown in Figure 5 6 Frames can contain up to three concentric framing rectangles For each framing rectangle that is c Note that these simple objects cannot be targeted Selecting vs Targeting page 5 6 because there are no internal components to be edited Page 5 8 The GI Notebook outer gap inner gap outer frame middle frame inner frame choose number EZ fame inner margin of frames drop shadow set line thickness color and pattern by selecting these lines and using the Format menu Figure 5 6 Graphic Frames
169. ee Figure 2 28 page 2 35 This dialog allows you to Numeric Format Style Fixed decimal notation Sign Unadorned O Scientific notation Degrees Exactly as typed Percent Minimum decimal places 2 Sample 123 46 Cancel Figure 2 28 Numeric Format Dialog define the format to be used In this case we want to use two decimal places Choose the Fixed decimal notation button and type a 2 into the text field labelled Minimum decimal places Try other items in this dialog and see how they change the sample text circled in the figure 7 Once you have configured the formatting the way you want press OK to add it to the menu 8 To remove a custom numeric format from the menu choose Format gt Numeric Format gt Remove Format From Menu You will see the dialog shown in Figure 2 29 Choose the format s you wish to remove and press OK Other custom menu items are removed in a similar way P Remove Numeric Formal Hiel et numeric mmm zit rem we trae merir wer decimal panes Figure 2 29 Removing a Custom Menu Item This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 35 Tutorials Taking Notes Using Background Text TUTORIAL 9 TAKING NOTES USING BACKGROUND TEXT 1 Open the notebook you saved in Tutorial 7 Aligning Analysis Objects page 2 31 Click in the background white area outside the output objects an
170. eeeaneeseeeeaes 3 19 N NAMINO COMES sde Seet ASS As 2 34 new ANAYSSI aona eel ateee edi acest EE IE 2 14 New Analysis Analysis Menu ccccccsssesseeceeeeeesssseaeeeceeeesesesesaeeeeeeesesesees 6 21 New Sequence Sequence Menu sseseseseeesesesesesssssesssessseenenenenes 6 45 New File Menu ccc cccccccccccscecsesececccnccsteccesecsecceccccbecusucuenetececsecetsecsuceenecstnccese 6 2 notebook see Gl notebook notebook lAYyQUt 2 48 6 37 Notebook Layout Notebook Menu sccccececeesesessaeeesesesssesseaeeeseeesees 6 37 Notebook Mem seene ane pe e ias 6 27 APPENGICES EE EE EAE E E ETE E TE ET 6 33 arrangement EEN 6 31 bookmarks E TETT EE ee 6 30 display cosan dien aa AE EeeeddE Eed teil aide ty 6 32 A Se EE ee Eed Da 6 30 DOMO a 6 27 lat 6 34 Make alas iii er ee eee ee ee 6 29 notebook Ee 6 37 Open for editing alicia 6 29 Index 10 page break scene Eege ege Eege EEER ieee A E AEAEE 6 37 See NEE 6 28 E E le EE 6 30 elei WE Een EE 6 28 WO EE EE 6 27 nucleic acid analyses align 2 Sequences global ccccccccecssssseeceseeecseseseeseeecesesesesseaeeesesesenes 4 16 align multiple sequences EEN EEN 4 20 DASE COMPOSITION 2 20 4 21 DASE distribution ado 4 21 Eelere EE 4 48 COON gl el EE 4 23 dot matrix color AMG EE 4 28 define thrESNOlAS 622 dice ocdevcaiwee decd addins iia 4 28 define WindowW Size ui dai 4 27 le re e WEE 4 29 GISCUSSION andanada ai il radia
171. eeeeeeseensseeeas 2 20 4 21 NEE el ele Lil WEE 4 21 BLAST S arching sta 2 65 4 47 BLOCKS search cuand 4 48 eelef 5 5 6 30 Bookmarks Notebook Menu u ccccccccccsccseeeeececeseesceseeaeeeeuseceeeeueeseuauaueeeeeneeuees 6 30 ek EC rel Te EEN 2 53 3 19 6 17 Bring tO TON etica Eegen 6 31 Bull amp Breese table ccccecccccccecccsceesceeeeeeeceeeeeeuceeeeeauceeeeeauceeeeeeaueeueenaeeeueenaueess A 1 C Choose Gl Data Folder File Menu onnncccncccccnninonanannnnnn canon nananan ana nn ana no naci n 6 8 Chou Fasman Structure prediction ccccconncccnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnns 4 54 Clear Edit Menu oi a Se 6 10 Close File Menu onccccncccccnonononincnonanoncnnonnnn nn cnn nano co nono na naar cnn cnn ocn cnn nana n nn anna nacos 6 3 Clustal V aloortbm EEN 3 11 Codon frequency tables siiani i enaa eaa aaa aeaaaee ani diieas 4 24 eelere EE 4 23 editing codon preference tables ENEE 4 13 Color Format Menu ic ccccccccecseccececcesceseeeseeeescesseseeeueuaeseceeeuseeuuauaauaaeeeeeeueeaaes 6 18 column headers jota 5 13 6 53 conditional text vida daa 1 7 5 2 Consensus Sequence Menu oococccocccccccncncnonononononnononononononononenonononononononananons 6 49 Consensus Sequences Menu onoococccccccccccconoccnnonononononononononononononononanononananons 6 44 CONSENSUS te TEE 2 25 3 12 Copy Edit Menu tai ia 6 9 Current Window Names Windows Menu cssccseesscescsscsseeeesseeeceeeeseeananees 6 1
172. el is shown in Fig ure 4 25 page 4 31 The window size is the size of the segment for which you are looking for an inverted repeat The maximum number of mis matches is the maximum number of nucleotides that will be allowed to be mismatched between the first segment and the second inverted segment Use the text boxes on the bottom to define the minimum and maximum num ber of nucleotides between the two parts of the inverted repeat The output is shown in Figure 4 26 page 4 32 This output object displays each inverted repeat as a short horizontal line the ends of which as vertical tick marks Page 4 31 Analyses Find inverted repeats Dros hsp22 201 301 401 Figure 4 26 Inverted Repeats Graphic Output represent the start point for each segment of the inverted repeat The graphical output object can be targeted and the output changed to a tab ular form by choosing Object gt View As Table The table form is shown in Figure 4 27 The table output contains additional information not found in the graph ical view Each row in the table represents an inverted repeat Seg 1 Start indicates the position of the first nucleotide in the first part of the inverted repeat The actual sequence is shown in the second column labeled Seq 1 The column labeled Seq 2 Start shows the first nucleotide of the start of the inverted repeat
173. emove All oo Range Entire sequence Segment Output Location Linear sequence From 1 Start 1 To 2486 4 Path C Documents and Settings Bob Gross Desktop GI 1 6b12 GI Seqs DNA Sequences acetylcholine recptors nuc chick must AchRec M Show Icons Add and Remove buttons are used to define the Chosen sequences Clicking on a sequence in the list displays its location Path Entire sequence will run all analyses using the entire length of the chosen sequence The Segment button allows the analyses to operate on only a part of the entire sequence A in the Chosen sequences list indicates that the segment is truncated by a stop codon Figure 4 4 Input Sequence Panel To add a sequence to the Chosen files and sequences list press the Add but ton which will bring up the Sequence Chooser Figure 4 5 page 4 6 Sequences can be added one at a time or as an entire file containing mul tiple sequences In this case the user has previously chosen the chick musc AchRec sequence and added it to the list in the bottom right of the window Then the user clicked on the Drosophila Hsps file in the top left This file when clicked once in the top left will put the list of all the sequences in the file into the bottom left corner list Clicking on the Add Drosophila Hsps gt gt button will add the entire file to the list in the bottom right as was done here To add a single sequence just click on the sequence yo
174. ems Page 1 3 Getting Started with Gene Inspector We hope you enjoy your new software System Requirements Mac e System 10 5 or later e 8 megabytes of RAM available for the application e 14 megabytes of disk space depending on what is installed System Requirements Windows e Windows 7 XP sp3 or Vista e 8 megabytes of RAM available for the application e 14 megabytes of disk space depending on what is installed Overall Design Philosophy The Gene Inspector provides an electronic notebook that functions as a coun terpart to the paper version you are used to using in the laboratory At the same time the GI Notebook provides additional capabilities only available in an electronic medium such as rapid searching through your notes for key words and easy navigation within your notes An integral part of the applica tion is the built in capability to carry out comprehensive nucleic acid and pro tein sequence analyses Defining a sequence analysis is straightforward and intuitive and provides a reproducible way for users to share analysis parame ters We firmly believe that you should spend most of your time doing analy ses rather than trying to figure out how to run the program This has been our guiding philosophy Three Main Parts of the Application There are three main parts to the Gene Inspector application the Sequence Editor the Analysis Setups and the Gl Notebook These parts work together to provide a well defined
175. ence used in the analysis is different from the version of the sequence saved in the file This means that an update is needed Hotlinks can be very useful For example you might create a notebook containing Page 2 22 Tutorials Hotlinking Analysis Results many analyses all hotlinked to a specific sequence When you want to perform this set of analyses on a new sequence just paste in the new sequence in place of the original sequence and then perform the auto recalc For more see Links page 6 34 This concludes this tutorial If you choose to continue to the next tutorial close all open windows now Page 2 23 Tutorials Multiple Sequence Alignments TUTORIAL 5 MULTIPLE SEQUENCE ALIGNMENTS In addition to storing and displaying sequences the sequence editor is also a convenient place from which to launch multiple sequence alignments and to fine tune the alignments once they are generated Multiple sequence align ments can also be created as a new analysis see Multiple Sequence Align ments page 3 10 In this tutorial we focus on performing multiple sequence Align multiple sequences Table BLOSUM62 Fs Step One Pairwise Grouping k tuple word size maximum gap length un w gap penalty number of top diagonals to use 5 Step Two Multiple Sequence Alignment gap creation o gap extension o Gain Cancel Figure 2 17 Align Multiple Se
176. ences menu item to launch the detailed single sequence analysis on the chosen sequence You will see the analysis setup panel open with all appropriate parameters already filled in for you Press Run to conduct the analysis The results will be seen as a new object in the notebook as shown in Figure 2 23 This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 29 Tutorials Running Summary Analyses Find sequence Chick musc Ach Recpt 101 201 301 Figure 2 23 Find Sequence analysis Page 2 30 Tutorials Aligning Analysis Objects TUTORIAL 7 ALIGNING ANALYSIS OBJECTS 1 This tutorial will describe how to align and automatically resize objects To do this you must first generate some output objects to be aligned For this tutorial choose to do the Accessible Surface Area analysis on lamprey octopus and xenopus rhodopsins they are in the peptide sequences folder Accept all the default parameters and choose Run as described earlier in this chapter 2 Click on the top output object and make it wide and short as shown in Accessible surface area Lamprey rhodopsin T T T T T T 1 51 101 151 201 251 301 351 D Aming acid Accessible surface area Octopus rhodopsin gt D accessible gt T T T T T 1 101 201 301 401 Aming acid Accessible surface area Xenopus rhodopsin CR accessible SAO 1 51 1
177. eptide sequence into a new sequence window 8 Click on the name ORF1 in the new sequence window and then choose Sequence gt Sequence Info Note that the program automatically generated appro priate text to help identify the source of the protein This is a good example of how the Gene Inspector presents you with intuitive ways of following up on your natural thought process Once you have exam ined a DNA sequence for possible coding regions and have identified one you are likely to want to create a corresponding peptide sequence for further analysis This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 57 Tutorials Dot Matrix Analysis Another Interactive Analysis TUTORIAL 18 DOT MATRIX ANALYSIS ANOTHER INTERACTIVE ANALYSIS 1 The dot matrix analysis deserves some special attention Choose Analysis gt New Analysis choose to do a protein analysis select the Dot Matrix anal ysis and click the OK button 2 Click on the Input Sequence icon along the left of the setup panel As your two sequences select the proteins Drosophila hsp22 and Drosophila hsp23 they are both in the Dros Hsps peptide file 3 Select Dot Matrix on the left of the Analysis Setup Window and exam ine the panel that appears Figure 2 41 Window size is the length of the a Analysis Setup Analyses 1 Inputs O Outputs 0 Il High Priority Close Run
178. es Table 1 col 2 of the paper were standardized to glutamine as suggested by the authors This analysis is identical to a Hydropathy analysis using the Manavalan amp Ponnuswamy table Temperature Factor This sliding window analysis is based on atomic mobilities also called tem perature factors for amino acids in proteins whose structure is known through x ray diffractions studies It is a good indication of antigenicity See Tainer et al Nature 312 127 1984 Transmembrane Helices Transmembrane helix analyses are designed to identify hydrophobic alpha helical or beta regions of proteins that are likely candidates to be membrane spanning domains This sliding window analysis uses the following tables of values Argos et al page A 1 Eisenberg et al page A 1 and Engelman et al page A 2 The best window size to use for a membrane spanning domain is 19 20 amino acids BLAST Search For information on using this analysis see the DNA BLAST discussion on page 4 47 The only difference is that you will need to choose a comparison table a matrix to use for scoring the matches in the database Page 4 70 The GI Notebook Chapter 5 The GI Notebook Overview of the GI Notebook The Gene Inspector Notebook can be used the same way an ordinary paper lab notebook can be used for notes about experiments that are being done to record ideas you might have for future research to paste in the results o
179. es of the colors orange bright green purple gold etc Another possibility is to name the col ors using a descriptive name such as Lisa s Text Important or Weak Data In this way when you see a specific color used in the Gl Notebook it will tell you something about that particular text After typing in a name press the OK button and specify the color using the standard color picker 3 Press OK to add this color to the Format gt Color menu You can look to see if it has been successfully added once you have pressed OK by selecting Format gt Color gt 4 Note that in Figure 2 27 only the Choose color from palette button was enabled This is because nothing was selected in the Gl Notebook If you see a color in the notebook that you want to use again but this color is not one of the colors in the menu select the item which has the color you want to add and then choose Format gt Color gt Add Color To Menu 5 As you will see in subsequent tutorials a number of other menu items can be customized Style Sheets Frames Analyses etc In this tutorial we will do one more to give you a familiarity with how the customizable menus work Choose Format gt Numeric Format gt Add Format To Menu You will see a dialog like the one in Figure 2 27 It should look familiar Type in two decimal Page 2 34 Tutorials Customizing Gene Inspector Menus places as the name and press OK 6 You will now s
180. extra pixels to be placed between each line of sequence One Letter AA Code The One Letter AACode menu item will alter the display of translated sequences to show a one letter amino acid code This item works only on translated sequences Three Letter AA Code The Three Letter AACodemenu item will alter the display of translated sequences to show a three letter amino acid code This item works only on translated sequences e Grouping The Grouping menu item specifies how the letters in the nucleic acid or protein sequence will be organized A group is the number of characters in the sequence that are drawn on the screen before a space is inserted Grouping makes the sequence easier to read Groups of ten work well for nucleic acid sequences and for proteins sequences If the nucleic acid sequence is trans lated then groups of three will allow the nucleic acid sequence to line up with the translated sequence Groups of Three The Groups of Three menu item will organize the selected sequence into groups of three characters starting with the first character selected Page 6 42 Menu Items Groups of Ten The Groups of Ten menu item will organize the selected sequence into groups of ten characters starting with the first character selected No Grouping The No Grouping menu item will remove all grouping from the selected sequence Other Group Size The Other Group Size menu item will allow you to specify
181. f current experiments and to discuss those results But the Gl Notebook does a great deal more than your paper notebook In addition to serving as a container for Gl analysis output the Gl Notebook can be used to archive information and analysis results and to design and print posters and illustrations Aids to navigation such as bookmarks and aliases allow you to maintain notes pertaining to several projects in the same note book and to cross reference them through hypertext links Appendices allow large amounts of data to be stored and retrieved without interfering with the flow of discussion in the GI Notebook Text is entered into the GI Notebook by typing just as in a word processor The text can be formatted by choosing Font Style Size Color and Justification submenus under the Format menu Rulers and tabs behave in the standard way see Figure 5 1 page 5 2 Just drag a tab out of the tab icon in the ruler to a location along the ruler at which you want the tab to reside You can specify left right center and decimal tabs The Gene Inspector displays the whole notebook sheet so you can see what the actual output will look like You can choose to show or hide text margins printable area and page breaks using the Notebook Display Set Display Preferences dialog shown in Fig ure 5 2 page 5 3 Using this dialog you can decide which adornments to show and also define the color to use for each border None of these ador
182. f the keys using the mouse will speak that particular character If you select the Edit radio button and then press a character you will see Figure 3 8 page 3 8 Using this dialog and a microphone hooked up to your Mac you can Page 3 8 The GI Sequence Editor record your own sound to be played back when that particular character is pressed Confirming Sequences As mentioned above sequences can be confirmed either by speech Mac only or by retyping To do this requires opening the Sequence Monitor by choosing Sequence gt Show Sequence Monitor This will show Figure 3 9 the left figures is what would be seen on Windows and the right is what would be seen on a Mac This windoid provides information about the file that is being used acetylcholine receptors the sequence selected in the file chick musc O Sequence Monitor Sequence Monitor File Drosophila HSPs Seq untitled File acetylcholine recpts pep Sel r el Seq Human nic Ach Recpt Sel 11 41 A 1 C 2 G 3 T U 4 N 5 A 1 C 2 0 3 TIU 4 N 5 Map keys M Confirm Reentry O Speak nucleotides while typing Read Sequence Confirm Reentry Figure 3 9 Sequence Monitor AchRec and the range of characters selected in that particular sequence chick musc AchRec nucleotides 5 through 64 The information in the sequence monitor may be of value to you often so you might choose to have the sequence monitor open whenever you are editing a sequence Keyboard map
183. f the Analysis Setup Panel The Table popup menu offers choices for the tables of values to be used in the analysis calculations Choose the Argos et al table The Style popup menu allows you to set a style for the output We will talk about style sheets later Tutorial 10 Creating and Using Style Sheets page 2 38 for now leave it at the default value 4 Click on Input Sequences on the left of the Analysis Setup Window and press the Add button Figure 2 9 page 2 16 This will bring up the sequence chooser window shown in Figure 2 10 page 2 16 Find the Pep tide Sequences folder which is inside the GZ Segs folder and click on the rho dopsins file This file contains a number of rhodopsin sequences which will appear in the bottom left list as soon as you click on the file in the top left list Note that the list on the top left is a list of files similar to what you see in other applications when you choose to open a file However because the Gene Inspector allows you to store multiple sequences in a single file you need to specify not only the name of the file but the name of the sequence s within the file with which you wish to work This is the reason for having the more detailed dialog box shown here If you press the Add rhodopsins gt gt button or double click the file name in the top left list the entire file will be added to the Chosen files and sequences list in the bottom Page 2 15 Tutoria
184. fault 9 Match Scoring determines how scoring will be calculated identity table matches identical residues only Gap Scoring defines a penalty for inserting a gap creation penalty and for extending the gap one residue extension penalty Choosing Upper path or Lower path Traceback path will usually give alternative alignments having the same score Z scores indicate significance of alignments but take extra time to calculate Figure 4 12 Global Alignment Panel for Nucleic Acids the alignment score for each comparison One possibility is to define all matches as having a score of 1 but also give a score of O 5 for purine matching purine or pyrimidine matching pyrimidine Using the identity table option allows you to specify the value to be added for an identical match and the value to be subtracted for a mismatch just enter a negative value for the mismatch score as shown Gap scoring allows you to define how the score should be adjusted to com pensate for creating and elongating gaps The gap creation penalty is the value subtracted from the score for the creation of a single nucleotide gap in one of the sequences The gap extension penalty is the value subtracted from the score for each nucleotide in the gap Depending on the underlying biol ogy sometimes it makes more sense to have a gap extension penalty of zero an insertion or deletion of a segment of DNA that occurs as one event rather than a series of individual nucleotide
185. ferent parts of a user table are shown in Figure 6 32 Adjust Size To Contents 38 e Show Hide Column Headers Show Hide Column Headers will toggle the display to either show or hide the col column headers NaCl mM Activity 0 23 ch a 9 20 68 o Figure 6 32 A User Table umn headers the top horizontal row e Show Hide Row Headers Show Hide Row Headers will toggle the display to either show or hide the row headers the left vertical column es Insert Row Column The Insert Row Column menu option will be enabled whenever either a row or a column is selected It will insert either a row above the currently selected row or a column to the left of the currently selected column e Delete Row s Column s The Delete Row s Column s menu option will delete the selected rows or col Page 6 53 Menu Items umns e Add Column s At Right The Add Column s At Right menu option will place additional columns at the far right end of the table This is not the same as inserting a row or a column internally in the table e Add Row s At Bottom The Add Row s At Bottom menu option will place additional rows at the very bot tom of the table This is not the same as inserting a row or a column inter nally in the table e Adjust Size To Contents The Adjust Size To Contents menu item will adjust the size of the table object so that it will exactly contain the entire contents Thi
186. g the Enable popup menu specifies which files are shown in the files to open list The popup menu allows you to define the format of the file to be imported Importing is discussed in detail in Importing Sequences page 3 19 If you are unsure about the format of the contents of a file you can select it in the scrollable list and then press the File Info button Mac or just look at the bottom of the dialog box Windows The Gene Inspector will attempt to read the file in the format indicated and will provide you with appropriate information about the file contents In the case shown in this figure a GCG sequence file was being viewed Import Nucleic Acid Sequence Selecting this choice will provide the same dialog box as for Import Peptide Sequence You can import text sequence files into a new sequence window or add them to sequence windows already open About Importing Sequences Please note that Textco BioSoftware has obtained information about the file formats from a number of sources and has used that information to design the import functionality in Gene Inspector However any other vendor has a right to change their file formats as does Textco to better suit their applica tions We do our best to keep up with changes but sometimes we might miss a change If you find an incompatibility in the importing please let Textco know about it By far the safest way to import and export sequences is through one of the more
187. genicity analysis is based on the partitioning of model peptides on an HPLC column Based on Parker et al Biochemistry 25 5425 1986 The Hydropathy and Antigenicity analyses using this table are identical Sweet and Eisenberg Based on data from Sweet amp Eisenberg J Mol Biol 171 479 1983 This table of data is derived by correlating data from a number of other hydropathy tables and from observed amino acid replacement rates This Hydropathy analysis is identical to the Optimal Matching Hydrophobicity analysis Page A 3 Appendix Tables Thornton et al This antigenicity table is based on side chain protrusion from the protein backbone This is based on Thornton et al EMBO J 5 2 409 1986 The Antigenicity analysis with the Thornton table is identical to the Side Chain Protrusion analysis von Heijne Based on data from von Heijne Eur J Biochem 116 419 1981 This table is based on the known occurrences of specific amino acids in both prokaryotic and eukaryotic signal sequences This Hydropathy analysis is identical to the Signal Sequence analysis Welling et al This antigenicity table is based on known antigenic regions in a sample pro tein set See Welling et al FEBS Letts 188 2 215 1985 Wolfenden et al This hydropathy analysis is based on free energy of transfer between vapor phase and solution for amino acid side chain analogs Wolfenden et al Science 206 575 1979 This Hydro
188. gt Hide Overview This will hide the overview pane at the top of the sequence window which is no longer helpful because all sequences being displayed are now the same length score A thvedopsine pep scoring row consensus ewe shading Figure 2 18 Aligned Sequences 6 Choose Sequence gt Consensus gt Show Consensus Row This will add a new row above the sequences showing the consensus sequence the most common character in that particular position of the alignment 7 Choose Sequence gt Consensus gt Show Scoring Row This will add a row contain ing a histogram indicating how good the match is at each position 8 Finally choose Sequence gt Consensus gt Show Shading This will add shading to the document which should now look similar to Figure 2 18 The shading indi cates which residues match the consensus sequence residue The more resi dues that match the consensus residue the more intense the shading will be 9 Click on the word SCORE in the name column to select the entire scor ing row Try choosing different colors and patterns using the Format menu Notice how the shading changes to reflect your choices 10 You can also choose to place custom adornments on the aligned sequences Choose Sequence gt Consensus gt Custom Score Adornments and try some of the options for depicting the aligned sequences in the exact arrangement you want For more details on custom score adornments see Sequen
189. gt Speak Typing to have the program speak each character as Page 3 7 The GI Sequence Editor you enter it By using this approach you can concentrate on reading the Speech Preferences Sounds Review Edit felwlele tlyju Tele huaga dga BUGY Bl NIM Speed D Slow Zei Medium a Pause between groups es in seconds fox gt Fast Figure 3 7 Speech Preferences Dialog sequence by eye from your gel or printed sequence and entering it without having to view the screen to see if you have made any mistakes If you err the program will either beep at you for pressing an illegal character or will speak the wrong character you entered to make you aware of the error The speech parameters can be changed by choosing Sequence gt Speech Prefs This will bring up Figure 3 7 The bottom part of the window allows you to Record m li p A d Cancel 00 0 seconds 00 Figure 3 8 Recording a Sound define the reading speed the program will use to read sequences You can choose either Slow Medium or Fast by pressing the corresponding radio button When a sequence is being read back directly from the sequence win dow see Confirming Sequences page 5 9 it will pause between each group The length of the pause can also be set in this dialog The top part of the window allows you to either Review or to Edit the sounds associated with each letter Selecting the Review radio button and then pressing any o
190. h translation tables and codon preference tables contain information about the translation of codons into amino acids for a given organism The codon E New Table New Table Type Nucleic Acid Now C Protein Cancel Existing tables Analyses of selected type Align 2 sequences global a Align multiple sequences CodonPreference Dot matrix GC coding prediction TestCode for selected analysis E Saccharomyces cerevisiae Salmonella typhimurium Schizosaccharomyces pombe Solanum tuberosum potato Staphylococcus aureus Strongylocentrotus purpuratus Triticum aestivum wheat Trypanosoma brucei Xenopus laevis Zea mays maize Figure 4 9 Creating a New Translation Table Page 4 13 Analyses preference tables contain additional information dealing with the relative fre quency of occurrence of each codon in an organism s DNA for those genes already sequenced If you are working with an organism for which you do not already have a table you will need to create one This section explains how to do this You cannot edit the built in tables but you can create new ones by choosing Analysis gt Tables gt Create New This will produce the dialog shown in Figure 4 9 The list on the left of the dialog shows the different kinds of analyses for which tables are used Clicking on an item in the left list Open reading frames in this case will bring up a list of tables currently available for the particular
191. he formatting information for displaying the analysis output If the amount of disk space is large and you are finished experimenting with the thresholds you can choose to discard data as dis cussed in the text related to Figure 4 23 page 4 29 This will leave a picture of the output in the notebook and will allow you to recalculate the analysis but you will not be able to alter the thresholds for this particular output object Sharing Setups With Colleagues You may find that you have defined a number of Analysis Setups that work well for your purposes and contains all the parameters you find to be most useful If you would like to share this setup with a colleague it is easy to do Each Analysis Setup is stored as a file on your hard disk The setups are stored in the Analysis Setups folder that resides in the Gl Data folder The Gl Data folder is in the same folder as your Gene Inspector application To send the Setup to a colleague just copy the setup file from the Analysis Set ups folder and give it to your colleague When your colleague receives it he Page 7 4 Tips For Using The Gene Inspector she should place it into the Analysis Setups folder on his her hard disk The next time Gene Inspector is started the analysis setup will be available under the Analysis menu along with all the other stored Setups Printing and Viewing Large Objects Very often you might have large objects such as long lists of restriction sites or
192. he values in the table the higher the value will be for the plot The output from a CodonPreference analysis is shown in Figure 4 20 One curve is drawn for each reading frame Any values present above the cutoff line are likely to be coding regions with a 95 confidence level In this case a nice coding region can be seen in reading frame 3 stretching from about 1600 to about 3700 nts Other options are also available to be displayed in the CodonPreference plot CodonPreference Dros hsp70 oo oT TO sy a pl 1 1001 2001 3001 4001 5001 Nucleotide Figure 4 20 CodonPreference Output Some codons are used only rarely in an organism usually less than 10 of the time for any set of synonymous codons These codons are likely to be found only infrequently in any real coding region By plotting rare codon appearance along the length of the plot one can often identify likely coding regions Notice how the rare codon frequency tick marks are much sparser within the actual coding region in reading frame 3 than anywhere else in the plot This provides an additional indication that the region is a true coding region You can specify that you want to plot rare codons in the setup panel Figure 4 19 page 4 23 Finally the plot draws actual open reading frames ORFs as arrows ORFs represent segments of DNA that do not contain any stop codons You can Page 4 25 Analyses specify the minimum length th
193. headers to set their text properties Tables may be useful for listing lanes in an electrophoresis gel recipes for buffers results of assays or other data you want to format as a table You might also want to create tables containing buffer recipes and then place the tables into appendices see Appendix Objects page 5 16 This will allow the buffers to be recalled at any time from any location in the GI Notebook Once a table is created you can shrink the size of the table object to a size smaller than the space required by the table columns and rows To do this select the table click once with the mouse on the table object and then resize it by dragging one of the object s handles with the mouse This is use ful for very large tables To view information in the table that is cropped out of the viewing area hold down the option key and drag the mouse to scroll the table cells within the table object area Note that with the option key held down the mouse cursor turns to a hand indicating that you can push the Page 5 13 The GI Notebook table around You can also open the table for editing see Open for Editing in the next section Copying and pasting can be used to transfer tables of information between Gl and other applications User Tables can be targeted and the values copied as tab delimited text This can be pasted into other applications that can receive such text Conversely you have tab delimite
194. helices This hydropathy analysis is identical to the Engelman et al Transmembrane Helix analysis Hopp and Woods This analysis is based on free energy changes in amino acid side chains between water and ethanol Proc Nat Acad Sci USA 78 3824 1981 It was designed to determine antigenicity but has become popular as a standard hydropathy analysis The Antigenicity and Hydropathy analyses using this table are identical Page A 2 Appendix Tables Janin Based on values in Janin Nature 277 491 1979 which determined the sur face accessibility of amino acids The ratio of buried accessible values in the paper Table 1 column 4 were converted to the fraction accessible This Antigenicity analysis is identical to the Accessible Surface Area analysis Kyte and Doolittle This hydropathy analysis table is based on an aggregate scale obtained by several methods and fine tuned manually The original paper recommends a window of 7 but 19 21 is also useful for determining membrane spanning segments J Mol Biol 157 105 1982 Manavalan amp Ponnuswamy Based on data from Manavalan amp Ponnuswamy Nature 275 673 1978 This data indicates the likelihood that an amino acid will be surrounded by hydrophobic amino acids Values in Table 1 col d were standardized to Gin as suggested by the authors This Hydropathy analysis is identical to the Sur rounding Hydrophobicity analysis Parker et al This hydropathy or anti
195. here as well If you drag a sequence segment to a different sequence window you will be able to drop it into any sequence in the target sequence window Drag amp Drop of a segment into the same sequence is like Cut amp Paste Drag and Drop of a segment into a different sequence is like Copy amp Paste Entering and Checking Sequences Entering a sequence into the Gene Inspector is done through the keyboard or by importing files saved on disk in other formats Once a sequence is entered there are two ways to verify it by reentering the sequence or by having the Gene Inspector read the sequence back to you Mapping the Keyboard To facilitate sequence entry for DNA you can reconfigure the keyboard to use additional keys for entering nucleotides Choose Sequence gt Map Keys to bring Map Keys Define key mappings for A 1 C 12 G 3 T U 4 N 15 Figure 3 6 Mapping Keys up Figure 3 6 In this case the keyboard has been reconfigured so that typing a 1 or an A will enter an A in the active sequence document a 2 will gen erate a C 3 a G 4 a T U in RNA and 5 an N This makes it easier to enter sequences without making mistakes and without straining your fingers This option is not available for entering peptide sequences Defining Speech Preferences Mac only You can also have aural feedback as you enter sequences this is a Mac only feature because there is no built in speech generator on Windows Choose Sequence
196. hold box Page 2 59 Tutorials Dot Matrix Analysis Another Interactive Analysis press the Change Threshold button to change the 20 to a 30 Next select the O from the list and change it to 20 by typing in a 20 in the threshold box and pressing Change Threshold Convert 20 to 10 and 40 to O using the same procedure You have now changed the thresholds from 40 20 O 20 40 to 40 30 20 10 O and left your color scheme intact Press Reformat and notice how this changes the display This reformatting ability lets you Figure 2 43 Selecting a Region in a Dot Matrix Plot play with the dot matrix results and find the best settings to display your data 9 Double click the dot matrix output object to make it the target 10 Use the mouse to drag a rectangle within the plot from about the middle of the plot down to the lower right corner selecting the diagonal line as shown in Figure 2 43 You have now selected a range of each sequence that shows similarity in the dot matrix plot 11 Choose Object gt Align Selection This allows you to do a sequence align ment directly from the dot matrix plot Set up the alignment parameters to match those shown in Figure 2 44 page 2 61 see Align 2 Sequences Global page 4 16 for more detail and then Run the alignment This will generate an alignment of the regions of the two sequences you selected in the dot matrix plot Note that you have just used an output object
197. how this works choose Notebook gt Tool Extensions gt gel table and click the mouse in the notebook to place a new gel table The table will be placed with its top left corner at the location of the mouse click Tool extensions provide an easy way to keep frequently used objects whether they be graphic objects tables or even analyses 18 Feel free to explore other aspects of this notebook to see how they might be useful to you in your work Notice how the Gene Inspector sequence anal yses are integrated with the discussion and experimental results This pro vides an easy way to keep track of all your work in one place rather than Page 2 7 Tutorials Tour of a Gene Inspector Notebook having to run multiple applications each with its own purpose When you are finished exploring the notebook close it by choosing File gt Close or by clicking the close box in the top left corner of the window When asked if you want to save changes just say Don t Save so that others can go through this tutorial in the future Page 2 8 Tutorials Editing Sequences TUTORIAL 2 EDITING SEQUENCES This tutorial focuses on the sequence editor Together with Tutorial 1 Tour of a Gene Inspector Notebook and Tutorial 3 Using Analysis Setups this tutorial provides an overview of the three main parts of the Gene Inspector 1 In this tutorial you will learn how to open and manipulate sequences stored within Gene Inspector sequence files
198. hown in Figure 5 9 page 5 11 For analysis output objects like this one you can see the analysis name the sequence analyzed and infor mation about any table used in the analysis Comments can be typed directly Page 5 11 The GI Notebook into the Comments box and the Title can be changed by typing in the Title box Text Objects Sidebar Text Text objects are separate text boxes into which you can type text This text is separate from the background text will not flow with the background text and represents a separate Gl Notebook object These kinds of objects are some times called sidebar text in other programs To create a text object select the T tool from the Notebook Tools menu Figure 5 5 page 5 7 and use the mouse to drag a rectangle in the GI Notebook corresponding to the area into which you want to type let go of the mouse button and then type in the text Text objects are useful for creating titles for figures or titles for whole Gl Notebook sheets They can span multiple printer pages and can serve as titles for multiple columns of text Text objects can also be used to annotate specific figures with descriptive text this avoids the possible problem of hav ing text describing a figure move when you edit background text Text objects will remain at the same location on a Gl Notebook sheet just like other Gl Notebook objects even if the background text is edited Table Objects User Tables The Gene Insp
199. iately on the sheet The Save as default page layout button in the bottom left of the dialog allows you to save the current configuration for notebook layout as defined in this window as the default configuration for all new GI Notebooks that you create Note that after making changes in the layout parameters you need to click in the box in the upper right to see the new layout as a graphic view Clicking in that area tells the program to update the display This is necessary to prevent the display from updating automatically after you enter each parameter and before you have completely redefined the new layout Bookmarks Bookmarks are designed to help you navigate to different locations within the Gl notebook Gl bookmarks function in the same way that paper bookmarks function you can use them to mark specific locations in the Gl Notebook Each bookmark is anchored to its location in the notebook by being attached to a notebook object any object can have a bookmark attached to it To create a bookmark select an object to use as the anchor for the book mark and then choose Notebook Bookmarks Attach Bookmark You will be asked to name the bookmark Once the bookmark is named it will appear in the Bookmarks submenu with its own name Simply choosing the bookmark from the Bookmarks menu will take you to that location in the notebook This is illus trated in Tutorial 19 Using Bookmarks in the Gl Notebook page 2 62 One
200. ic acid sequences you can enter comments about the sequence and can specify the starting position of the first residue e New Sequence Choosing the New Sequence menu item will create a new empty sequence in the currently active sequence window You will be given an opportunity to name the new sequence es Insert Xs Insert Ns Insert Xs Insert Ns will allow you to insert ambiguous characters into your sequences Ns into nucleic acid sequences and Xs into peptide sequences Enter Value Number of nucleotides to insert Figure 6 28 Inserting Ambiguous Characters Page 6 45 Menu Items After a warning notice telling you that the operation is not undoable you will see the dialog box shown for nucleic acids sequences in Figure 6 28 You can specify the number of characters to be inserted Inserting characters is a convenient way of putting in placeholder sequences For example you might know that there are 2300 nucleotides between two restriction sites in a DNA but the sequence itself is not known By inserting 2300 Ns in the sequence the overall proportions of the sequence are maintained and the known restriction map is preserved even though you do not know the actual sequence Of course sequence analysis on this sequence will not be all that informative e Generate Random The Generate Random menu item is similar to the menu Insert Xs Insert Ns command in that it inserts new char
201. idase LysC endopeptidase LysC hydroxylamine Remove All d hydroxylamine NBS 1 a DIR aa O Mark cut sites pancreatic elastase e Mark recognition sites m Show Icons Style Default D Choose a specific Cleavage List using the popup menu Select cutters to use from the Available Cutters list on the left and move them to the right list of Sites to Mark Sites can be marked at either the actual cut site Mark cut sites or the beginning of the recognition site Mark recognition sites Figure 4 67 Protein Cleavage Setup for restriction enzyme digests of DNA page 4 42 The sites that are used in this analysis are listed in the Appendix in Protein Cleavage Sites page A 6 The output can be displayed either graphically or as a table and is similar to the restriction enzyme digest output Figure 4 39 page 4 43 Protein Interior This sliding window analysis uses data from Engelman and Steitz Ce 23 411 1981 which indicates the likelihood that an amino acid will lie in the interior of a protein This analysis is identical to a Hydropathy analysis using the Engelman and Steitz table Side Chain Flexibility This analysis uses the algorithm of Karplus amp Schultz Naturwissenschafter 72 212 1985 which examines the side chain flexibility of a peptide segment Page 4 67 Analyses The flexibility is an indication of antigenicity Flexibility of peptide segments were determined by examining crystal structures of 31 different
202. ifies the size of the dot to be drawn for each match The numbers represent the size in pixels for each square dot For plots that are to be used for posters or slides it is often more visible to use larger dot sizes so the image can be seen easily from a distance Running the analysis will give results as one shown in Figure 4 22 You can change axis parameters and other items in this output object just as you can in other graphical outputs However the dot matrix output provides a number of additional and interactive features Because the actual comparison data is stored with the output object you can target the dot matrix output and change thresholds by choosing Object gt Reformat and then choosing the Threshold Page 4 28 Analyses Figure 4 22 DNA Dot Matrix Output icon in the reformat window In this way you can fine tune the display to show only the data you are interested in displaying By experimenting with different thresholds it is possible to maximize your signal to noise ratio One consequence of storing all the data with each plot is that it can take time to write all the data to disk and the resulting analysis output can be very large In order to provide the flexibility of allowing changes in thresholds in E Dot Matrix Optimizations Discarding the data generated by the analysis will reduce the size of the notebook document on disk A picture of the output will be drawn but you will need to recalculate the
203. igma Sigma Boehringer Boehringer Amersham In Vitrogen In Vitrogen Stratagene B P P oehringer harmacia harmacia U S Biochemicals USB U S Biochemicals USB Sigma Sigma Clontech In Vitrogen Clontech Page A 31 Appendix 428 pZeoSv 429 pZeoSVLacZ 430 pZEro 431 rpDR2 432 rpSE937 433 Yep24 434 Yip5 Page A 32 List of all Vectors Included With Gene Inspector In Vitrogen In Vitrogen In Vitrogen Cl EL on on New New Eng tech tech and Biol Eng and Biol abs abs A Index A about this Manual ninia ae ii ia das 1 1 absorption coefficient see protein physical characteristics accessible Surface area iii id 4 48 Add Another Analysis Analysis Menu ccconcccccccnccnononnocnnncnnnnnancnonnnnonononannnos 6 25 Add Column s at Right Table Menu cccccnnccccnccnnononancnonnnnnnnnnnnananononononnnnannns 6 54 Add Row s at Bottom Table Menu oocccccnnnnnnnnncnnnnnncnnnninananana na nan cn ono nininnn 6 54 Add Setup to Menu Analysis Menu ccccconccccccncccnnnoncnnnnnncnnnnnnncanononononnnannnos 6 26 Adjust Size to Contents Features Menu c cccccccecsssssssececesesssssseaeeeeeeeess 6 43 Adjust Size to Contents Sidebar Menu c cccccccecssssssecceceeeessssseaeeeseeeess 6 52 Adjust Size to Contents Table Menu ccccncnocococononcnnnnonnnnonnnnnnnnnonacnononnnnnnns 6 54 adjusting table
204. in standard values from the literature Creating tables is also dis cussed in Tutorial 20 Creating Your Own Analysis Tables page 2 63 and Editing Translation and Codon Preference Tables page 4 13 Create New To create a new table you must first specify what kinds of analyses the table will be used for As shown in Figure 6 14 you must choose an analysis from the list on the left which will specify what dimensions the table needs to be Once a selection is made on the left you can choose to create an empty table or you can create a table filled with values copied from another pre existing table In the case shown here the BLOSUM100 table is chosen as the starting point for the new table You may also paste table values in from the clipboard Any tab delimited text e g from Excel will be placed appropri ately in the table using the cell you have selected as the top left cell for past ing in the table data from the clipboard In this instance the new table window that is created is given a default name of BLOSUM100 Copy because the BLOSUMI100 table was used to fill in the values in the table This is shown in Figure 6 15 The Edit Info button brings Page 6 23 Menu Items amp BLOSUM100 Copy Edit Info Cancel OK Figure 6 15 A New Table Window BLOSUM100 Copy Info Title BLOSUM100 Copy Min Adjective Similar Max Adjective Dissimilar Description 21x21 Peptide similarity table Create
205. ind dialog pasting in the text and then starting the search Replace The Replace option will perform a search to find a given key word and then will replace that text with a new string of characters The Replace All button will replace all occurrences of the search text You should use this option with caution because it is not undoable The Replace Find Next button will replace the current selected match and then find and highlight the next match e Drag amp Drop Options This menu item allows you to define how you want drag amp drop to work within Gene Inspector As shown in Figure 6 9 you can specify which opera jal Drag amp Drop Options Enable drag amp drop editing of text d Enable drag amp drop of notebook objects Enable drag amp drop of sequences Cancel OK Figure 6 9 Drag Drop Editing Dialog tions will utilize drag drop within Gene Inspector To use drag drop first select the item you which to move and then click and drag the selected object to the new location Page 6 13 Menu Items e Show Clipboard Show Clipboard opens a new window that will display the current clipboard con tents This is viewed through the Gene Inspector application When you leave the Gene Inspector only TEXT or picture information gets passed along to other applications because other applications cannot recognize the Gene Inspector s internal parameters e Show Hide Page Breaks It is pos
206. individ ually and choosing to recalculate each one individually Selecting Recalc Selected Items is equivalent to targeting an object and choosing Object Recalcu late without changing any of the parameters If you only want to update a few hot linked objects in the notebook instead of performing an autorecalc on the whole notebook this is the best way to do it Show Dependencies Each analysis output object is linked to a sequence but it is often easy to lose track of which sequence is connected to which analysis object By Alias List Select item s to open for review NewHampshire2 Applications Gene Inspector 1 6 Gl Seqs Pentide se Cancel 0K Figure 6 23 Show Dependencies Dialog selecting an output object and choosing the Show Dependencies menu item the dialog box shown in Figure 6 23 will appear The dialog lists the full path name to the sequence that is linked to the output object It also lists the date Page 6 36 Menu Items the sequence was last modified September 4 1995 7 23 PM in this case e Page Break The Page Break menu item will place a page break at the location of the inser tion point in the background text This is indicated by an omega Q symbol if show invisibles page 6 33 is turned on The text following the page break will start on a new printer page Note that a new printer page is not necessarily the same as a new notebook sheet See Notebook Layout
207. ing or pending low priority analyses will be put on pause and will Analysis Setup Analyses 1 Inputs 0 Outputs 0 Output type Graph Table Display results as Input Sequences peo C Percent of all occurences oo Output Location e Number of occurences Figure 6 13 Setting High Priority for an Analysis wait to execute until the high priority analysis is completed see also Tempo rarily Pausing Long Running Analyses page 7 3 This capability might be useful if you are running a time consuming analysis alignment or dot matrix on large sequences but have a quicker analysis you would like to see without waiting Starting the quick analysis as a high priority analysis will let it run and then the slower analysis will resume without having to start over again Page 6 22 Menu Items e Tables The items in this submenu deal with analysis tables which you can create New Table New Table Type Nucleic Acid New Protein Cancel Existing tables Analyses of selected type for selected analysis Accessible surface D lt Empty gt H Align 2 sequences L Amino Acid Identity Antigenicity BLOSUM30 Dot matrix BLOSUM35 Helical wheel BLOSUM40 Hydropathy BLOSUM45 Hydration potential BLOSUMSO Membrane buried re BLOSUMSS Ontimal matchina h y RI OSUM60 y Figure 6 14 Creating a New Table You are not allowed to edit the tables that are built into the program because these conta
208. insertions or deletions and at other times it might be more appropriate for the gap penalty to have a non zero value If you want the two sequences to be aligned along their entire lengths start ing and ending at the ends of each DNA then choose to Treat unaligned Page 4 17 Analyses Ey ends as gaps Placing an x in this checkbox will cause a gap penalty to be subtracted from the alignment score if one sequence starts or ends before the other In other words the non aligned end segment will be treated as a gap If this option is not checked then ends will not be forced to be aligned Usually it is possible to generate several alignments having the same score By choosing a different traceback path either upper or lower traceback paths you can see either of the two most different alignments capable of generating this maximum score The traceback path is a technical term that corresponds to the way the program actually generates an alignment Finally you have the option of calculating a Z score This is a time consum ing process but it provides an indication of the significance of an alignment score When the two sequences are aligned a score is calculated based on the scoring table chosen and the resulting alignment produced by the analy sis The score is shown in the output object but often it is difficult to assess the significance of the alignment based on the score e g what does 247 4 mean To address this p
209. int Xa 3 48 PinPoint C 49 pSl 50 pSP64 polyA 51 pSP64 52 pSP65 53 pSP70 54 pSP71 55 pSP72 56 pSP73 57 pSPluc 58 pSPluc NF 59 pSV B GAL Sigma ColE1 pBK614 pBR325 pMB9 pTZ18U pTZ19U pUB110 pUC8 pUC9 pYAC4 pYAC55 OO OY rides ey wech oh o Page A 19 Appendix Stratagene M13 PhageScript ONDUBWHD 9 10 11 12 13 14 15 16 pBC KS pBC KS pBC SK pBC SK pBlueScribe pBlueScribe pBlueScribe pBlueScribe pBlueScribe pBlueScribe pBlueScribe KS KS M13 M13 M13 SK SK pBluescript Il KS pBluescript Il KS pBluescript II SK pBluescript Il SK 17 pBluescript KS 18 pBluescript KS 19 pBluescript SK 20 pBluescript SK 21 pBS 22 pBS 23 PhageScript SK 24 pRS403 25 pRS404 26 pRS405 27 pRS406 28 pRS413 29 pRS414 30 pRS415 31 pRS416 32 pT3T7 lac 33 pT3T7BM 34 pWE15 Page A 20 Vectors by Supplier Appendix U S Biochemicals USB pAX4a pAx4a pAX4b pAx4b pAX4c pAX4c pAX5 pAX5 9 pCF20 10 pMEX5 11 pMEX6 12 pMEX7 13 pMEX8 14 pT7 0 15 pT7 1 16 pT7 2 17 pTRXN 18 pTRXN 19 pXPRS 20 pXPRS CO d Ou E GA bM Vectors by Supplier Page A 21 Appendix List of all Vectors Included With Gene Inspector List of all Vectors Included With Gene Inspector Vector NameFile Name
210. l page A 3 Thornton et al page A 4 and Welling et al page A 4 Each of these tables were created by the authors based on different physical proper ties of amino acids and peptides CF Structure Prediction This analysis uses the algorithm originally developed by Chou and Fasman Biochemistry 13 222 1974 J Mol Biol 115 135 1977 and later updated in Prediction of Protein Structure and the Principles of Protein Con formation ed G D Fasman Plenum Press New York 1989 p391 Unlike the GOR algorithm GOR Structure Prediction page 4 58 this approach looks for nucleation sites to start the formation of alpha helical beta sheet or turn structures and then tries to extend the structure from the nucleation site There are no parameters to be entered by the user for this analysis One view of the output is shown in Figure 4 51 on page 4 55 There are three plots indicating the probabilities of being alpha beta or turn structures These three plots are used to calculate the structures predicted in the bottom part of the analysis output object The blocks in lower part of the figure indi cate the predicted structures After targeting the object you can view the out put as a Squiggles plot instead of a graph In the squiggles plot any segment of the protein not being defined as alpha beta or turn is called a coil This will give you an output similar to that for the GOR analysis Figure 4 56 page 4 5
211. l the objects in the Gl Notebook will be selected For a sequence editor document if the insertion point is in a sequence SelectAll will select all of the residues of that sequence If a sequence name is selected then SelectAll will select all Page 6 10 Menu Items sequences in the document If the selection is within a targeted object Select All will behave according to the rules of that particular object es Show Selection Sometimes if you have a large amount of information or some of the material in the window is not on screen it is difficult to find out where the cursor or selection is Choosing this option will bring the selected information or the cursor if nothing is selected into view If possible the selection will be cen tered in the window e Find amp Replace Find Find allows you to search either a sequence document or a Gl Notebook For notebooks the search dialog is shown in Figure 6 7 The check boxes allow Hndi Heplace In Mabebook dq Se feel GontHinz Danese cole rat Ment MESE E Figure 6 7 Notebook Find Dialog you to be specific to the case of the matching words Case sensitive to continue to search after the end of the document is reached by continuing the search at the beginning of the document Wrap around and to search starting at the current site and working towards the beginning of the document Search backwards When a match is found the match is highlighted an
212. larger and larger groups of sequences during the multiple alignment The final multiple sequence alignment is then accomplished by aligning the various alignments of similar sequences with each other You must first choose a table to use for comparing the different sequences This is done using the popup Table menu in the same way you would choose a table for any analysis Unless you really know what you are doing or are interested in experimenting we suggest that you leave intact the default values for the other parameters in the panel Improper use of the parameters may yield misleading results especially for the step 1 parameters so be careful Varying the step 2 parameters may lead to different alignments and might alter the score of the alignment What follows is a brief description of what each parameter means based on the documentation accompanying the Clustal V code Step one pairwise grouping e k tuple word size Can be 1 or 2 for proteins 1 to 4 for DNA Increase this to increase the speed of the analysis decrease the word size to improve sensitivity detect sequences more distantly related maximum gap length The number of diagonals around each top diagonal that are considered Decrease for speed of analysis increase to improve sensitivity gap penalty The number of matching residues that must be found in order to introduce a gap This should be larger than the k tuple size This has lit tle effect on speed
213. layout Tei Side by side layout Poster sheet layout p Poster layout size 8 Measured in inches Zei In printer pages 2 00 wideX 1 00 tall r Columns en r Margins in inches Zei Printer page columns Left 8 Full sheet columns Right Columns per page 1 Top sheet Bottom Space between 1 00 Binding columns in inches Eo gt F Cancel Save as default page layout Figure 5 3 Notebook Layout Dialog sections to this dialog Note that the box on the top right contains a graphic indicating the layout of the text in the Gl Notebook The dashed line down the center of this area in the figure shows a printer page break and the gray area on each printer page indicates the available area for text The line around the border of the sheet indicates the available printable area for the currently selected printer This particular layout is called a side by side layout as cho sen in the Notebook layout style box at the top left The Poster layout size box allows you to set dimensions that might be appropriate for designing posters The number of columns of text and the margins can be set in the a The printed area is defined by using File Page Setup dialog Page 5 4 The GI Notebook Columns box and Margins box respectively You can change these settings even after you have entered text and objects into the Gl Notebook The text will re wrap around any objects and the objects will be placed appropr
214. leic Acid Codes A 7 IUPAC Standard Amino Acid Codes A 8 Vectors by Supper A 9 Amersha M Ae cere awe yee eee hee eae eae Pare es A 9 Boehringer hi A as A 9 A Ee e Ee A 9 COMEC aint a ias E lei A 10 page 8 TABLE OF CONTENTS Bl Eh Ree Ss sce e aio le A A A 12 INVITOB EN soto Sede setts o E A 12 New England Biolabs NED A 13 Nova GEN ia tada a a ai we a Re ae A 14 O A 16 Pharmacia simi NEE EN NEEN EEN NEIEN EEN a A ea A 16 d lu HEEN A 18 Sigman Ate a ld stalin a o Sek hide ahs A 19 Stratagene sett hada ees at aS ets Ae eae A 20 U S Biochemicals USB oooooococooconccn ees A 21 List of all Vectors Included With Gene Inspector A 22 Index page 9 TABLE OF CONTENTS page 10 Getting Started with Gene Inspector Chapter 1 Getting Started with Gene Inspector About This Manual This manual contains a number of sections describing the Gene Inspector Because the program was designed from the very beginning to function the way a molecular biologist would think it often does things that other programs fail to do or else might do things differently from the way other programs do To get started with Gene Inspector and to get an overall feel for the program we strongly urge you to go through the Tutorials They will point out some of the differences between the Gene Inspector and other programs and provide you with an overview of the general philosophy of the application
215. length ORF to consider amino acids 100 be J Ss Display C Only ORFs Table None ORFs and rare codons Cutoff 10 00 Table Standard Drosophila melanogaster Input Sequences Output Location zi V Show Icons Style Default The Method box ORFs can be found between stop codons Only Stop Codons button or between start and stop codons Start and Stop Codons button The Display box lets you show ORFs or ORFs and rare codons Choose a codon table appropriate for the DNAs being analyzed using the Fable popup menu The tare codon Cutoff value is based on the frequency of occurence in synonymous codons Figure 4 35 Open Reading Frame Panel ORFs Dros muse AchRec 1 501 1001 1501 2001 2501 3001 3501 4001 4501 Nucleotide Figure 4 36 Open Reading Frame Output this case there is a long ORF in reading frame 3 from about 250 2700 Notice how in this reading frame there are not very many rare codons tick marks which suggests that this ORF may truly be a coding region This in an interactive output object which allows you to extract either the DNA or the peptide sequence corresponding to an ORF To do this target the Page 4 40 Analyses analysis output by double clicking on it and then click once on the ORF of interest This is shown in Figure 4 37 page 4 41 Once the ORF is selected ORFs Dros muse AchRec 501 1001 1501 2001 2501 3001 3501 4001 4501 Nucleotide Figure 4 37 ORF
216. like that shown in the Analysis Monitor in Figure 4 3 In this case the low priority dot matrix analyses are put on Pause while the high priority base composition analysis is being run As soon as the high priority analysis finishes the low priority analyses will resume Input Sequence Panel Figure 4 4 page 4 5 shows the Input Sequence panel in the Analysis Setup Window Because the Input Sequence icon is selected in the list at the left information relevant to selecting a sequence is shown on the right The panel contains a list of sequences chosen for analysis just pBR322 in this figure and can also be used to define segments of the whole sequence for analysis This is done using the Entire sequence and Segment radio buttons in the Range section of the panel When a sequence is selected in the top list the Segment button can be chosen and then the to and from fields can be used to define which range of nucleotides or amino acids are to be included in the analyses For circular DNA sequences it is possible to select a seg ments of DNA that spans the origin Whether the sequence is linear or circu lar is indicated in the Range section shown in Figure 4 4 Page 4 4 Analyses Analysis Setup Analyses 1 Inputs 2 Outputs 2 High Priority Close Run Gee 2 Chosen sequences G b range gn ambiguities Y modified WW chick musc AchRec 1 2486 0 Base distribution Ry Human muse Ach Rec 1 1913 0 Bat w O Remove R
217. ll be making some changes in the sequences in this document and do not want to accidentally change the original sequences 3 Click on the sequence name lamprey rhodopsin in this sequence window to select that sequence Analysis Setup Analyses 1 Inputs 3 Outputs 3 F High Priority Close Run Calculate the occurrence of Mononucleotides Output type 6 rap Display results as Number of occurences Percent of all occurences Je Show Icons Style Default z Use the Calculate the occurence of popup menu to specify mono di or trinucleotide composition The Style popup menu allows a predefined style to be applied to the results Output type specifies whether the results should be presented as a graph or as a table Results can be displayed as Number of occurences or Percent of all occurences using the Display results as buttons Figure 2 12 Simple Base Composition Setup 4 Hold down the shift key and then click on the xenopus rhodopsin Page 2 19 Tutorials Hotlinking Analysis Results sequence name This is a standard method for extending a selection in this case we are extending our selection from just being lamprey rhodopsin to being lamprey octopus and xenopus rhodopsin 5 Leave these three sequences selected in the sequence document and choose Analysis gt New Analysis to start a new analysis Choose to perform a nucleic acid analysis and then select Base Composition If you are not sure how to do this
218. lot to be 2 pixels on a side 5 Colors now need to be defined for each threshold Select the 40 in the threshold list and make it red using the Format gt Color menu Next choose the 20 and make it magenta Make 0 green 20 cyan and 40 blue Note how the Color Ranges thermometer on the right reflects your changes Start the analysis by choosing Run 6 When the dot matrix analysis is run the actual comparison data is saved Dot matrix Dros hsp22 amp Dros hsp23 Dros hep22 151 4 T T T T 1 26 51 76 101 126 151 176 Figure 2 42 Initial Dot Matrix Output with the output object so changes in the display can be made easily and quickly after the analysis is completed This allows you to fine tune the dis play to show exactly what you want Your initial output should look like Figure 2 42 Each dot represents a point of similarity between the two sequences 7 Double click on the dot matrix analysis object to make it the target and then choose Object gt Reformat Select the Threshold Settings icon on the left to see a panel similar to what you saw in the initial setup similar to Figure 2 41 page 2 58 You cannot change window size but you can change the thresholds and or their colors By redefining the thresholds you can present the data differently This is useful for reducing the noise level in the plot and highlighting the data of interest 8 Select the 20 in the threshold list and type a 30 into the thres
219. ls Using Analysis Setups Analysis Setup Analyses 1 Inputs 0 Outputs 0 D High Priority Close gt run A See A AA ie Chosen sequences a b range n ambiguities A modified Transmembrane helices Ti o Output Location Add Remove Remove All Range Entire sequence Segment From 0 Start We To 0 Path i show Icons Add and Remove buttons are used to define the Chosen sequences Clicking on a sequence in the list displays its location Path Entire sequence will run all analyses using the entire length of the chosen sequence The Segment button allows the analyses to operate on only a part of the entire sequence A e in the Chosen sequences list indicates that the segment is truncated by a stop codon Figure 2 9 The Input Sequence Panel Choose Peptide Sequences Look in O Peptide Sequences sl e ez ES a acetylcholine recpts pep a rat globins pep a actins pep Je a cytochromes b5 pep el Dros hsps pep a E_coli beta galactosidase pep a lactate dehydrogenases pep File name tnodopsins pep Files of type Peptide Sequence Files v Cancel Add Files or Folders gt gt Show Open Files Done Sequences in document Chosen files and sequences bacteriorhodopsin Halobacterium archaerhodopsin Lamprey rhodopsin Octopus rhodopsin Xenopus rhodopsin Add gt gt Remove Figure 2 10 The Sequence Chooser right Fo
220. ment This results in vertical space between the object 5 This same alignment dialog can be used to superimpose graphs to see how different plots compare One use might to make one protein plot red and another blue then superimpose them for a direct comparison To superim pose the plots one needs to align objects at their tops and at their left edges while defining the same widths and heights for all objects 6 Save the notebook using File Save You will need the notebook for a later tutorial see Taking Notes Using Background Text page 2 36 This concludes this tutorial If you choose to continue to the next tutorial close all open windows now Page 2 32 Tutorials Aligning Analysis Objects 151 201 251 301 351 Amino add S Accessible surface ardh Octopus rhodopsin H 1 101 201 301 401 Amino add LI LI a Accessible surface are Xenopus rhodopsin s 1 51 101 151 201 251 301 351 Amine add Figure 2 26 Output Objects After Alignment Page 2 33 Tutorials Customizing Gene Inspector Menus TUTORIAL 8 CUSTOMIZING GENE INSPECTOR MENUS 1 Choose Format gt Color gt Add Color To Menu This will bring up the dialog B Add Color Ham uf cu url ad Ww nwn mz Jas t menty sel et d caine Chase en nom paler aial Figure 2 27 Add Color Dialog shown in Figure 2 27 This dialog allows you to type in a name for a color you wish to create 2 You may choose to name colors by using the nam
221. mparisons between nucleotide characters We will create a table in which identities score 1 and matches between pyrimi n You can also use copy and paste to move a table from other applications into the Gl Table editor Page 2 63 Tutorials Creating Your Own Analysis Tables DNA identity Copy Edit Info Cancel OK Figure 2 46 Nucleic Acid Table Editor dines or between purines score 0 5 Mismatches will score O 3 The Gl table editor is designed to allow you to press the tab key to move through the table and enter values As you type a value in one cell of the table and then tab to move to the next cell in the table the same value is placed in the symmetrically located corresponding cell In the figure the selected cell is row 2 column 4 while the corresponding symmetric cell is row 4 column 2 Fill in the table to match that shown in Figure 2 46 by entering values and tabbing to the next cell 4 You can adjust the width of any of the columns by placing the mouse cursor over one of the vertical table lines the cursor will change shape and then dragging left or right to move the dividing line If you hold down the shift key when you drag the line all columns in the table will be made the same width as the column just to the left of the line you are dragging If you hold down the option alt key while you drag to adjust the width all columns in the table will be adjusted by the same amount i e
222. n Page 5 1 The GI Notebook e608 Sample Notebook Co 1 2 Le 4 tab icons justification icons Hine spacing icons A Sample GI Notebook 20 TM Helix Argos et al Octopus rhodopsin This is text in the notebook It is in 15 the background eu and will H automatically 05 wrap around any object you put 0 0 T T T T Y T r into the a a 7 S notebook It can be used to describe analyses and to discuss the results For example the green arrows in this analysis point to regions in octopus rhodopsin that might be transmembrane helices 4apsog 1X3 4apsog abpd You can also use the notebook to contain pictures and other objects created m other programs such as the scanned in image of a gel shown here Notice that a frame and a shadow have been added to the gel image to enhance its appearance The frame can be modified or removed from the object by using the Format gt By using tables like the one the left it is easy to Y document experiments This particular table shows the enzymes used to prepare the samples d Shown in the gel well not really but this is just _ 4 to illustrate the concept You can easily create a table with many columns to specify a series of complex reactions 4 Restriction maps are easy to create as shown Sheet 1 Figure 5 1 Gl Notebook Window ments will be printed Conditional Text One interesting capability available in the Gene Inspector is
223. n 157 pET23b NovaGen 158 pET23c NovaGen 159 pET23d NovaGen 160 pET24 NovaGen 161 pET24a NovaGen 162 pET24b NovaGen 163 pET24C NovaGen 164 pET24d NovaGen 165 pET25b NovaGen 166 pET26b NovaGen 167 pET27b NovaGen 168 pET28a NovaGen 169 pET28b NovaGen Page A 25 Appendix 170 pET28c 171 pET29a 172 pET29b 113 PETZIG 174 pET3 175 pET30a 176 pET30b 177 pET30c 178 pET31b 179 pET32a 180 pET32b 181 pET32c 182 pET3a 183 pET3b 184 pET3c 185 pET3d 186 pET3xa 187 pET3xb 188 pET3xc 189 pET5 190 pET5a 191 pET5b 192 pET5c 193 pET7 194 pET9 195 pET9a 196 pET9b 197 PETC 198 pET9d 199 pEUK C1 200 pEX1 201 pEX2 202 pEX3 203 pExCell 204 pEXlox 205 pEZZ18 206 pGAD10 207 pGAD424 208 pGBT9 209 pGEM 11Z 210 pGEM 11Zf 211 pGEM 13Zf 212 pGEM 15Z Page A 26 List of all Vectors Included With Gene Inspector NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen NovaGen Clontech Clontech Boehringer Boehringer Pharmacia NovaGen Pharmacia Clontech Clontech Clonte
224. n Window reduces the entire Gl Notebook to fit in a single window on the screen This option is best used to shrink a whole sheet down to the size of a window You might want to do this if you are creating a poster of a size larger than that which can fit on your screen If you have a notebook with many sheets this option will reduce the size of each sheet to a very tiny size that will prove to be of very little use e Tool Extensions Tool extensions were discussed in Tool Extensions page 5 17 This menu will allow manipulation of tool extensions Page 6 28 Menu Items Add Extension To Menu Add extension To Menu provides a way to add a new extension to the Tool Exten sions menu After naming the extension it will be added to the end of this menu Most GI Notebook objects can be added as tool extensions Remove Extension From Menu Remove Extension From Menu will remove a custom tool extension from the menu If you do not have any custom extension which have been added to the menu this option will be disabled Custom Extensions After the Add and Remove options will be a list of all the custom extension which have been defined in the application e Open For Editing Open For Editing is available whenever a notebook analysis object is selected It is similar to targeting an object see Selection vs Target page 2 1 and page 5 6 but instead of letting you edit the object in place in the n
225. n addition to the standard ones to specific nucleotide characters e Show Hide Sequence Monitor The Show Hide Sequence Monitor menu item will either open the sequence monitor which will stay visible as a palette in front of other windows or it will hide the sequence monitor The sequence monitor was discussed on page 3 9 e Display The Display menu item allows you to define how the sequence window is dis played what information is shown and what information is hidden The sequence window and its parts is shown in Figure 3 1 page 3 1 Display of each part of the window is controlled by the submenus under the Display menu Show Hide Overview Show Hide Overview will toggle the appearance of the overview pane to be shown or hidden If your sequence file contains only a single sequence or if you are viewing a multiple sequence alignment where all the sequences are the same length you might want to hide the overview pane to make more room for the sequence s itself Note that the overview pane can also be used as a navigation tool see The Overview Pane page 3 1 for more details Show Hide Ruler The ruler is the position indicator along the top of the sequence Show Hide Page 6 47 Menu Items Ruler will toggle the appearance of the ruler to be shown or hidden Show Hide Names Show Hide Names will toggle the appearance of the names of the sequences along the left side of the window to be shown or hidden
226. n alias to any notebook object not just an appendix object including graphics that you import from other programs This provides a convenient way to refer to the same analysis or object from multiple loca tions in the notebook You might also put buffers at specific locations and have aliases to them from anywhere in the notebook Putting buffers into an appendix is a convenient way to do this This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 45 Tutorials Customizing and Saving Analysis Setup Suites TUTORIAL 13 CUSTOMIZING AND SAVING ANALY SIS SETUP SUITES 1 Select Analysis gt New Analysis choose the Nucleic Acid Analysis button choose Base Composition and then press the OK button Select Dinucleotides from the popup menu and choose to Display results as Number of occur rences Specify the output type as Graph Do not run this analysis yet 2 Choose Analysis gt Add Another Analysis and add Base Distribution In the Windows version of Gene Inspector the Add Another Analysis selection is accessible through the right mouse button menu Do a purine distribution by checking just the A and G boxes Set the parameters to a window of 20 with an offset of 2 Do not run this setup panel yet 3 Choose Analysis gt Add Another Analysis again and add Restriction Enzyme Digests In the Windows version of Gene Inspector the Add Another
227. n an output object to make it the selection and note the appearance of 8 black square dots called handles along the edges of the object The selected object can be moved around on the page by dragging with the mouse The handles can be used to resize the object by clicking and drag ging one of the handles with the mouse This is similar to the way objects behave in many drawing programs 10 Save the notebook in a location on your hard disk that you will remember by choosing File gt SaveAs and give it a name you will remember You will need the notebook to continue with the tutorial Aligning Analysis Objects page 2 31 This concludes this tutorial You may quit or continue on to the next tutorial Page 2 18 Tutorials Hotlinking Analysis Results TUTORIAL 4 HOTLINKING ANALYSIS RESULTS It is often desirable to have the results of an analysis directly connected to the sequence being analyzed in such a way that if the sequence is changed the output object is recalculated automatically without you having to remem ber to do it manually The sequence might represent one that you are refining in the lab a multiple aligned consensus sequence or a contig The Gene Inspector provides this ability through hotlinks as described in this tutorial 1 Open the DNA sequence file called rhodopsins 2 Choose File gt SaveAs and save this sequence document as rhodopsins2 in a location on your hard disk that you will remember You wi
228. n the protein Since this is a probability table comparisons accepting 40 point mutations per 100 amino acids can be obtained by multi plying the PAM1 matrix by itself 40 times to give the PAM40 matrix The Gene Inspector provides PAM tables of PAM40 PAM120 and PAM250 It is actually possible to recognize sequences that are related even after 250 amino acid changes for every 100 amino acids in the sequence Sequences Table 22 PAM Values vs Change in Sequence Evolutionary Distance Observed Percent in PAMs Difference 1 1 5 5 Page 4 50 Analyses Table 22 PAM Values vs Change in Sequence 11 10 17 15 23 20 38 30 56 40 80 50 112 60 159 70 246 80 x This table is from Molecular Sequence Comparison and Alignment by J F Collins and A F W Coulson in Nucleic Acid and Protein Sequence Analysis A Practical Approach ed M J Bishop and C F Rawl ings IRL Press Washington D C 1987 p323 closely related should be compared using lower value PAM tables while the higher value PAM tables should be used to compare more distantly related sequences As shown in the table above as more mutations are allowed to accrue they recur in the same position so that at a PAM246 246 mutations per 100 residues only 80 of the amino acids in the sequence are altered The remaining 20 are enough to recognize sequences as having some degree of similarit
229. nacancn non nana n nn nana cono nannnnanaananans 6 53 insert rOwZ colum Lesoto dorado ACEN Nee Ee dE EU 6 53 Show hide column headers ccceccesccesccasecesceaeceecceeeeceeecueecueeseceseeseessees 6 53 show hide row headers ccseceeceeceecseceecaeceuceccecaeceuceeecueeeecaeeeesaeeeesaeeeees 6 53 tables BANG EE 4 13 SAMO ard EE 4 14 ue la E e WEE 4 14 LEE zea ci ed 4 14 Tables Analysis Menu ccccccccssssssccceceeeesessseeaeeececesesesaeaeeececeescssseaeeeseeeess 6 23 tearing Off MENUS EEN EEN 5 7 temperature factor erences ence cee denncbueecadecbutedeccel sseanclesvendacuaceandadbaveanctuese 4 70 temporary Space caen eap e a BEE 7 1 ELE RE 4 44 LOX TOW EE 5 10 6 30 Text Flow Notebook Menu oooncccccccccnnccncnononnnncncncnncnonn ana na nano nononanana ana nananns 6 30 EA El EE 2 4 LX SLAM GOT Z a A A LS a A a ias ds 5 10 Thornton vet alt table ii eege ee ege SAS AE A 4 ANS O A O 5 17 Tool Extensions Notebook Menu ccccconononccccnnoconononanananonanancnnnanananananananana 6 28 Tools Notebook Menu ococccccccccnononccnncnnnononononn cnn cnn conan nan cn cnn cn nano na nana nncnnnanns 6 27 transferring tables to and from Gl EEN 5 14 Translate Features Menu cononnnccccnncnnnnncnannnncnanancn ono nana na nana nn nano nan ana nanananana 6 40 translating DNA for a selected ORF ae 4 41 translation ACFOSS IN TONS serierna eaaa alaa a E N Aa aaa ia 3 17 translation EE 4 13 4 14
230. nalysis based on data from Sweet amp Eisenberg J Mol Biol 171 479 1983 This table of data is derived by correlating data from a number of other hydropathy tables and from observed amino acid replacement rates This analysis is identical to a Hydropathy analysis using the Sweet 8 Eisenberg table pH pl The pH pl analysis presents the charge on a peptide as a function of pH There are no parameters to enter in the setup panel The output is shown in Figure 4 61 page 4 64 Total positive charge total negative charge and net charge are each plotted as a function of pH The exact pl can be obtained by selecting this object and choosing Notebook Get Info You can customize the output by targeting the plot and then selecting the legend item you wish to modify Once the legend item is selected you can change its color font and pattern by using the various options available under the Format menu Physical Characteristics This analysis calculates a number of physical properties of the selected pep tides There are no user definable parameters in the setup panel The results Page 4 63 Analyses Figure 4 61 pH pl Output Plot molar extinction coefficient 11558 0 4f280 nm mole length GO amino acids molecular weight 74365 9 daltons isoelectric point 6 24 net charge at pH 1 microgram 13 4 picomoles 1 4280 nm 6 4 mari Figure 4 62 Physical Characteristics Output are shown in Figure 4 62 Prosite Motif S
231. ncated by a stop codon Figure 2 32 Adding Analyses 7 You can add and remove analyses from any Analysis Setup using the approach described in this tutorial It does not matter how the Analysis Setup Page 2 41 Tutorials Adding More Analyses to a Setup was opened as a new analysis from a recalculation or from the Setup Menu option 8 Press the Run button to start the analyses This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 42 Tutorials Appendices Hiding Large Amounts of Data TUTORIAL 12 APPENDICES HIDING LARGE AMOUNTS OF DATA Sometimes analyses produce a large amount of data that you might not want to keep displayed in the GI Notebook at all times Yet it would be nice to be able to keep the data and refer to it as needed You can do this by creating an Appendix Each appendix object resides in its own window which is hidden within the notebook but can be made visible when you want to see it 1 Choose Analysis gt New Analysis and choose to do a protein analysis that is called Align 2 Sequences global If you are not sure how to do this see Tutorial 3 Using Analysis Setups page 2 14 2 In the setup panel for this analysis press the button to use scoring table and select the PAM120 table in the Table popup menu Leave the rest of the settings in this panel at their default values 3 For input se
232. nd the next occurrence of X in a protein sequence or one of the ambiguous nucleotide characters in a nucleic acid sequence including Y R N The bottom part of the dialog allows you to define what is to be searched You can choose to search the active sequence in the current window meaning the sequence in which the insertion point currently resides You could alternatively choose to search all sequences in the current file Also for nucleic acid sequences you can specify that you would like to Search bottom strand as well as top strand The check boxes at the bottom of the dialog allow you to continue to search after the end of the document is reached by continuing the search at the beginning of the document Wrap around and to search starting at the current site and working towards the beginning of the document Search backwards Page 6 12 Menu Items Find Next Find Next its keyboard equivalent 3 G will find the next occurrence of the key word s specified in the Find dialog It is a very convenient way to quickly go through a notebook or a sequence document and find each occurrence of the key word s Use Find Next in conjunction with Find for rapid searching Find Selection Find Selection initiates a Find operation using the text that is currently selected This is an easy way to look for the next occurrence of any text or sequence you highlight It is equivalent to copying the selected text opening the F
233. ndividual sequence analysis in the format shown in Figure 4 31 page 4 36 first target the summary output object select the sequence s you want to examine and then choose Object gt Search Selected Sequences You will see an analysis setup panel with just the one sequence entered Running this analysis will show the single sequence inverted repeat analysis This is a convenient way to rapidly examine a num ber of sequence analysis in one output object GC Coding Prediction The GC Analysis uses the algorithm described by Bibb et al Gene 30 157 1984 This analysis takes advantage of the fact that organisms prefer to Page 4 37 Analyses Analysis Setup Analyses 1 Inputs 1 Outputs 1 O High Priority a Chosen sequences a b range n ambiguities A modihed gt chick muscAchRec IAB OZ GC coding prediction jee E e Output Location Add Remove gt Remove All Range A Entire sequence O Segment Linear sequence From 1 Start 1 To 2486 Path NewHampshire2 Applications Gene Inspector LG SW Show Icons Seqs DNA sequences acetylcholine receptors chick Add and Remove buttons are used to define the Chosen sequences Clicking on a sequence in the list displays its location Path Entire sequence will run all analyses using the entire length of the chosen sequence The Segment button allows the analyses to operate on only a part of the entire sequence A e in
234. ne of the items from this submenu it is possible to specify how the background text will flow around or through the selected object s Try a few of the options and move the object around on the note book page to see how the text flows around or through the object Page 2 36 Tutorials Taking Notes Using Background Text 5 Save the notebook by choosing File gt Save You will need the notebook to continue with Tutorial 10 Creating and Using Style Sheets page 2 38 This concludes this tutorial You may quit or continue on to the next tutorial Page 2 37 Tutorials Creating and Using Style Sheets TUTORIAL 10 CREATING AND USING STYLE SHEETS 1 If it is not already open choose the notebook created in Tutorial 9 Tak ing Notes Using Background Text page 2 36 2 Select a word of background text in the notebook and change it to 14 point Helvetica bold condensed and magenta use choices in the Format menu to make these changes Add this style to the Style Sheet menu for future use by selecting the text whose style you just changed and then choosing For mat gt Style Sheets gt Add Style Sheet In the dialog box that appears give the style sheet the name Magenta Text Press the OK button to add the style sheet to the Format gt Style Sheets menu 3 Click once on the first analysis output object in the notebook to select it Now choose Format gt Style Sheets gt Magenta Text Note that all the text in the out
235. nformation about a sequence when its name is selected This is shown for a nucleic acid sequence in Figure 3 2 page 3 3 This window gives you infor mation about the sequence and allows you to type in text as comments and redefine the position of the first nucleotide For nucleic acids you can choose to display the sequence as either DNA or RNA show Us instead of Ts You can also define the sequence as being circular or linear which will affect how some analyses are run For example if a DNA is defined as circular restric tion enzyme digests will find sequences that cross the origin like the Eco A Page 3 2 The GI Sequence Editor __ Dros hsp70 Info Tue Beem Description DNA Sequence Sequence created Wed Jun 21 1995 1 23 Sequence modified Wed Jun 21 1995 1 23 Sequence starting offset 1 Sequence length 5066 Number of ambiguities O Locus DROHSP7D1 S066 bp ds DNA 2 NV 15 MAR 1985 DEFINITION D melanogaster heat shock locus 8701 distal hsp7O genes ACCESSION 01104 JO1105 KEYWORDS gene duplication heat shock nentain Figure 3 2 Nucleic Acid Sequence Information site in pBR322 The protein information is shown in Figure 3 3 This contains essentially the B Pig musc Ach Recpt Info Title Pig muse Ach Recpt Description Peptide Sequence Sequence created Mon Jan 09 1995 11 31 Sequence Mon Jan 09 1995 11 31 Sequence starting offset 1 Sequence length 460 Number of
236. nk the magnification of the image so that the entire notebook will fit on the screen 6 Choose Notebook gt Notebook Layout and set the display to be side by side 7 While still in reduced mode move the different analysis output objects to fit on the notebook sheets the way you want them to 8 Now choose Notebook gt Reduction gt Enlarge To Full Sizeso that you can view the graphic results This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 48 Tutorials Restriction Enzyme Digests TUTORIAL 15 RESTRICTION ENZYME DIGESTS 1 Open the Gl Notebook you saved in Tutorial 13 Customizing and Saving Analysis Setup Suites page 2 46 and select the restriction enzyme analysis output object Make it bigger by dragging the lower right corner handle down and to the right 2 Double click it to make it the target and then select Object gt Edit Display Parameters This will bring up Figure 2 35 Set the display to show only Display parameters Show Y 3 overhangs v 5 overhangs v blunt ends with v at least D 9 sites Mino more than 1 sites L Cancel OK Figure 2 35 Restriction Enzyme Display Parameters unique cutters by checking the two check boxes at the bottom of the window and then placing a 1 in each text box as in Figure 2 35 Press OK This will display only those enzyme which cut exactly once 3
237. nnnnnnnnnnonnnnnnnnnnnnnnnannnnnnnnanos A 8 IUPAC standard nucleic acid codes ccccccceeeeeeceeeeeeeeeeeeeeeeeeaaeeeeesaaeeeeesaneeees A 7 J Janin tablet Laa ia ia A 3 K Kyte and Doolittle table como re A 3 L Ji Le UN conti A ida ataca 6 16 Lines Format Menu ENEE 6 16 Links Notebook Menu ccccccccccnconononannncnnnonononananana na nan nn conan na nan nn anar n na nn 6 34 Index 9 M M Make Aliases Notebook Menu u ccccccccccsceeeeeeececeseeeseuseueeeeeeseueeuseueeanaueaes 6 29 Manavalan Ponnuswamy table cececeececceceeeeecesenaeeeceeeeeeeeeenneeeeeeeeeneees A 3 Manipulate Sequence Menu ssssesesesesesesesesessssssnsnsssnsnssrsnereeenees 6 48 Map Keys Sequence Menu ooccccccccnononcnoonnnoncnnnnonannoncnononnnnnnannnonononnnnnnannns 6 47 mapping the keyboard ui 3 7 6 47 Mark Sites Features Menu ccnnnnnnnnccnnnnccnnnoncnnnnanana nan cn no nann nn anar nana nan 6 39 medial SISI idas doo 4 11 membrane buried regions 4 63 mesh size in median sieving siressa a aa ee eeaeee 4 13 molecular weight see protein physical characteristics moving objects to appendices AE 2 43 5 16 multiple sequence alignment custom score adronments ccccccceceescceeecaecceeceaeceecceeeecaueceeecueeaseceeesaeessers 2 25 parameter ic tias 3 10 tutorial Li a A di ce 2 24 2 26 multiple sequence features object cccceseececeeeeeeeeeeeeeeeeeeaeeeeeeesece
238. nnnononnnonnnnnnnnnnnnnnnononnnnnnnnnnnnnnas 6 19 Parker etal table 00d A 3 Paste Edit Menu ooccccccccccnononcncnnncnonononana nana n ana n an nn non anar ana nan nana nna nana nn anna nana 6 9 paste picture ER 6 10 EEN EE 6 10 pausing analyses geed eege zed es at E See ete cages an 7 3 peptide numbering Style eege eedeeg NEEN a eee 2 52 perform autorecalC cccccceeeceeeceeeeeeeeeeeaeeeeeeaaeeeeseaaeeeeeeaaaeeeessaaeeeeseageeeeessaeeeeee 2 21 ale U 4 63 preferred WEE 6 32 Previous Setup Analysis Menu ccccessssessccececeeecesseaeeeceseeecsesesaeeseeeesenssees 6 21 Pri File Menu ad Ee 6 7 printing larqe Obj6cts cita is a Be eee a 7 5 Prosite language definitions AANEREN A 5 prosite motif search TEE 4 64 protein analyses accessible surface area ccccceecceceeceeeeeeeeeeeeeeseaaeeeeseaaeeeeeesageeeessaeeeeeessaneeees 4 48 align 2 sequences global ccccessesssececeeeceesssesaeeceeeesessssaeeeeneess 2 43 4 49 align multiple sequences oocccoonccconnccnonnncconcnonnnnnnnnnnnnnncnnnnnnnnernnnnnnnnnarennnrinnns 4 52 AMINO acid COMPOSITION cece eee cece cece ce ee cette eee inania hep ee eae eaa aaa aaan 4 53 ANT GEMICILYE EE 4 54 BLAST e EE 4 70 Chou Fasman Structure prediction ccccccseeeeceeeeeeeeeeeeeeeeeeesaeeeeeesneeeeee 4 54 dot matrix Index 12 aligning sequences from within E 2 60 descripto EE 4 54 ele 2 59 reformatting taaier deduces au ea A EAS Nivedadeeduate
239. notebook by shift clicking for example it is possible to align the selected objects by using the Notebook Arrangement Align Objects menu item This presents the dialog shown in Figure 5 8 Objects can be aligned vertically and or horizontally and can be made to have the same height and or width Aligning objects is also discussed in Tutorial 7 Aligning Analysis Objects page 2 31 Page 5 10 The GI Notebook a Alignment Click in the controls to the left of and above the diagram 7 J7 Make widths the same C as widest object C as narrowest object e Make heights the same C as tallest object Cancel Figure 5 8 Align Objects Dialog Getting Information About Objects Each object in the Gl Notebook has information associated with it Different objects may contain different amounts of information associated with them At the very minimum each object can have a name and textual information associated with it The information is viewed by selecting the object and then choosing the Notebook Get Info menu item A Get Info box for an Antigenicity A Antigenicity Octopus rhodopsin Info Title Antigenicity Octopus rhodopsin Analysis Antigenicity Sequence Octopus rhodopsin Table Emini et al Created Mon Nov 6 2006 12 29 AM Modified Mon Nov 6 2006 12 29 AM Current size in memory 5482 bytes Current size on disk 3592 bytes Comments Figure 5 9 Get Info Dialog analysis is s
240. nsus row gap space Figure 6 31 Aligned Sequences Page 6 49 Menu Items help illustrate sequence alignments The parts of the consensus window are shown in Figure 6 31 page 6 49 The consensus row shows the most com mon character in that position The scoring row presents a histogram of the extent of matching in each position to the consensus sequence Shading is illustrated in Figure 3 11 page 3 13 Show Hide Consensus Row Show Hide Consensus Row will toggle the appearance of the consensus row between visible and hidden Show Hide Scoring Row Show Hide Scoring Row will toggle the appearance of the scoring row between visible and hidden Show Hide Shading Show Hide Shading will toggle the appearance of shading between visible and hidden Update Scores The score for the alignment is shown in the upper left corner of the window This score is the per cent of characters matching the consensus sequence Only identical characters count as a match Selecting Update Scores will make any adjustment needed to the score to compensate for editing changes that may have been made Note that this does not cause any realignment of the sequences it just recalculates the score Automatic Updating Turning on Automatic Updating will cause the score to be updated constantly This might be a convenient way to fine tune a multiple sequence alignment by hand However automatic updating requires some CPU time and might
241. nt analyses J Mol Biol 179 125 1984 Emini et al Based on paper by Emini et al J Virol 55 3 836 1985 this paper uses the values in Janin et al J Mol Biol 125 357 1978 To indicate surface accessibility the values in column 5 of table 1 are subtracted from 1 This analysis is different from the Surface Probability analysis which uses a differ ent calculation method Engelman amp Steitz Data from Engelman and Steitz Ce 23 411 1981 These data indicate the likelihood that an amino acid will lie in the interior of a protein This Hydropa Page A 1 Appendix Tables thy analysis is identical to the Protein Interior analysis Engelman et al This hydropathy table is based on Engelman et al Ann Rev Biophys Bio phys Chem 15 321 1986 This is identical to the Hydropathy analysis with the same table Fauchere amp Pliska This hydropathy table is based on free energy changes in amino acid side chain analogs between water and 1 octanol Eur J Med Chem Chim Ther 18 369 1983 Fraga Based on the data from Fraga Can J Biochem 60 2606 1982 These values are modified from the original Hopp Woods table to include recognition factors This represents the ability of an amino acid to be recognized by other amino acids GES From Engelman Steitz and Goldman Ann Rev Biophys Biophys Chem 15 321 1986 This is sometimes called the GES scale It is designed to show transbilayer
242. o acids to be dis played along an x axis changing the placement of axis tick marks changing the names of the x and y axes changing the font styles for axis labels and titles etc Different analyses may also allow you to reformat different parts of the display For example in the dot matrix analysis you can change the thresholds for different colors in the output display Recalculate gives you the ability to adjust analysis parameters and then recal e The exception to this is that multiple sequence alignments will create new sequence edi tor documents for their outputs Page 5 15 The GI Notebook culate the analysis Choosing recalculate will give you an Analysis Setup Win dow see The Analysis Setup Window page 4 2 which contains all the parameters used to run the analysis initially This window can be used to make any changes you want even allowing you to change the sequence being analyzed Once you have made a change s in the analysis setup the analysis will be rerun and will be placed back in the Gl Notebook in the same location and size it originally occupied when it was first selected Thus any analysis output object can be used to rerun the analysis If you copy and paste an analysis output object you can change the param eters of the copy and rerun the analysis for comparison with the original anal ysis output object Analysis output objects can also be stored as tool extensions see page 5 17 Featu
243. o analyses remain OK 12 To update the analysis choose Notebook gt Links gt Perform Auto Recalc Now You should then see the dialog shown in Figure 2 16 Click the Recalcu late Now 13 Save the notebook and close it This will save all of the analyses in their current state including the Xenopus base composition based on the sequence in the currently open rhodopsins2 sequence file Page 2 21 Tutorials Hotlinking Analysis Results s00 Base composition Lamprey rhodopsin 7 509 Base composition Octopus rhodopsin 1 500 400 400 300 300 200 RH 200 100 F wo 3 F 0 D A c G T Base composition Xenopus rhodopsin Oo 500 400 300 update needed A c G F Figure 2 15 Hotlinked Object Needs Updating Auto Recalc The following objects seem to be out of date Base composition Xenopus rhodopsin Recalculate the listed objects now Cancel Recalculate Now Figure 2 16 An Auto Recalculate Dialog 14 Close the rhodopsins2 sequence document but do not save the changes This will save the sequences in the state they were in before any changes were made during this session 15 Open the notebook you just saved You should see a dialog like the one shown in Figure 2 16 This happens because the analysis object in question was created using the changed Xenopus sequence but the rhodopsins2 sequence file was saved without the changes made while the document was open Thus the version of the sequ
244. o rapidly exam ine a number of sequence analysis in one output object Page 4 33 Analyses Find Repeats This analysis will search DNA sequences for repeats of any defined length It is similar in setup to the Inverted Repeats analysis page 4 31 The repeats can have mismatches and the maximum distance between the two parts of the repeat can be specified The output looks like the outputs shown in Figure 4 26 page 4 32 and Figure 4 27 page 4 33 for inverted repeats The only difference is that the table output does not have the last column showing for inverted repeats Find Repeats can also be run as a Summary analysis Find Sequence The find sequence analysis allows you to define and search for a complex query sequence in a target DNA The setup panel is shown in Figure 4 29 Analysis Setup Analyses 1 Inputs 0 Outputs 0 I High Priority Close Run Find New Search Edit Find Menu tataaa caat Input Sequences O Add Segment Insert Segment Remove Segmen oo Output Location Segment tataaa Max number of mismatches in this segment 0 Gap before next zen Min 0 Iw Max 100 Search Both Strands V Show Icons Style Default a Show summary results Enter sequence in Segment box define maximum number of mismatches and distance before next segment and press Add Segment button The Edit Find Menu button allows complex queries to be saved edited or removed Insert Segment and Remove Segment buttons
245. o the new Scratch Data folder It will be named Scratch Data Alias Drag the Scratch Data Alias icon you just created on the new Scratch Volume into the original Gl Data folder the one in the same location as the Gene Inspector application Rename the alias that you just moved into the Gl Data folder in step 6 to Scratch Data You will now have a GI Data folder containing a Scratch Data icon that is an alias to the Scratch Data folder on the new Scratch Volume The Gene Inspector will now use the Scratch Data folder on the new Scratch Volume to store the temporary files during analyses Discard the Scratch Data Alias from the Scratch Volume It is not needed any longer return to the previous state and discontinue using the scratch volume for The Scratch Volume should contain the extra hard disk space you want to use for hold ing temporary files during analyses You can use any local disk that is mounted on your desktop including removable media disks You should not use a remote file server vol ume because it will be extremely slow due to the large amount of data that needs to be transferred to and from the scratch folder Page 7 2 Tips For Using The Gene Inspector temporary data just remove the Scratch Data alias from the Gl Data folder The next time Gene Inspector is run it will create a new Scratch Data folder to be used for analyses Analyses That Take a Long Time The Gene Inspector is an interactive
246. odopsin TWWYNPTVOL HPHWAKFOP PDAVYYSVG FIGVVGIIGI Xenopus rhodopsin VPMSNKTGVV RSPFOYPOYY LAEPWOYSAL AAYMFLLILL bacteriorhodopsin PALAFTMLSM LLGGLTMVPF GGEONPIWAR ADWLFTTPLL Halobacterium archaerhodopsin SIP AREYYSITIL VPGIASAAYL SMFFGIGLTE VOVOSEMLDI YYARYAOWLF Lamprey rhodopsin OFPVNFLTLF VTVOHKKLRT PLNYILLNLA MANLFMVLFO FTVTMYTSMN Octopus rhodopsin LONGVVIYLF SKTKSLOTPA NMFIINLAMS DLSFSAINGF PLKTISAFMK Xenopus thodopsin JOLPINFMATLF VTIOHKKLRT PLNYILLNLY FANHFMVLCO FTVTMYT SMH bacteriortodopsin PLLOLALLVOA DOGTILALVG ADGIMIGTGL VGALTKVSRF VWWAISTAAM Halobacterium archaerhodopsin WIP TTPLELLOLA LLAKVORVS GTLVGVOALM IVIGLYGALS HTPLARY Twa Lamprey modopsin oi cyevecetmc SIEGFFATLO GEVALWSLVV LAIERYIVIC KPMGNFRFGN Octopus rhodopsin 101 KWEFGKVACOQ LYGLLGGIFG FMSINTMAMI SIORYNVIGR PMAASKKMSH Xenopus rhodopsin wi GYFIFGOTGC YIEGFFATLG GEVALWSLVV LAVERYMVVC KPMANFREGE bacteriothodopsin ISiPLILVULFEGFT SKAESMRPEV ASTFKVLRNV TVVLWSAPVV WLIGSEGAGI Halobacterium archaerhodopsin 11 LFSTICMIVY LYFLATSLRA AAKERGPEVA STFNTLTALV LVLWTAYPIL Lamprey rhodopsin ISI THAIMGVAFT WIMALACAAP PLVGWSRYIP EGMOCSCOPD YYTLNPNFNN Figure 2 6 Sequence Editor with Multiple Sequence File the GI Seqs folder This file contains multiple sequences and is shown in Fig ure 2 6 Note the overview pane of the sequence window It now displays multiple sequences and indicates their relative lengths The segment indicator box indicates which segment of each peptide is being dis
247. oosing Notebook 0pen For Edit ing An example is shown in Figure 5 11 which shows the Transmembrane Helix analysis from the Sample Notebook Note that the analysis is placed in a separate window and the original object in the notebook is not visible as an analysis but is labeled with text indicating the name of the analysis which is currently opened This object does not have scroll bars because it is all visi ble in the window For large tables or sequence alignments having the ability to scroll through the table or alignment is quite useful Changes made in the open window will be preserved in the object once it is returned to the note book Any open for editing windows will be closed automatically when the notebook is closed Analysis Output Objects Each time you perform an analysis in the Gene Inspector it will create an analysis output object in the Gl Notebook These objects are similar to the other Gl Notebook objects discussed previously in this chapter in that they can be moved around the notebook sheet and can be resized Once they are targeted however they acquire new properties Extra options for analysis objects can be accessed through the Objectmenu that appears when any analysis output object is targeted The Object menu will contain at least two items Reformat and Recalculate The Reformat item will let you change the formatting of different parts of the dis play such as changing the range of nucleotides or amin
248. or an object contains information about the frame if any so you can create style sheets that have specific frames you may find useful Median Sieving Data Sieving Many of the analyses can use a process called median sieving also referred to as data sieving This unique way of filtering data uses medians instead of means when performing sliding window kinds of analyses such as hydropathy and surface probability J A Bangham Anal Biochem 174 142 145 1988 Using a standard sliding window that calculates a mean tends to smooth out the results and therefore lose details see Sliding Window page 4 68 In sliding window analyses each position in the sequence each amino acid or nucleotide has a value assigned to it by looking it up in a table designed for that analysis A segment of characters of defined length a window is moved along the sequence and a mean value for the residues in e A drop shadow box consists of dark thick lines to the right and below the object These lines are meant to look like shadows cast by the object when a light source is in the upper left corner of the screen Page 4 11 Analyses that window are calculated The results are plotted Think about two adjacent residues having values of 10 contained within a region of 8 other residues having values of 1 the mean will be only 2 8 which is not a significant peak Using median sieving however the 10s will stand out Median sieving is not ju
249. or sirarite ipai eege eed ane eee 3 9 Sequence TEE 2 9 Shading WEE 2 25 3 12 Speak typing ee a aA ied cea lee eee eee tet dee ae 3 7 speech preferences deed de ce diia DITA Edge 3 8 Sequence Info Sequence Menu omoococccccccccncccncnnonenocononononenonncononanononononons 6 45 Sequence Info Sequences Menu onocococonononcncnonononononcnonnononenonocononaninononanons 6 44 Sequence Men idad ita 6 45 AMA MEN a E 6 49 eg EE 6 49 display dat traida ee egen 6 47 Val Le le 6 51 Index 15 S generate Tandon siksa iaraa a de ee ee EEN id 6 46 gota POSON EE 6 46 insert Cd EE 6 45 ul 6 48 Map E 6 47 UE He E 6 45 seguente INNO EE 6 45 Show hide sequence monitor cc cece cee eee cece cece eee e eee e eee eeeaee ee aa nennu nennen nnee 6 47 Speak typing enee Er A AO ena 6 46 speech Pele iria e nd 6 47 USO extra c o torcida 6 51 SEQUENCE Monti 2 10 3 9 6 47 SEQUENCE NUMDETING eocccccoonnccnnnnnnnnnnnnonnnnnnnnnnnnnnn nn ee rre 3 2 sequences GETING INTO EE 2 11 3 2 la e BEE 3 4 opening SEQUENCE files ceceeceeeeeeeceeeeeeeeeeeeeeeceaeeeeeeeeaaeeeeeeaaeeeeeeeaaees 2 9 2 15 ele E le EE 3 3 TANSIATINGA esses pe waco EE 2 11 3 4 Sequences MENU EE 3 17 6 44 CONSENSUS E 6 44 el 6 44 forinat SeQUENCES EE 6 44 SEQUENCE e EE 6 44 Shading sees Se wees ee eet ate ence ve Renee Sete ee ee aa 2 25 3 12 Nee e le E EE 2 46 Show Clipboard Edit Menu ccccecccccccceececececececeeeeececeeeseeeeeeeeeeeeeee
250. otebook itself the object is opened for editing in its own window The advantage of doing this is that you can use scrollbars to move around the object and review its component parts Closing the editing window will return the object to its place in the Gl Notebook See also Open for Editing page 5 14 This is a very useful way to view large objects such as sequence alignments that might not fit on a single notebook sheet Printing and Viewing Large Objects page 7 5 e Make Alias This is similar to making an alias in the Finder The aliases that are created are quite useful and can even be used between notebooks Let s say that you have a recipe for a buffer in a table in notebook 1 You can create an alias to this table and then copy the alias and paste it into notebook 2 In the future when you are in notebook 2 you could double click on the alias and it will open notebook 1 and select the table of interest Aliases can also be used to point to other locations in the same notebook See also Appendix Objects page 5 16 Page 6 29 Menu Items e Find Original When an alias is selected Find Original will bring the object pointed to by the alias to the front If the notebook containing the original is closed it will be opened es Bookmarks A bookmark may be named and attached to any object in the notebook Its name will be added to the Bookmarks menu and can be used to navigate to specific locations
251. other tables of data If these objects are larger than one printer page you can define your GI Notebook sheet to be much larger than a single printer page see Gl Notebook Layout page 5 4 and can print the object as part of the notebook in this way Alternatively you can open the object for editing see Open for Editing page 5 14 and print it from within its own private window Large objects can be viewed by choosing Notebook Open For Editing This will place the object in its own window along with scroll bars By choosing Show Hide Page Breaks page 5 14 you can see where the printer page breaks will occur Page 7 5 Tips For Using The Gene Inspector Page 7 6 Appendix Tables Appendix Tables Argos et al This table is based on the statistical distribution of specific amino acids in membrane vs non membrane segments for a sample set of proteins Argos et al Eur J Biochem 128 55 1982 This Transmembrane Helix analysis is identical to the Membrane Buried Regions analysis Bull amp Breese This table is based on variations in surface tension as a function of amino acid concentration This is related to the free energy of transfer between sur face and solution Arch Biochem Biophys 161 665 1974 Eisenberg et al This table is based on consensus values obtained in a number of ways optimized for alpha helical membrane domains These values are often used in hydrophobic mome
252. ove it from the list of appendices Custom Appendices After the menu items above there will be a list of all the appendices in the notebook Selecting a name from this list will cause the corresponding appen dix to open e Links The Links menu is enabled when an output object is selected Each analysis output object is the result of analyzing a specific sequence s There remains a link between the output object and the sequence which was analyzed to generated the output object The Links menu provides a way for you to specify how the output object is or is not affected when the original sequence is changed See Tutorial 4 Hotlinking Analysis Results page 2 19 for more details on how to use links Automatic Automatic updating make the connection between the sequence and the out put object into a hot link Hot linked objects will have a small symbol in their upper right corner indicating the state of the output object If no updating is needed a plain green circle appears as shown in Figure 6 20 If the Page 6 34 Menu Items 00 Dinucleotide Composition Bovine rhodopsin a Surber of Courences s 2 AR AC AG AT CA CO OG CT o GC GG GT TA TC TG TT Figure 6 20 Hot Link No Update Needed toy Dinucleotide Composition Bovine rhodopsin O a unge o POoouren ces s 2 AR AC AG AT CA CO OG CT o GC GG GT TA TC TG TT Figure 6 21 Hot Link Updating Needed sequence has changed since the output
253. pBR322 The name in brackets at the end of the path is the actual sequence name while the last part of the path name which is not in brackets is the name of the file containing the sequence chosen In this case they are both called pBR322 The length of the sequence is also shown The Output Location Panel The output location panel allows you to specify where the results of the anal ysis will be placed This is shown in Figure 4 6 In this particular instance the Analysis Setup Analyses 1 Inputs O Outputs 0 I High Priority Close Run Gec 2 Notebook outputs will bein Base distribution c a in notebook TS Input Sequences Create new output s in notebook in notebook untitled Sequence outputs will be placed in new sequence editor M Show Icons The Output Location allows the destination for the analysis outputs to be defined Depending on the situation this panel can be used to replace a current analysis output or to create new outputs in various locations Figure 4 6 Output Location Panel output will be placed into an open notebook called untitled The popup menu will show a list of all the open notebooks and also let you specify that you want to create a new notebook for the output The grayed radio button would be active if the analysis setup is being shown through a recalculation of an Page 4 7 Analyses existing output object One exception to the output location indicator is fo
254. pathy analysis is identical to the Hydration Potential Analysis Page A 4 Appendix Prosite Language Definitions Prosite Language Definitions The Prosite language was developed to enable searching of databases for very specific patterns It has the elegance of being very specific yet general Prosite patterns are described using the following conventions e The standard IUPAC one letter codes for the amino acids are used The symbol x is used for a position where any amino acid is accepted Inclusive ambiguities are indicated by listing acceptable amino acids for a given position between square parentheses For example ALT stands for Ala or Leu or Thr Exclusive ambiguities are indicated between a pair of curly brackets Y The amino acids that are not accepted at a given position are placed in the brackets For example AM stands for any amino acid except Ala and Met e Each element in a pattern is separated from its neighbor by a Repetition of an element of the pattern can be indicated by following that element with a numerical value or a numerical range between parenthesis Examples x 3 corresponds to x x x x 2 4 corresponds to x x or XXX OF X X X xX e When a pattern is restricted to either the N or C terminal of a sequence that pattern either starts with a lt symbol or ends with a gt symbol respectively e A period ends the pattern Examples AC x V x 4
255. pe in a new value in the High Mid or Low threshold box on the left Alignment indicators are of more use in peptide alignments Align Multiple Sequences Multiple sequence alignments can be initiated either as an analysis or directly from within the sequence editor See the Tutorial Multiple Sequence Align ments page 2 24 and for more details This is also discussed in the Sequence Editor Chapter Multiple Sequence Alignments page 3 10 Multi ple sequence alignments initiated with an Analysis Setup Window can contain any number of sequences from any number of files Starting a multiple sequence alignment from within a sequence document can only be done on all the sequences in that document g Alignment indicators are not shown in Figure 4 13 but are shown in peptide Figure 4 48 page 4 52 Page 4 20 Analyses Base Composition Base Composition analysis determines the composition of mono di or tri nucleotides in the sequences being analyzed The setup panel is shown in Analysis Setup Analyses 1 Inputs 0 Outputs O O High Priority Close run A Calculate the occurrence of Dinucleotides is Output type Graph Table Input Sequences Display results as iech A Number of occurences oa Op t of all Output Location _ Percent of all occurences v Show Icons Style Default D Use the Calculate the occurence of popup menu to specify mono di or trinucleotide composition Th
256. ping is shown for nucleic acid sequences and a number of shortcuts are available using the sequence monitor instead of the Sequence menu Also it is possible to map keys and to turn speaking on or off using the buttons in the sequence monitor The bottom two buttons in the sequence monitor Figure 3 9 page 3 9 allow you to verify the current sequence Pressing Read Sequence will cause the program to start speaking the sequence from the position of the insertion point in the sequence As the Gene Inspector reads each residue in the sequence the position is updated in the sequence monitor and the indi a a special kind of window that cannot be hidden Page 3 9 The GI Sequence Editor vidual character in the sequence editor is highlighted as it is spoken Pressing Confirm Re entry will allow you to type the sequence into the computer a second time As you type each character it will be compared with what you entered the first time If there is agreement between what you typed the first and second times the program moves on to the next character A disagreement will result in the computer beeping at you and keeping the character selected in the sequence editor Using either of these procedures makes it easy to enter and confirm any sequence by typing However if you already have a sequence in a file on a disk you can import it easily as shown in Importing Sequences page 3 19 Multiple Sequence Alignments The sequence
257. played in the win dow In this particular case the first three sequences have four lines dis played and the last two sequences Octopus and Xenopus rhodopsins have only three lines displayed in the sequence pane Scroll the window through the sequence file using the scrollbar on the right side of the window and note how the segment indicator indicates the visible sequences 9 Click in the overview Note how the position of the segment indicator changes and how the sequence that is the target of the click in the overview pane is now selected in the sequence editor part of the window This naviga tional tool becomes more useful as you put more and longer sequences into the a single sequence file 10 You can try other formatting options in this window using available items in the Format menu for example you can change the color or style To f Your display might look slightly different from this figure Page 2 12 Tutorials Editing Sequences select an entire sequence click on the name of the sequence in the name column Note that in the multiple sequence file each sequence line has a name next to it so that you are never confused about what you are viewing This concludes this tutorial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 13 Tutorials Using Analysis Setups TUTORIAL 3 USING ANALYSIS SETUPS This tutorial focuses on running analyses Together with
258. porarily to the sequence window The Match Adornments window will remain in front so you can experiment with different Page 3 15 The GI Sequence Editor methods of displaying your sequence The Cancel button will leave the set tings as they were in the sequence window The OK button will accept the new custom adornments that have been defined Creating a Features Object View of a Sequence The text on page 3 13 describes a quick and easy way to capture part of the sequence editor window for display in the GI Notebook or in another program as a picture by using the option key and dragging a selection in the sequence editor to be copied If you want to actually use sequence data and alter the display of the output you can create a Features Object in the GI Notebook This Features object can contain one or more nucleic acid or protein sequence s along with a translation and cleavage site indicators Like other GI Notebook objects its appearance can be altered The first step is to select in the sequence editor window the sequence seg ment you want to display in the Gl Notebook as a Features object Choose translation marked sites intron E GTG GCA TCC ACG G TGG ATC GAG CCT CTC GTG TCC ffs Sei Ata Ser Far Asp pie Cle Fre Lew Vel Ser i colt Alf eve 909 CCT CCT CGT GAG CTG TGC AGG GGG ATG AGC CGC GCT TCC hos l4 Pro Pro Arg Giu Leu Cys Arg Giy Met Ser Arg Ala Ser 2g EAN el Send alec 129 ATC TTA CTT TGG GCA ACC AAG AGT GCG AG
259. program This means that you are free to do whatever you want whenever you want to do it For example when an analysis is running you can continue to take notes or draw in the GI Note book You can launch additional analyses or even switch to another applica tion while Gene Inspector continues to run the analyses you already started This provides you with a great deal of flexibility in how you work with the pro gram You are never locked out from doing something else because an anal ysis is running The trade off for having an interactive program is that some of the operations are slower because the computer must constantly be watching the keyboard and the mouse for any user input To work effectively with the Gene Inspec tor you should learn to start analyses and then continue with other work in GI or in other applications This might take some getting used to because there is a natural tendency to watch an analysis run Because GI does not prevent you from doing other work you are never slowed down by its opera tion Temporarily Pausing Long Running Analyses Sometimes you might have a time consuming analysis executing and realize that you need to get some other result before waiting for the running analysis to complete yet you do not want to cancel the analysis that has been run ning for a while You can do this by taking advantage of the High Priority option see also page 4 3 This allows you to put on hold any running
260. proteins Signal Sequence This is a type of sliding window hydropathy analysis designed to reflect regions of a sequence that have potential for lipid protein interactions It is best used to examine membrane proteins and signal sequences on peptides The analysis is based on values from von Heijne ur J Biochem 116 419 1981 The Gene Inspector does not limit matches to amino terminal ends of proteins but will find matches in any location along the peptide Sliding Window A sliding window analysis scans along the length of a sequence and evalu Analysis Setup Analyses 1 Inputs O Outputs 0 I High Priority Close Run Calculation method Sliding Window Average Window Size 110 C Median Sieving Mesh Size 5 Input Sequences Table User charged amino acids ma O E DD Output Location Iw Show Icons Style Default Window size is the number of adjacent amino acids whose property is calculated in each iteration After calculating the value for the first window of amino acids the window is moved one residue along the sequence and a value is calculated again for the new window of amino acids Median Sieving emphasizes data having a specific distribution U a Bangham Anal Biochem 174 142 1988 Figure 4 68 Sliding Window Setup ates the residues for a specific property The property being determined is based on the values in a user defined table Thus the analysis displays a property of a peptide
261. pup allows you to choose the database to be searched Figure Analysis Setup 7 Analyses 1 Inputs O Outputs 0 C High Priority Close R Query nucleic acid vs nucleic acid sequence database Filter out low complexity matches aoe Expect value 10 10 e Input Sequences Word Size 11 EJ mo ae D Output Location Gap Cost Existence 11 Extension 1 1 Align best matches number of best matches to align om Number of hits to keep show cons sive Pap 2 Query defines the database to be searched Align best matches aligns the top matches indicated in the box to the right Filter out low complexity matches ignores matches found frequently in the database Request Results as HTML allows you to save results to open in a web browser to take advantage of links You can specify the subject for the return email and have the results sent to your email address or a Gl notebook Figure 4 45 BLAST DNA search 4 45 shows the dialog for the nucleic acid BLAST search The Query popup Page 4 47 Analyses menu allows you to define the database that is to be searched Align best matches will in addition to a list of top matches present a number of align ments of these top matches Filter out low complexity matches will per form filtering as described on the BLAST web site it basically removes matches with sequences that occur frequently in the database These may or may not be important to
262. put object changes to match the style you just defined The style sheet defined the style of text and when applied to the output object it caused all of the text in the object to take the new style 4 Double click on the second output object to make it the target Let s change the styles of some of the other parts of the graph Click once on the part of the targeted object you want to alter e g the title or an axis label and then use the Format menu to try font style and color combinations You can select multiple items in the targeted object by shift clicking click on one object and then hold down the shift key while clicking on additional objects all objects clicked will be selected Once you have the graph looking the way you like choose Format gt Style Sheets gt Add Style Sheet and if presented with a choice use the entire object button to define a style sheet named Graph Format This style sheet represents all the formatting information of all the items in the output object 5 Select a different graph object click once and then apply the new style sheet you just created by choosing it from the Style Sheets submenu as before h If any part of the targeted object is currently selected you will be given a choice of add ing styles corresponding to the selection or to the entire object If nothing is selected in the targeted object you can only create a style sheet corresponding to the entire object Page 2 38
263. quence It provides you with a graphic overview of the matches found and serves as a starting point for further analysis Without using the Summary Analysis option in the setup panel the results would have been dis played as nine separate Find Sequence analyses Page 2 28 Tutorials Running Summary Analyses Oe Choose Object Select sequences for analysis ej f Peptide sequences a Q search E oo M o T Network tide sequences P W actins El NewHampshire2 iQ cytochromes b5 o GCK25 KS M 2 i Drosophila HSPs 2 2 E coli be ctosidase 3 SS Desktop A lactate d rogenases S bobgross i rat globins hr Applications W rhodopsins Documents 7 Serializer 4 Name acetyicholi 3 KeyServering f receptors D Ee q Size 48 KB f a unilives lt i a Add Selected Files and Folders Show Open Sequence Documents _ Sequences in documents Chosen files and sequences Chick musc Ach Recpt la 0 acetylcholine receptors Chick nic Ach Recpt 0 Dros musc Ach Recpt Dros nic Ach Recpt Human mus Ach Recot d Add Selected Sequences Remove Selected Items New Folder Cancel 3 E Done Figure 2 21 Choosing Sequences for Summary Analysis Find sequence summary acetylcholine receptors Matches i Locations Figure 2 22 Summary Result Output Object 7 Double click on the output object to target it and then click on the Chick musc Ach Recpt item Now choose the Object gt Search Selected Sequ
264. quence Editor document for easy reference This has been done in the vector sequence files provided with Gene Inspector all vectors from a given vendor are grouped into a sin gle file At a more advanced level you can take advantage of the Sequence Editor s ability to contain multiple sequences to generate calculated sequences such as a consensus sequence and to perform multiple sequence alignments see Align Multiple Sequences page 4 20 and page 4 52 Hotlinks see Links page 6 34 enable a sequence to be actively con nected directly to one or more analyses By hotlinking a contig or a consen sus sequence to a set of analyses it is possible automatically to generate new and up to date analyses from these generated sequences as your contig grows or as you refine the consensus sequence Page 1 5 Getting Started with Gene Inspector Analysis Setups Analysis Setups provide a container in which single or multiple analyses on One or more sequences can be defined Through the ability to install new Analysis Setup Windows as menu items you can assemble a set of custom analyses an analysis suite having all the parameters defined in a way which works best for your purposes The entire analysis suite can be reused simply by selecting the item from a menu Because of it s similarity to the way you are used to working in other applica tions and operating systems you already know how to use Gene Inspector s Analysi
265. quences choose the peptide file Drosophila Hsps and select both Dros hsp26 and Dros hsp27 for the alignment Please refer to previous tutorials if you are not sure how to do this 4 Run the analysis by pressing the Run button 5 The output object that gets placed into your notebook is rather large to begin with and some of the alignment is invisible because it is below the bot tom edge of the object Double click on the output object to select it Hold down the option key Mac or alt key Windows and drag the mouse verti cally over the alignment object The cursor changes to a hand and allows you to move the contents of the object around within the object s borders Option alt dragging will work in any output object that has more data than is visible within the object boundary 6 One alternative to option alt dragging is to put the object into an appen dix window Select the alignment object and then choose Notebook gt Appendices gt Move Object to Appendices The dialog shown in Figure 2 33 page 2 44 will appear The action you are taking will create a new appendix window to con i Another way to do this is to open the object in its own window see Open for Editing page 5 14 Page 2 43 Tutorials Appendices Hiding Large Amounts of Data El Move To Appendix Creating appendix from object named Alian 2 sequences alobal Dros V Replace object with alias to appendix Specify the text for the alias Align 2
266. quences from Sequence Editor alignments directly within the sequence editor document 1 Open the peptide sequence file called rhodopsins 2 Choose Sequence gt Alignment gt Align All Sequences 9 This will bring up Figure 2 17 The parameters displayed in this dialog are described in more detail elsewhere Multiple Sequence Alignments page 3 10 and represent values needed by the algorithm to perform the alignment For now choose the BLOSUM62 table and leave all the parameters at their default values The analysis table is chosen using a popup menu Press and hold the mouse but ton down on the word None next to the word Table at the top left of the anal ysis panel This will cause the appearance of a popup menu Choose Standard gt BLOSUM62 3 Press the Align button to start the process of aligning all the sequences g You can align a subset of sequences in a given file or sequences from multiple files by performing a multiple sequence analysis Multiple Sequence Alignments page 3 10 Page 2 24 Tutorials Multiple Sequence Alignments in the document Progress is indicated in the upper left corner of the sequence window just above the column of sequence names 4 After the alignment is complete you will see the sequences aligned in the sequence document window Our goal will be to use the Gene Inspector fea tures to change the view to look like that shown in Figure 2 18 5 Choose Sequence gt Display
267. quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 54 Tutorials Testcode An Interactive Analysis TUTORIAL 17 TESTCODE AN INTERACTIVE ANALYSIS The Gene Inspector is an interactive application This means that you can perform almost any function at almost any time It also means that you can interact with some output objects to alter their appearance rerun an analysis or even continue the analytical process as described in this tutorial for Tes tcode and in Tutorial 18 Dot Matrix Analysis Another Interactive Analysis page 2 58 1 The output from the TestCode analysis can be used to generate addi tional information or to launch further analyses This ability is in addition to the ability to recalculate each analysis Choose Analysis gt New Analysis and select the nucleic acid analysis called TestCode 2 TestCode is used to determine if an open reading frame is likely to actu ally code for a protein More details about the analysis can be found in Tes Code page 4 44 The TestCode panel is shown in Figure 2 39 We will not Analysis Setup Analyses 1 Inputs 1 Outputs 1 O High Priority Close C run 3 Window size nucs 200 Replace ambiguous characters with A 19 Method 8 Start and Stop Codons a Only Stop Codons Min length ORF to consider amino acids 200 Input Sequences 0 go A Output Location Display O Only ORFs Ta
268. r multiple sequence alignment analyses In this analysis the aligned sequences are placed in a new sequence window Adding Analyses to an Analysis Setup Window This topic was the focus of Tutorial 11 Adding More Analyses to a Setup page 2 40 After an Analysis Setup Window is created and is still open addi tional analyses can be added to it This is done using the Analysis gt Add Another Analysis menu item Additional analyses are added using the Analysis Chooser just as for creating a new analysis Analyses can be removed from the analysis setup by selecting the analysis icon to be removed on the left of the Analysis Setup Window and then using the Analysis gt Remove Analysis menu item As analyses and sequences are added to or removed from the Analy sis Setup Window the total number of analyses total number of input sequences and total number of analysis output objects are continually dis played in a box at the top of the Analysis Setup Window see Figure 4 4 page 4 5 Adding Analysis Setups to the Menu This topic was discussed in Tutorial 13 Customizing and Saving Analysis Setup Suites page 2 46 Once you have defined all the parameters for a given Analysis Setup Window and all its panels you can add the entire Setup to the Analysis menu where it will be easily available for future access Choose Analysis gt Add Setup to MenuAs and then name the Analysis Setup when prompted to do so You can later recall
269. r a query sequence in both upper and lower case letters and then you allow mismatches to occur the mismatches will only be allowed to occur in the lower case characters For example if the search sequence for the tataaa were entered as TATAaa and one mismatch was allowed the matches discovered in Figure 4 31 would only consist of the match starting at Page 4 36 Analyses 572 the only one having a mismatch in the last two characters Two other points should be mentioned about the output object Notice that the output position indicators are shaped like golf clubs or hockey sticks if you are from a colder climate The horizontal part of the output indicator corre sponds to the length of the search sequence segment The vertical part of the indicator is the actual site of the match As with other outputs in the Gene Inspector if the entire table or graphical output does not fit in the output object box you can drag the data within the output box by holding the option key down and dragging with the mouse The Show summary results checkbox in Figure 4 29 on page 4 34 will create a single output containing the found matches in all the sequences you have chosen The output from a summary inverted repeat analysis is shown Sequences Mato Locations Figure 4 32 Find sequence summary analysis in Figure 4 32 For this analysis the summary results are presented in table format To see the result of an i
270. r this tutorial just double click on the octopus rhodopsin sequence in Page 2 16 Tutorials Using Analysis Setups the bottom left Octopus rhodopsin will appear in the bottom right list of Cho sen files and sequences One or more sequences can be chosen from one or more sequence files The analysis will be performed on every sequence in this list on the right Also choose Lamprey rhodopsin and add it to the Chosen files and sequences list on the right Press the Done button to indicate that you have no more sequences to be analyzed 5 Finally you need to specify a location for the output from the analysis Analysis Setup Analyses 1 Inputs 2 Outputs 2 High Priority Close run A Lo Notebook outputs will Transmembrane helices Replace existing output s in notebook Input Sequences A Create new output s in notebook in notebook GI Notebook Tour ES OUtput Eocstion Sequence outputs will be placed in new sequence editor v Show Icons The Output Location allows the destination for the analysis outputs to be defined Depending on the situation this panel can be used to replace a current analysis output or to create new outputs in various locations Figure 2 11 Output Location Panel This is done using the Output Location panel which can be selected from the icon list on the left and is shown in Figure 2 11 Using the popup menu in this panel you can specify that the output generated will be placed in a
271. raphic object that was created in another program and pasted into the notebook to serve as a recognizable marker in this case as a graphic that can easily identify the location of a bookmark Graphic objects like this one can be stored within the notebook for easy access Choose Notebook gt Tool Extensions gt dot marker The mouse cursor will change to an sz tool extension pointer Click the mouse button on the notebook sheet to the right of the background text A new dot marker appears You can add tool extensions of your own creation easily as a means of storing graphics and other objects you might want to use repeatedly 6 Bookmarks can be used to navigate to different locations in the notebook Choose Notebook gt Bookmarks gt Analyzing the Peptide This bookmark will take you to a location in the notebook which contains peptide analysis results These analyses were generated by the Gene Inspector and represent the results of analyses of a peptide coded for by the DNA being cloned in the cloning proj Page 2 4 Tutorials Tour of a Gene Inspector Notebook ect that is the subject of this notebook Performing analyses is discussed in Tutorial 3 Using Analysis Setups page 2 14 For this tutorial we are focus ing just on the GI Notebook 7 Let s return to the top of the notebook again by choosing Notebook gt Book marks gt Objectives Since bookmarks can be given meaningful names and attached to any object in the
272. re slightly differ Import gt S Page Setup Export gt ent as shown at the right pre Giep Th i 7 Print Notebook and Appendixis Page Setup e menu items are ink Ss SE Print Notebook and Appendixes Exit Ctri Q e New Set Alias Resolution Rules New allows you to create new Gl Notebooks nucleic acid sequence docu ments or protein sequence documents It will bring up the dialog box shown in Figure 6 1 Use the radio buttons in the top part of this dialog to choose the document type you wish to create and then enter the name to be used for the new window Note that this does not create a file on disk and does not r New File New file type fei Notebook Peptide Sequence Nucleic Acid Sequence Figure 6 1 Create New Dialog save what you will enter into the window This just creates a new window for you To save the contents choose File Save As Save As page 6 3 e Open Use Open to open an existing document file on disk As shown in Figure 6 2 page 6 3 you can open any of the three types of documents Gene Inspector can create The check boxes at the bottom of the window will deter mine the kinds of document which will appear in the scrolling area of the win dow If you have only Notebooks checked then all you will see in the Page 6 2 Menu Items Look jr E gt DNA Sequences Bb ececyichotins recptors nuc nes ru LE E Dros 185 rRNA nae rat globine nuc wan la
273. refer to Tutorial 3 Using Analysis Setups page 2 14 For our purposes in this tutorial we will do only a simple mononucleotide compo sition as a graph as shown in Figure 2 12 page 2 19 6 Because the sequences were selected in the sequence editor document they are already entered as sequences to be analyzed as shown in Figure Analysis Setup Analyses 1 Inputs 3 Outputs 3 High Priority Close run 3 Chosen sequences a b range gn ambiguities A modified ESRipreythodepsin Gus le28i 67 me Octopus rhodopsin 1 1675 0 Kv Xenopus rhodopsin 1 1684 0 Output Location Add Remove gt Remove All Range Zei Entire sequence O Segment Linear sequence From 1 Start 1 To 1628 EI Sowo Path NewHampshire2 Applications Gene Inspector 1 6 Gl Seqs DNA sequences rhodopsins Lamprey rhodopsin Add and Remove buttons are used to define the Chosen sequences Clicking on a sequence in the list displays its location Path Entire sequence will run all analyses using the entire length of the chosen sequence The Segment button allows the analyses to operate on only a part of the entire sequence A Tel in the Chosen sequences list indicates that the segment is truncated by a stop codon Figure 2 13 Sequences Are Already Chosen 2 13 If you had selected a range of nucleotides in a sequence that range would be indicated and the Segment button would be on 7 Run th
274. res Objects Features objects are designed to enable the display of annotated sequence information within the GI Notebook One or more protein or DNA sequences can be display and then enhanced This object is discussed in detail in Cre ating a Features Object View of a Sequence page 3 16 If you have multiple sequences in your features object you will see a Sequences menu Sequences Menu page 6 44 while a single sequence features object has a Features menu Features Menu page 6 39 Appendix Objects Any Gl Notebook object can be moved into an appendix When the object is moved to an appendix an alias to the appendix is created in the notebook The alias in the notebook can be used to access the appendix An appendix contains the full object in a separate window which is associated with the notebook Double clicking on the alias will open the corresponding appendix in the same way an alias can open a file in the Finder An object can be moved to an appendix by selecting the object and then choosing Notebook Appendices Move Object to Appendices A list of all appendices will be available as part of the Notebook Appendices menu Selecting the appen Page 5 16 The GI Notebook dix name in the Appendices menu will open that appendix window This pro cess is described in more detail in Tutorial 12 Appendices Hiding Large Amounts of Data page 2 43 Appendices can be used to store large objects s
275. rial You may quit or continue on to the next tutorial If you choose to continue close all open windows now Page 2 62 Tutorials Creating Your Own Analysis Tables TUTORIAL 20 CREATING YOUR OWN ANALYSIS TABLES The Gene Inspector allows you to create a number of different kinds of tables for use in analyses The Gene Inspector s built in table editor makes it easy 1 Choose Analysis gt Tables gt Create New and you will see a dialog like that E New Table New Table Type New C Protein Cancel Existing tables Analyses of selected type for selected analysis E Align multiple sequences Nucleotide identity CodonPreference Dot matrix GC coding prediction Open reading frames TestCode Figure 2 45 Creating a New Table shown in Figure 2 45 For this tutorial choose to create an empty nucleic acid table by selecting the items shown in the figure In this case we will be creat ing a scoring table for aligning 2 sequences The table we are creating is for nucleic acids but the same procedure would be followed for amino acid tables as well Press the New button when you are ready If you had selected the Nucleotide Identity table in the list on the right the values from the Nucleotide Identity table would be entered into the new table for you to modify An lt empty gt table will be filled with zeros 2 Pressing the New button generates a nucleotide comparison table This table contains all pairwise co
276. roblem and to provide some statistical information about the alignment Gene Inspector provides a way to calculate a Z score To generate a Z score several steps are performed First one of the sequences call it sequence A is shuffled the bases in the sequence are scrambled into a random order This preserves the base composition but not the sequence An alignment is now done between the shuffled sequence A and the non shuffled sequence B and an alignment score is calculated The process is repeated again with a new version of a shuffled A being compared to a non shuffled B The process is repeated a number of times and a mean score and a standard deviation of these alignment scores is calculated The Z score is the number of standard deviations the true alignment score is away from the mean for the score of the shuffled alignments Gene Inspector also shows the number of alignments with the shuffled sequence that had a better score than the true alignment using the unshuffled sequences Using the input panel Figure 4 12 page 4 17 you can decide whether or not to calculate a Z score and how many iterations should be done Calculating a Z score can take quite a bit of time because the program needs to perform Page 4 18 Analyses many additional alignments Note that alignment times can be quite lengthy if you choose long sequences The time to perform an alignment calculation is proportional to the product of the lengths of
277. run e Set Alias Resolution Rules Mac only Many users on the Macintosh store files remotely When the Operating Sys tem tries to find a file on a remote volume and the volume is not readily found the OS will try a number of ways of accessing the volume Sometimes this can take a long time This dialog will allow you to specify how hard the Mac should try to find the wayward volume e Quit Mac Exit Windows Quits the application and closes all open windows If any changes have been made you will be offered an opportunity to save them before closing them Page 6 8 Menu Items Edit Menu OT The items in the Edit menu are standard operating system options E pa f d Special Paste e Undo SE Seti Select All SA Undo allows you to undo the last ech za Show Selection ag ll Crop Option operation you performed Fire amp Replace p y p TEM Pa Drag amp Drop pions s Cut Show Clipboard Cut transfers the current selection to the clipboard Once on the clipboard it can be pasted elsewhere If pasted into a document from an application other than Gene Inspector you will be able to retrieve either the text of the clipboard object or a picture of the clip board object depending on the receiving application Other applications do not understand the information that is used internally within the Gene Inspec tor how to format a sequence parameters for running analyses style sheets etc and
278. s Setup Window There is a scrollable list of icons on left representing different functions When an icon is selected from this list relevant information about the selected icon s function is displayed in a panel on the right Through this mechanism you can choose an Analysis icon define analysis parameters specify sequences to be analyzed and define output locations Each analysis has its own icon and corresponding panel The Input icon allows you to define which sequence or sequences are to be used in the analyses You can choose sequences from one or more files Every sequence chosen will be analyzed by every analysis listed in the Anal ysis Setup Window The Output icon gives you the ability to decide which Gl Notebook will be used to receive the analysis results You may also specify where within a given GI Notebook the results should be placed Because an Analysis Setup Window can be named and added to the Analysis menu once a set of parameters has been optimized for your particular needs and output styles have been defined it can be accessed by anyone in the lab even new members of your group who may not be all that familiar with what is important to your particular analyses GI Notebook The Gl Notebook is designed to serve three main functions e It can serve as a day to day electronic laboratory notebook Page 1 6 Getting Started with Gene Inspector e It is the place where output from analyses is placed e It c
279. s are shown in Figure 4 64 The site names are on the left and vertical tick marks along the horizontal lines are used to identify the locations of sites A row can be selected by clicking with the mouse after targeting the object as shown for CAMP_PHOSP in Figure 4 64 page 4 65 and the information about the motif obtained by choosing Object Get Info About Selection This brings up Figure 4 65 describing what is known about the particular motif You can also view the output as a table by choosing Object View as Table when the output is targeted This is shown in Figure 4 66 The position of the Page 4 65 Analyses E CK2_PHOSPHO_SITE Casein kinase II CK 2 is a protein serine threonine kinase whose activity is independent of cyclic nucleotides and calcium CK 2 phosphorylates many different proteins The substrate specificity 1 of this enzyme can be summarized as follows 1 Under comparable conditions Ser is favored over Thr 2 An acidic residue either Asp or Glu must be present three residues from the C terminal of the phosphate acceptor site 3 Additional acidic residuesin positions 1 2 4 and 5 increase the phosphorylation rate Most physiological substrates have atleast one acidic residue in these positions 4 Asp is preferred to Glu as the provider of acidic determinants 5 A basic residue atthe N terminal ofthe acceptor site decreases the phosphorylation rate while an acidic one will increase it
280. s code available to the community so that sequence analysis programs can be made to read each other s files and we are using it with his permission Currently you can import the following 10 Text file formats in addition to the Gene Construction Kit and DNA Inspector Ile formats DNAStrider EMBL Fitch GCG GenBank GB IG Stanford NBRF Pear son Fasta PIR CODATA and Plain Text Note that it is also possible to import sequence that are just plain text DNA or protein sequences using this method they count as plain text in the Plain TEXT category When sequences are imported all inappropriate characters are filtered out Note also that if you have the wrong format you might end up with some characters from comments as part of the sequence So if you are unsure of the format of your sequence file edit it first with a word processor and just save it as a text file and then import it Sequences that are imported can be placed into a new sequence document or can be added to existing sequence documents Use the radio buttons at the bottom of the dialog Figure 3 16 page 3 19 to define what the program will do with the sequence information it imports Generating Sequences There are two methods of generating sequences in a sequence editor docu ment The first one is to insert Ns into a nucleic acid sequence or to insert Xs into an amino acid sequence This is accomplished by choosing Sequence gt InsertNs or InsertXs You will be
281. s information about the set of options in the cur rent Analysis Setup Window This includes the number of analyses chosen the number of input sequences and the number of output objects that would be generated by this analysis setup These were discussed in Tutorial 3 Using Analysis Setups page 2 14 The Analysis Monitor The analysis monitor provides information about the analysis you are running and is shown in Figure 4 3 page 4 4 You can view the analysis monitor by choosing Analysis gt Show Analysis Monitor This window provides information about the state of the analysis the name of the analysis in the queue the percent Page 4 3 Analyses 8000 Analysis Monitor State Analysis Name Finished Priority E si Der ea Gree kerio EDSS sc a areas me Run Dot matrix Dros hsp70 Dros hsp27 28 High Run Dot matrix Dros hsp70 Dros hsp26 23 High Pause Dot matrix Dros hsp70 Dros hsp83 0 Low Pause Dot matrix Dros hsp70 Dros hsp27 0 Low Pause Dot matrix Dros hsp70 amp Dros hsp26 0 Low Figure 4 3 The Analysis Monitor age complete and whether the analysis is High or Low priority The High Priority button in the setup panel can be used to temporarily pause any running low priority analysis and start running the high priority analysis defined in the current Analysis Setup see also Analyses That Take a Long Time page 7 3 Pressing Run with the high priority button checked will create a situation
282. s is a useful command if the editing operations you perform on the table causes it to shrink or grow Page 6 54 Tips For Using The Gene Inspector Chapter 7 Tips For Using The Gene Inspector Using Extra Disk Space for Analyses The Gene Inspector has been designed to allow you to work with very large sequences We have done this in anticipation of results from the Human Genome Project Many other sequence analysis packages limit you to working with what fits in the RAM memory of your computer To handle analysis of sequences larger than your available memory requires storing sequences on disk and storing temporary data on disk while the analysis is being run Some analyses like dot matrix and global sequence alignment require a significant amount of disk space roughly proportional to the product of the length of each sequence As a consequence of using disk space instead of RAM space for storing files during analysis some of the analyses those which require alot of reading and writing of information will be slower in the Gene Inspector than in other programs This is a trade off for allowing you to work with large sequences The Gene Inspector stores its temporary working files in the Gl Data folder in a folder called Scratch Data The GI Data folder needs to be in the same folder as the Gene Inspector itself All temporary files are stored in the Scratch Data folder during an analysis If you have alot of extra disk spa
283. s with a given peptide sequence see Prosite Motif Search page 4 64 It is an alternative to the Find Sequence analysis The setup is shown in Figure 4 52 In this setup panel you can define query sequences using the language shown in the window see also Prosite Lan guage Definitions page A 5 n the example shown here the search is for Page 4 56 Analyses Find sequence Prosite style Chick LDH 1 51 101 151 201 251 301 S ET ET EEE EE D E E e H H A IA 1 1 D 1 1 DI Figure 4 53 Find Sequence Prosite Style Graphic Output RKDE 2 5 This means that we are looking for a stretch of 2 to 5 charged amino acids Arg Lys Asp Glu As the other examples given in this window illustrate you can be very specific in your search criteria As shown in Figure 4 53 the output indicates positions of the matches By choosing Object View As Table you can see the positions as shown in Figure Find sequence Prosite style Chick LDH First aa Matching sequence RKDEJC2 5 KD Figure 4 54 Find Sequence Prosite style Table Output 4 54 The query sequence is shown at the top of the second column and every subsequent line shows a match with that query sequence This analysis can be run as a summary analysis See page 4 37 for more details Page 4 57 Analyses GOR Structure Prediction The GOR analysis is based on the paper by Garnier Osguthorpe and Rob son J Mol Biol 120 97 120 1978 which
284. se the rest of this panel will be as it was originally however all you need to do is select a new sequence for analysis If no analysis has been run this menu option will not be enabled e Show Hide Analysis Monitor Show Hide Analysis Monitor allows you to either show or hide the Analysis Moni tor The Analysis Monitor is shown in Figure 6 12 page 6 22 It contains a list of the analyses which are currently scheduled to run The top item in the list is the analysis currently in progress The percentage complete is also indi Page 6 21 Menu Items Analysis Mon tor pe State Analysis Name Finished Priority Run Hydropathy Lamprey rhodopsin 80 Low ai H Hydropathy Octopus rhodopsin 0 Low 1 D Hydropathy Xenopus rhodopsin 0 Low D Hydropathy bacteriorhodopsin 0 Low Hydropathy Halobacterium archaerhodopsin 0 Low Hydropathy Lamprey rhodopsin 0 Low D Hydropathy Octopus rhodopsin 0 Low i e Hydropathy Xenopus rhodopsin 0 Low A AAA Y AA Y PA Figure 6 12 The Analysis Monitor cated One can click on an analysis name to select it and then press Cancel to remove that analysis from the queue Analyses can be classified as low or high priority By default all analyses enter the list as low priority analyses Starting an analysis as high priority can be done as shown in Figure 6 13 When the high priority analysis is Run it will enter the analysis queue at the top and will start to execute immediately Any runn
285. senesenesenens 6 14 Show Selection Edit Menu ooonccnnnnnnnnnnccnncnnnninanananana nana n cnn nana nan nn nr n cnn conan 6 11 show summary results find SEQUENCE vrieska a A suave weed suaetdgddonncdete 4 37 inverted e EE 4 32 Show Hide Analysis Monitor Analysis Menu cccccccnocnnncnonnncnncnnnacanononononos 6 21 Show Hide Column Headers Table Menu occcccccccccnononancnnnnnanoninanananinanana 6 53 Show Hide Page Breaks Edit Menu ccsscssccececeesesssseaeeceeeesessseaneeeeeees 6 14 Show Hide Row Headers Table Menu cccccccccsscsseeseeeeceseeseeseeueeeeseeeenseuees 6 53 Show Hide Sequence Monitor Sequence Menu cccccnncnccccnncnnnnonaccncnnnnnnnnos 6 47 side chain flexibility s aee esanera aiea aeea a a e aE paha sees eaaa Eaa ener 4 67 side chat protruSiOn iiaeia aa aa d e ERR 4 69 Sidebar Menu adjust size to One ed ee Ee dit A T 6 52 Index 16 Sidebar ah ed sieges deeg Sege e geg E deed dite Ee CS 2 4 5 11 GIE UE 4 68 site markers show hide in features object cccceeesceceeeeeeeeeeeeeeeeeeeeeeeeeeaees 3 18 Size Format Menu EES 6 17 GE e NEE 6 16 sliding window accessible surface area ENEE eee ee rr erre 4 48 ANUGSNICIY sete tee eee Ashe ee i itl ed ER ee 4 54 hydration potential EEN 4 62 hydropathy io ala 4 60 membrane buried regiOnS 4 63 optimal matching hydrophobicity AAA 4 63 OVEIWIGW EEEE EEEE ATEETAN e EE 4 68 lte Re 4 67 Side Chain ProtrUSiON ENEE EE 4 69 GEIER UE 4 68
286. sequences global Dros hen amp Nene hen 7 Figure 2 33 Creating an Appendix tain the analysis object and also will create a new alias Mac or shortcut Windows in the GI Notebook to the appendix object The alias will be placed in the Gl Notebook instead of the original object and will point to the original object just like an alias in the Finder Fill in the text in the dialog box to suit your needs and then press OK You will see an alias that looks like type of object pointed to by the alias Align 2 sequences global Dros hap 2d E Oreos bape text you entered to describe the alias Figure 2 34 A Notebook Alias Object Figure 2 34 7 The resulting Gl Notebook alias can be framed or styled just like any other notebook object To edit the alias text double click on the text in the alias and then edit the text 8 To view the contents of the appendix either double click on the appendix icon or choose the name of the appendix from the Notebook gt Appendices sub menu Any appendix you create will be in this menu By placing information into an appendix it can be viewed from any place in the notebook This is a convenient way to store often used information like buffers 9 To return an appendix to the Gl Notebook and remove it from the appendix area choose Notebook gt Appendices gt Return Appendix to Notebook Page 2 44 Tutorials Appendices Hiding Large Amounts of Data 10 You can make a
287. sible to create Gl Notebook sheets that are larger than one printer page in size If you would like to see how the objects in a notebook sheet will be distributed over a printer page boundary you can Show Page Breaks With page breaks displayed you can arrange objects on the GI Notebook sheet in such a way as to minimize the number of objects that actually cross the printer page boundary The placement of the printer page indicators is calcu lated using the information provided to the Gene Inspector through the Page Setup dialog page 6 7 Page breaks in sequence documents are also useful to determine where a sequence will be broken when you print it By changing font and font size in a sequence document and monitoring the location of page breaks you can get exactly what you want to print Page 6 14 Menu Items Windows Menu e Stack Windows Stack Windows will organize all open windows within the Gene Inspector and stack them neatly with all their title bars visible This is sometimes desirable when many windows are open to help you find windows easily e Current Window Names After the StackWindows item will be a list of all currently open windows Select ing one of the window names will bring that window to the front to be edited Page 6 15 Menu Items Format Menu rm Fil L Lines The Format menu contains formatting options t Font k that pertain to the formatting of a large num KE H S ber of objects or par
288. sly this is not a true biological intron but is being used here just to illustrate the functioning of the Features object Page 2 51 Tutorials Displaying Formatted Sequence Information sequence 9 Your display should now look like figure like that in Figure 2 36 page 2 1 TTCTCATGTT TGACAGCIAA SMS SSRIS SGM 50 1 PheSerCysL euThrAl 6 51 EA Maa a SA UA AT 10 D open 7 101 GCGCTCATCG TCATCCTCGG CACCGTCACC CTGGATGCTG TAGGCATAGG 15 8 AlaLeulleU allleLeuG l yThrValThr LeuAspAlaU alGlylle 23 Figure 2 36 Features Object with Translation 52 Notice how the translation skips over the intron and how even though the codon is interrupted by the intron it is reconstructed by the program 10 We need to adjust the grouping of the DNA sequence characters to align better with the translation Select nucleotides 1 17 and choose Features gt Group ing gt Groups Of Three Notice how the intron grouping is adjusted to remain in groups of ten but now the grouping starts with the first character of the intron 11 Select nucleotides 98 150 in the second exon and group it by threes as you did in the previous step 12 Select the peptide sequence by clicking on it once Using the Format menu change the peptide sequence to Arial 9 point italic and color it red Notice that the numbering style is set to match the actual peptide sequence This is the only way the peptide numbering style can be changed 13 Now let s add some restric
289. st a sliding median though and really rep resents a sieve that emphasizes features of certain size A sliding median is good at presenting rapid transitions within long term events in signals in a noisy background However sometimes a median window can distort the data and present results in a misleading way by presenting sharp peaks that just represent an anomalous high value Median sieving addresses this problem by looking at segments in a way in which multiple high values will keep their values and not be averaged while at the same time requiring more than one high point to cause a peak in the plot Bangham states this by saying that there are two disadvantages to sliding means First peculiar residues that do not share the properties of most of the amino acids in the domain may prevent its identification Second as a low pass frequency filter the running mean smooths sudden transitions from one domain or phase to another Data sieving is based on a running median and is characterized by a single parameter the mesh size which controls its resolution Data sieving is very good at smoothing noisy data while maintain P Hydropathy Kyte amp Doolittle Chick musc Ach Recpt Hydrophobic h bh o n T T T T T 201 301 401 Amino acid Hydropathy Kyte amp Doolittle Chick musc Ach Recpt T 1 101 Hydrophobic LG on A T T T T T 1 101 201 301 401 Amino acid Figure 4 8 Median Sieving ing an ability to detect
290. st on the left and then pressing the Move gt gt button To transfer all enzymes to the Sites to mark list press the All gt gt button The Mark cut sites or Mark recognition Sites radio buttons will do as their names suggest placing the position indicator in the output at either the cut site or the first nucleotide in the recognition site The output is shown in Figure 4 39 page 4 43 Each enzyme is represented Page 4 42 Analyses Restriction enzyme digest B caldotenax LDH 301 401 501 501 701 801 901 1001 1101 1201 1301 Acal Accll Acil I Afal 1 AhaB1 Alul Apul Asp16HI Asp17HI Asp18HI Asp29HI Aspr42l Asp748l AspMDI ASpNI ASpTIIl Figure 4 39 Restriction Enzyme Digest Output on a single line and a vertical tick mark represents either the cut site or the first nucleotide of the recognition sequence depending on what you chose in the setup panel You can target the object and resize columns of informa Restriction enzyme digest B caldotenax LDH __ Number of sites Recognition site Asp74sl 380 892 can 520 D5 7 Pano cca ES 44 E Figure 4 40 Restriction Enzyme Digest Output Table tion by dragging the border between the enzyme name and the actual map using the mouse Choosing Object View as Table will produce Figure 4 40 on page 4 43 You can view any data not visible in this object by option drag ging in the table Another method to view more
291. surrounding hydrophobicity AE 4 69 temperature factor H ch cdebeccccdebivcadaeesceancaedececccdvevendevesceandeuvecencebveeceace veveadav es 4 70 transmembrane helices 2 2 2 cccceeeeeeeee cece ence eee ee ee eee ee cane ceca essa eeaaeeseaeeenaaee 4 70 ee 7 3 speak PING EE 3 7 Speak Typing Sequence Menu cccccnmncnccoonnncnnnonanconnnnnnnnnnonannnnnnnonnnnnannncnnnoos 6 46 Special Paste Edit Menu sssi ieira a a e A a ae a 6 10 speech parameters defining cccccceececeeeeeeceeeeeeeeeeeeeeeeeeeeaeeeseeeaceeeeeeeeeeessanes 3 7 speech preferences stenge geesde ai 3 7 Speech Prefs Sequence Menu coccccccccccccococononononenenonocononononononononononononons 6 47 Stack Windows Windows Menu ccccccccnonnncnccnncnonononananana nana n cn nono nanananananannnns 6 15 Standard tables tu it 4 14 Style Format Meninas EE ee EE 6 17 style sheets CISCUSSION shit ida A tn adas pida rca 4 9 OVA E 5 3 t tonal Lar az 2 38 2 39 Style Sheets Format Menu cccccccsssssssscececeeecsssesseececeesesesesseaeeceeeesessseeaees 6 20 summary EI 2 27 surrounding hydrophobicity 4 69 Sweet and Eisenberg table ENEE A 3 System TEQUITE MENS ias 1 4 Index 17 T T Table MENUS a A dla 6 53 add column S at right c ccccccccssssssssececeseeecsssesaeeeseseeecessssseaeeeeeeesesssenaees 6 54 add row S at bottom misanna aaa ae a aa a a a aa 6 54 adjust size e E 6 54 delete row S COIUMN S cnonnnnncncnnccnnononaninana
292. t no introns are defined in the selected sequence e Display The items in the Display submenu define how the features object is displayed This is accomplished by showing or hiding different parts of the features object and by defining other characteristics of the display Show Hide Double Strands The Show Hide Double Strands menu item is only available for nucleic acid sequences and will toggle to either show the sequence in double stranded format or to show it in single stranded format Page 6 40 Menu Items Show Hide Site Markers The Show Hide Site Markers menu item will determine if the restriction enzyme nucleic acid sequence or cleavage protein sequence markers will be dis played Hiding the markers does not remove them from the features object it just causes them not to be displayed To remove markers first select them and then press the Delete key Show Hide Translations The Show Hide Translation menu item will determine if the translated sequence is displayed along with the nucleic acid sequence Hiding the translation does not remove it from the features object it just causes it not to be displayed To create a new translation see Translate page 6 40 Show Hide Left Positions The Show Hide Left Positions menu item will either show or hide the sequence position indicators at the left side of the sequence lines This includes both nucleic acid and translated amino acid positions
293. tained by several methods and fine tuned manually The original paper recommends a window of 7 but 19 21 is also useful for determining membrane spanning segments J Mol Biol 157 105 132 1982 d V Show Icons Style Default Window size is the number of adjacent amino acids whose property is calculated in each iteration After calculating the value for the first window of amino acids the window is moved one residue along the sequence and a value is calculated again for the new window of amino acids Median Sieving emphasizes data having a specific distribution J A Bangham Anal Biockea 174 142 1988 Figure 4 59 Hydropathy Analysis Setup Panel the About the analysis box In addition to any tables you might create the standard supplied tables are Bull amp Breese page A 1 Engelman amp Steitz Page 4 61 Analyses page A 1 GES or Engelman et al page A 2 Fauchere amp Pliska page 4 Hydropathy Kyte amp Doolittle Lamprey rhodopsin 3 ER CR g2 a 1 5 2 T3 1 51 101 151 201 251 301 351 Amino acid 4 Hydropathy Kyte amp Doolittle Lamprey rhodopsin o B 4 J D gt I4 1 51 101 151 201 251 301 351 Amino acid Figure 4 60 Hydropathy Analysis with Median Sieving A 2 Hopp and Woods page A 2 Kyte and Doolittle page A 3 Manava lan amp Ponnuswamy page A 3 Sweet and Eisenberg page A 3 von Hei jne page A 4 and Wolfenden et al page A 4 A sample ou
294. te TABLE OF CONTENTS GrOUPING siesta eee Ae ee eee dt 6 42 Adjust Size To Contents reee E EEEE 6 43 Sequences Men CNN NENNEN a a ed 6 44 Sequence Ox A gek nat EE cesar as cama hed Gaede 6 44 Format SEQUENCES 2 0 cece eee eens 6 44 DIS VE 6 44 CONSENSUS A tics gie Aid tiie dre EPR EEN sand E 6 44 Sequence Men vc ess hes ca ee eS NEEN a EEN EEN E 6 45 SEQUENCE INTO caido E Bees eee eae E eae eee 6 45 New SEQUENCE soe bin NEEN ee ee ee EE dE 6 45 Insert Xsw INSert NS rei ee ease ieee 6 45 Generate Random oss cas es cece eee teen eens 6 46 GOTO POSITION icici ioe ton dba is wa aia we Bates ae etn 6 46 Speak TY PINS esis coded Lae ies Sees sg ae MOS ake ate cs eee ae dae es 6 46 Speech Profes r Soak wa Sakae ees a ate See 6 47 Map Keys cos tdi wi cuca ents See EEN d 6 47 Show Hide Sequence Monitor 00 cee eee eee eee 6 47 Display ease dete athe ot ef E Bae Sata ae ae ot eta eh 6 47 Manipulate NEEN eee ee a ENEE be 6 48 AUBAMENt erf Weg rca Eeer deg tees bo eee aed 6 49 CONSENSUS osa et ds Swiss aids eens EN 6 49 Format SEQUENCE 2 cece eee eee eens 6 51 Use Extra Cautions oia ies ee ee ee d KEN 6 51 sidebar Menu Asics a eae a eee BR ee ge RN 6 52 Adjust Size To Contents 0 00 c cee eee es 6 52 Table MENU nese when Malad ae ea ee RS ee et 6 53 Show Hide Column Headers 000 eee eee o 6 53 Show Hide Row Header 6 53 Insert ROW COIUMN 6 53 Delete Row S Column S 0 0
295. tebook menu has many special features unique to the Gene Inspector Since this tutorial is meant to be an overview it will illus trate some of these features but will not go into details of how to create them see elsewhere in the manual for that Page 2 3 Tutorials Tour of a Gene Inspector Notebook ruler text object title frame around text object A Tour of the Gene Inspector Notebook O Osyecties sequence subcloned segment of pBG123 obtain the sequence of this DNA fragment and examine it for coding regions and other interesting features clone coding segment into polycloning region downstream from metallothionine promoter determine appropriate restriction efsynes to use and clone the fragment into the expression vector graphic object notebook text Figure 2 2 The Tour Notebook 4 The title with the colorful border across the top of the notebook is a text object into which the title text was typed Text objects sometimes called sidebar text are extra blocks of text you can create in addition to what is in the background body text of your notebook Text objects can be located any place on a notebook sheet and can be any size from tall and skinny to short and wide like the text object containing the title here We have placed a red blue and green frame around the text object to make it stand out 5 The small elliptical object to the left of the word OBJECTIVES is a g
296. the BLAST setup panel Figure 2 47 You can learn more about the various Analysis Setup Analyses 1 Inputs O Outputs 0 Il High Priority Close Query fucleic acid vs nucleic acid sequence database J Fitter out low complexity matches a Expect value 05 Input Sequences Output Location number of best matches to align M Show Icons Default g Query defines the database to be searched Align best matches aligns the top matches indicated in the box to the right Filter out low complexity matches ignores matches found frequently in the database Request Results as HTML allows you to save results to open in a web browser to take advantage of links You can specify the subject for the retum email and have the results sent to your email address or a Gl notebook Figure 2 47 BLAST setup panel options in this dialog elsewhere in this manual BLAST Search page 4 47 Page 2 65 Tutorials BLAST Searching For now enter parameters as shown in Figure 2 47 page 2 65 2 Click on the Input Sequences icon and choose the Drosophila 5S sequence or one of your own if you prefer 3 Make sure you are connected to the Internet either directly or through a modem Press the Run button to send the query to the BLAST server The query will be sent and GI will monitor for results at regular intervals 4 You will see notification of what GI is doing in the BLAST output object that appears in your GI notebook When results are
297. the Chosen sequences list indicates that the segment is truncated by a stop codon Figure 4 33 GC Coding Prediction Setup use some codons over other synonymous codons see CodonPreference page 4 23 The result of this bias is that in the third position of the codons in a specific reading frame there is often an extreme skewing of the G C content The GC Analysis setup panel is shown in Figure 4 33 page 4 38 The analysis output shows the distribution of G C in different positions of the DNA as shown in Figure 4 34 page 4 39 The first curve represents G C content in every third nucleotide starting at position 1 reading frame 1 the second curve represents the G C content of every third nucleotide starting at position 2 reading frame 2 and the third curve starting at position 3 read ing frame 3 In this case there is a clear plateau in reading frame 1 corre sponding to the ORF from 750 2700 Because of its ability to be specific for a single reading frame a GC Analysis can be used to identify sequencing errors which cause frame shifts A sequencing error in a coding region could result in a shift in the reading frame this would show up as two separate ele vated regions in the GC Analysis output plot Along the bottom of the plot are illustrated the actual open reading frames and rare codon usage if requested Figure 4 33 This area of the plot was discussed in Tutorial 17 Testcode An Interactive Analysis page 2 55
298. the analysis and or change the sequence s being analyzed You can even add or remove analyses in the Setup Window when you choose to recalculate an analysis The options which are unique to each output object are discussed in the context of that analysis later in this chapter Style Sheets Style sheets were the subject of Tutorial 10 Creating and Using Style Sheets page 2 38 Individual components of an output object can be for matted enabling you to modify the output object to appear the way you want The complete set of display options can be saved as a Style Sheet A Style Sheet contains information about the color font and style of each text item in the object as well as specific formatting information about the data display itself For example a style sheet for a sliding window analysis Sliding Win dow page 4 68 might specify that the title be in blue 24 point Helvetica italic and underlined the x and y axis labels should be in green 10 point Times bold the axis numbering should be in green 7 point Courier and the plot itself should be red with a dotted pen pattern for the graph Once saved as a Style Sheet all the stylistic information can be applied to other output objects d The exception to this is multiple sequence alignments whose output goes into a sequence editor document Page 4 9 Analyses When an output object is selected its style can be added to the Format menu by choosing Format gt Style
299. the same number of pixels will be added to or removed from the width of each column in the table 5 Press the Edit Info button to enter a Title for this new table The new title you enter in the Info Dialog will appear in the title bar of the window 6 Pressing OK will create a new table in the User Table folder which is found in the Gl Data folder This User Table will be available in any analysis for which the table is appropriate Page 2 64 Tutorials BLAST Searching TUTORIAL 21 BLAST SEARCHING Gene Inspector allows you to perform a BLAST using the Internet This anal ysis requires that you set up your internet connection parameters first so that the program knows how to communicate with the database servers If you do not have an internet connection you will not be able to do this tutorial Note that different user locations might connect to the internet in different ways so that what follows might not work at your particular location although it works at most sites If you have problems connecting please contact your network administrator for help This is something that Textco cannot help you with because every site can have slightly different ways of connecting to the internet and only the network administrator at your site will know the best way for you to connect 1 Now choose to do a new nucleic acid analysis and select BLAST Search as the analysis note that this is at the end of the analysis list You will see
300. this select the name SCORE and then use the Format menu to change color and pattern If you want to capture a part of the multiple sequence alignment to display in the GI Notebook as a picture you can do so Identify the area of the align ment that you wish to capture and make sure it is all visible in the sequence view Hold down the option key click at the top left corner of the sequence area you want to capture and then keeping the option key depressed drag the mouse until you have selected the region of the sequence window you Page 3 13 The GI Sequence Editor KPGET FKSIVXXIXA SGFXGXFRYA ERSIVESYMA SGFOGLF ERSIVESYMA SGFOGLE FXCIIPOTIK YSPNCTI LKESIVDPIVYD SGENGIF TKDYYTEIVA SGFKGIF FADY PKILK AAPEAYL Figure 3 12 Alignment Picture in Notebook want to capture Let go of the mouse button and the option key and you will see the selected area outlined Choose File gt Copy to copy a picture PICT of the selected region to the clipboard You can now paste this into other appli cations or into the Gl Notebook Figure 3 12 shows the result of this opera tion Note that this technique just captures a picture of the selection it does not contain any actual sequence data Using Custom Score Adornments Custom Score Adornments are display properties that can be adjusted to highlight important information in multiple sequence alignments Figure 3 13 shows the dialog box you can use to change the multiple sequence
301. tion site markers Choose Features gt Mark Sites Select the Commercial_4 enzyme list and move the first six enzymes in the list all start with A to the right hand Sites to Mark list by pressing the Move gt gt button Your dialog box should look like Figure 2 37 Press Find to mark all sites for the enzymes you have selected 14 Select one of the site markers by clicking on it once Choose Edit gt Select Allto select all the site markers Using the Format menu change the selected site marker text to Times 10 point bold and color blue 15 Choose Features gt Adjust Size To Contents to expand the features object again You should now have something that looks like Figure 2 38 16 Try using some of the other options in the Features menu to adjust the Page 2 52 Tutorials Displaying Formatted Sequence Information E DNA Restrict Find cc ercial_4 Is0 Window Size Mesh Size Move All gt Remove All e Mark cut sites C Mark recognition sites Cancel Search Figure 2 37 Features Mark Sites Acil REA 47 Alul At 1 TTC TCA TGT TTG ACA GC 1 Phe Ser Cys Leu Thr Al 6 CACCGTGTAT 97 636 48 CGCAGTCAGG AspLEI 98 AAT GCG CTC ATC GTC ATC CTC GGC ACC GTC ACC CTG GAT GCT 139 7 Asn Ala Lew tle Val tle Leu Gly The Val The Leu Asp Ala 20 140 GTA GGC ATA GG 150 21 Val Gly tle 23 Figure 2 38 Features object Completed display to look exactly the way you want it to You might select the Box Around
302. to be found in non coding DNA Fickett defines eight numerical parameters that can be used to evaluate the likelihood of a segment of DNA actually coding for a peptide One advantage of using TestCode compared to CodonPreference is that it does not require the use of a codon preference table so TestCode analysis can be performed on any DNA not only those for which a codon preference table is known Page 4 44 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs 0 High Priority Close fun A Window size nucs 200 Replace ambiguous characters with A is Method CO Start and Stop Codons Only Stop Codons Min length ORF to consider amino acids 100 Input Sequences D Output Location Display A Only ORFs Table Gallus species chicken ORFs and rare codons Cutoff 10 00 Table Drosophila melanogaster ES v Show Icons Style Default D Window size is the length of the segment of DNA analyzed in this sliding window analysis 200 is recommended by Fickett ORFs can be found between stop codons or between start and stop codons Method box The Display box lets you show ORFs or ORFs and rare codons Choose a codon table appropriate for the DNAs being analyzed The rare codon Cutoff value is based on the frequency of occurence in synonymous codons Figure 4 42 TestCode Setup Panel TestCode Dros hsp70 aver AAR AAA ATA EE AAA A 1 501 1001 1501 2001 2501 3001 3501 4001 4501 500
303. to zoom in to that region and see a more detailed plot of the region This is discussed in detail in the tutorial Tutorial 18 Dot Matrix Analysis Another Interactive Analysis page 2 58 Page 4 30 Analyses The dot matrix analysis therefore provides a very good starting point for com paring sequences Regions of similarity that are of interest can be investi gated in more detail directly from the dot matrix plot Selecting the region of interest and zooming in or aligning the sequences provides more detail about the matching regions of the sequences Find Inverted Repeats This analysis will search DNA sequences for inverted repeats of any defined length The analysis can be used to identify regions of potential secondary Analyshs Setup Analyses 1 Inputs 1 Outputs 1 O High Priority Close gt C run Window size BU Max number of mismatches per match 1 Distance between matches Min 0 gt py Input Sequences Max 1 D Gi DD Output Location v Show Icons Style Default i D Show summary results Window size defines the size of the inverted repeat Maximum number of mismatches defines the allowable number of mismatches between the two parts of the inverted repeat The Distance between matching segments of the inverted repeat can be set using the maximum distance buttons and text fields in this box Figure 4 25 Inverted Repeats Panel structure in DNA or in the transcribed RNA The setup pan
304. tom style sheets you no longer need you can use Remove Style Sheet From Menu to remove them from the Style Sheets menu Page 6 20 Menu Items Ana lysis Menu New Analysis Ctrl T New Analysis SER Previous Setup Ctrl M Previous Setup HB The Analysis Menu deals with show analysis Monitor starting modifying and stor ie gt ing analyses and the tables used in these analyses The four items at the bottom of the Mac menu shown at the right represent custom anal Show Analysis Monitor Tables gt Remove Setup From Menu Base Analyses ysis setups see Custom Coding Regions human D Hydropathy Analyses Analysis Setups 7 page 6 Peptide Characteristics 26 e New Analysis New Analysis is the starting point for launching analyses You will be presented with a dialog that allows you to choose to do either a nucleic acid analysis or a peptide analysis This was discussed in Tutorial 3 Using Analysis Setups page 2 14 and in Starting an Analysis page 4 1 You should examine those sections for more information e Previous Setup Previous Setup will bring up the Setup Window which was used last It will include all the analyses in the window as well as any sequences or sequence segments Note that if you ran an analysis on a sequence from an open sequence document and that document was never saved the Previous Setup will not be able to retrieve the sequence Becau
305. tput from a Hydropathy analysis is shown in Figure 4 60 page 4 62 As labeled this analysis was done using the Kyte and Doolittle table The top part of the figure shows the standard analysis output using a sliding mean while the bottom plot is an analysis of the same data but using median sieving page 4 11 Median sieving can be applied to any of the sliding win dow analyses by choosing Object Apply Sieving The median sieving will often resolve peaks that can not be resolved by any other method Hydration Potential The Hydration Potential analysis is a sliding window analysis based on the Page 4 62 Analyses free energy of transfer between vapor phase and solution for amino acid side chain analogs Wolfenden et al Science 206 575 577 1979 The higher the hydration potential the more likely it is that a particular region of the peptide will be on the outside of the peptide is hydrophilic This analysis is identical to the Hydropathy analysis using the Wolfenden table Membrane Buried Regions The Membrane Buried sliding window analysis is based on the statistical dis tribution of specific amino acids in membrane vs non membrane segments for a sample set of proteins Argos et al Eur J Biochem 128 55 1982 The analysis is identical to a Transmembane Helix analysis using the Argos table Argos et al page A 1 Optimal Matching Hydrophobicity The Optimal Matching Hydrophobicity analysis is a sliding window a
306. transmembrane beltces metsien a iaia 2 14 4 70 U Undefine Intron Features Menu u cccccccccccsesssesecceceseescesseaeaueuesseeeeueeeeuanaueaes 6 40 Undo Edit Menu iii a aa a a Ee 6 9 Update Setup Analysis Menu cocccoccnccnnoncoooncnnncnnnnonnnnononononnnnonannnnnnncnnnnnnannns 6 25 Index 18 Updating Gene INSpector ccccccceeeeeeeceeeeeeeeeeeeeeeeeeaaeeeeeeeaaeeeesseaaeeeeesaeeeeessaaeees 1 3 Use Extra Caution Sequence Menu ccccccccccccccconoconocononononenonononononocaninonons 6 51 User tables aiii ed 4 14 5 12 V vectors El ds cet ee aa ee cea a Ses ede eee ees A 9 Ee ll A 9 BRE eebe a A 9 Clontech EE A 10 EE A 12 dat tee ET A 12 New England Biolabs merceiar aia A 13 Nova Gem deeg e A 14 Su e EE A 16 Pharmacia ista aaa A 16 Seel BEE A 18 el EEN A 19 Stratagene EE dee ee A i ee A 20 E EE A 21 viewing large objectsllarge objects printing and viewing oocccnnnncccnnnnccccncnnn 7 5 VON Helne dable ii A ee cae es cag nce ae eed ee A 4 W Welling et als table iio cs li dida edd A 4 Windows Menu current window NAMES cceccecceeceeceeceeceesaeceecaeceesaeeeecaeeeeseseaseeesueeeesaeeeesaes 6 15 St ck WINKOWS iii A ENEE 6 15 Wolfenden etals table nuca dial A 4 Z Index 19 Z Index 20
307. ts Ungroup When a grouped object is selected and Ungroup chosen the grouped objects will be converted into individual objects each of which can be manipulated separately The menu is not enabled if not grouped object is selected Align Objects This menu item provides a means to align and adjust the sizes of a collection of selected objects Aligning has been discussed in Tutorial 7 Aligning Anal ysis Objects page 2 31 and in Aligning Objects page 5 10 Save Preferred Size The preferred size of an object is a defined size for an object that can be recalled later on Each window will open at a preferred size You may enlarge the window with the grow box at the top right corner but clicking on the grow box again returns the window to its preferred size Save Preferred Size defines a preferred size for the selected object s It is used in conjunction with Restore Preferred Size Restore Preferred Size Restore Preferred Size will restore the size of any selected object s to its pre ferred size as set in Save Preferred Size If no preferred size has been defined the object will revert to the size it had when it was originally created e Display The Display menu deals with how the notebook and its contents are displayed You can customize the appearance of the notebook by showing or hiding dif ferent components of the notebook window Show Hide Ruler Show Hide Ruler will show or hide the
308. ts of objects Some of gg e is bale d the Format submenus may not be available at Parar ki Numeric Farmar F any given time because they might not per Gees E tain to the current selection For example when a rectangle is selected the Font submenu will not be available e Fill Fill contains a number of patterns that can be used to fill in currently selected rt AI Lines gt BEUZ Font gt TES Style SOE ze SAME pa JEE mamen Numeric Format gt ANYA omms Se Style Sheets gt E SS Figure 6 10 The Fill Submenu objects It is shown in Figure 6 10 The NoFill selection will make the filled portion of the object transparent e Lines Lines allows you to set properties of lines It is shown in Figure 6 11 page 6 17 When your selection contains a line you can choose line thickness using the different thickness lines in this menu or using the Pick Line Width choice For simple lines the arrowhead options are also available You can choose to place an arrowhead on one or both ends of the line s The Size Arrowhead option will give you an opportunity to adjust the appearance of the arrow heads on any selected lines containing arrowheads a This includes not only simple lines but rectangles ellipses and other graphic objects The line thickness is equivalent to the pen width Page 6 16 Menu Items Fill gt Lines gt Color Frames Numeric Format gt Style Sheets D
309. u You name the color menu item and then define its color using either the Color Picker or by using the color of the currently selected object This is discussed in Tutorial 8 Customizing Gene Inspector Menus page 2 34 Remove Color From Menu Remove Color From Menu allows you to remove any custom colors you had previ ously added to the Color menu If you have not added any custom colors this menu option will be disabled e Frames Frames are adornments that can be attached to any Gl Notebook object Adornments modify the appearance of objects they adorn Frames consist of one or more borders and optionally a shadow The items in this menu allow you to create and modify frames See Framing Gl Notebook Objects page 5 8 for more details Edit Frame Edit Frame provides a means to edit the frame of a selected object Note that it is possible to have no visible frame on an object and still edit the frame In this case the object could be considered to have a frame that consists of no border and no shadow The dialog is shown in Figure 5 6 page 5 9 Add Frame To Menu Once you have designed a Frame you would like to use again in the future you can add it to the menu using this option If you have an object selected and choose Add Frame To Menu the dialog will allow you to add the frame from the selected object to the Frames menu If you do not have an object selected this option will be disable
310. u wish to add in the bottom left list and press the Add gt gt button or double click on the sequence in the bottom left to add it to the list on the bottom right One or more sequences can be chosen from any number of files All analyses will be performed on each sequence in this list This method of choosing sequences involves an extra step compared to the standard way of opening files because you need to specify not only a file but a particular sequence in that file see also Tutorial 3 Using Analysis Setups page 2 14 b Note that the same result could be achieved by double clicking on the file name Page 4 5 Analyses 008 Choose Object Select sequences for analysis ar Ela 7 DNA sequences A search PLS A sequences gt EN acetylcho receptors Network stide sequences a chick cytochrome b5 A Drosophila Ss E NewHampshire2 A Drosophi 18S rRNA fg Desktop Drosophila HSPs Q bobgross lactate d rogenases Applications 3 o lysozymes Documents A PBR322 Serializer 4 W rat globins LA rhodopsins Name Drosophila L KeyServering f s J TextcoBioSoftware Size 44 KB k 1 Kind DNA M Utilities 74 e Add Selected Files and Folders Show Open Sequence Documents y Sequences in documents Chosen files and sequences Dros hsp22 4 A Drosophila HSPs Dros hsp23 Dros hsp26 Dros hsp27 Dros hsp70 I Add Selected Sequences Remove Selected Items y Ne
311. uch as lists of cut sites in a sequence or perhaps aligned sequences or database search results You can also use it to store buffer recipes or other often needed information Collaps ing large objects into appendices will therefore save space in the Gl Note book and make the background text more readable At the same time you will still have access to the information in the appendix if you need it later on Appendices are also useful because they are available from any location in the notebook Tool Extensions The Gene Inspector allows you to create your own custom items to paste into the Gl Notebook This is done in the form of a tool extension Once a note book object is selected it can be added to the Tools menu as an extension by choosing the Notebook Tool Extensions Add Extension menu item You will be asked to provide a name for the extension after which it will become available through a menu the original object is still left in the notebook As seen in Figure 5 5 page 5 7 when the Tools menu is torn off to create a Tools Pal ette the tool extensions appear as a popup menu in the palette In this par ticular case the tool extension called Bookmarker is being shown in the palette Selecting a tool extension from the Tools menu or Tools Palette will enable you to place a copy of the tool extension in a notebook Clicking the mouse f Gene Inspector s aliases behave similarly to the Finder s aliases lt is not an
312. ure 3 2 page 3 3 for nucleic acids and Figure 3 3 page 3 3 for proteins e Format Sequences This option is the same as the Format Sequence option in the sequence edi tor window see Figure 3 4 page 3 5 e Display This is the same as Display page 6 47 e Consensus This provides the same flexibilities as in the sequence editor window See Consensus page 6 49 Page 6 44 Menu Items Sequence Menu Sequence Info Ctrl Sequence Info 31 S New Sequence Ctrl K The Sequence menu is enabled inserts ee Generate Random Cl nsert Ns whenever you have a sequence Generate Random L document as the active document Map Keys f f Show Sequence Monitor Speak Typing t allows for manipulation o gt Speech Prefs sequences Manipulate gt Map Keys Alignment gt Show Sequence Monitor Consensus gt Disp gt ispila o Sequence Info Format Sequence koih gt l v Use Extra Caution Al ignment gt When a sequence is selected in Consensus gt the sequence editor you can Format Sequence obtain information about that v Use Extra Caution sequence by choosing the Sequence Info menu item Slightly different information windows are seen for nucleic acid Figure 3 2 page 3 3 and for peptide sequences Figure 3 3 page 3 3 The information windows allow you to set a nucleic acid sequence to linear or circular or to DNA or RNA For peptide or nucle
313. ure 3 5 Using Extra Caution sequence display operate only on the all sequences in the document for dis playing sequences with full features see Creating a Features Object View of a Sequence page 3 16 If you would like to be alerted to such events choose Sequence gt Use Extra Caution This will bring up the dialog box shown in Figure 3 5 If you know your way around the program and do not want to be disturbed with these dialogs choose the No button If you want to be made aware of what the program is about to do choose the Yes button Drag and Drop Sequence Editing Sequence data can be moved within the sequence editor window using stan dard copy and paste operations However if you have enabled Drag amp Drop editing Drag amp Drop Options page 6 13 you will be able to manipulate sequences using this faster method To utilize Drag amp Drop editing you must first make a selection If you select a sequence name by clicking once mouse down gt mouse up when you click on the selected sequence name again you can drag mouse down gt drag it to a different location in the sequence editor window up or down You can even drag the selected sequence to a different sequence window where it will become a new Page 3 6 The GI Sequence Editor sequence in that window or to a GI Notebook where it will become a Fea tures object in that Gl Notebook You can also select a piece of a sequence and drag it elsew
314. ure 4 1 The Analysis Chooser left a brief description of the analysis is shown on the right in the Information about selected analysis section of the window This provides you with an idea of what each analysis can do to aid you in finding an appropriate method to answer biological questions about your sequence The Draw icons check box to the right of the list will present the lists of analyses as icons rather than as a text list Figure 2 7 page 2 14 shows what an icon list looks like To perform an analysis first select it in the list on the left and then press the OK button to continue or double click on the analysis name The Analysis Setup Window After choosing an analysis you will see the Gene Inspector s Analysis Setup Window as shown in Figure 4 2 page 4 3 for Transmembrane Helices Selecting an item in the list on the left of the window will present you with a panel on the right of the window which is used to enter information needed by the icon on the left Three kinds of panels can be found in the Analysis Setup Window An Input Sequence panel allows you to choose which sequence s to use in the anal yses Every sequence chosen will be analyzed by each of the analyses in the Analysis Setup Window The Output Location panel allows you to define where the results of the analyses are to be placed In general there will be Page 4 2 Analyses Analysis Setup Analyses 1 Inputs O Outputs 0 I High Priority Close Run
315. us sequence C do not match the consensus sequence ii Replace the characters that C match the consensus sequence with C do not match consensus sequence with J7 Draw boxes around the characters that C match the consensus sequence with C do not match consensus sequence with J Include sequence gaps with non matching characters Try Out Cancel OK Figure 3 13 Custom Sequence Adornments Dialog button is chosen characters Inverting in this case means the comple mentary color Thus red becomes cyan green becomes magenta black becomes white etc e Replace the Characters that will replace any character that matches the consensus character or does not match depending on which radio button is chosen with the character that is chosen from the popup menu The characters that are available in the popup are those that cannot be found in the sequence itself e Draw boxes around the characters that will draw an enclosing line around all the characters that match the consensus character or do not match depending on which radio button is chosen The bounding line will include all adjacent character that qualify as match or non match The last check box Include sequence gaps with non matching characters allows you to specify if the gaps should be considered as part of the match ing characters or as part of the non matching characters The Try Out button will apply your choices tem
316. useful approach for using bookmarks is to create a bookmark graphic object using the drawing tools in the notebook This might be a rectangle of a specific color or pattern We have provided you with a default graphic object named bookmarker as a tool extension Tool Extensions page 5 17 To use this object as a bookmark just place the bookmarker object on the Page 5 5 The GI Notebook page and attach a bookmark to it You can also paste in pictures from other applications that could serve as a bookmarker icon You might want to attach bookmarks to locations that you refer to often per haps a restriction map or database search list GI Notebook Objects In addition to the descriptive background text which can be used for storing comments about experiments and discussing results several different kinds of objects can be placed into the GI Notebook Objects can be pasted in from other programs can be generated as analysis objects by the Gene Inspector itself or can be drawn using the drawing tools in the Gene Inspector All Gl Notebook object share some properties and behaviors These common fea tures are discussed first followed by a description of each of the Gl Notebook object types Selecting vs Targeting Clicking once on an object makes it the selection and causes the appearance of eight handles little boxes around the edges of the object as shown in Humah LDH 4 G T i D Figure 5 4
317. ve Remove Sil Vange tire segn enre mz Legmeci Liceaar equene tore s rart in fru tai CX In aimants and et ngs 1 ak ms De Anc EI Zb Ecgs Fept de 52 que ces thozops D Cp IF icone ieans La prey he dup int il a d Reimse al mz ap rn Us st nr Ur Cios eerie D dann se qu gt in Ur Wed EE EN nn at deeg us ng the ste Rea edel se rn To Srgtssecr Je D a all se br Len beier mmh qua af hie we ali mA te EES SES 23 the EY Seg msn i gt Eee d by atep zeien Figure 2 31 Selecting a Sequence Segment for Analysis 3 Double click on the resulting Helical Wheel output object in the notebook to make it the target and try some of the options under the Object menu This analysis shows how the amino acid side chains would be distributed if you were to look down the central axis of an alpha helical segment of a peptide Functions that are specific to each analysis output object will always be found under the Object menu Page 2 40 Tutorials Adding More Analyses to a Setup 4 Choose Object gt Recalculate and rerun the analysis using a different table Sweet and Eisenberg Note that you can also create your own tables for use in any appropriate analysis Use the Analysis gt Tables option to create your own tables Creating tables is discussed in the Tutorial 20 Creating Your Own Analysis Tables page 2 63 5 Double click on the Helical Wheel output object and choose Object gt Recal culate again Remove the
318. w Folder Cancel Done Figure 4 5 The Sequence Chooser The Open Sequences and Saved Sequences buttons icons in the upper right of the Sequence Chooser window allow you to perform analyses on sequences selected in different ways Pressing the Open Sequences button will present a list of all sequences in open Sequence Editor windows If you have a range of nucleotides selected in a Sequence Editor window that range will be the default segment for that sequence in the Input Sequence panel The Saved Sequences button allows you to select sequences for analysis from files on disk which are not currently opened in the Sequence Editor Note that if you plan to rerun the analysis at a later time or to hot link see Links page 6 34 or Hotlinking Analysis Results page 2 19 the analysis you should work with the saved sequences because Gene Inspector might not be able to find the Open Sequences at a later date Page 4 6 Analyses Pressing the Done button returns you to the Input Sequence panel Figure 4 4 page 4 5 with all the sequences you chose now appearing in the Cho sen files and sequences list Pressing the Cancel button returns to the Input Sequences panel without any sequences being added Note that in Figure 4 4 the Chosen files and sequences list contains not only the name of the sequence but the name of the file which contains the sequence which is indi cated as Path Working Files Gl f Gl Seqs DNA f pBR322
319. w of a peptide segment looking down the axis of an alpha helix The distribution of side chains which stick out from the helix axis is readily apparent in this view This analysis is described by Schiffer amp Edmundson Biophys J 7 121 1967 The setup panel is shown in Figure 4 57 page 4 60 The only information you need to supply here is the table to be used in calculating the properties of the side chains The out put for this analysis on lamprey rhodopsin amino acids 230 250 is shown in Figure 4 58 page 4 61 Notice that since the analysis is meant to look down an alpha helical segment of a peptide you should limit the segment length you are examining to a reasonable size for an alpha helical structure per Page 4 59 Analyses Analysis Setup Analyses 1 Inputs 0 Outputs 0 High Priority Table Kyte amp Doolittle B helical Wheel Input Sequences 60 ao Output Location v Show Icons Style Default E Helical wheels display side chain properties as viewed by looking down the axis of an alpha helix They can be calculated using any set of values for amino acids Choose one of the tables to specify which set of values to use Typically Hopp amp Woods or Kyte amp Doolittle are used Note also that it is best to use less than 30 residues of the peptide sequence for the anlysis Figure 4 57 Helical Wheel Setup Panel haps by first running a Chou Fasman page 4 54 or GOR page 4 58
320. will flow around objects in the Gl Notebook You can specify exactly how the text will flow around an object by first selecting the object and then using the Notebook Text Flow submenu item The available options are e FlowThrough text does not recognize that an object is present and over writes the whole width of the text column completely running through the object e BothSides text jumps across the object and is placed on both the right side and the left side of the object e Left Side text only will be placed to the left side of the object e RightSide text only will be placed to the right side of the object e WidestSide text only will be placed to the side of the object that has the greatest distance between the object and the border of the text column If the object is moved the text will flow only to the widest site e Neither Side text is not placed on either side of the object and jumps from above the object to below the objects Text Standoff Specify the space between the text and the graphic in pixels Horizontal standoff 6 Vertical standoff 2 Figure 5 7 Setting Text Standoff Distance Notebook Text Flow Set Text Standoff can be used to set the number of pixels that will be maintained between the object and the surrounding background text both vertically and horizontally as shown in Figure 5 7 page 5 10 Aligning Objects When more than one object is selected in the
321. windows that are associated with a Gl Notebook An appendix window can be visible or hidden Appendix markers aliases to appendices which are attached to a Gl Notebook can contain a great deal of data e g a table a long list of matches from a database search references etc Any GI Notebook object can be converted into an appendix and multiple locations in the Gl Notebook can point to the same appendix This allows large amounts of information to remain available for occasional viewing but does not interfere with displaying information and reading the Gl Notebook One use might be to store buffer recipes as appendix objects and then refer to the appendix whenever the buffer is discussed by using an alias to the appendix containing the buffer Page 1 8 Tutorials Chapter 2 Tutorials This chapter contains a number of tutorials introducing you to Gene Inspec tor Although you should not feel obligated to going through all the tutorials you should do as many of the tutorials as you can because they are designed to provide an overview of how the program works Doing the tutorials now will save you many hours in the future The Gene Inspector has a number of unique features you might not have seen in any other application the tuto rials provide a way for you to learn about these unique capabilities In many locations in this chapter and throughout the manual you will be asked to select items in menus To make your choices as
322. xx GETVxMYxSx FLVKGMGV SDPDAKKF A ITTLVPAIAF TM LSMLLG GLTMVPFGGE YFIVKGNGV TDKEAREYYS ITILVPGIAS AAYLSMFFGI GLTEVQVGSE Lamprey rhodopsin 51 GFPVNFLTLF VTVQHKKLRT PLNYILLNLA MANL FMVLF GFTVTMYTSM Octopus rhodopsin 49 GILGNGVVIY LFSKTKSLQT PANMFIINLA MSDLSFSAIN GFPLKTISAF Xenopus rhodopsin SL GLPINFMTLF VTIQHKKLRT PLNYILLNLV FANH FMVLC GFTVTMYTSM 0898 gt Rrodopsins matched 35 50 SCORE HIR S CONSENSUS eh ben pos xx Lee me A8 He d bacteriorhodopsin 1 MLELL PTA erg i ai ALMG Halobacterium archa 1 MDPIA LTA AVGADLL D GR ie LLMEL Lamprey rhodopsin 1 MicTEGONFY geregepg e ENEE DER RR ee Octopus rhodopsin 1 MVESTTLVNQ TWW Pi WD IH H WAKFD aie YSV BrEIGwW IT Xenopus rhodopsin 1 MicTEGPNFY Seele Si ron EHAR AC LE ben SCORE milla m CONSENSUS xxx MAN y Wx MYx 5x bacteriorhodopsin F A ITTLVPAIAF TM GLIMVPFGGE Halobacterium archa GV TDKEAREYYS ITILVPGIAS LEE VQVGBE Lamprey rhodopsin Lt MYTBM Octopus rhodopsin T PLKTISAF Xenopus rhodopsin SL LPIBFMTLE WTIQH ie 4 Figure 3 11 Shading to Examine Sequence Similarities ing will be the same color and intensity as in the scoring row If half the char acters match the intensity of the highlighting will only be half that of the scoring row Sequence Adornments To change the color or pattern of the shading you must change the color or pattern of the scoring row which is the basis for the shading To do
323. y A different set of tables are the BLOSUM tables S Henikoff amp J G Henikoff Proc Nat Acad Sci USA 89 10915 1992 The BLOSUM tables are based on blocks of aligned sequence segments from over 500 groups of pro teins known to be related lt has been demonstrated that BLOSUM62 pro vides one of the best ways to compare sequences and yields results superior to comparisons using the PAM matrices Henikoff amp Henikoff Proteins 17 49 Page 4 51 Analyses 1993 BLOSUM tables have proven to be more accurate than projecting similarities of distantly related proteins based on known similarities of closely related proteins which is how the PAM matrices were developed In general it is best to start off using a BLOSUM62 matrix for your comparisons As with the PAM tables the closer related the sequences are the lower value BLO SUM table you should use Sequences that are more distantly related should be compared using higher value BLOSUM tables The output from this alignment is shown in Figure 4 48 In this instance Align 2 sequences global Dros hsp22 amp Dros hsp23 First sequence Dros hsp22 Second sequence Dros hsp23 Scoring table BLOSUM62 Gap insertion penalty 2 50 Gap extension penalty 0 30 Unaligned ends not treated as gaps Traceback Upper path Score 364 00 1 MESLPMF MAEEMARM S PF EP 31 1 MANIPLI LADDLGRM f bt H 28 32 P VAL PANWQHIARWQ EQEL AT 58 29 b Al b E ot bd ray As 57 59 YNKDG KLTLDY KD
324. yle has previously been defined Choose a different word in the background text and then select Format gt Style Sheets gt Section Title to convert the selected text into the new style Once you have created a style sheet it can be used from within any notebook you open it becomes part of the applications menus Creat ing style sheets is described in Tutorial 10 Creating and Using Style Sheets page 2 38 10 Use the vertical scrollbar on the right side of the notebook window to scroll down to the next notebook sheet that says SEQUENCING SUBCLONED SEGMENT OF PBG123 The descriptive text on the left side of the notebook Page 2 5 Tutorials Tour of a Gene Inspector Notebook page discusses the current set of experiments while the CodonPreference data analysis on the right indicates a possible problem in the sequencing project see the January 28th notes The ability to mix your notes with anal ysis results and other notebook objects provides a convenient way to keep a running commentary on your experiments just as in a paper lab notebook but with added flexibility and convenience 11 Scroll down the page further and you will see a scanned image of a restriction digest gel Note also that lane 11 in the legend has conditional text which is actually part of this text object This is another use of a text object the notebook title was the first example Also note that the figure legends for the figures in this notebook are actu

now - Textco BioSoftware

Contents

Download Pdf Manuals

Related Search

Related Contents