Home
User`s Manual - GeneHarbor, Inc.
Contents
1. CCGTGTTCAAGAGGAAGCACGTTGCCTTGTGGAGGAACTTAGGAAAACCAATGGCTCACCCTGTGATCCCAC CTTCATCCTGGGCTGTGCTCCCTGCAATGTCATCTGCTCAATTATTTTCCAGGATCGTTTTGATTATAAAGA VI Non Similarity Search and Display The Non Similarity Search and Display utility 1s designed for users to search the non similar regions of a sequence against a pool of related sequences It is intended to identify unique exons in a transcript among a group of alternatively spliced variants however it also can be used to identify non homologus regions of a sequence against the transcripts from a gene family or random sequences This function has an practical use As mentioned before the dot xon file is a sequence format containing exon intron information With the Non Similarity Search function users are now able to create a sequence file with both exon intron non similarity data With this sequence file it 15 possible to design primers to amplify DNA fragments not only spanning introns but being unique to a splicing variant The Non Similarity Search and Display in ExonTracker 2 0 can process either seq files created by other programs no intron and exon information or files created using ExonTracker 2 0 Procedure Create a
2. BA jI8QqOAMO qwrI2WNOEI UOXHg Protocol ExonTracker User s Manual A key step in establishing the relationship between a transcript and its genomic sequence using ExonTracker 2 0 is to upload the nucleotide sequence of a transcript into the program system The sequence information serves at least two purposes First the system uses it to detect the length of the transcript its open reading frame ORF and subsequently uses the data to draw diagrams in various functionalities Second the sequence or its ID 1s used as a query for genomic Blast In order to perform the analysis a user must have one of the following three things 1 a nucleotide sequence in fasta format 2 an accession number or a gene bank identification number GI of the transcript or 3 the ACCN or GI for the protein that the transcript encodes Based on the information you possess select one of the three approaches described below to begin I Preparation for launching the genomic BLAST Launch ExonTracker 2 0 by double clicking on the program icon on the Desktop of you computer The DATA Entry form will appear Fig 1 Based on the initial information you have choose one of the three approaches described below to begin Fig 1 The Data Entry form with the three types of querying examples ag 810 ama miri niu pj Tower Is AARSE bra erani ild aim er kj mh P
3. 100 Manut OUI HT Uim 14 699 agni th 14n 140 Mam H17 M Fr MT 14 TR We T 101 Hal Select Export Alignment With Amino Acid under Export to display the whole sequence alignment in a text editor Fig 18 Fig 18 Sample of exported whole sequence alignment with amino acid sequences Click on the rows to show the exon number Predicted exon Saqil HT 1 Zu errisr 72 junction HB Dp P 1 L V L V S3 C L PF L L L MW B I amp CTCCATCCATCCAATCCTGCTCCTCCTCTTCACCCTCTCCTCTCTCTTTCTCCTCTCACTCTOCACAC ACA 7l e 2 L P P G F T P L I I I 1 I N ident 1 GCTCTCAARDCGGCUGGAADCCTOCCTCCTGGCCCCACTOCTCTT CAATTATTOGAAATATCCT iaj on 1 en 1Ca IBELERLETIRIETERERIRIETETERIETETERIREEAERETITIERETIRIETETIRI DA chen LLL EG 1 II CCTCTCAARCCCCCAACCTCCCTCCTCGCCCCACTCCTCTTCCAATTATT XMTCCTTCACATCAATC 143 d d pr amino acid residuc 2 m d 7 b F I b T F L I GGD b 6 H Y P L Rat CACTTTATIGCATTATITCCTARATTARTOGDA CIEL CAAGACCECAACTATOCAT TAL TRGGCTICA II CACTTTATTCATTATTTCCTAATTAATCCACCCCACACARAAC T TCCATTO
4. Transcriptional regulator RING finger Really Interesting New Gene 15 ExonTracker User s Manual 2 The Region items in the protein Entrez are read and input to the spreadsheet The data from one Region occupies one row and treated as one domain Each one of the positive detections of hydrophobic regions is also treated as one domain To add additional customized domains input the data Shape Source Location Domain and Abbreviation to the next row of the last domain row following the rules See Note below Note The format in the Location column 15 critical The numerical number before is the beginning point of the domain and the number after it is the ending point of the domain 3 The number in the Shape column 1 the shape ID assigned by the program and can be changed manually to any one of the integer numbers available 1 to 6 To do so just delete the old and type a new one To hide a domain delete the shape ID 4 Usethe Domain Label Option to select the content of Domain column or Domain Abbreviation as the domain label 5 Click on the Shape It button to draw the domain diagram Note The color and style of each preset shape can be customly modified Use the follow method to make change on the preset shape a Select a shape by click on the shape ID under the shape The shape 1s selected as indicated by the yellow colored shape border b Select the Shape Color Style menu under the Effect
5. Save Sequence 199 199 Hide Intron Info Copy Diagram Intron length label sampleseq xon 7564 1062 1715 2380 2425 Pb rc Stop NE NEP om id Intron location i 295 imm 1355 label 679 Non similarity region 3 Click on the Browser button to show the File Selection and Loading form Fig 21 a special file open form for loading xon files or files Fig 21 The File Selection and Loading form File Selection and Loading NM_007294 xon _007295 NM 007297 sample data Cancel OK 21 ExonTracker User s Manual Use the Drive browser to locate the folder containing the sequence files to be analyzed The file names in the folder will appear in the file list box Use the file type option menu above the file list box to select either seq or xon extensions Click on the OK button to load all files with the selected file extension in the folder to the system The form will close and the file names will be input to the Loaded Sequence pull down menu Fig 20 It contains file names loaded and can maximally hold 2 000 names or a sub set of loaded sequences can be used in a search Click on Add to use all sequences for searching To select a sub set of the sequences use the Loaded Sequences pull down menu to select the sequence and click on Add One button The selected sequence
6. Length The length of Query Sequence The column content is presorted The first nucleotide position of query ascending based on the position of a Q Location j sequence in the segment fragment in the query sequence and within a contig region The first nucleotide position of 5 Location subject sequence in the segment Identical Data read from the Blast return two sequences Tot bases orguery Data read from the Blast return sequence in the segment 9 Contig ID and Segment ID Assigned by the program The description of each contig 11 ExonTracker User s Manual 2 Identify aligned segments corresponding to the real exons composing of the transcript Follow the steps described below Note Identifying the aligned segments Exons and ordering them are the most critical operations in assembling the genomic sequence comprising the transcript Based on our experience the majority of transcripts are easy to be assembled however some transcripts with multiple copies of homologous sequences in the same contig do require user s efforts ExonTracker 2 0 provides many means to help users to deal with some very difficult situations a Scroll through the spreadsheet and identify the block containing several segments with highest identities usually being near 100 as the candidate region in a contig Remember the orientation of the subject strand plus or minus Click on the headin
7. MT Mane 4 4 zaer 4020220 gt 2 Click on the Manual option Make a small change on the number in the exon scale and intron scale boxes 3 Click on OK to exit Click on the heading of ID Click Here to redraw the diagram using the new scales Try different scales until the dimension of the diagram 15 satisfied 4 To draw two different diagrams in the same scale Draw first diagram and open the Scale setting form and record the two numbers Before draw the second open the Scale form manually and input the scales detected from first transcript drawing and then draw the second diagram The two diagrams will have the same scales 28
8. Copy Oligo button The copied rows can be pasted to other document environments such as MS Excel Click on the Print Oligo button to print the spreadsheet containing the primer information A saved sequence file or can also be loaded directly by using the Browse button Any change in the sequence will trigger an update of the ORF information and the restriction enzymes list in the pull down menu The program uses an arbitrary scoring system to evaluate the primers The lower the penalty score the better the primers There are sixteen pre selected frequently used enzymes for the 5 addition BamH I GGATCC sac I GAGCTC Bgl sac II CCGCGG EcoR I GAATTC sal Hind Sca Kpn I GGTACC Sma I CCCGGG Not I GCGGCCGC Spe I ACTAGT Pst I CTGCAG Xba I TCTAGA Pvu II CAGCTG Xho I CTCGAG 25 ExonTracker User s Manual Appendix A Web Link Update The URLs of the preset links in the package are frequently used by researchers and managed by NCBI The collection may not be so broad to accommodate every user s needs and they are subject to future changes by their administrators This package includes the Web Link Update utility to give a user s ability to add new URLs or modify the URL of a preset link Procedure 1 2 Start ExonTacker 2 0 by clicking on the program icon on the PC Desktop if the program is not open Select Web Link URL under th
9. Selection of A Coding Region The coding region of transcript is the region encoding the protein as defined by the initiation codon ATG Position and the termination codon Stop Position The coding information used in the analysis is very important in many diagrams The coding region 15 detected as the longest open reading frame ORF It is also read from the Entrez document if it is available If the coding region detected by the program defers from that read from the Entrez document the program uses the late for the coding region To use the longest ORF or other region manually type the ATG and Stop positions Procedure l Select Set the Coding Region Manually under the Parameter menu in Data Processing and Integration form The form will appear Fig 25 2 Manually type the numbers of ATG location and Stop location in the labeled text boxes Then click on OK The system will use the input coding information Fig 25 Manual ATG form for inputting a desired coding region lox ATG POSITION 338 STOP POSITION 21 ExonTracker User s Manual Appendix C Set Exon Intron Scales The diagram in the Data Processing and Integration form is drawn using the scales calculated based on the length of transcript and the length of the genomic sequence involved To reflect the relative sizes of an intron and exon the program uses two different scales one for the exons and one for the introns The program automatically sets the
10. V Data Export The Data extracted and generated by ExonTracker 2 0 can be exported in multiple formats which are designed for various research purposes and data presentations The data export functions are organized under the Export menu in Data Processing and Integration Fig 15 17 ExonTracker User s Manual Fig 15 The data export utilities under the Export menu in the Data Processing and Integration form File Maneuver Parameter Effect Export Window Help Getic ZEE Diagram to Clipboard Copy Table Content lo n Query Sequence With Intron Info 111 Genomic Copy of Query Sequence P Alignment be w 4965547 4922156 4i 11840 80809 1 Copy Diagram to Clipboard Select the menu under Export The diagram in the picture box 15 copied to the clipboard and is now ready to be pasted to other picture environments provided by third party software packages including Microsoft PowerPoint and Adobe Photoshop 2 Copy Table Content Highlight desired rows or columns in the spreadsheet and then select Copy Table Content menu under Export The selected table contents are copied to the clipboard and can be pasted to Microsoft Excel sheet 3 Save The Spreadsheet Select Save Data Table under the File menu to save the spreadsheet as an Excel file if there is MS office installed in the computer 4 Query Sequence With Intron Info This function must be done after the Def
11. With the help of human intelligence it is possible to correctly process transcripts with Fig 10 many repeats in the genome using ExonTracker 2 0 which are difficult to be resolved by other means The Filter function under the Maneuver menu is a very effective tool to deal with difficult transcripts Fig 10 The Strand option deletes either all segments with a plus orientation or the opposite Since the exons for one transcript should have the same orientation this function can remove all unwanted segments with the opposite orientation of the real exon segments The Identities option removes those alignments with lower identities which usually are not true exon segments of the transcript The Exon Length option is useful for removing short repeats Filter form Check Delete all plus strand rows for removing all segments with a plus orientation and check Delete all minus strand rows for removing all minus rows To remove segments within a range of Identities values use the Identities panel to set the range To remove the segments within a range of sizes use the Exon Length panel to set the range Click on OK to accept the setting and Cancel to quit X Identities Exon Lengnth Delete rows with identies above i 00 Delete exons longer than 0000 Delete all minus strand rows Delete rows with identies below 0 Delete exons shorter than 0 IV Addition of Protein Domain Information to the Diagram in Data Proces
12. a dozen of variants Interestingly a spliced variant is often found to express in a tissue specific manner and the proteins encoded by the transcriptional variants have diversified biological functions The fact that a gene can express multiple transcripts increases the amount of genetic information carried by the gene This phenomenon provides both opportunities and challenges for scientists who are trying to understand the biological functions of genes In recent years the bioinformatic study has advanced tremendously along with the progress in the genome study however the efforts are put more on the sequence analysis and gene decoding fields Desktop tools for analyzing alternatively spliced variants are extremely rare With more cDNA sequences becoming available molecular biologists now frequently encounter genes with multiple transcripts To manage and understand the multiple transcripts requires sophisticated and user friendly tools GeneHarbor Inc has devoted its resource to develop a tool for the purposes We are now proudly to present ExonTracker 2 0 ExonTracker 2 0 is a windows system based software tool which enables users to analyze a typical genome Blast data much further Some of the unique features in the package which we believe will enhance the ability of molecular biologists to study the exon intron structure of a gene make a graphic comparison of alternatively spliced variants and evaluate the effects of an exon replace
13. folder and save all related sequence files in folder 2 Start ExonTracker 2 0 program if it s not started and select Non Similarity Search and Display under the Function menu of the ExonTracker 2 0 main form The Non Similarity Search and Display form will appear Fig 20 20 ExonTracker User s Manual Fig 20 The Non Similarity Search and Display form with sample data MEM MT red colored region m Non Similarity Search and Display y 0 0 b Import File Load Sequences _Add One Selected Sequences Browse sampleseq Add All sampleseq xon Clear All CGTTTCAGGTTGAAGATGARATCCACTGAGGAGGGAAGTCCAGCACCCTGTGTGCCAGTCCAGAARCTGGCCCATCTGTAGACCCCCTGAAAARTCATATGGECTTGG ATTTGGATATTCTCAACAGAAAGGCTTAARAGGCTGATGGTACCTAAAGCCTGGTACTTGAARTTTTGATCAARGATAAGCTGCCTTAAGTTCTCTTCATTA4CACZ3S ORAATGATCCTAGATAATTGATAGATCCTGTGGTTCAARCTGGATTTCTAGATAGAAGCTGGATTCATGTGATGCCAGAGlO6ZGAGTAAAARTTTCAAGRAGACTGA AACCAGATCTGACGTTTCGCTCTTCCAGTCTGGACCTCTTTGGTGCTGTARATCCTGGATATACTGTAGATGACGTACTGCGTTTTTCTTTTATG7564AWPCGAGCT GCGGCCCGCAARCTCCCTCAGCCTGGCAGGTCCCAGGATGCTTCGGGGCAACCCCACGCATGCGCGCTCTGGCCGTAGTTCCCTGAARGCACTCGCTCGCAGACAATC CTGGAATCTTCGCTTGTGGAGCTGTGGAGGCAACGGAGTTTTTCCGCGCTTTTTCTTTCAGCCTCAGl2Z34GCCTCTCTTCAGCTTCTGGAGACCTCACTATCCT ATTATGTCTTTGTGTGAAGACATGCTGCTTTGTAATTATCGAAAGTGTCGCATCAAACTCTCTGGCTATGCATGGGTCACTGCCTGCTCTCACATCTTCTGTGAT CACCATCCCACTCETCACTTTACTCCCTCACCACLCTATC TC TCCTCCCTCCAACACTACCCTTTCTCCAAACLCTACATATTCTCCCCACACAACTCACTCCATC A
14. forward primer and reverse primer Use the two pairs of guidelines to set the boundaries To design a primer pair to span exon junctions move the four guidelines accordingly so that exon junctions are between the two 3 guidelines One can directly set boundaries to a single base precision using the four pull down menus Set a desired annealing temperature for the pair of primers using the Tm pull down menu For a subcloning purpose optional you may add enzyme sites from the enzyme selection pull down menu RE site for forward and reverse primers These enzyme sites listed in the menu do not exist in the sequence between the 5 of the forward boundary and the 3 end of the reverse boundary and they are dynamically updated following any change of positions of the two guidelines Add a few bases to the 5 end of the restriction site to ensure a complete digestion of the PCR product with the selected enzyme Select the number of primers pairs to be designed on the Oligo Returned pull down menu Click on the Design button to begin It takes a few seconds for a primer pair to be displayed in the spreadsheet The best pairs are always listed at the top of the table Clicking on the primer sequence in the table will highlight the primer sequence in the sequence box so that you can verify the primer sequence and examine the adjacent bases 24 Notes ExonTracker User s Manual Highlight the rows you want to copy and then click on the
15. menu The Shape Parameter form will appear Fig 13 c Use the buttons and pull down menus to adjust the parameters of the shape back color fill color fill style and shape d Click on the OK button to accept the change or Cancel to exit without any change Fig 13 The Shape Parameter form x Back Color Color Pallette__ Color Color Palette Style 5 Downward Diagonal Shape 2 Rounded Rectangle 6 try different shapes for a domain just change the shape ID by typing or double clicking on the shape number after selected the number to be changed Click on Shape IT to redraw the diagram 7 copy the diagram click on the Copy Drawing button to the clipboard and then paste it to other picture editor such as MS PowerPoint and Adobe PhotoShop 16 ExonTracker User s Manual 8 To copy the table content highlight the rows to be copied and click on the Copy Table button 9 Click on the Merge button to transfer the diagram to the Data Processing and Integration form 10 The protein domain diagram will merge with the existing diagram in the picture area of the Data Processing and Integration form Fig 14 If the domain labels are stacked on each other move the mouse point to the label hold down the mouse left button and move the mouse to separate them Click on the ID column to select row or rows corresponding to a domain The dotted lines will connect exons with the corresponding region of
16. name will appear in the Selected Sequences poll down menu One by one add all desired names to the Selected Sequences pool Use the pull down menu under Selected Sequences to select a sequence you wish to analyze The sequence will appear in the text box below This sequence is used as the query and the other sequences who s names in the Selected Sequences will be used as subjects to be compared with Click on the Search button The program starts to perform non similarity search and may take from few seconds to several minutes depending on the number of sequences in the selected pool When the process ends the identified non similarity regions of the sequence will be marked with red lines in the diagram If the used sequences are dot xon files the intron lengths and their locations in the transcript sequence are labeled thus the program creates a diagram showing the coding information the non similarity information and the exon intron information Fig 20 The non similarity regions of the sequence in the text box are also red colored Click on the Copy Diagram button to copy the diagram to the clipboard and then paste it to other picture editing environments such as PhotoShop and MS PowerPoint To save the sequence with intron and non similarity information click on the Save Sequence button to show the Save file window Select the doc as the file extension to save the sequence as a Microsoft Word document file This sequence file can be r
17. sample of Blast Return in the Data Browser form Pull Down option menu with Exon Data Browser 812218 2 Info selected Extract Data Sco 6797 bits 3425 Expect 0 0 Identities 3429 3429 100 Strand Plus Minus Button Query 1067 gctygcttygtgaaltttctygagacgyatyglaacasalactygaacatcatcaacccagtaat 1126 Sbjct 4969376 gctygcttygtygaaltttctyagacygygatyltaacasatactygaacatcatcaacccagtaat 4969317 Query 1127 anrqnarrranncaccacrqagnanqegrqenagqcraagaggcarccagnaangrarceagggr 1186 ELE ELE ELLE ELLE BL EEE EERE EEG nnrqarrranancaccacrqagnangqegarqenagqcragagagqcearccagnaangrarcagagr 4965257 Shier 4268316 10 ExonTracker User s Manual III Data Processing and Integration 1 Wait until the Data Processing and Integration form appear Fig 6 The form has three parts Its top part is a picture box for drawing diagrams When the form initially appears there 15 already a diagram depicting the query sequence with the length and coding region labels based on the transcript information The middle part is designed to show sequence alignment dynamically in response to the movement of the mouse point within the picture box The positions of the junctions of two adjacent aligning fragments and the locations of base pairs are also labeled in accordance with the alignment position Click on the picture box once to stop the movement and click again to resume The bot
18. 19 Whole sequence alignment without amino acid sequence Click on the Juncture to display the m Whole Sequence Alignment Report File Edit Juncture predicted junctions Seqi SeqII NM 134144 1 NT 039692 1 Identies 99 34 72 base Row CTCCATGGATCCAATCCTGGTCCTGGTGTTCACCCTCTCCTGTCTGTTTCTCCTCTCACTCTGGAGACAGAG CTCCATGGATCCAATCCTGGTCCTGGTGTTCACCCTCTCCTGTCTGTTTCTCCTCTCACTCTGGAGACAGAG CTCTGAA AGGGGGAAGCTCCCTCCTGGCCCCACTCCTCTTCCAATTATTGGAARATATCCTTCAGATCAATGT CTCTGAAAGGGGGAAGCTCCCTCCTGGCCCCACTCCTCTTCCAATTATTGGAAATATCCTTCAGATCAATGT GAAGGACATCTGCCAATCCTTCACCAARTTTATCAAAAGTTTATGGCCCTGTATATACTCTATATTTGGGCAA GARGGACATCTGCCAATCCTTCACCARTTTATCAAAAGTTTATGGCCCTGTGTATACTCTATATTTGGGCAG TAAGCCCACTGTGGTGTTGCATGGATATGAGGCAGTGAAGGAARGCTCTGGTTGACCATGGGGAGGACTTTGC di d icte GAAGCCCACTGTGGTGTTGCATGGATATGAGGCAGTGAARGGAAGCTCTGGTITGACCATG i 52 junction TGGAAGAGGAAGGCTCCCAGTTTTTGATAAAGCTACTAATGGAATGGGCATTATATTTAGTAAAGGAAATGT
19. 7295 as an example to demonstrate the process see Fig 1 its GI 6552300 can also be used Click on the Submit button next to the text box This will query the nucleotide Entrez database in NCBI and return the Entrez document of NM 007295 in the Data Browser form This process may take several seconds to several minutes Wait until the entire page is completely downloaded Fig 2 Fig 2 The Data Browser form with the nucleotide Entrez content The Pull Down option menu with Transcript Info X LL selected Extract Data button Pubida i ant tractus ub T emp Muclecide Ee Go Preview History Clipboard Details Displmy ae mult 2 n Sand to Get Subsequence 1 007295 Homo sapiens brea gr6552 300 inks HM DUIS T3653 bp MENA Linear FRI ZU0 DEC ZULU DEFINITION sapiens beeagt cancer 1 carly BRCA1 transer ipe variant ACCESS Don MH Dies VERSION MH 007295 1 1 6552300 KEYWORDS SOURCE Hom sapiens himan ORGANISH Hore sapiens Eukaryota Chardat amp Craniata Eut amp lesaostomi 11 Eutheria Primates Catarrhini Heesinidae Homs REFERENCE l bases 1 ro 7355 AUTHORS LeVine D A AGENTS Ba rsha il D S Bagcmelniy T Rahaman J A BRobsen H E Offit E Barzakat R R Soslow R k
20. ARAAATACOCTTOA BEB E L E T V T Db EL F X i T B T T 8 T T L BR Y ALALCTISGURATAAUCAGTGACLTISATCTETETTECTGLUTUIzGACAGAGALRACARDUAURACACTIZRISARTATISC ACACCTUCDURATAAUCACTGACTUEATCTETTITICTGCUTULCACACAUCAURACARCCARUARACACTURIEATATULC L L L L TY P H V T amp E V E E I E H V i TCTCCTACTCCTCTTCAACTACCCACATCTCACACTCTARACTCCACGCAACAAAT TUGAGCATCTCATCCCCA T LTLLETACTCETDTTIZAAGTACDUCACATUITUCACAGHUTARAGTCUADIZRAGARATTIZRDUCATOTGATEGULR 1021 E H R Z PCBHUQUD RS RBPTYTTDAEBISN F ACCACCCAACCCCATCCATCCAGCACACCACCCCTCATCCCCTATACTCATZCCATCATTCATCACCTUCAGA A0T3 111 TI RCOCACCCRACACCATGOCATCCAGCACACCACCCACATCCOCCTATACTCRTCCCATCATTCATCACCTCCR CR 19 ExonTracker User s Manual b The select and copy functions menus are under Edit while Save and Print are under the File menu 7 export a whole sequence no labeled amino acid residues select Alignment Without Amino Acid Note unlike the previous function it can be used to display the alignments of any selected number of exon segments Fig 19 Fig
21. ATGGGCTACAGAAACCGTGCCAAAAGACTTCTACAGAGTGAACC CGARARTCCTTCCTTG4241CAGGAAACCAGTCTCAGTGTCCAACTCTCTAACCTTGGAACTGTGAGAACTCTGAGGACAAAGC 248 TATTGCAGl321TGTGGGAGATCAAGAATTGTTACAAATCACCCCTCAAGGAACCAGGGATGAAATCAGTTTGGATTCTGCAAA 85 TAATAAAAGC AARACAGCCTGGCTTAGCAAGGAGCCAACAT AAC AGATGGGCTGGAACT AAGGAARACATGTAATGATAGGCGGAC TCCCAa caCcarzaaaaaaacrcTAarcaTCT ICaaT CCo TI aTCoCoCocCoT CTITIa ca icaaaacaaTc raaTaaccarcaaacTCoeoaTcecoTCo a When the option of Select by Exon is selected click on the sequence of the nucleotide will highlight the entire exon sequence while when Free Selection 15 effective selection will not be confined within an exon To copy the select sequence select the Copy menu under Edit The copied sequence can be pasted to other text editor 18 ExonTracker Use
22. G lt eneHarbor racker M Version 2 0 User s Manual O 2003 GeneHarbor Inc www geneharbor com ExonTracker User s Manual Copyrights 2003 GeneHarbor Inc rights reserved GeneHarbor ExonTracker and xon are the trademarks of GeneHarbor Inc other trademarks and registered trademarks are property of their respective owners License Agreement GeneHarbor Inc grants a license to use the accompanying software and printed material to you the original purchaser This is a binding Agreement between you and GeneHarbor Inc Use of the software shall constitute your acceptance of this Agreement The copying of the software 15 strictly prohibited and adherence to this requirement 1 your sole responsibility GeneHarbor Inc reserves the right to modify and update the software and printed material without obligation to notify you the original owner of any change in the software and printed material Limited Warranty If the performance of the software does not meet the standard described in the documentation GeneHarbor Inc will replace the software if notified within 30 days of purchase In the event of a replacement agreed by GeneHarbor Inc the original software accessories and documentation must be received by GeneHarbor Inc in order for a replacement to be sent to you The original users can replace a new version of software free of charge if there is one available Under circumstances sha
23. TATAADIARARRORARGCOTACAABAAAZTGRCO OAGETTTAGTCAAL UTTGTTORRGAGCTATTRAALAATORATTTGTOC eC ACD EDD Wee CC aT Sequence g CTCTARC TTCRRRCTCTCACAACTICDEECIRCRRACCACCCCATACARCCTCARRRACACCTCDEICTRACARTTCANTTCCCATUTERTTCTTCTGA ACATACCEITERTAERGCGCAARCTITATTIEROECIGICUGOIACATCAACAATTCITAECAKRRKTCACCCCDTUNANRE RRCCACGCCCATCARADCECITTGCUG CAETTC VUA eT eT Me UT AT 0 CAACULNET ABT ART Me a e AC ADLNCT eal E Click on the Extract Data button to show the Nucleotide Info form Fig 3 There are three major parts in the form The top part display several pieces of information about the transcript including items directly read from the Entrez document Access number Protein ID Description Tissue Length and the coding region and two new items generated by the program the longest open reading frame ORF and the GC content ExonTracker User s Manual The middle part of the form is a simple diagram indicating the length and the coding region of the transcript based on the information from the Entrez document The lowest part is a text box containing the nucleot
24. and Select Transcript Info from the pull down menu next to the Extract Data button There are three options Transcript Info Protein Info and Exon Info The program usually can detect the page contents but it is recommended to make sure that the item selected corresponds to the content in the Data Browser form Fig 3 The Nucleotide Info form with extracted data To launch genomic Blast Accession 1 BLAST Gerome Length 7345 Select Coding Info Product WF 0092261 Coding Region 208 508 Tissue Longest ORT 09 5090 Definition Home sapiens breast cancer marly onset ni 42 17 BECAL transcript To Query protein Entrez LECALE TET Me SuN E AT Be TET CET Mee eT wl ils ATAL Cee TTR AM ET LLT tus AS PT Tie Pee Ve e CELT MAREE le Ae e ATT 4 C FOAL ATOR SE ATT OG AT ADR eA CAE Ae eA A A a eT OTT Copy e opy equence GRAAL AG AAA HEPES ASCP EST ETT Cac eT TAA AAG AP RABE ee SATE RE OTT CPCCA eT Ae A ADEE he RATT wr eA TTCTCAMCCAGAAGAALERRMET TOMA GTUGTDUCPTTATGTARGAUTGA
25. below the picture area 23 ExonTracker User s Manual Fig 23 Primer Design Form Intron length move mouse here to show the location in the transcript sequence d Primer Design Forward 5 Reverse 5 Forward 3 Reverse 3 6365 154858 Tm Set the boundary of forward oligo Set the boundary of reverse oligo Forward 2 3 3 5 a RE site 1451 1547 62 1860 1980 Additional base r Sequence box Browse Paste Clear Name NM 013982 Reverse RE site CCCCAGCCTTCCTACCGTTGGTTCAAGGATGGCAAGGAGCTCAACCGCAGCCGAGACATTCGCATCAAATATGGCAAC GGCAGAAAGAACTCACGACTACAGTTCAACAAGGTGAAGGTGGAGGACGCTGGGGAGTATGTCTGCGAGGCCGAGAAC Additional b ATCCTGGGGAAGGACACCGTCCGGGGCCGGCTTTACGTCAACAGCGTGAGCACCACCCTGTCATCCTGGTCGGGGCAC E GCCCGGAAGTGCAACGAGACAGCCAAGTCCTATTGCGTCAATGGAGGCGTCTGCTACTACATCGAGGGCATCAACCAG Oligo Returned CTCTCCTGCAAATGTCCAAATGGATTCTTCGGACAGAGATGTTTGGAGAAACTGCCTTTGCGATTGTACATGCCAGAT CCTAAGCAAAAGCACCTTGGATTTGAATTAAAGGAAGCCGAGGAGCTGTAC en Nen WR ACGGGC w 1 Print Location 1456 1476 A B Num Name Oligo Sequence Direction Location cDNA Size Genomic Size Penalty P Dimer Pair 1 NM 013982 CAGAAGAGGGTCCTGACCATC Forward 1456 20 No Pair 1 NM 013982 CACGGAATCGTGATAGGGTGG Reverse 1989 534 4126 20 No Define the boundaries of the
26. between two adjacent exons the program predicts the splicing junctions based on the splicing rules and then remove the overlapping base pairs The program processes all junctions starting from the row representing the 5 of the query sequence Note a If there is no more redundant segment in the region at the end of the process the background color of the middle portion will make a change to indicate the data process is completed The data in the spreadsheet and the location labels in the alignment display will be updated to reflect the segment data after the trimming b If the program detects a gap a missing exon it will give a warning message indicating the total number of the missing base pairs and its location and then add to fill the gap C If the program detects a long stretch of overlapping base pairs indicating there is a potential redundant exon in between it will give a warning message to tell the location of the potential redundant piece and give users three options Fig 9 Abort will close the Data Processing and Integration form Retry will return to the status before the Define Exon Junction was used In this case previously made deletions and sorting will remain effective Ignore will continue the process despite the warning Select Retry and see if there is indeed a redundant exon between the two segments detected by the program If there is one delete it using the Delete function mentioned above Repeat the a
27. bove process until there is no more redundant segment in the region If you cannot find a redundant segment in the location detected by the program it may be that the overlapping is naturally long and is treated as if there was a redundant segment by the program false warning Repeat Define Exon Junction and ignore the warning by clicking on Ignore The process will proceed to pass this junction by removing the overlapping base pairs Fig 9 Warning message box Pron 24 Please check if there is a redundant exon between 1 and exon 2 try to sort the S Location first d The Define Exon Junction utility completes two things removing overlapping base pairs and predicting the exact splicing junction The program tries to make the best judgment based on the sequence in the overlapping region For the first goal it can correctly remove the redundant base pairs in each junction For the issue of predicting 13 ExonTracker User s Manual splicing junctions if the number of the overlapping base pairs is larger than one the prediction accuracy is near 100 when it equals to one pair since there is not enough information for making a prediction based on the splicing rules the program randomly removes one base pair In this case some prediction will be off by one base pair Please make a notice on this issue If the position of a splice junction is very critical to your analysis use other means to define it d
28. e Setting menu to show the Link Setting form Fig 24 To add a new link paste or type the link name in the text box just under Select or Enter a Site Name then type or paste the URL to the text box under Enter a New Link You also can click on Current Link to input the current Web link showing on the Data Browser form Click on Add New button then Apply button The new link is added to the links stored in the pull down menu for genomic Blast To Update the URL of an existing link select the name from the pull down menu under Select or Enter a Site Name and then type or paste or use the Use Current Link button to input the updated URL to the URL box Then click on Update then Apply button The URL of the selected link 15 updated To delete an existing link select the link name and click on the Delete button then Apply button To change all links to the original setting provided by the package click on Set to Default then Apply button Note Any change made without clicking on the Apply button will not be effective AII changes made will be immediately effective after clicking Apply and remains so after restarted the computer Fig 24 The Set Web Link form x Set Web Links Select or Enter a Site Name My New Data Base Entera New Linkor Use Current Link Delete Add New Update Set All to Default Cancel Epply 26 ExonTracker User s Manual Appendix B Manual
29. ead by the Primer Design in our previously released package GeneLooper 2 0 With the sliding bars in the Primer Designer utility of GeneLooper 2 0 one can conveniently design primers to be specific to a unique region and to span introns This is an extremely useful feature for studying the expressions of spliced variants 22 ExonTracker User s Manual VII Primer Design Take Exon Junctions Into Consideration RT PCR has been widely used to detect gene expression A successful DNA amplification is partly depended upon the pair of primers used Optimized primers can increase the yield of the amplified DNA and reduce the background caused by non specific reactions In addition to the general criteria the locations of the pair of primers are also critical for producing reliable data Specifically designing primers to span exon junctions can eliminate false positive data due to amplifying contaminated genomic in the mRNA used in the reaction The primer design utility in ExonTracker is developed based on the data from thousand of PCR reactions and has been proven to be very reliable for designing optimal primers Combining with the transcript sequence annotated with exon junctions and or our exclusive primer design layout one can easily design optimal primers producing a fragment spanning exon junctions thus obtaining unequivocal gene expression data Procedure Note The primer design utility in ExonTracker 2 0 can processes either
30. ect was completed The direct outcome of the project 1s that the arrangement of the base pairs of nucleotides comprising human genome becomes completely known Its historic significance may not be understood fully at present time but it immediately provides an answer to a very important question that is how many genes exist in human genome To many of us a big surprise the results coming from the study by several groups indicate that we human have only about 35 000 to 45 000 genes much fewer than what people originally thought This number is not much significant higher than that from a lower eukaryotic organism such as Fly Considering the much complex human body the amount of genetic information required for human should be conceivably much larger than that for any lower eukaryotic organism The answer to the discrepancy in the genetic information may very well be due to the ways of the regulations of gene expressions One of the major differences between eukaryotic organisms and prokaryotic organisms 15 that the majority of mRNAs in eukaryotic organisms are transcribed as pre RNAs from intron containing genes and processed into mature RNAs through RNA splicing Evidence has shown that many genes have multiple forms of transcripts which are formed by different combinations of exons With the progress in cDNA cloning it has become clear that the majority of genes have multiple isoforms of transcripts Some genes have been found to express more than
31. g of S Location blue colored to select the entire column c Openthe Sort menu under the Maneuver menu and select Sort Ascending if the orientation 15 plus or Sort Descending if minus The data in the spreadsheet will be sorted according to the 8 Location column d Examine the Q Location column and search for the smallest number the first segment within the candidate region Fig 7 Highlight all rows above the row just identified Because the selected rows are not true exon segments delete them using the Delete Row function under the Maneuver menu Fig 7 The spreadsheet with the upper non exon segments selected after sorting the column S Location The first segment of the query sequence in the candidate block 0072851 NT 010755 14 4999629 0072851 NT 010755 14 7365 378 4998631 100 100 0072851 NT 010755 14 7365 477 4330235 55 55 7295 1 NT 1 14 FARA 49381648 78 79 Hs17 1I 0 1 3 17 1I 0 1 5 Hs17 1I Minus 1 Hs ipd e Highlight all rows below the last segment the last exon judged by adding the number of Q location to the number of Tot Base and see if the sum roughly equals to the length of the query sequence Q Length Fig 8 Delete the selected rows as described above because they are also non exon segments Fig 8 The spreadsheet with the lower non exon segments selected after sorting the column S Location Query Subject Length Q_Loctio
32. g the order of the transcript and incorporating amino acid sequences translated from the transcript and the subject sequence a useful feature for comparing a query sequence with the genomic sequence at the amino acid level Users who have done sequence BLAST and Entrez querying will have little difficulty to use ExonTracker 2 0 The specialized Web browser in the software has many shortcuts to some frequently used querying pages There are convenient links in ExonTracker 2 0 for transferring data from one type to another Users can perform data analysis online or offline using previously saved genomic BLAST data and Query nucleotide and or protein Entrez documents To assist users to understand the logic and data flows in ExonTracker 2 0 we have created a Flowchart shown on Page 6 to give users an overview about the software In the flowchart the data sources and functional operations are presented and linked by arrows and lines The detailed procedure for each utility can be found in the late chapters of the manual ExonTracker User s Manual 1 119 do or paura EIECT BABEL 9121024 WAL SUL 909 aog A palng palquiassy Un Tu uox a3uanbac Al ase ut 124014 uosa ward ueu WET 27 WOT ueu BSurssaJod Sp HOS PTY AIAX 889227
33. gh N TX Gest Open Fmiwra e ur nes Pater HM mne Diin Sji Bem _ wn dir id quienes in PASTA fecic L ouai k T COCA T Ec cere Rae cg APASAT TITTETTAANTCTGEAATATATTTOCTAGATTTCT UAE SUA TUTUTTUT RUE ECCE TRTATRAPCOCATT TETAATTCCTCATTASTAATANSTALLLTCTTTATTCTTCTAGCEUCEUST RT VTTTERTATET ATA AE TOR LOU TOC CUOI UOR ILLO UUL CT TTUC TTECTT Pac Tage Rag oue FROG TE RAAT ATL CTT TurrrrTTMETITTEL TUTT ECT AAALL Spee ta a ae TCTTTATACE TCTTCCLA CC C mas p AacccOcOCLGTOTUCLLOOGGCLCTGLLCACTECATTCTR RRARREREDCETEPGTLLLTGTTUTOEECTOGTTLLC AE Lu AMME pir piii iir Approach Approach II Approach I Approach I Beginning with a nucleotide sequence in fasta format 1 Paste a nucleotide sequence to the text box from the clipboard or use the Browse button to load a sequence file stored in your computer It is now ready to Blast a genomic database Approach II Beginning with a nucleotide accession number 1 Enter a nucleotide accession number in the text box titled Enter a nucleotide ID in the DATA Entry form Here we use the accession number for human breast cancer 1 early ExonTracker User s Manual onset BRCA1 NM 00
34. he custom installation method All components are required for the program and should be kept in the designated folder all the time After installing ExonTracker 2 0 the interface will automatically continue to install licenser device driver required to run ExonTracker 2 0 Follow the instructions to finish the process An Icon for ExonTracker 2 0 will be placed on the desktop of your computer After the installation attach the licenser key to your computer Double click on the ExonTracker 2 0 icon on the desktop to run ExonTracker 2 0 ExonTracker User s Manual Overview ExonTracker 2 0 is designed to assist researchers to understand how a transcript 1s assembled from RNA splicing With the software users can easily answer some basic questions such as the number of exons to form a transcript the distance between two adjacent exons the length of intron and the exons encoded a particular protein domain and so on ExonTracker 2 0 fully relies on the resource and data in the public databases created by National Center for Biotechnology Information NCBI Particularly it uses the returned BLAST data from a query transcript against a genomic database It also uses the annotated information presented in the Entrez documents both for nucleotide and protein sequences While the data from NCBI have already been providing tremendous help for researchers ExonTracker 2 0 taking advantages of the processing ability and flexibilit
35. ide sequence designed for copying and saving the nucleotide sequence 5 Click on the To Genomic BLAST button next to the nucleotide accession number This action will input the accession number to the text box in the Data Entry form for genomic Blasting Note You may wonder why just put the Accession number directly in the text box for genomic Blasting The reason to go through the steps described above is to retrieve the physical sequence of the transcript from NCBI and stored it in the program system for late use Approach III Begin with a protein Accession 1 Type the protein accession number in the text box titled as Enter a protein ID Here we use the accession number for human breast cancer 1 early onset BRCA1 NP 009266 as an example When its GI 6552301 1 used the results will be the same Fig 1 2 Click on the Submit button next to the text box This will query the protein Entrez database in NCBI and return the Entrez document of 009226 in the Data Browser form Figure not shown Wait until the entire page is completely downloaded 3 Click on the Extract Data button to show the Protein Info form Fig 4 In the top part of the form there are several pieces of information about the protein including items directly read from the Entrez form Access number Nucleotide Accession Number Definition Cytogentics Length and a new item generated by the program the calculated molecular weight of the protein Other c
36. ine Exon Junction To use this exclusive data format created by GeneHarbor Inc select the menu under Export to display the Marked Query Sequence form Fig 16 Fig 16 The Marked Query Sequence form with a sample dot file m Marked Query Sequence File Edit 3Juncture 9 Exon Junction and Intron length GGCAGTTTGTAGGTCGCGAGGGAAGCGCTGAGGATCAGGAAGGGGGCACTGAGTGTCCGTGGGGGAATCCTCGTGATAGGAAC GGAATATGCCTTGAGGGGGACACTATGTCTTTAAAAACGTCGGCTGGTCATGAGGTCAGGAGTTCCAGACCAGCCTGACPE ACG TGGTGAARACTCCGTCTCTACTAAAAATACAAAAATTAGCCGGGCGTGGTGCCGCTCCAGCTACTCAGGAGGCTGZGUCAGGAGA ATCGCTAGAACCCGGGAGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCATTGCACTCCAGCCTGGGCCGALEKGAGCGAGACTGTC 2 237 9 192 1499 GAAAATCATTTGTGCTTTTCAGCTTCACACAGGTTTGGAGT606ATGCAAACAGCTATAATTTTGCAAAAAAGGAAAATAACTC TCCTGAACATCTAAAAGATGAAGTTTCTATCATCCAAAGT
37. ll GeneHarbor Inc and its officers or its distributors be liable for any indirect incidental consequential or exemplary damages arising from the use or the inability to use the software even if they were aware of the possibility of such damages GeneHarbor Inc info geneharbor com www geneharbor com ExonTracker User s Manual Table Contents Copyrights osito d seas 1 3 CDROM InstallatfOH qus ce vai mod Ten dae nu as e vans aUa 4 can HN EET 5 Protocol I Preparation for launching the genomic 8 7 II Genomic BLAS a T AR A outdated 10 III Data Processmg and Inte 11 IV Addition of Protein Domain Information 0 02 22 14 V FPO A A E 17 VI jSedre 20 Primer Design Take Exon Junctions Into 23 Appendix A Web Ink Update uei tote ie ud sd 26 B Manual Selection of A Coding Region 2 27 Ser Exon and Iatrom 28 ExonTracker User s Manual Introduction Several years ago the human genome proj
38. ment on the protein functional domains The innovative dot file xon is extremely useful for designing PCR primers spanning exon junctions or within an exon We hope you will enjoy using ExonTracker 2 0 in your research and we would also like to hear your opinions on the software and suggestions for future improvement Geneharbor Inc January 31 2004 ExonTracker User s Manual Installation System Requirements The recommended system and configuration for ExonTracker 2 0 Component Minimum Requirement Processor Intel Pentium III or compatible 500 MHz or greater RAM 256 MB or greater Display 800 x 600 resolution 256 color depth small fonts setting and 256 colors or greater Operating System Windows 95 Windows 98 Windows Me Windows NT 4 0 Windows 2000 or Windows XP Drives CD ROM drive CD ROM Installation If you have Autoplay turned on your computer will automatically run the CD ROM interface otherwise follow these directions Insert the CD ROM into your CD ROM drive From the Windows desktop double click the My Computer icon Double click the CD ROM icon Double click the setup icon to start the installation interface py Follow the instruction of each step during the installation process ExonTracker 2 0 will be installed into a default folder assigned by this program In case that you do not want to use the default folder you may also install it to another location by choosing t
39. n S_Loction Identic Tot Bases Identities Strand ID Click Here 23 0072951 NT 010755 14 7365 5728 4323711 7 100 Minus 0 1 23 Hs17 11 24 _007295 1 NT 010755 14 7365 5804 4322218 61 61 100 Minus 0 1 24 Hs17 1I 25 _007295 1 NT 010755 14 7365 58R4 4320318 1498 1503 99 Minus 0 1 25 Hs17 1I NT 03 8873 7385 28 _ _00759213 7385 The last segment of the query sequence in the candidate region 12 ExonTracker User s Manual f Click on the heading of ID Click Here blue colored to select entire ID column This action evokes the program to draw all exons rectangles and introns lines with labels of intron lengths above the existing transcript diagram Each exon 15 pointed to the corresponding location of the query sequence by a pair of dotted lines Step d and remove all unrelated segments non exon matches upstream of the first exon and downstream of the last exon but not the unrelated matches inside the exon region By examining the diagram one can remove them manually if there 1 more than one genomic rectangles pointing to the same location of the transcript indicating a redundant non exon segment in the region Delete the row to eliminate the non exon segment Click on the heading again to redraw the diagram The next step can detect a redundant segment between two true exons automatically g Select Define Exon Junction under Maneuver menu By examining the overlapping sequences
40. ontents of the form will be discussed in late chapters Fig 4 The Protein Info form with extracted data Query Nucleotide Entrez lt ExonTracker J Window m Protein Info NP 009226 1 nj x Accession NP_009226 1 Length 1863 Coded By NM 007295 1 Query Entrez Molecular Weight 2077508 Cytogenetics 17421 Definition breast cancer 1 early onset breast ovarian cancer included Homo Source Location Domain Domain Abbreviation 1 4 14831 23 68 RING finger Really Interesting New Gene RING 2 4 22138 B4 765 Transcriptional regulator KOG4362 3 4 22138 1349 1851 Transcriptional regulator KOG4362 gt r Shape Option Domain label option s mm Sow 0 mum mem ore Copy Table Copy Draw 4 Double click on the Nucleotide accession number to query the nucleotide Entrez database in NCBI and retrieve the Entrez document of NM 007295 The Data Browser form with the nucleotide Entrez content will appear Figure not shown ExonTracker User s Manual 5 Follow the steps described in Approach II begin from step 3 to complete the data extraction and submit the ID to the Data Entry form for Blasting Genomic BLAST The operation of Genomic BLAST is the same as for the regular genomic Blast provided by BCBI The Data Entry form serves just as a customized Web browser with many shortcuts to some frequently used genomic database
41. r s Manual b To save the sequence file select the Save as to open the standard Save As window The file can be saved as or txt file c To print the file select the Print function under File menu d Click on the Remove Number button to remove all numbers inserted in the sequence 5 Genomic Copy of Query Sequence The assembled genomic sequence matching to the transcript is exported by selecting the menu under Export The sequence is displayed in the text editor similar to the previous one Each exon region 1 colored and can be selected by exon or freely Figure not shown Similar to the previous file it can be copied saved and printed 6 Export whole sequence Alignment With Amino Acid sequences This function can only be done after finishing Define Exon Junction and having all Exon rows selected Fig 17 Fig 17 The required status of Data Processing and Integration form just before exporting whole sequence alignment with amino acid labeled Fe ar A 04S 1 Maneuver Parameter Elet Wired Help i n 13748 13750 pna 1 ARTATGRAGAAGTRGTTCAGACTGTTARTACAGATTTCTCTCCATARTCTGATTTCRGATAACTTARGAACAGCCTART i Toi HMM UU HT Urs 14 MM cy HI 4 Wea MM 072981 HT 01075514 755 478 54 E 102 AMOS 010755 14 75 532 4581048 78 TH HM drca HT Ursa 14
42. s The Genomic Blast links preset in the pull down menu under the text box can be deleted added and updated Refer to Appendix A for additions or modifications 1 Following the last step of all approaches described in previous chapter Fig 1 there should be a nucleotide sequence or an ID in the text box for BLAST Select a desired genomic database from the pull down menu under the text box 2 Click on the Submit button to launch the BLAST This action transfers the content in the text box of the Data Entry form to the input box of regular genomic Blast page 3 Wait until the content appears in the input box for the standard genomic BLAST page provided by NCBI Uncheck the MegaBlast option and set the Filter option to because Blast with the two options sometimes result in the loss of short fragment alignments consequently the loss of short exons 4 Click on the Begin Search button to submit your blast request This blast procedure is identical to the regular blasting procedures Wait until the Blast data to return completely as the Data Extract button gradually becomes clear and the pull down menu next to it displays Exon Info If it does not display the item manually select Exon Info Fig 5 5 Click on the Data Extract button to extract the Blast data and transfer it to the Data Processing and Integration form It may take a few seconds to process the data and display the form if the file size 1s large Fig 5 A
43. scales to draw the diagram so that it fits well to the drawing area In order to compare more than two transcripts graphically it 1s better to draw the two in the same set of scales The program includes a utility to give users option to set the scales manually Procedure 1 Select Exon Intron Scales under Setting menu of the Data Entry form or under Parameter menu of Data Processing and Integration to show the Exon Intron Scales Form Fig 26 Fig 26 Setting the Exon Intron scales manually 101 File Maneuver Parameter Effect Export Window _007295141___ ini x 9192 2485 8237 4241 985 621 c 10 i 9 182 80809 Er WT emi nens Manual selection ind nr y E N gt and Intron Sacles 39 Autosacle Exon Intron 0 63544 fo 057347 Cancel 149660 lt ID Click 007295 1 NT 010755 14 4999629 E _007295 1 NT 010755 14 7365 379 4998630 99 99 100 013 Hs17 1 _007295 1 _010755 14 7365 478 4990294 54 54 100 848 1I _007295 1 _010755 14 7365 532 4981048 78 78 100 Minus 0456 11 _007295 1 _010755 14 7365 510 4979471 89 89 100 Minus BH 11 tat 2 4 TCCTTTCTTGATTGG NM 007295 1 Exon 1 To 23 TCCTTTCTTGATTGG NT 010755 14
44. sequence files seq or dot xon files xon This function can be accessed through Data Entry under Function menu Fig 22A or Data Processing and Integration under Export menu Fig 22B The following procedure shows how to start primer design using the later after the completion of Define Exon Junctions Fig 22B ExonTracker Hmmm eem Copy Table Content Primer Design Al E Non Similarity Search and Display http Jara ncbi nlm nih govigenome seq HsBlast htral DX ipe E 1 Upon the completion of Define Exon Junction page select Primer Design under Export in Data Processing and Integration form The query sequence with exon junctions annotated 15 transferred to the Primer Design Fig 23 Note In the picture area of the form there is a horizontal line depicting the length of the input sequence The coding region of the sequence is marked by lines labeled with ATG and Stop Each exon junction 15 indicated by a vertical line below the horizontal line and the corresponding intron length is also labeled On top of the horizontal line four vertical lines guidelines are set in place for defining the regions to select primers The lines can be moved by pointing the mouse to a line label then holding down the left mouse button and moving it to a desired location The position of each line can also be adjusted precisely by using the corresponding pull down menus located just
45. sing and Integration form This section describes how to add protein domain information to the diagram created during Data Processing and Integration If you do not want the information in your analysis you can go directly to Data Export because the protein domain information is nonessential to other operations in the analysis The Protein Info form has been introduced briefly in a previous 14 ExonTracker User s Manual section Follow the procedure described below to draw a protein diagram with its domain information 1 Use the method described previously to retrieve a protein Entrez content to the Data Browser form Check the Check box next to CDD and click on Display button to get annotated protein domain information Fig 11 Click on the Extract Data button with the Protein Info item in the poll down menu selected to display the Protein Info form Fig 12 Fig 11 Method to get the pre annotated domain information 2 Click 3 Click ExonTracker E x File Setting Function Help m Data Browser e gt ajm a http Huny ncbi nih govientrezAnewer fegi val NP 009226 1 1 Check Nucleotide Protein Genome Structure PMC Taxonomy OMIM Search Protein for Go Clear Limits Preview Index History Clipboard BeTails Display GenPept send allto file Range from begin to end Features SNP M m 1 NP 009226 Repor
46. the transcript and then with the protein domain The panel in the right of the middle part will display the beginning and ending exons involved 11 To hide the connection lines or the intron labels select the functions under the Effect menu correspondingly Fig 14 Data Processing and Integration form with protein domain information File Maneuver Parameter Effect Export Window Help 398 5989 14494 5864 ERES Tum ciptionalreguhtorBRCAl Trmscriptionalr guhtor 1 RING finger Really teresting New Gene i New Discovery 15770 5804 5805 15825 TTCTGTGGTGAAGGAGCTTTCATCATTCACCCTTGGCACAG GTGTCCACCCAATTGTGGTTGTGCAGCCAGATGCC NM 007295 1 EEE ETT EEE 11 To 22 TTCTGTGGTGAAGGAGCTTTCATCATTCACCCTTGGCACAG GTGTCCACCCAATTGTGGTTGTGCAGCCAGATGCC NT 010755 14 14923669 4923635 14922217 14922197 Identic Subject Length Q_Loction 5 Loction Tot Bases Identities Strand ID Click Here 12 0072951 NT 010755 14 7365 1068 4969375 3426 3426 100 Minus 0 112 Hs17 11 13 0072951 NT 010755 14 7365 4494 4965547 89 89 100 Minus Hs17 1I 14 007285 1 NT 010755 14 7365 4583 4957090 172 172 100 Minus ag Hs17_ 15 0072951 NT 010755 14 7365 4755 4951129 127 127 100 Minus G45 Hs17_1 16 NM 007285 1 NT 010755 14 7365 4882 4949036 181 181 100 Minus H7 1 17 0072951 NT 010755 14 7365 5073 4945753 311 311 100 Minus 7770447 Hs17 16
47. tom part is a spreadsheet for storing and arranging the extracted data The description of the column contents in the spreadsheet is shown in Table I In addition to display the extracted data the spreadsheet also serves as an operation panel for data processing Fig 6 The Data Processing and Integration form with extracted sample data IE xl Marcuver Pararester Eliet Expert Winches 13961 3901 16001 1021 CACCTACCTGATACCCCAGATCCCCCACAGOCACTACTGACTGCAGCCAGCCACAGGTACAGAGCCACAGGACCOCAL 138380944 138380984 or uen Subject Length Loctior i Hates Het Shard ID Click Here 1 NM 0072951 0107551 T S 499529 n 1 1 Hal 1 0072251 NT 0107551 x5 50644117 15 i 61 2 1 d d bl k j 0072951 NT 0107551 x5 70 COT o ui andidate OC X SIM 41 3 5 ff 1 ut 2l CIT T x muah 1 HH 1 HT Or ix 1 NT frm Table I The descriptions of the column contents in the spreadsheet Query NUMBST OL a Use Query_seq_1 if no accession equence The accession number of Subject An Accession is used for all aligned Subject oequence fragments found in the contig Query
48. ts breast cancer 1 gr 6552301 BLink Domains Links LOCUS NP 009226 1863 aa linear PRI 26 J4N 2005 DEFINITION breast cancer 1 early onset Homo sapiens ACCESSION NP 009226 VERSION NP 009226 1 GI 6552301 Fig 12 The Protein Info form with sample data In the form the general information about the protein read directly from the Entrez document is displayed in the top panel It also includes the calculated molecular weight detected by the program The annotated protein domain information in the Entrez 1 displayed in the spreadsheet The program also detects hydrophobic regions in the protein using Kyte amp Doolittle s method and displays a positive detection in the spreadsheet as one item Accession 009226 Length aa 1863 Coded By NM_007295 1 Query Entrez Molecular Weight 2077508 Cytogenetics 17421 Definition breast cancer l early onset breast ovarian cancer included Homo Source Location Domain Domain Abbreviati 6 23 68 RING finger Really Interesting New Gene RING Custolr ized 4 22138 64 765 Transcriptional regulator KOG4362 1 22138 1343 1851 Transcriptional regulator KOG4362 Ite Ro W 2 my lab 920 980 New Discovery JBOO z Shape Option Domain label option SS Domain Name Shape It Merge 7 Domain Abbreviation Cony Draw New Discovery Transcriptional regulator
49. y of a user s computer and user s intelligence provides means for users to analyze sequence data even further based on their needs and get maximum out of a query sequence In order to complete the tasks ExonTracker 2 0 retrieves three pieces of information related to a transcript sequence including the nucleotide Entrez the protein Entrez and the alignment data obtained by Blasting the transcript against a genomic database and then processes and integrates the data to creates a dynamic and graphic rich model depicting the gene the transcript and the protein Using the model one can easily identify the number of exons composing of the transcript and the lengths of the introns in the gene and more importantly the exon or exons corresponding to a particular protein domain The most remarkable thing about the software 1s its ability to export data in multiple formats which are long sought by researchers One example is the so called dot file It is a transcript sequence in fasta format mosaicked with the intron information With this simple format the exon intron information of a transcript can be stored transmitted and reproduced with a simple interpreter along with its sequence information It is extremely useful for designing primers to produce PCR products to cover multiple exons or to be within a single exon The program also can precisely assemble the fragmented alignments provided by NCBI to create a whole sequence alignment followin
Download Pdf Manuals
Related Search
Related Contents
環境活動レポート - エコアクション21 Mode d'emploi Cucitura laterale e piegatura STC-Mitsubishi A/C - REM Copyright © All rights reserved.
Failed to retrieve file