Home

RNApasta User Manual, 2009 Dec 3rd

image

Contents

1. 1 Delete Gap Cols Pse put A1 gt 3 0 1 IE m Del Seq by Reg Size GGGGCCGau uaggauu cgaCGCCG GUAgcgaa acUcua GGUG CAUGCCGAGUu gGUAacaG aACUCGuaaa uCcacUGUugcaac uuuuauaguugccaaugacgaaaacuac 4 oo gt She put A1 3 0 21 J Partition by Stems GGGGGCGauucuggauu cgaCAGGA uuCacgaa accCUG GGaG CAUGCCGAGG gGCGGUUg gCCUCGuaaa aAGCCGCaaa guuauaguugcaaacgacgauaacuac 11 7 Partition by Loops gt Alt hal A1 gt 3 0 0 N nnnnnnnnnnnnnnnnn nnnnGGAA uuCaagaa gcCCGa GGUG CAUGUCGAGG uGCGGUUU gCCUCGuaaa aAAGCCGCaauuu aaaguaaucgcaaacgacgauaacuac 4 Mar hyd A1 3 0 0 nnnnnnnnnnnnnnnnn nnnnGCCG GUGacgaa cCCUUG GGug CAUGCCGAGAu gGCAGCga aUCUCGuaaaucc aaaGCUGCaac guaauagucgcaaacgacgaabBacuac 1 gt Pse hal A1 gt 3 0 0 gt Vib cho A1 gt 3 0 1 gt Aer sal A1 gt 3 0 0 gt Esc col A1 gt 3 0 1 gt Kle pne A1 gt 3 0 1 nnnnnnnnnnnnnnnnn nnnnnnnn nnnnnnnn nCUcga GGUG CAUGUCGAGAa uGAGAGaa UCUCGuuaaaua CUUUCaaaa GGGGGUGca cuggcuu cgaCGGGG GUcacaaa AUCaCu UGGUg CAUGCCGAGGg gGCcGCUuu CCUCGuaaa uccAGCaGCaaa nnnnnnnnnnnn
2. GUGGC vaad ih BasePair Freq Reg AP009044 1 2333762 2333355 ih Hes e JucaG GC A A C C G GGUGAUUGCGUU CG A CGAACUGGGGGAUUGCUCC Reemi Ott a AME an CG U C G GCCGAGGAAAGUCCGGACUC C ACA GAGCA C GG UG GUUGU madd mW 82000035 2 2267229 2266802 ly Entropy By Pos UGAG UC G G U C AA GGUGAUCGCGUC UG A UGAACCAAGGUGACUGGAUCACC CAGG CA U C A GCCGAGGARAGUCUGGACUC C ACA GAGCA C GG UG GUUGU UAAQ th Stack Doublets l gt Bx248358 1 328133 327709 UGAG CC G G C U G GGCGAUCGCGAC UU U GCGUACCACAUGCGAUGAACAUGG i GC A A G GACGAGGARAGUCCGGACUC C ACA GAGCA C GG UG AUUGU UAAQ th PseudoFreqs 178818 1 7603 8020 ils Region Frea CGAG UU G G C U G GGCGGCCGCGGC UC G UGUCGGUCUGARAGGCC Tre C G A GUCGAGGAAAGUCCGGACUU C ACA GAGCA G GG UG AUC UAAQ i e gt CP000325 1 1422068 1422493 Mm ry GAG UU G G C U G GGCGGCCGCGGC UC G GGUUGGGCUCGUGUUGUCACGAGUU CAGC GC C G A AUCGAGGAAAGUCCGGACUU C ACA GAGCU G G6 UG UAAQ ili Loop Summary CP000384 1 3537274 3536871 R 9 E Alignment Tools e e Le e CARA C G G UUCGAGGAAAGUCCGGACUU C ACA GAGCA G GG UG ADUGC 3 UAAQ tsp ir umi CP000480 1 4393032 4392
3. m Interpreted Pairing regions displayed here ih PseudoFreqs ils Region Freq Output displayed here ils Stem Summary ils Loop Summary 3 Ed Alignment Tools 1 Non Canonical T Extract PK 1 Extract Regions 7 Erase Region Label Delete Gap Cols Del Seq by Reg Size F Partition by Stems Partition by Loops Ef Phylogeneic Tools dds Phylogenic Stem Loop Parse Output Input File Selection Pane 1 The top menu bar This pane provides the user with functions to Save and Load Stockholm or Pasta files and it also has some commonly used function like Push Analysis Up which we will explain its use in later section Pane 2 The file input textbox This pane acts primarily as a medium for viewing interpreting parsing and verifying the Stockholm or Pasta files inputted into the application For most cases the user can actually ignore this portion However in our web start version of RNApasta for security reasons we disabled the File Load feature so the file input pane is where the user inserts the Stockholm or RNApasta text into the application Pane 3 File Output textboxes This displays the results of the interpretation and is where the user can view results from various functions executed within the application Functions from Section 4 Pane 4 The Function Selector We designed this portion according to the classification of the
4. 3847214 Ihe Ribonuclease P Database ws 4 C3 USP_ELP2UA 55 java RNApasta Rp 53 RNApasta User Man untitled Paint mW 10374m a RNAseP BactA rfam txt File Edit Analysis Align Edit View Help T Ly Load File bd Save Interpreted A Clear Bottom Analysis 4 Push Analysis Up fo Options e Help Tools 4 STOCKHOLM 1 0 5 k 7 Basic Tools P Interpret Input peed Rn oca rex 2 88 ct a i Draw Arcs GF DE Bacterial RNase P class A E Pasta Edit j amp GF AU Griffiths Jones SR Mifsud W fl Pseudoknot Removal GF SE Brown JW The Ribonuclease P Database PMID 9847214 e Deine ra oo QU QESCLLLLL LILLULQGCO eee ee ee OILLECBO GGCIULLLLOELLLUUUGGGIU C GCGGOLEC LO NLLCTIAL LCOLLOLLILLOLLILUOIOLLILiA Interpreted Input TODETEGUEGUEGCLT E COD Us NENCTWOTTUTOYN GNHeRERE eS ee Remove All Highlights 111511 3 1 1 2 7111 goo OG es o eere 122 211 1o oso 1141 21 10 E ne biikerecece Ren EJ Analysis Tools gt AAOBO1000003 1 8301 7930 ihi Base Freq ICE UC C CA vec eee GCAA G G G U ACCGAGGAACGUCCGGGCUC C GCA GAGCA G GA UG DE M UAAQ AAY102000004 1 1757034 1756671 th BasePair Freq Pos ee GAAA C G CUCGAGGAACGUCCGGGCUC C ACA GGGCA G GG UG
5. H P Interpret Input T GGGGG HHHHHHH ggggg long s F 11111 1111111 11111 Draw Arcs numseq 87 J Pasta Edit Br rrr Yt EA BBBBB BBB DDDDD EEEEE FFFFFGGGGG HHHHHHH ggggg ES UU C Pseudoknot Removal gt index 1111111 11111 111 11111 11111 1111111111 1111111 11111 1111111 cec cesossssssssssssssscssesssscsscsscscsosssssescscessescescescsoses Remove Partition Tag gt Aqu aeo Highlight Stems GGGGGCGga aaggauu cgaCGGGG ACaggcg GUCCc cGAGGag CAGGCCGGG UGGCU CCCGuaac AGCCG cuaaaacagcucccgaagcugaacuc Remove All Highlights CMT GGGGGCGaa cggguu cgaCGGGG AUGgagu CCCC UGGGaag CGAGCCGAGGu cCCCACCU CCUCGuaaaaa AGGUGGGacaa agaauaagugccaacgaaccuguu EJ Analysis Tools bh e the ih Base Freq G8668UGaa acggucu cgaCGGGG GUCgccga g8GCGu GGCUg CGCGCCGAGGu GCGGGUg gCCUCGuaaaa ACCCGCaac ggcauaacugccaacaccaa
6. 1 dJ1 2 0 1 K1 5 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 01 50 9 GGGGGUGua aagguuu cgaCUUAG aAAugaag CGUU AAUUg CAUGCGGAGGgC GUUGGCUgg CCUCCuaaaa AGCCGACaaa acaauaaaugccgaaccuaaggcugaaugcgaaauuauca gcuucgcugaucucgaagaucuaagAGUAGCUgcuuaauua Chl mur A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 gt 3 5 1 F1 4 4 1 G1 gt 4 3 1 H1 gt 5 6 1 I1 gt 6 4 1 J1 gt 2 0 1 K155 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 0150 9 1 Phylogenetic Tree e288 Mutiple Selection Stop Selection Remove Seledied mE le View Phylogenetic Tree The_mar Aqu_aeo The_the Por_gin Dei_rad Str_coe Cor_dip Myc_lep lyc_bov fe Myc_tub lyc avi Geo sul Des des Des vul She put ap Dic_nod 4 F Mar_hyd Pse_aer 4 Pse put Alt_hal Pse_hal Vib cho Aer sal Pas mul Hae duc Hae inf Act act Yer pes Kle pne Esc col 55 par E r Pasta File ABMAAAAA BBBBB BBB DDDDD EEEEE FFFFFGGGGG HHHHHHH ggggg hhhhhhh eee ec cece ewe ee eene e eee c eee n eee e cece nance rrr reor roor error OO IIIIIIIIIJJJJ 1111111 2 22 11111 111 11111 11111 1111111111 1111111 11111 1111111 e IA AR e A D M A e A eem IIIIIIIIIII 1111111111111 PE pairs eee ee ee eee BBBBB
7. A GCGGC CAUA GCGGUGG GGCGA CAC CCGU A C CCAUUCCGAACACG GA AGUUAA GCCCA CCAGCGUACCGGG AAGUACUGGAGUGAGCGAUC CUCUGGGAAGCGGGGUUC GCC GCCUGCC gt L27169 1 1 121 AGGCGGC CAGA GCGGUAG GGAAA CAC CCGU A C CCAUUCCGAACACG GA AGUUAA GCCUA CCAGCGUAUCGUG AAGUACUGGAGUGAGCGAUC CUCUGGGAACCACGAGUC GCC GCCUAC To help you visualize the stems click on Highlight Stems The application will highlight all the regions within each sequence that is capable of forming a stem amp File Edit Analysis Align Edit View yLoadrie f save interpretea Clear Boi Tools 4 STOCKHOLM 9 c Basic Tools P Interpret Input Draw Arcs GE Pasta Edit Pseudoknot Removal Highlight Stems Remove All Highlights 3 Analysis Tools ri Base Freq 05870 1 30 nF034619 1 L27163 1 1 L27170 1 1 x03407 1 59 2 L27343 1 3 27168 1 1 1 69 ih BasePair Freq Pos ih BasePair Freq Reg al BasePair Freq Stem Pos Entropy By Pos all Stack Naiuthiate Once you click on Highlight Stems you will get the following result Interpreted Be sr Bs Bree 11 Ea Ee a cd Ltd bee coke ses 11 1155 5 LLLIEL 1 11l l11 11 deli sees s is m 2L27343 1 3 116 GCGGC CAGG GCGGAGG GGAAA CAC CCGU A C CCAUUCCGAACACG GA AGUGAA GCCCU CCAGCGAACCAGC UAGUACUAGAGUGGGAC 2L27168 1 1 120 UUGGCGAC CAUA GCGGCGA GUGAC CUC CCGU A C CCAUCCCGAACACG GA AGAUAA GCUCG CCUGCGUUUCGGU CAGUACUGGAUUGGGCC 2X72
8. GGGGGUGaa acggucu cgaCGGGG GUCgccga gGGCGu GGCUg CGCGCCGAGGu GCGGGUg gCCUCGuaaaa Dei rad A1 3 5 1 B1 3 0 21 D1 gt 2 5 1 E1 gt 3 5 1 F1 4 4 1 G1 4 3 1 H1 5 6 1 I1 6 4 0 J1 2 0 21 K1 gt 5 5 1 GGGGGUGac ccgguuu cgaCAGGG Gaacugaa GGUG aUGUug CGUGUCGAGGu GCCGUUgg CCUCGuaaaca AACGGCaaagc cauuuaacuggcaaccagaacuac gcUCUCGCUgcuu Por gin A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 1 H1 5 6 1 I1 6 4 1 J1 gt 2 0 1 K1 gt 5 5 1 M1 2 1 1 N1 3 4 1 01 0 9 GGGGCUGa ccggcuu ugaCAG C GUGaugaa gCGGU cuaCGCUcUCGCUgcguaaucgaagaauaguaga Chl tep A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 gt 3 5 1 F1 gt 4 4 1 G1 4 3 1 H1 5 6 1 11 6 4 0 J1 2 0 1 K1 5 5 1 M1 2 1 1 N1 3 4 1 01 0 9 GGGGAUGa caggcuaucgaCAGGA UAggugug aGAUGU cGUUG CACUCCGAGUuucagCAUGGACgg ACUCGuuaaaca aGUCUAUGua COBABRUHOCHORCORRISRUCURIL 99 Jon BC CCR Chl tra A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 gt 3 5 1 F154 4 1 G1 4 3 1 H1 55 6 1 I1 6 4 1 J1 2 0 1 K1 55 5 1 GGGGGUGua aagguuu cgaCUUAG aAAugaag CGUU Chl mur A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 1 H1 55 6 1 1156 4 1 J1 2
9. 0 1 K155 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 01 0 9 1 I Once the user click on Save Tree will prompt dialog for saving the tree as an Image The following is the full image output The mar Aqu aeo The the Por gin Dei rad Str coe Myc lep bov Myc tub C avi Geo sul Des des Des vul She put Dic nod V Mar hyd S Pse aer 4j Pse put m Alt hal Pse hal 9 Vib cho Aer sal Pas mul Hae duc Hae inf Act act Yer pes Kle pne pP al par Xyl fas Leg pne Fra tul Aci fer Alc eut ar Bor_bro_am Nit_eur Nei gon Ehe mea Nei me B Hel pyl Cam jej Sta epi Sta aur Bac sub regc ant Bac meg Ent fa T Et BA Str pyo Str gor tr pne E m Str equ Cyn par cy Syn sp A Syn sp B Nas pun Ana spe Clo dif Clo ace pa Odo sin cl Tha wei cl Bor bur Chl tep Tre pal PM Cau cre ml Ric pro Sin mel t pno Rho_pal Chl tra Chl mur Chl psi Chl pne Gui the cl Ure ure Rec ame mt Por pur cl Myc cap Myc gen 0 1 Expected Substitutions per Site
10. 6 4 0 J1 gt 2 0 1 K155 5 1 L1 gt 2 6 1 M1 2 1 1 N1 3 4 1 01 gt 0 9 GGGGGCGaa cggguu cgaCGGGG AUGgagu CCCC 1 UGGGaag CGAGCCGAGGu cCCCACCU CCUCGuaaaaa AGGUGGGacaa i agaauaagugccaacgaaccuguu gcuguuGCCGcuuaauagaua The the A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 gt 4 3 1 H1 gt 5 6 1 I1 6 4 0 J1 2 0 1 K155 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 0150 9 GGGGGUGaa acggucu cgaCGGGG GUCgccga gGGCGu GGCUg CGCGCCGAGGu GCGGGUg gCCUCGuaaaa ACCCGCaac i ggcauaacugccaacaccaacuac gcucUCGCGGcuuaau Dei rad A1 gt 3 5 1 B1 gt 3 0 1 D1 2 5 21 E1 3 5 21 F1 gt 4 4 1 G1 gt 4 3 1 H1 gt 5 6 1 I1 6 4 0 J1 2 0 21 K1 gt 5 5 1 L1 gt 2 6 1 M1 gt 2 1 1 N1 gt 3 4 1 01 gt 0 9 GGGGGUGac ccgquuu cgaCAGGG Gaacugaa GGUG aUGUug CGUGUCGAGGu GCCGUUgg CCUCGuaaaca AACGGCaaagc cauuuaacuggcaaccagaacuac gcUCUCGCUgcuu Por gin Al gt 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 61 4 3 1 H1 5 6 1 I156 4 1 J1 2 0 1 K155 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 01 0 9 GGGGCUGa ccggcuu ugaCAG C GUGaugaa gCGGU AUG
11. Edit View Help j Load File ta Save Interpreted f Clear Bottom Analysis 9 Push Analysis Up Options Help Tools gt pairs GC SS cons numseq 602 E Basic Tools B BBBB B B B AAAA A A aa aaa a bb bbbb b bb P Interpret Input Pindex 11 1111 1 1 1 1113 1 1 212 111 1 11 1111 i 5 3 Draw Arcs E Pasta Edit GUUUCGAACCCG GA AGUUAA GCCGG CCACG Pseudoknot Removal pe JCAUGUCGAACCCG GA AGUUAA GCCGG CCGCG Remove Partition Tag Input Highlight Stems f Remove All Highlights E Analysis Tools ili Base Freq il BasePair Freq Pos C UCUUCA GAACCCG GA AGCUAA GGCCG GCGCC A CGAACCCG GA AGUUAA GCCGC UCACG GUUUCGAACCCG GA AGUUAA GCCGC UCACG fly BasePair Freq Reg ih BasePair Freq Stem Pos th Entropy By Pos 11 1111 1 1 1 1111 1 1 tlg Stack Doublets ili PseudoFreqs il Region Freq RM il Stem Summary gt pairs GC SS_cons numseq 602 numseq 602 IBB BBBB B B B AAAA A A aa aaa a BB bbbb b bb ils Loop Summary gt index C3 Alignment Tools RH 8 R 22111 1 1 11 211 23 R ERR E Dr MR gt X07545 1 505 619 La JGUGGCCG BGCAA CAC CCGG U C UCGUUUCGAACCCG GA ABUUAA GCCGG CCACG i Extract PK gt M21086 1 8 123 1 Extract Regions JGCGGCCG BGCAA CAC CCGG A C UCAUGUCGAACCCG GA ABUUAA GCCGG CCGCG gt X01588 1 5 119 i _ Erase Region Label GUGAGCG GGCAA C
12. joeucoecease ccccecasceccoccan M gt x01588 1 5 119 ACCCGGU CACA GUGAGCG GGCAA CAC t Extract Region Options AUACCGUGAGG AUCCGCAGCCCCACUAA GCU GGGAU gt M16530 1 8 123 ACCCGGC AAUA GGCGCCGGUGCUA CGC t Start Region Biv End Region HM jGGUCCGCGAGG CCCCGGGAAACCGCCG U BCU 866A U05019 1 544 658 ACCCGGU CAUA GUGAGCG GGUAA CAC gt X05870 1 304 418 ACCCEGC CACA GUGAGCG GGCAA CAC jAUACCGUGAGG AUCCGCAGCCCCACUAA 8CU 8GGAU gt AFO34619 1 5584 5700 Came 6G6C6GC CACA GCGGUGG GGUUGCCUC Geo T CORUCCCUMACILCU UI IUIAUIAI OCOC CCOHMUCUUUCCILUU umcumcuocAGUGCGCGAGC CUCUGGGAAAUCCGGUUC GCC GCCA Afterwards Copy bottom to top Reinterpret pasta Ge CCC IM RENI Specify the start and end region and you will get the a new tab with the extracted region bbbb b bb 1111 1 11 X07545 1 505 619 UGGCCG GGCAA CAC CCGG U C UCGUUUCGAACCCG GA AGUUAA GCCGG CCACG 1086 1 8 123 GGCCG GGCAA CAC CCGG A C UCAUGUCGAACCCG GA AGUUAA GCCGG CCGCG X01588 1 5 119 GAGCG GGCAA CAC CCGG A C UCAUUUCGAACCCG GA AGUUAA GCCGC UCACG 16530 1 8 123 GCCGGUGCUA CGC CCGG U C UCUUCA GAACCCG GA AGCUAA GGCCG GCGCC U05019 1 544 658 UGAGCG GGUAA CAC CCGG A C UCGUUUCGAACCCG GA AGUUAA GCCGC UCACG X05870 1 304 418 UGAGCG GGCAA CAC CCGG A C UCAUUUCGAACCCG GA AGUUAA GCCGC UCACG AF034619 1 5584 5700 GGUGG GGUUGCCUC CCGU A C CCAUCCCGAACACG GA AGAUAA GCCCA CCAGC If you want to perform the analysis on this new seq
13. methods Region Freq This measures the frequency with which each pairing region appears in the sequence alignment by not counting sequences which have no bases in that region of the alignment Stem Summary This function reports the length distributions of the subregions of stems The subregions are the length from the beginning of the sequence the length of the 5 stem the central loop the length of the 3 stem and the length to the end of the sequence Summary statistics are followed by the complete length distributions Loop Summary This function calculates the length distributions of each non paired sequence region These are named by the flanking pairing regions so that non paired region AB is in between pairing regions A and B We also provided functions for assisting the editing and alignment of the pasta sequences E g extracting a portion of pseudoknot or removing particular stem or loop region of the pasta sequence etc Note Some of these functions can result in a new modified RNApasta sequence alignment which is why we have a Push Analysis Up Function See example below 6 RNApasta File Edit Analysis Align Edit View Help Load File la Save Interpreted A Clear Bottom Analysis 4 Push Analysis Up Options re Help nput file displayed here c Basic Tools P Interpret Input Draw Arcs 9 cf Pasta Edit Pseudoknot Removal Remove Partition Tag Highlight Stems f Remove All High
14. the Pasta lines as it will having nothing with which to pair The Push Analysis Up function can then be used to copy the extracted regions to the upper input textbox after which Parse Pasta can be used to generate an analysis of the extracted regions Alignment Subdivision Functions Load an RFam file into RNApasta T md Alignment Tools Non Canonical 3 Extract PK Delete Gap Cols Del Seq by Reg Size Partition by Stems 2 Partition by Loops Click on Extract Regions Interpreted X Extract Reg 34 BB BBBB B B B AAAA A A 8a aaa a bb bbbb b bbD DD DD da ad d cec ecoccEE et See Bs a Spe pairs GC SS cons numseq 602 numseq 602 ee BB BBBB B B B A8AA8A A A 8B8 BBB B bb bbbb b bbD DD DD D DD DD dd dd d dd dd d DEB DN s s s 11 1111 1 1 1 2222 2 3 22 222 2 11 1111 1 111 11 11 1 11 11 11 11 1 11 11 1 S X07545 1 505 619 ACCCGGC CAUA GUGGCCG GGCAA CAC CCGG U C UCGUUUCGAACCCG GA AGUUAA GCCGG CCACGUCAGAACG GCC G UGAGGUCCGAGAGG CCUCGCAGCCGUUCUGA GCU GGGAU gt M21066 1 8 123 ACCCGGC CAU GCGGCCG GGCAA CAC qp e
15. 1111111 11111 ePP LA 1111111111111 e Aqu aeo A1 3 5 1 B1 gt 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 0 BH155 6 0 11 6 4 0 J1 2 0 1 K155 5 1 L1 2 6 1 M1 2 1 1 N1 gt 3 4 1 0150 9 GGGGGCGga aaggauu cgaCGGGG ACaggcg GUCCc cGAGGag CAGGCCGGG UGGCU CCCGuaac SBDLUp eee Cuaaaacagcucccgaagcugaacuc gcucuCGCUGccuaauuaaa Ihe mar A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 21 G1 4 3 21 H1 gt 5 6 1 I1 gt 6 4 0 J1 gt 2 0 1 K1 gt 5 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 01 0 9 GGGGGCGaa cggguu cgaCGGGG AUGgagu CCCC UGGGaag CGAGCCGAGGu cCCCACCU CCUCGuaaaaa AGGUGGGacBa agaauaagugccaacgaaccuguu gcuguuGCCGcuuaauagaua Ihe the A153 5 1 B1 gt 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 gt 4 3 1 H155 6 1 11 56 4 0 J1 gt 2 0 1 K1 gt 5 5 1 L1 gt 2 6 1 M1 gt 2 1 1 N1 gt 3 4 1 01 gt 0 9 GGGGGUGaa acggucu cgaCGGGG GUCgccga gGGCGu GGCUg CGCGCCGAGGu GCGGGUg gCCUCGuaaaa ACCCGCaac ggcau
16. 2M16530 1 8 123 ACCCGGC AAUA GGCGCCGGUGCUA CGC CCGG U C UCUUCA GAACCCG GA AGCUAA GGCCG GCGCC U05019 1 544 658 ACCCGGU CAUA GUGAGCG GGUAA CAC CCGG A C UCGUUUCGAACCCG GA AGUUAA GCCGC UCACC X05870 1 304 418 ACCCGGC CACA GUGAGCG GGCAA CAC CCGG A C UCAUUUCGAACCCG GA AGUUAA GCCGC UCACC AF034619 1 5584 5700 Move mouse beside a column with the cursor right of the stem and click on Pair Col Highlight bbbb b bt 1111 1 11 EA AGUUAA GCCGG CCGCC EA AGUUAA GCCGC UCACC lights Remove High 2X07545 1 505 619 ACCCGGC CAUA GUGGCCG GGCAA CAC CCGG U C UCGUUUCGAACCCG GA AGUUAA GCCGG CCACGUCAGAACG GCC 2M21086 1 8 123 ACCCGGC CAUA GCGGCCG GGCAA CAC CCGG A C UCAUGUCGAACCCG GA AGUUAA GCCGG CCGCGUUGGGGGA UGCUC 2X01588 1 5 119 ACCCGGU CACA GUGAGCG GGCAA CAC CCGG A C UCAUUUCGAACCCG GA AGUUAA GCCGC UCACGUUAGUGGG GCC 2M16530 1 8 123 ACCCGGC AAUA GGCGCCGGUGCUA CGC CCGG U C UCUUCA GAACCCG GA AGCUAA GGCCG GCGCCGCGGACGG GAGU 2U05019 1 544 658 ACCCGGU CAUA GUGAGCG GGUAA CAC CCGG A C UCGUUUCGAACCCG GA AGUUAA GCCGC UCACGUCAGAGGG GCC 2X05870 1 304 418 ACCCGGC CACA GUGAGCG GGCAA CAC CCGG A C UCAUUUCGAACCCG GA AGUUAA GCCGC UCACGUUGGUGGG GCC 2A F031619 1 5584 5700 GGCGGC CACA GCGGUGG GGUUGCCUC CCGU A C CCAUCCCGAACACG GA AGAUAA GCCCA CCAGCGUUCCAGG GAGUI To provide a clean visualization of the structure of the RNA we pro
17. 588 1 6990 7093 L 568GC CACA GCGGCGOG GGCGA CUC CCGU A C CCAUCCCGAACACG GC AGAUAA GCCCG CCAGCGUUCCAGC GAGUACUGGAGUGUGCC X02128 1 24 138 CGGC CAGA GCGGUGA GGUUC CAC CCGU A C CCAUCCCGAACACG GA AGUUAA GCUCA CCUGCGUUCUGGU CAGUACUGGAGUGAGCC 2X144411 1 5 123 GaGCcGGC CAGA GCGGUGA GGUUC CAC CCGU A C CCAUCCCGAACACG GA AGUUAA GCUCG CCUGCGUUCUGGU CAGUACUGGAGUGAGCC 2L27162 1 2 122 A GCGGC CAUA GCGGUGG GGCGA CAC CCGU A C CCAUUCCGAACACG GA AGUUAA GCCCA CCAGCGUACCGGG AAGUACUGGAGUGAGCC 2L27169 1 1 121 UJAGGCGGC CAGA GCGGUAG GGAAA CAC CCGU A C CCAUUCCGAACACG GA AGUUAA GCCUA CCAGCGUAUCGUG AAGUACUGGAGUGAGCC 2L27167 1 1 122 GAAGGCGGC CAGA GCGGUGG GGAAA CAC CCGU A C CCAUUCCGAACACG GA AGUUAA GCCCA CCAGCGUACCGUG AAGUACUGGAGUGAGCC 2L27236 1 1 122 UJAGGCGGC CAGA GCGGUAG GGAAA CAC CCGU A C CCAUUCCGAACACG GA AGUUAA GCCUA CCAGCGUAUCGUG AAGUACUGGAGUGAGCC For a lighter version of the highlight the user can also choose to highlight one column at a time by performing a right mouse click This is useful for looking at specific nucleotide for covariance pairs GC 53 cons numseq 602 COCCOCCOC aa aaan a BB BBBB B B B A A gt index Find Next Selection TUPVELPDL 11 1111 1 1 1 2X07545 1 505 619 Copy Selection ACCCGGC CAUA GUGGCCG GGCAA C Paste A AGUUAA GCCGG CCACC 2M21086 1 8 123 RT ACCCGGC CAUA GCGGCCG GGCAA C Highlight Stems X01588 1 5 119 Pair Col Highlic ACCCGGU CACA GUGAGCG GGCAL
18. 645 INES TCG GOSS SC e iiie nie ceste tcd CU G A U GUCGAGGAAAGUCCGGACUU C ACA GAGCA G GG UG MJUGC UAAQ a Extract PK CP000656 1 3066080 3066491 1 Extract Regions GAGC CU G 6 C C G GGCGGCCGCOGC AC C CAGA mm UG G G G GUCGAGGAAAGUCCGGACUU C ACA GAGCA G GG UG ADUGC 3 UAAQ gt apo06618 1 1794140 1794557 Erase Region Label CGAG UU G G U C G GGCGGCCGCGGC GA C GGGAACGGGCACGCUGGUGUCG CAGC CC G G C GCCGAGGAAAGUCCGGACUC C ACA GAGCA G GG CG GUUGC UAAQ T Delete Gap Cols CP000431 1 1253711 1253273 Segiefiog Ske GAG UU G G C C G GGCGACCGCGGC AC A GGGAACCAGCACUUCAGUGCAC AGEC CC U G A GUCGAGGAAAGUCCGGACUC C ACA GAGCA G GG CG GUUGU UAAQ a gt CP000481 1 999499 999917 4 Partition by Stems GAG CC 6 G C C G GGCGGCCGCGUC 66 6 CCACGGUGOGC CGUG GA C C C GCCGAGGAAGGUCCGGGCUC C ACA GGGCA G GG UG GUGGG UAAQ F Partition by Loops CP000249 1 1735135 1734723 GAG UU G G C C G GGCGGCCG0GUC G6 C GCOGG6CCG6 UCCG GC G U C GCCGAGGAAAGUCCGGGCUC C ACA GGGCA G GG CG GUGGG UAAQ 9 Phylogenei
19. A tmRNA A1Z1 tmRNA A1Z1 neighbor c tRNA Places File name RNAseP_BactAstam H Fest pe Us By this time the application will open the file in the input file and ask if the user wants to interpret the file as either a pasta or Stockholm format If the user press okay RNApasta will attempt to parse input parse it to the output section 5 RNAse File Edit Analysis Align Edit View Help Tools STOCKHOLM 1 0 7 Basic Tools P Interpret Input gcc seer M GF ID ENaseP bact a Draw Arcs DE Bacterial RNase P class A 7 Pasta Edit AU Griffiths Jones SR Mifsud W EA Pseudoknot Removal GF SE Brown JW The Ribonuclease P Database FMID 9847214 amp GF 55 Published PMID 3847214 GA 43 00 l Highlight Stems Je GF TC 44 67 ann Io 2 i 6 Input Data Format 3 f Remove All Highlights PAL a va M CES Sene ribozY Choose Data Format 2 Analysis Tools efi Remove Partition Tag Base Freq amp GF cmsearch 2 Pasta BasePair Freq Pos 5 ii ih 17 Stockholm Rfam ih BasePair Freq Reg GF 9759486 ih BasePair Freq Stem Pos 6F Ribonuclease Cancel a tRNA processing ribozyme 6F Frank DN Pacc ih Entropy By Pos hh Stack Doublets ih PseudaoFreqs E i OCeanian Crear Annu Rev Biochem 1998 67 153 180 2
20. AC CCGG A C UCAUUUCGAACCCG GA AGUUAA GCCGC UCACG F Delete Gap Cols 5M16530 1 8 123 DERSedmy ege 5U05019 1 544 658 gt L27163 1 L27170 1 1 120 2X03407 1 5927 6048 22599 1 amp 22n 7n23 a 2 11 111 1 11 1111 1 11 JGGCGCCGGUGCUA CGC CCGG U C UCUUCA GAACCCG GA AGCUAA GGCCG GCGCC Partition by Stems JIGUGAGCG GGUAA CAC CCGG A C UCGUUUCGAACCCG GA AGUUAA GCCGC UCACG CCAUCCCGAACACG GA AGAUAA GCCCA CCAGC 1 1 119 JGCGGUGG BGUUE EUC CCGU A C CCAUCCCGAACACG GA ABAUAA GCCCA CCAGC 0 1 1 120 JUCGGCGG GGUUC CUCCCCGU A C CCAUCCUGAACACG GA AGAUAA GCCCG CCAGC C CCAUCCCGAACACG GA AGAUAA GCCCG CCUGC a aaa a bb bbbb b bb uf Copied From Bottom 0 0 0 Output Erase Region Labels This will erase a pairing region stem from the pairing indicator line It does not alter any of the sequences themselves After the erasure an automatic copy bottom to top push analysis up and re interpret are available options Pseudoknot Removal specifically removes all the pseudoknots in the structure line to create an alignment that can be used by programs that do not model pseudoknots Del Gap Cols This function will remove columns from the alignment that contain a gap in all sequences and also in the pairing indicator line The Copy B to T function can then be used to start more analyses Partition By Stems This allows the user to divide the data set in 2 ba
21. BBB DDDDD EEEEE FFFFFGGGGG HHHHHHH ggggg 5 5 5 5 5 5 5 IIIIIIIIIJJJJ e eere index 1111111 11111 111 11111 11111 1111111111 1111111 11111 P 6 A A A A A A A a 1111111111111 Aqu aeo A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 gt 4 3 0 H1 gt 5 6 0 I1 6 4 0 J1 2 0 1 K155 5 1 GGGGGCGga aaggauu cgaCGGGG ACaggcg GUCCc cGAGGag CAGGCCGGG UGGCU CCCGuaac AGCCG cuaaaacagcucccgaagcugaacuc gcucuCGCUGccuaauuaaa Ihe mar A1 3 5 1 B153 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 1 H1 55 6 1 I1 6 4 0 J1 2 0 1 K155 5 1 GGGGGCGaa cggguu cgaCGGGG AUGgagu CCCC UGGGaag CGAGCCGAGGu cCCCACCU CCUCGuaaaaa AGGUGGGacaa i agaauaagugccaacgaaccuguu gcuguuGCCGcuuaauagaua TIhe the A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 1 H1 5 6 1 I1 6 4 0 J1 2 0 1 K1 gt 5 5 1
22. Cul cug CAUGIAGABNCUT Aer A1 3 0 l sisti CGau uaggauu cgacccoG c aacaaa acta m gg CAUGICOAGCu Pae put amp 1 3 0 1 isisistsiCGau uaggauu cgacicoo G Agcgaa acUcua OGUG CAUGCCOAGUu She put i R153 0 l c oaudcuggauu cgac AG GA uucAcgAB ACCCUG oa TT T CRUGCCAGG Alt hal A1 3 0 0 nmnnnnnnannnnnnnnn nnnnoGAA uucaagaa ee is Re mm r hyd amp 1 3 0 D rrmnnnnnnnannnnnan annnnecco oGacgaa coCUDG anug CRAUGICBAGRu Fae hal A1 3 0 0 A UU CA AAA Golo aUa UG CADUGUCOAGG Vib cho A1 3 0 91 rM UGauucaggauu cgacoaaA AllUuugca GaUcUg aiUG CAUGICOAAnG gt Aer sal A1 3 0 nnnnnnnnnnnnnnnmnn nnnnAAGA UUcAacgaa AcCCA aaUG CAUGCCOAmRSG Esc col 123 011 To initiate the Phylogenetic routine click on Phylogenetic Stem Loop Parse Note In order to use this routine the user must obtain a Newick tre file with the title of the fasta tag the same as the Newick node name tmRNA A1Z1 pasta File Edit Analysis AlignEdit View Help j y Loadrie a save interpretea Clear Bottom Analysis Push Analysis up Options nep Tools M 11111 1111111 11111 1111111 c eere eee eee eme Ef Basic Tools se tt et e MMETE
23. I FTTTICOGGG 22773 TIT F seer eet mim e a T Ton P ie LAXE e e ae l CATULL Lii Pi LEER i T iig i ilg pum mm TT iig Ld EXC MEGA RAE Mj 1 L ilh ak 4 p m mmn mum JT TFA T FF J 4 S d 4 TT a 4 T j yp 23 3 Now let s take a step back to the previous screen and consider the scenario in which the user clicks on Into Label Line 6 Partition Stem Partition sequences by stem lengths Data Viewer Name Length A1 Stem Length Select Length for All Regions Partition Into Two Subsets Indicate on The Label Line The data will be An indicator will be added divided into 2 sets to the fasta label line if the above and below sequence is above or the selected length below the selected length 8 lt gt The following partition tag will be added to the new pasta output This is crucial for the phylogenetic study pipeline UCaugaa acAGC uCAGgg CAUGCCGAGCA Dic nod A1 3 0 0 nnnnnnnnnnnnnnnnm nnnnnnnn nnnnnnnn ncUcga G UG CAUGDCGASBAA Fra tul A153 0 ml GG5G00Geauaugguuu cgaCAlca Algucesa AlCUaa 25ug CAUGCCEAGSA Xyl fas A1 3 0 1 ssa UGCA cuggcuu cgacoGGG G Ucacaam AUCaCu UGSUg CAUGCCGAGGI Leg pne A1 3 0 0 jnmrnnnn innnnnn nnnnst 6GUUgcaaa accgga AGUgs CAUGCCGAGAA Aci fer 153 5 sis sisisac cggcuu cgacGcAG qucgcgaa cC
24. RNApasta User Manual 2009 Dec 3rd By Tim Shaw Introduction RNApasta is a Java application to calculate a variety of useful statistics related to RNA stem loop and pseudoknot structures It will also perform a functions related to alignment editing primarily the generation of subsets of data where the original data set is heterogeneous with respect to some structural feature The input data may be in a pasta formatted file a variant of the fasta format alternatively the program will also accept Stockholm formatted files as downloaded from the Rfam database The pasta format is defined in the Definition Section Design RNApasta has a simple 4 pane user interface design See figure below with the red 1 4 number labels Feel free to skip to the demo section as it is always much easier to learn about the application through examples Oem emaitromco _ Suse saa Java RNApasta Rp L X NIE amp RNApasta oee File Edit Analysis Align Edit View Help j Load File la Save Interpreted A Clear Bottom Analysis 4 Push Analysis Up Options e Help 1 Input file displayed here 5 Basic Tools 4 P Interpret Input Draw Arcs 9 J Pasta Edit Pseudoknot Removal f Remove Partition Tag Highlight Stems Input Remove All Highlights 5 Analysis Tools ih Base Freq BasePair Freq Pos ils BasePair Freq Reg ih BasePair Freq Stem Pos ili Entropy By Pos Stack Doublets
25. Uaag CAUGUAGUGCqu gGGUGgCUu GCACUauaaucu cAGaCAUCaaa aguuuaauuggcgaaaauaa cuaCGCUcUCGCUgcguaaucgaagaauaguaga Chl tep A1 3 5 1 B1 3 0 1 D1 gt 2 5 1 E1 gt 3 5 1 F1 4 4 1 G1 4 3 1 H155 6 1 11 6 4 0 J1 gt 2 0 1 K1 gt 5 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 01 gt 0 9 GGGGAUGa caggcuaucgaCAGGA UAggugug aGAUGU cGUUG CACUCCGAGUuucagCAUGGACgg ACUCGuuaaaca aGUCUAUGua ccaauagaugcagacgauuauucguau gcaAuGGCuGccugauua g Chl tra A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 61 4 3 1 H155 6 1 I1 gt 6 4 1 J1 gt 2 0 1 K155 5 1 L1 gt 2 6 1 M1 gt 2 1 1 N1 gt 3 4 1 0150 9 GGGGGUGua aagguuu cgaCUUAG aAAugaag CGUU AAUUg CAUGCGGAGGgC GUUGGCUgg CCUCCuaaaa AGCCGACaaa acaauaaaugccgaaccuaaggcugaaugcgaaauuauca gcuucgcugaucucgaagaucuaagAGUAGCUgcuuaauua Chl mur A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 61 4 3 1 H155 6 1 11 6 4 1 J1 2 0 1 K155 5 1 L1 2 6 1 M1 2 1 1 N1 3 4 1 0150 9 1 LLL 1H On the top right there is a combo box the user can choose which partitio
26. aacugccaacaccaacuac gcucUCGCGGcuuaau Dei rad A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 gt 4 3 1 H155 6 1 I1 gt 6 4 0 J1 gt 2 0 1 K1 gt 5 5 1 L1 2 6 1 M1 2 1 1 N1 gt 3 4 1 01 0 9 GGGGGUGac ccgguuu cgaCAGGG Gaacugaa GGUG aUGUug CGUGUCGAGGu GCCGUUgg CCUCGuaaaca AACGGCaaagc cauuuaacuggcaaccagaacuac gcUCUCGCUgcuu Por gin A153 5 1 B153 0 21 D1 2 5 21 E153 5 1 F1 4 4 1 G154 3 1 H155 6 1 I1 gt 6 4 1 J1 gt 2 0 1 K1 5 5 21 L152 6 1 M152 1 21 N1 3 4 1 0150 9 GGGGCUGa ccggcuu ugaCAG C GUGaugaa gCGGU AUGUaag CAUGUAGUGCgu gGGUGgCUu GCACUauaaucu cAGaCAUCaaa aguuuaauuggcgaaaauaa cuaCGCUcUCGCUgcguaaucgaagaauaguaga Chl tep A1 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 1 H155 6 1 I1 gt 6 4 0 J1 gt 2 0 1 K155 5 1 L1 2 6 1 M152 1 1 a N1 gt 3 4 1 01 gt 0 9 GGGGAUGa caggcuaucgaCAGGA UAggugug aGAUGU cGUUG CACUCCGAGUuucagCAUGGACgg ACUCGuuaaaca aGUCUAUGua ccanaagaugcagacgd HEU COE BU eee gcaAuGGCuGccugauua gi Chl tra A1 3 5 1 B1 gt 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 G1 4 3 1 H155 6 1 I1 gt 6 4
27. c Tools 4A1101000015 1 28568 28153 dil Phylogenic Stem Loop Parse ICGAG UU G G C C G GACGGUCGCGUC GG U ACCGGGGCGCGAGC CCGG GU G C C GCCGAGGAACGUCCGGGCUC C ACA GGGCG G GG UG GUGGG UAAQ CP000750 1 3426537 3426162 ICGAG UC G G C C G GGCGACCGCGUC G6 C CGG6 GCGA CCCG G C C GCCGAGGAAAGUCCGGACUC C ACA GGGCA G GG UG GUGGC UAAQ AAGP01000051 1 20859 21226 Output JACAG GU C A C C A GACGGUCGCGGC UC U C UCAC G A G G GCUGAGGAACGUCCGGGCUC C GCA GAGCAGG GA UG GUGGG UAAQ 4E014184 1 342276 342626 AGG GC U U 6 C G GGCGGUCGCGGU GU C UA Te eo TTL eee TT UU G G U ACCGAGGAACGUCCGGGCUC C GUA GAGCA A GG CG GUGGG UAAQ 42M 01000002 1 922597 922223 HCGAE UCEA C U C OO GU GR po ee LL Lee EEUU GCAA C C A U GUCGAGGAAAGUCCGGGCUC C ACA GGGCG C GG UG GUGGC CBAC 7 Loses Ee ACRA UGGC ALO 4 a C eclipse workspace2 RNApasta data RNAseP_BactA rfam txt The user could also just copy and paste the text from the Stockholm or Pasta file straight into the input textbox Howev
28. cuac lily BasePair Freq Pos hh BasePair Freq Rec Ne U 020 0 0L 0852080 0 0 010 0 1008 0 0 000 0 0 th BasePair Freq Stem Pos gt Bor per A1 gt 3 0 1 iili Entropy By Pos GGGGCCGauc cggauu cgaCGUGG GUCaugaa aCAGCu CAGgg CAUGCCGAGC aCCAGUaa GCUCGuuaauc ih Stack Doublets gt Bor bro am A1 gt 3 0 1 ah GGGGCCGauc cggauu cgaCGUGG GUCaugaa aCAGC uCAGgg CAUGCCGAGCa CCAGUaa GCUCGuuaauc E HN lily PseudoFreas isses a na ih Region Freq gt Fra tul A1 gt 3 0 1 n Stem Summary ill Loop Summary gt Xy1 fas A1 gt 3 0 1 E Alignment Tools 1 Non Canonical imm s nnnnnnnnnnnnnnnnn nnnnGUGG GUUgcaaa aCCgga AGUg CAUGCCGAGAa gGAGaUc UCUCGuaaaua aGaCUCaauua aauauaaaugcaaacgaugaaaacuuu 4 Extract PK gt Aci fer A1 gt 3 0 1 1 Extract Regions GGGGGCGac cggcuu cgaCGCAG guCgcgaa GCCuU cGGug CAUGCAGAGCu gCGGUUc GCUCUuaaa acuGGUCGcaga uucauaauugccaacgacagcaauuac T4 gt Pse aer A1 3 0 21 i Erase Region Label GGGGCCGau uaggauu cgaCGCCG GUaacaaa acUUGa GGgg CAUGCCGAGCu gGUAGCaG aaCUCGuaaa uUcGCUGCugcaaa cuuauaguugccaacgacgacaacuac
29. dit View Help yLoadrite af Save interpretea Clear Bottom Analysis 4 Push Analysis Up gt gt Options 2 Help STOCKHOIM 1 0 ESSIC Tools P Interpret Input GF AC RFOOO14 ID DsrA DE DsrA RNA Pasta Edit GF AU Bateman A f Pseudoknot Removal SE Bateman A WIR Partition T SS Published PMID 9770508 DEN GF GA 25 00 Highlight Stems e crF TC 27 48 Remove All Highlights J GF NC 5 81 GF IP Gene sRNA T irt Tonis BM cmbuild CM SEED ih Base Freq amp GF BM cmsearch toponly CM SEQDB E ilg BasePair Freq Pos grecum 1 M GF RM 7534408 th BasePair Freq Reg GF RT A small RNA acts as an antisilencer of the H NS silenced fly BasePair Freq Stem Pos RT rcsA gene of Escherichia coli H GF RA Sledjeski D Gottesman 5 th DTP GF RL Proc Natl Acad Sci USA 1995 92 2003 2007 ils Stack Doublets RN 2 ils PseudoFreqs RM 9770508 a i GF RT DsrA regulates translation of RpoS message by an th Region Freq GF RT anti antisense mechanism independent of its action as Stem Summary GF RT an antisilencer of transcription 5 RA Majdalani N Cunning C Sledjeski D Elliott T Gottesman 5 ili Loop Summary dp ER J Alignment Tools Interpreted 5 Non Canonical Pairing regions displayed here 1 Extract PK zfoutput displa
30. e pseudoknot at the 3 end of turnip yellow mosaic GF RT virus RNA in minus strand synthesis by the viral RNA dependent RNA GF RT polymerase GF RA Deiman BA Kortlever RM Pleij CW GF RL J Virol 19977 71 5990 599635 AF035635 1 619 641 UGAGUUCUCGAUCUCUAAAAUCG M24804 1 82 104 UGAGUUCUCUAUCUCUAAAAUCG J04373 1 6212 6234 UAAGUUCUCGAUCUUUAAAAUCG M24803 1 1 23 UAAGUUCUCGAUCUCUAAAAUCG GC SS cons SEU Vaca sec RNApasta Demo In the next few sections we will introduce some of the functions of RNApasta Once you open the application press Load File which is located on the top left corner of the application 6 RNApasta File Edit Analysis Align Edit View Help j Load File kd Save Interpreted A Clear Bottom Analysis 4 Push Analysis Up 5 Options re Help Highlight Stems Input Remove All Highlights E Analysis Tools ils Base Freq ili BasePair Freq Pos ils BasePair Freq Reg ili BasePair Freq Stem Pos ils Entropy By Pos ih Stack Doublets ih PseudoFreqs ih Region Freq ili Stem Summan y ih Loop Summary Rs FI 1 Alignment Tools 1 Non Canonica l T Extract PK 1 Extract Regions thy Phylogenic Stem Loop Parse Output Input Pile 0 26 26 Open the file that contains either the Stockholm format or the Pasta format Open Pasta File 60 Look in BE SSRNA rfam Desktop asdf RMAseP BactA rfam RNAseP BactB rfam tmRN
31. er the user will need to press Interpret Input to continue with the processing For RFam users this is particularly convenient to just copy the text from their database and input them into RNApasta for analysis 6 RNApasta File Edit Analysis Align Edit View Help yLoadriie Save interpretea Clear Bottom Analysis Push Analysis Up S Options Que Tools Input file displayed here EJ Basic Tools Find Next Selection P Interpret Input Copy Selection Draw Arcs GQ Pasta Edit ef Pseudoknot Removal 2 Highlight Stems Remove Partition Tag Pair Col Highlight Add Highlight Stems Remove Highlights b Remove All Highlights 2 L5 Analysis Tools Base Freq 5 BasePair Freq Pos Pairing regions displayed here ih BasePair Freq Reg Output displayed here fig BasePair Freq Stem Pos 2 m Entropy By Pos Stack Doublets ah th PseudoFreqs Region Freq Stem Summary the Loop Summary L5 Alignment Tools Non Canonical F Extract PK Extract Regions HE Erase Region Label Delete Gap Cols 7 _ Del Seq by Reg Size Partition by Stems HE Partition by Loops CI Phylogeneic Tools dis Phylogenic Stem Loop Parse Once the user selects all the text on the top screen the user can paste the Stockholm or Pasta text into the Input Section and click on Interpret Input E RNApasta File Edit Analysis Align E
32. function The user can perform an array of functions sequence structure editing statistical analysis alignment analysis and phylogenetic studies End User This program is designed specifically for ncRNA researchers interested in studying the structural features and variation within the same ncRNA family Availability RNApasta is available at http www uga edu RNA Informatics software RNApasta There is the option of using either the java webstart or execute the program from source RNApasta was compiled using the Sun Java Development Kit version 1 6 Compiled versions jar files are available for Sun Java 1 5 or 1 6 The RNApasta src zip file contains the java code source which you can compile yourself if needed or desired The files compile bat Windows and compile sh Linux contain a command for compiling the code while a jar file Java archive may be created from the compiled class files using jarcreate bat Windows or jarcreate sh Linux Format Definition Pasta Format The pasta format is built upon a fasta sequence alignment with the addition of two or one line s of pairing indicators to indicate the RNA secondary structure is a comment line gt pairs the next line contains pairing indicator letters AAAA BBBB aaaa bbbb AAAA aaaa gt index the next line contains index subscript numbers s M prie E lo Poe ae m gt sequence label 1 GCUCAACCCAGUCAUUUGCCGGUUC AAUGGCUAAACCCCGGUUG
33. g the Cladogram representation of the Phylogeny on top and the Pasta sequence on the bototm IE Phylogenetic Tree File View Stop Selection Remove Selected Save Treef r Phylogenetic Tree The mar Aqu aeo The the Por gin Dei rad Str coe Cor dip Myc lep yc bov Myc tub lyc avi Geo sul Des des Des vul She put Dic nod Mar hyd Pse aer Pse put Alt hal Pse hal Vib cho Aer sal Pas mul Hae duc Hae inf Act act Yer pes Kle pne Esc col Sal_par r Pasta Fil AMAAAAA BBBBB BBB DDDDD EEEEE FFFFFGGGGG HHHHHHH ggggg hhhhhhh ween ence eee eese serene ee eee re ee eee reo re oorr oso oO IIIIIIIIIJJJJ e ttt index SOIHI l 0999 11111 11 111 11311 1311 H112 sabhi h b RRG b a S 5 r b Elei e R ETETA A T S PUTTIN 1133111 ETT Aqu aeo A1 gt 3 5 1 B1 gt 3 0 1 D1 2 5 1 E1 gt 3 5 1 Fl gt 4 4 1 G1 4 3 0 H155 6 0 11 6 4 0 J1 gt 2 0 1 K1 5 5 1 L1 gt 2 6 1 M1 2 1 1 N1 3 4 1 01 gt 0 9 GGGGG6CGga aaggauu cgaCGGGG ACaggcg GUCCc cGAGGag CAGGCCGGG UGGCU CCCGuaac AGCO cuaaaacagcucccgaagcugaacuc gcucuCGCUGccuaauuaaa The mar A15 3 5 1 B1 3 0 1 D1 2 5 1 E1 3 5 1 F1 4 4 1 61 4 3 1 H155 6 1 I1 gt
34. gt sequence label 2 UCGCAACCC UCAUUUCGCGGUUCCAGAAUGGAUCAACCGCGGUUU The pairing indicators are upper and lower case letters in the gt pairs and numbers in the gt index line Regions that pair with each other are indicated by corresponding upper and lower case letters while the numbers are used as subscripts to allow more than one pairing region using the same alphabetic letter The is used for space between pairing indicators while is used to indicate an alignment gap or structural bulge in the sequences The base in each sequence in the column beneath the first A1 will pair with the corresponding base in the column beneath the last a1 The base beneath the last B1 will pair with the base beneath the first b1 In the example shown pairing regions A1 a1 and B1 b1 form a pseudoknot Stockholm Rfam files get converted to Pasta by changing the lt lt gt gt notation into the AA aa notation as well as resolving the interleaved sequence format The user may wish to compare the Stockholm pair structure line with the computed Pasta format line to ensure the conversion makes sense Stockholm Format This is a multiple sequence alignment format commonly used by RFam a database containing information on non coding RNA The RFam database can be accessed at http rfam janelia org STOCKHOLM 1 0 GF ID UPSK GF SE Predicted Infernal GF SS Published PMID 9223489 GF RN 1 GF RM 9223489 GF RT The role of th
35. lights Ei Analysis Tools ili Base Freq Interpreted ils BasePair Freq Pos Pairing regions displayed here ih BasePair Freq Reg Output displayed here ils BasePair Freq Stem Pos ili Entropy By Pos ili Stack Doublets ili PseudoFreas ils Region Freq D EJ Aligfiment Tools Non Canonical i Extract PK i Extract Regions i Erase Region Label i Delete Gap Cols i Del Seq by Reg Size i Partition by Stems Partition by Loops EI P Mogeneic Tools op Parse Input File 0 26 26 Non Canonical This function searches for non canonical base pairs and marks them with a beneath them Non canonical is defined as anything other than A U G C or G U Bulges and gaps are also indicated by b and g below the positions Extract PK This function will extract one or more of the pseudoknot regions creating an alignment of just those regions The Copy B to T function can then be used to copy the extracted pseudoknots to the upper input textbox after which Parse Pasta can be used to generate an analysis of just this extracted pseudoknot Extract Regions This function will extract one or more of the pairing regions creating an alignment of just those regions If there is a pseudoknot crossing region which begins within the area being extracted and which would pair with a region outside the area being extracted then that pseudoknot pairing region is erased from
36. logenetic Functions Phylogenetic study on the stem loop length variation is one of the most novel features of RNApasta Press Partition by Stems 7 Alignment Tools Non Canonical Extract PK Extract Regions i 1 Erase Region Label i 4 Delete Gap Cols 7 Phylogeneic Tools dias Phylogenic Stem Loop Parse Select the Stem that will be used to partition the sequences and click on Select Length for One Region Note The user at this step can also choose to Select Length for All Regions and it will prompt the user with a series of Histograms ia Shen DE Stem Length Sehect Lenght Bor All Regions Partition into Two Subsets indicate on The Label Line The data will be An imdicator will be added Givided into sath ie The dasta ams linee if tre above and Eili PGES H oF the selected length below the selected length Select the criteria value for the partition and Press OK At this step the user could also choose to save this image Partition Histogram 06606 File Region A1 Left mouse click to select value for partitioning 77 3 0 Hum 3 4 Region Length The Data Viewer will show the stem that will be partitioned Partition Stem Partition sequences by stem lengths Data Viewer Name Length A1 30 Stem Length 3 0 Select Length for One Region OR Select Length for All Regions Partition Into T
37. ned stem or loop the user wants to examine Once the user finishes with the selection the user can then click on Intersection Union to proceed g Phylogenetic Tree File View Mutiple Selection Stop Selection Remove Selected Save Tree Intersection Union r Phylogenetic Tree The mar Aqu aeo The the Por gin Dei rad Str coe Cor dip Myc lep c bov c Myc tub Geo sul Des des Des vul She put Dic nod 5 Mar hyd N Pse_aer Ty Pse put m Alt hal Pse hal lt Vib cho Aer sal Pas mul Hae duc Hae inf Act act Yer pes Kle pne Esc col Sal_par r Pasta File AMAAAAA o BBBBB BBB DDDDD EEEEE FFFFFGGGGG HHHHHHH ggggg hhhhhhh 9999999999 eeeooeoeoosooooosossssssossssosocosossccceccceeecccoccccecececececececececce IIIIIIIIIJJJJ eee eene 1111111 11111 111 11111 11111 1111111111 1111111 11111 1111111 e ee eee esee e eee eee so eesesosocosssscsosoosssssssoososocsscosscccscsoccccocccceccecees1111111111111 eee eee pairs BBBBB BBB DDDDD EEEEE FFFFFGGGGG HHHHHHH ggggg hhhhhhh eee eee ese sosssesssssssssscssossssssssccsssscsssesssssssccsssssscssccssssscosee IIIIIIIIIJJJJ eee ee eene gt index 1111111 11111 111 11111 11111 1111111111
38. nnnnn nnnnGGAA uuCaagaa GcCCG aGGUG CAUGUCGAGG uGCGGUUU gCCUCGuaaa aAAGCCGCaauuu GGGGCUGauucaggauu cgaCGGGA AUUuugca GUCUg aGGUG CAUGCCGAGG uGCGGUag gCCUCGuuaa canACCGCana nnnnnnnnnnnnnnnnn nnnnAAGA UUcacgaa AcCCa aGGUG CAUGCCGAGG uGCGGUag gCCUCGuuaa caaACCGCaaa GGGGCUGauucuggauu cgaCGGGA UUUgcgaa AcCCa aGGUG CAUGCCGAGG gGCGGUUg gCCUCGuaaa aAGCCGCaa r ICJIA A A VE AG CETT GGGGGCGaauaugguuu cgaCAUGA AUgucaaa AUCUaa GGug CAUGCCGAGGa aGUACcGuaaCCUCGuuaauaa CaGUACaaaugccaauaauaacuggcaacaaaaaagcaaaccgcguagcggcu aacgacagca Copied From Bottom Input Output The application will prompt the user to input a Newick TRE file Programs that are capable of producing the Newick file includes Phylip R Coffee and many other phylogenetic programs EE 2 SSRNA rfam Desktop Lal asdf RNAseP_BactA rfam RNAseP BactB rfam tmRNA tmRNA A1Z1 tmRNA A1Z1 neighbor 3 tRNA Type TRE File Date Modified 6 19 2009 1 01 AM My C er Size 2 28 KB ithe tti My Documents My Network Places File name neighbor Files of type All Files E Once the user selects a valid file a screen will pop open with a graph showin
39. of a pseudoknot which the program detects as actually crossing another stem and is indicated by an XXX in one of the comment lines Second the base pair frequencies are given for each of the labelled regions of the alignment A B C etc BasePair Freq Stem Pos For each region this calculates the frequency of each base pair as a function of position within a stem with positions reported as outer middle or inner If there are 4 base pairs in a stem the middle 2 would get averaged The overall base pair frequency by stem position over all positions is also calculated Entropy By Pos This computes relative and absolute entropy by alignment position with and without gaps Entropy H X Sum over i P Xi log P Xi and relative entropy is H P Q Sum over i P Xi log P Xi Q Xi see Durbin et al page 305 308 This uses the natural log base PseudoFreqs This will recalculate the base and base pair frequencies using pseudofrequencies The user can choose which set of pseudofrequencies to use or to define their own and then which method to average the pseudofrequences with the measured frequencies Zero offset adds 1 to any 0 count Fifty weights the pseudofrequencies as if they were from 50 sequences Square Root uses the square root of the number of sequences in the alignment as the weight and Minimal Risk uses a modification of Square Root developed by Wu et al 1999 J Comp Bio 6 219 235 Wu et al discusses each of these
40. sed upon the size distribution of a stem length The dialog allows one to select the stem of interest then the Select By Histogram generates a histogram of the stem length distribution A left mouse click selects the partition value The user has a choice of outputs In one case the data is divided into 2 sets of sequences one above and one below the selected partition length for the selected stem In the other case whether a given sequence is above or below the selected partition length for that stem is indicated by text added to the standard label line D1 gt 4 6 1 This example indicates that for stem D1 the sequence that follows has a length greater than 4 6 Remove Partition Tags will remove these Partition By Loops This allows the user to divide the data set in 2 based upon the size distribution of a loop length The dialog allows one to select the loop of interest then the Select By Histogram generates a histogram of the loop length distribution A left mouse click selects the partition value The user has a choice of outputs In one case the data is divided into 2 sets of sequences one above and one below the selected partition length for the selected loop In the other case whether a given sequence is above or below the selected partition length for that loop is indicated by text added to the standard label line 4 c1D1 gt 3 6 0 This example indicates that for loop c1D1 the sequence that follows has a length less than 3 5 Phy
41. tion by Stems Extract Regions Erase Region Label Delete Gap Cols Del Seq by Reg Size Partition by Loops E Phylogeneic Tools db Phylogenic Stem Loop Parse nput file displayed here 4 Push Analysis Up RNApasta TD Her 2 Options Input a Interpreted Pairing regions displayed here Output displayed here Output mee Base Freq This calculates the frequency of each base as a function of position in the alignment and overall BasePair Freq Pos This calculates the frequency of each possible basepair as a function of position in the alignment The pairing used is that indicated by the pairing indicator line in the pasta format If the pairing line indicates that position 2 pairs with position 670 then this function will calculate the observed base pair frequencies for these positions omitting the alignment induced or gap gap pairs See below for details This function will calculate the frequencies for both ends of the pair hence for a position 2 670 it will report a certain GC frequency and for position 670 2 it will report the same frequency as CG asymmetric BasePair Freq Reg This reports the frequency of each possible basepair as a function of regions First overall frequencies of base pairs are reported for the whole alignment the pseudoknots and non pseudoknot stem loops A crossing helix is the stem
42. uence press Push Analysis Up amp SSRNA rfam txt File Edit Analysis Align Edit View Help Load File ba Save Interpreted Tools 9 L5 Basic Tools P Interpret Input Draw Arcs 9 5 Pasta Edit Pseudoknot Removal Remove Partition Tag Highlight Stems 1 Remove All Highlights 5 Analysis Tools ils Base Freq ils BasePair Freq Pos ils BasePair Freq Reg ils BasePair Freq Stem Pos ih Entropy By Pos E Options e Help A Clear Bottom Analysis 4 Push Analysis Up J SIOCKHOLM 1 0 22 222 2 22 e pairs GC SS cons numseq 602 Sees oe eo S 2 da enis BB BBBB B B B AAAA A A 58 858 8 Db bbbb b bbD DD DD X07545 1 505 619 ACCCGGC CAUA GUGGCCG GGCAA CAC CCGG U C UCGUUUCGAACCCG GA AGUUAA GCCGG CCACGUCAGAACG gt M21086 1 8 123 ACCCGGC CAUA GCGGCCG GGCAA CAC CCGG A C UCAUGUCGAACCCG GA AGUUAA GCCGG CCGCGUUGGGGGA X01588 1 5 119 sh Y Interpreted X Extract Reg um GC SS cons numseq 602 ee ee ee 888 8 bb bbbb b bb indaw After you reinterpret the sequence you will be able to perform analysis on the new sequence basically the process can be recursively performed based on what the user is trying to achieve eo SSRNA rfam txt File Edit Analysis Align
43. vided the capability of forming an arc diagram representation of the structure which can be accessed this by clicking on Draw Arcs 6 File Edit Analysis Align E Load File ba Save Interpreted Tools rj 5 Analysis Tools ils Base Freq Pseudoknot Removal Remove Partition Tag Highlight Stems al BacaPair Fran Pan Remove All Highlights gig BasePair Freq Pos Analytical Functions APBD EFT op hy Typ apap Rp Ly My Np my Op op lr ken Prepare Spry ay sp TUL xy vy uy ty Ay By Coby op xy frerdpbiDydyay ANA This section will go over the more technical routines that are used within the application The Analysis Tool Section contains a list of functions for obtaining structural information The more frequent used functions are the Stem Summary and Loop Summary They will provide the user with a summary of the size of the stem and distances between each stem File Edit Analysis AlignEdit View Help j Load File jd Save Interpreted A Clear Bottom Analysis Tools 9 Ej Basic Tools P Interpret Input Draw Arcs 9 GJ Pasta Edit f Pseudoknot Removal Remove Partition Tag Highlight Stems ih BasePair Freq Pos ils BasePair Freq Reg ils BasePair Freq Stem Pos ili Entropy By Pos ih Stack Doublets ih PseudoFreqs th Region Freq hh Stem Summary Loop Summary i Extract PK Parti
44. wo Subsets Indicate on The Label Line The data will be An indicator will be added divided into 2 sets to the fasta label line if the above and below sequence is above or the selected length below the selected length into Two Sets Into Label Line On the bottom you will have the choice of either partitioning the entire list of sequence into two dataset or put a partition tag on each of those sequences we ll explain the utility of the partition tag later E Partition sequences by stem lengths Data Viewer Name A1 Stem Length IA1 Select Length for One Region 3 0 J OR Select Length for All Regions Partition Into Two Subsets The data will be divided into 2 sets above and below the selected length Into Two Sets Indicate on The Label Line An indicator will be added to the fasta label line if the sequence is above or below the selected length imo Labai Lina Length Once user clicks on Into Two Sets the program will put the two dataset into two Pasta formatted text The user can now have the option of performing further analysis on the subsequences DABEI Qpan Al FPEJSHECCEP CESHE E E ALLELE ES i ERELL EEE DOC A d L DOG UR 1 JJ n m DOS 1 mero e mo meom DEI DOGS a E J T 3 5 EEF 3 AL T TTE Fre Pn Me pea L E 5 Semi bebe 8 DM eem Ad been 30 i FEXZ
45. yed here Extract Reaions a At this point if you are not familiar with the pasta format we encourage the user to look at the interpreted section A particular utility of RNApasta is its capability to break apart various sections of the RNA structure and align the sequence according to the stems that they are corresponded to See the yellow tags BB BBBB B B B AAAA A A aa aaa a bb bbbb b bbD DD DD D DD DD dd dd d dd dd d ccc cccccc 111111111 11 1111 1 1 1 1111 1 1 11 111 1 11 1111 1 111 11 11 1 11 11 11 11 1 11 11 1 111 111111 gt L27343 1 3 116 GCGGC CAGG GCGGAGG GGAAA CAC CCGU A C CCAUUCCGAACACG GA AGUGAA GCCCU CCAGCGAACCAGC UAGUACUAGAGUGGGAGACC CUCUGGGAGCGCUGGUUC GCC GCC 2L27168 1 1 120 UUGGCGAC CAUA GCGGCGA GUGAC CUC CCGU A C CCAUCCCGAACACG GA AGAUAA GCUCG CCUGCGUUUCGGU CAGUACUGGAUUGGGCGACC CUCUGGGAAAUCUGAUUC GCC GCCACC 2X72588 1 6990 7093 GCGGC CACA GCGGCGG GGCGA CUC CCGU A C CCAUCCCGAACACG GC AGAUAA GCCCG CCAGCGUUCCAGC GAGUACUGGAGUGUGCGAAC CUCUGGGAAAACUG gt X02128 1 24 139 GCGGC CAGA GCGGUGA GGUUC CAC CCGU A C CCAUCCCGAACACG GA AGUUAA GCUCA CCUGCGUUCUGGU CAGUACUGGAGUGAGCGAUC CUCUGGGAAAUCCAGUUC GCC GCCC gt X14441 1 5 123 GGCGGC CAGA GCGGUGA GGUUC CAC CCGU A C CCAUCCCGAACACG GA AGUUAA GCUCG CCUGCGUUCUGGU CAGUACUGGAGUGAGCGAUC CUCUGGGAAAUCCAGUUC GCC GCCCCU gt L27162 1 2 122

Download Pdf Manuals

image

Related Search

Related Contents

sumário 1. apresentação 1 2. memorial descritivo e especificações  Manuel d`Installation  Exercice promotion 2004 - Renouveau et Democratie  Funciones especiales  Manual do Usuário Estabilizador Microprocessado  COMPUMAG 2013 USER`S MANUAL  USER MANUAL - Traxon Technologies  LevelOne PLI-3021  取扱説明書  

Copyright © All rights reserved.
Failed to retrieve file