Home

RNA-‐seq analysis with CANEapp User Manual

image

Contents

1. Add groups Type in the group name and press Add Group Please no spaces in group names If you want to remove a group from the list select the group and press Remove Group Once you added at CANE analysis setup e090 Create Project a Project Name New_project Project Location labadmin Documents Browse Submit Create New Project Load Existing Project Delete Project least one group you can go to the next tab and add samples e090 CANE analysis setup ma Manage Projects Add Groups Add Sample Add experimental groups to be examined Add Group gt gt m lt lt Remove Group 3 Adding samples step one On the Add Samples tab first select the experimental group from the list on the right and type in the name of the first sample Then you have two options either upload raw data files from your computer or use the files you already have on the server L Li L CANE analysis setup Manage Projects Add Groups Add Samples Analysis Settings DGE Primer Design Submit Analysis Enter name for sample Select group al Upload Read Files from Compute Use Read Files on Server Select library type Ji Single End Paired End Samples Add 4 Adding samples step two Specify if your sequencing is single or paired end If you chose to upload the data from the computer browse to the raw read file The accepted format is fastq but you can us
2. and also includes a streamlined RNA seq analysis pipeline that efficiently manages the computational resources parallelizes computation and automates the entire analysis CANEapp operates on a variety of UNIX servers including Amazon Cloud or High performance computing servers and requires zero interaction with the server or any command line or installation operations CANEapp is a free open source software distributed under the GNU General Public License Prerequisites To use the CANEapp for RNA seg analysis you just need two things the CANEapp package downloaded from our webpage and a Linux server Make sure you have the latest version of Java installed and Python version 2 7 on the server to initiate the pipeline Since RNA seq analysis is computationally demanding you will need a server with at least 30 GB of RAM CANEapp can be used with a variety of Linux operating systems Ubuntu CentOS RedHat Fedora a cloud server such as Amazon EC2 or a Linux cluster using LSF job scheduling system If you are using Amazon Cloud just search for CANEapp Amazon Machine Image create a new instance based on it and use it as the server for CANEapp If you are not using an Amazon Cloud server but have administrative rights run CANEapp as the root user and all prerequisites will be installed automatically Otherwise you need to contact you system administrator to install the prerequisites You can simply provide the administrator with shell script wi
3. for job scheduling check the Server Uses Job Scheduler option and specify the cluster queue amount of memory and number of cores to be used for a job and max time to run a job Using Amazon Cloud Instance If you are using Amazon EC2 to perform the analysis the easiest way is to use CANEapp Amazon Machine Image AMI to create a new instance with the amount of resources you need Search for CANEapp AMI and create an instance with as much resources as you need Then in the CANEapp GUI provide the public key for your instance together with the instance IP address in the GUI Make sure the instance is running before submitting the analysis Before submitting the analysis make sure you have at least 30 GB of RAM or more for large projects and enough disk space for the analysis As a rule of thumb you will need free space 3 times the size of the raw data to safely run the analysis Finally click Submit Analysis button If it is the first time you are using CANEapp on the current server it will take a minute to transfer the pipeline files to the server After that you will see file transfer window with a progress bar Make sure the computer does not go to the sleep mode while the files are transferring Once the files have been transferred you will see a notification window Now you can check the status of the project or close the GUI The rest of the process will take on the server side CANEapp will utilize all available resources
4. validation of the gene expression estimated with RNA seq or confirm presence of novel previously unannotated genes For that purpose CANEapp includes a primer design tool In order to use it open the tab delimited output file from CANEapp and select the genes you are interested in validating Copy the first column contacting the Gene IDs XLOC Then navigate to the Primer Design tab and paste the IDs in the window Press Submit Gene List and wait until primers are designed sansar mrs meem semn a mem m m i m co Le OO CANE analysis setup i Manage Projects Add Groups Add Samples Analysis Settings DGE Primer Design Submit Analysis Gene Primers XLOC_000022 XLOC_000027 XLOC_000029 XLOC_000034 Submit Gene List Primers will be designed preferentially to span a splice junction common to all the isoforms of a gene or to span a common exon If there are no common junctions or exonic regions no primers will be designed Once primer design is complete the file containing primer sequences will appear in your project folder Retrieving logs If you experience any problems with CANEapp for instance analysis stops on a specific step click Get Logs button on the Manage projects tab Archived logs will be downloaded to your project folder on the local machine Please email the archive and the output txt file in your project folder to Dmitry Velmeshev dvelmeshev med miami edu with the sub
5. RNA seq analysis with CANEapp User Manual Dmitry Velmeshev Patrick Lally Faghihi s lab University of Miami Contents WY AG USC sororia EE PFEFEQUISILOS ei aan a A e NES INSTA ALI ON esnea a MO SUING sirens a sovicsueysin det seusatwactwoeitalscersaiweusversiateewtusatweseved Analysis Quick guide cccccccsscsosccccccccecscsccccccccscscncsccscecccccncns Creating a new project Adding experimental groups Adding samples step one Adding samples step two Adding samples specifying RNA seg library Specifying analysis settings Submitting the analysis OS SN SY ST os Me Checking status of a running project 10 Retrieving the data 11 Primer design RETHIOVING 08S ccocceiccnectaerwinccudeccataceceic aaa aaa Setting up differential gene expression analysis salt oare Ra 3 E E E 3 T ET 3 ATE S 4 A AA 4 O WAN BNR A 10 11 12 13 Sebeiseiseaaasanasbaean 14 What is it CANEapp application for Comprehensive automated Analysis of Next generation sequencing Experiments is a software tool that strives to provide biologists with no background in bioinformatics and computational science with an easy way to perform cutting edge analysis of large scale RNA seq data It also minimizes hands on time to perform RNA seq analysis by automating all the analysis steps CANEapp comes with the Graphical User Interface GUI that makes the experimental design and analysis setup easy and user friendly
6. e tar gz tar gz or bz2 compressed fastq files as well as SRA NIH Short Sequence Archive files In case your files are already on the server specify the full path to the file including the file name on the server eoo CANE analysis setup w Manage Projects Add Groups Add Samples Enter name for sample Select group al A B Location of Read Files Upload Read Files from Computer elect library type Single End Paired End Read file Library Type Samples Add 5 Adding samples specifying RNA seq library Now you need to specify the type of RNA seq library prep you used for your experiment You can select from a list of predesigned libraries if you know which prep you used or specify a custom library prep It is necessary to specify the strand selection used in the prep the other parameters such as adaptor sequences and adaptor lengths are optional In most cases Default will work just fine but if you have additional information about your library it will help with the analysis accuracy For additional options see below If you want to modify library settings unclick the Default checkbox to the right of the library Adaptor sequences are important if you choose to trim the adaptor sequences before performing read alignment in case you performed size selection and your fragment are bigger that the read length it is not required If you don t know which adaptors were used standard Illumina adaptors will be u
7. ject CANEapp issue
8. n Click the button and you will see a window with a bar showing the progress of output files download Once the files have been downloaded you can locate them in the local project folder 11 e090 CANE analysis setup Manage Projects Add Groups Add Samples Analysis Settings DGE Primer Design Submit Analysis mouse_ cortex Check Status Status Done Get Logs Retrieve Output Files mouse_cortex Create New Project Load Existing Project Delete Project The files will include one tab delimited text file for each pairwise comparison between groups containing all the genes another tab delimited file containing only differentially expressed genes based on FDR and the third file containing genes filtered by both the expression and FDR These files can be opened in Excel and contain information including the gene ID gene name gene classification raw read counts for each sample first column for each sample and FPKM second column for each sample log of fold change between the groups and statistical values for differential expression The other two output files are the GTF Gene Transfer Format files for all genes and for only differentially expressed genes These files can be used to visualize reconstructed transcripts and loci on IGV Integrated Genome Viewer Primer design Once analysis have been completed and you have identified genes differentially expressed in your experiment you might want to perform qRT PCR
9. nced Options Use defaults Use custom settings Setting up differential gene expression analysis On the next tab you can select from three alternative workflows for differential gene expression analysis Cuffdiff edgeR or DESeq2 If you wish you can run all three in parallel You can use default options for Cuffdiff or specify custom options For edgeR you can select from two approaches to differential expression testing Generalized Linear Models GLM or exact test You can also use them in parallel For edgeR and DESeq2 you need to select the pairwise combinations of the groups you want to compare Warning in order to use edgeR or DESeq2 you need at least two replicates per experimental group e oo CANE analysis setup Manage Projects Add Groups Add Samples Analysis Settings DGE Differential gene expression tools W Cuffdiff v edgeR v DESeq2 Cuffdiff Options Use defaults Use custom settings edgeR options Comparisons FDR FDR Correction Method v Exact test ADD gt gt 0 05 BH VS W GML v lt lt DELETE DESeq2 options Comparisons FDR FDR Correction Method z ADD gt gt 0 05 BH v VS v lt lt DELETE 8 Submitting the analysis Now proceed to the final tab You need to specify your user name server address home folder and either a password or a public key to access the server High performance computing servers using IBM Platform LSF Session Scheduler If your server is a cluster using LSF system
10. of the server so you can run only one project at a time and should avoid running resource demanding processes on the same server together with CANEapp If you are using a cluster with the LSF job submitting system you can run several projects in parallel but make sure software and reference installation steps have been completed before starting another project e800 CANE analysis setup e Manage Projects Add Groups Add Samples Analysis Settings DGE Submit Analysis Username ec2 user Server Uses Job Scheduler Connection Address 52 88 134 214 Home Folder home ec2 user Authentication Method Password Key File Private Key File ers labadmin Desktop pipeline Lucid pem 9 Checking status of a running project Once the project has been submitted you can check its status at any time on the Manage Projects tab Select a project and click Check Status button You will see the current step of the pipeline the projects is at OO e os 2s ee eee se see me mo 00 AER 2 CANE analysis setup n EJ Manage Projects Add Groups Add Samples Analysis Settings DGE Primer Design Submit Analysis single_top_pegasus eparing Bowtie index Get Logs Retrieve Output Files single_top_pegasus Create New Project Load Existing Project Delete Project 10 Retrieving the data Once the project is completed the Status will read Done It will enable the Retrieve Output Files butto
11. sed Adaptor lengths help in calculating mean insert length and will help with TopHat alignment If you have information about your library s size distribution e g from Bioanalyzer trace specify fragment mean and coefficient of variation CV Finally click add sample Proceed with the rest of the samples lIe oo CANE analysis setup al Manage Projects Add Groups Add Samples Enter name for sample Select group al A B Location of Read Files Upload Read Files from Computer Use Read Files on Server Select library type Single End Paired End Read file Users labadmin Des Browse Library Type Mean insert size Custom single v Coefficient of Variation Direction Adapter Length unstranded v 120 _ Default Samples Adapter Sequence 5 3 AGATCGGAAGAGC 6 Specifying analysis settings Next navigate to the next tab and specify the analysis settings You have to select the alignment program TopHat or STAR TopHat is a more conventional tool that is relatively slow but does not require a lot of resources whereas STAR is a more recent aligner with ultrafast performance but requires a lot of RAM Then select the species and the assembly By default the pipeline will perform adaptor trimming Trim raw reads option will filter out single exon transcripts Filter transcripts option based on what percentage of all samples Total Filter or samples from one group Group Filter expre
12. sses the transcript The pipeline will then filter out lowly expressed genes Filter lowly expressed genes option based on minimum number of reads mapping to a gene You can modify these options however the CANEapp was tested with the default options and demonstrated good results so in general the defaults will work well The species and assembly can be selected using the drop down menus it is also possible to add new species assembly by clicking add species button and specifying species name assembly name URL link to the fasta genome file and the gtf file containing gene annotations If you are familiar with TopHat STAR and Cufflinks and want to modify the options for these tools click Use Custom Settings next to one of them For TopHat or STAR you have to put the options the same way you would use them in the command line say if you want to change the max insertion length for TopHat alignment to 2 you would paste max insertion length 2 in the TopHat option box eoo CANE analysis setup a Manage Projects Add Groups Add Samples Analysis Settings DGE Primer Design Submit Analysis Basic options Trim raw reads Species Filter lowly expressed genes Filter Transcripts Yes No human v Yes No Yes No Alignment Program Assembly Tophat STAR GRCh38 Threshold 20 Total Filter 75 Group Filter 75 Add species assembly STAR Advanced Options Use defaults Use custom settings Cufflinks Adva
13. th all prerequisites that can be found in the misc folder of CANEapp For CentoOS RedHat and Fedora CANE_library CentOS sh For Ubuntu CANE_library _Ubuntu sh Installation There is no installation for CANEapp The Graphical User Interface GUI component is written in Java and works on Mac and Windows GUI together with the pipeline component of CANEapp will do all the work for you Just download the CANEapp package unzip and open the JAR file Testing The package includes the example folder with two samples two small Fastq files for each sample from paired end RNA seq of human tissue These files can be used to quickly test CANEapp on your system Use them to perform analysis of paired end data and compare two experimental groups 1 sample each to familiarize yourself with CANEapp and test the package Analysis quick guide Open the CANEapp JAR file You will start on the Manage projects tab On this tab you can create new projects check status of running projects and remove existing projects 1 Creating a new project Click Create new project button then type in the name of the project please no spaces and browse to a location on your computer where you want to save the files related to the project Press submit Now you project is displayed in the list of recent projects and you can proceed to designing your experiment Manage Projects 2 Adding experimental groups Click on the next tab

Download Pdf Manuals

image

Related Search

Related Contents

DINOX!Network!Camera!  Voir l`article (format pdf)    Nobo T-Card Blister Packs Size 3  Bedienungsanleitung HIER ! (Klick) - Molly  Singer 3229    Indesit TLA 1 S  Hardware User Manual  EA08 Hydraulic Earth Auger  

Copyright © All rights reserved.
Failed to retrieve file