BecA Bioinformatics

Quick Links

Useful Resources

The ability to visualize your raw sequencing aligned to the genome is enabled through the use of genome browsers. Visualizing your data is incredibly important - it enables you to investigate your data and get a feel for how it "looks".

Visualizing using SAMTOOLS TVIEW
Visualizing using IGV
Visualizing using TABLET

Viewing the output with TView

Before we can use TView to compare Bowtie / BWA mappings, we need to sort the Bowtie BAM file, and generate an index for it.:: cd drosophila_bowtie
samtools sort [bowtie.bam] [bowtie.sorted]
samtools index [bowtie.sorted.bam] Now that we've generated the files, we can view the output with TView. We'll compare two different sorted:: samtools tview ../[full_bwa.sorted.bam]
Now open an additional terminal window, and load the Bowtie mapping file there as well.: cd [path_to_bowtie_folder]
samtools tview [full_bowtie.sorted.bam] path_to_reference.fasta] To view the tview help, type '?'

Visualisation using IGV

The Integrative Genomics Viewer (IGV) is an efficient visualization tool for interactive exploration of large genome datasets. It supports a wide variety of data types including NGS alignments, genomic annotations, expression data, genetic variations, etc.
Familiarity to IGV features can be explored by doing this very short tutorial:

Viewing the output with IGV

Introduction

The Integrative Genomics Viewer (IGV) is an efficient visualization tool for interactive exploration of large genome datasets. It supports a wide variety of data types including NGS alignments, genomic annotations, expression data, genetic variations, etc. Alternative browsers can be found on these pages: NGS Alignment Viewers and NGS Viewers Reviewed.

Module load IGV & The "&" means that the IGV should be run in the background, freeing the terminal to be available for typing other commands.

Import of Custom Genomes and Annotations

Save the following genome sequence and GFF annotation files to a directory called 'myigv':

Import the new genome into IGV

File -> Import Genome
Follow instructions and name genome: Arab_test

Explore IGV's genome viewing and panel functions. For instance:

Click chromosome ID and zoom in and out.
Click icons on top: home (whole genome view), refresh screen and define region of interest.
Expand Genome track to view gene regions (triangle in top left corner) then right click on a feature and copy its annotation or sequence to clipboard.
Paste gene ID AT1G34575 into search field and hit Go.

Save current session in myigv directory, close IGV and then restart this session.

File -> Save Session

Viewing Expression Data

Prebuilt genomes

Select in genome drop down Human hg18
Import expression data and display as heatmap and line plots
File -> Load from File

Viewing NGS Data

Import NGS data from human
File -> Load from File

NA12878.SLX.chr1_sample.bam (requires *.bam file from this site)

Import custom NGS aligned read data generated in above Galaxy exercise
Select under prebuilt genomes A. thaliana (TAIR9)
Import aligned reads from Galaxy exercise

File -> Load from URL

http://biocluster.ucr.edu/~rsun/workshop/APIUIND.sorted.region.bam
http://biocluster.ucr.edu/~rsun/workshop/APIIND.sorted.region.bam
http://biocluster.ucr.edu/~rsun/workshop/APIUIND.sorted.region.bam.bai
http://biocluster.ucr.edu/~rsun/workshop/APIIND.sorted.region.bam.bai

Go to region

Chr1:15166146-15242215

Create various custom tracks from Excel (or R, Perl, Python, etc.), and import them into IGV. The Recommended File Formats page contains detailed information about the formats for handling different data types, such as mutation tracks, ChIP-Seq and RNA-Seq data.
Mutation track sample (details on this format). The colors for displaying different mutation types can be changed in IGV under:
View -> Color Legends -> Mutation
A BED formatted file can be used to define ranges (details on this format).
A similar result can be achieved with the GFF3 format (details on this format).

Viewing the output with Tablet

Open Tablet to load

Click on "Open Assembly".
Choose one of the filtered SAM files obtained previously as the primary assembly.

If you used reference mapping, load your fasta reference file. ElseIf you used a denovo approach, use the contigs found as the reference.
Select the contig against which the reads were mapped in bowtie as the reference file and click "Open".

Once the assembly has loaded (this can take a while), select the only contig on the left menu, and wait until it loads (this also may take a while).

Explore the visualization features of the program, some of which are:

the navigation tools (zoom, ...)
the various levels of highlighting the variants
the different layout styles
the different pack styles
in the advanced tab, click on "Coverage"

In a paired-end pack mode, mouse over some pairs, and have a look at the displayed information.
In the coverage mode, click in the low coverage regions in the overview window (above the main read display area).