Library Construction & Sequencing: Data Delivery

Data Delivery

Data delivery varies depending on the specifics of the project. Factors that influence the type of data returned include species and library construction method (ex WGS vs RNASeq).

For Human WGS and Exome Projects:

bam file (*.bam) - is a binary SAM file that contains the sequence alignment data for a single sample created using BWA
bam index file (*.bam.bai) - are the index files for the corresponding bam files
Multi-sample vcf file (.vcf) - contains all samples in a project called at every site that is variant in at least one of the samples as generally recommended by GATK best practices. For more information on VCF format go to: http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41
Sample ID Lookup table (.xls) - is a cross-reference between the UW-CMG LIMS IDs generated at the sequencing center and the investigator sample IDs. It also lists the family information for the project.
Genotype data - Not all projects will have high-density genotyping data generated. If the project was genotyped, the investigator will receive a PLINK-formatted file containing the genotypes. For more information on PLINK, go to: http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml
SeattleSeq annotation files are provided. For additional information on this annotation, see the SeattleSeq website: http://snp.gs.washington.edu/SeattleSeqAnnotation138/

For Human RNA Projects:

bam files mapping to the genome and transcriptome
Normalized expression counts
Gene and exon counts

*Fastqs can be released for any project type

Next Generation Sequencing • Frequently Asked Questions