Upload
danielstandage
View
41
Download
0
Tags:
Embed Size (px)
Citation preview
Differential gene expression in Polistes dominulaDaniel S. Standage, Brendel Group Meeting, 21 Nov 2013
Context
Basic differential expression analysis
Isoform-level analysis unreliable
Refocused on locus-level DE analysis
Interval loci (iLoci)
Partition genome into segments that contain
0 protein-coding genes
1 protein-coding gene
2 or more overlapping protein-coding genes
P. dominula genome contains 18,675 iLoci
8,531 with 0 genes
9,197 with 1 gene
947 with 2-5 genes
Out-of-the-box analysis
RSEM: estimate expression levels for each sample independently (uses Bowtie to align reads)
Combine expression data into a single matrix
EBSeq: normalize expression levels and identify differentially expressed genes
Results and observations
295 differentially expressed iLoci
Grouping of samples is troubling
Similar concerns as with previous analysis
Some iLoci with very many reads mapped
Some iLoci with very few reads mapped
Concerns about normalizing over such a large dynamic range
Results and observations
294 differentially expressed iLoci
Grouping of samples is troubling
Similar concerns as with previous analysis
Some iLoci with very many reads mapped
Some iLoci with very few reads mapped
Concerns about normalizing over such a large dynamic range
iLocus filtering
Filtered the iLoci based on
Number of reads mapped
Number of samples with reads mapped
Distribution of mapped reads across samples
10,043 / 18,675 iLoci (54%) passed filtering criteria
Re-ran RSEM/EBSeq procedure from scratch
Analysis sansQ4
Removed the Q4 sample and re-ran EBSeq step
Verified normalization is working as we expected
Found very clean result
Analysis sansQ4
Identified 314 differentially expressed iLoci
219 (70%) over-expressed in workers
95 contain 0 genes
197 contain 1 gene
22 contain 2 or more genes
Biological interpretation
Manual analysis of DE iLoci
xGDBvm
yrGATE
Two protein families occurred very frequently
Cytochrome P450s
NADH dehydrogenases
5 questions
How many CYP genes are in the wasp genome?
What percentage of these CYP genes are DE?
Do CYPs and NADH dehydrogenases belong to the same pathways?
Can the CYP genes in the genome be categorized?
Can reads discarded during genome assembly provide insight into mitochondrial contamination?
CYPs in Polistes dominula
Identified with a basic BLASTP search
Query: translations of Maker annotations
Database: Hymenopteran CYPs from NCBI
154 iLoci potentially contain CYP genes
Not all matched queries represent CYPs
Stricter criteria required for high-confidence count
Differentially expressed CYP genes
Took intersection of 2 lists
mRNAs from DE iLoci
mRNAs potentially encoding CYPs
Identified 12 putative DE CYPs
11 verified manually
9 / 11 over-expressed in queens
DE NADH dehydrogenase genes
BLASTP search found 38 potential NADHdh genes
12-15 DE NADHdh genes
16 putative DE NADHdh genes
1 thrown out by manual examination
3 borderline
14 / 15 are over-expressed in workers