Differential Expression

From Geuvadis MediaWiki
Jump to: navigation, search

PUTATIVE TEXT


Differential Expression

We performed gene differential expression (DE) using tweeDEseq (R/Bioconductor) a method that uses a Poisson-Tweedie family of distributions and is well suited to compare groups with more than 15 samples. After filtering genes with less than 5 counts per million in all samples but one a set of 16 583 genes remained for analysis. We performed pairwise population comparisons and population specific comparisons (one population against the remaining four). Genes with FDR < 0.05 and log2 fold change greater than 3 were considered significant. For population specific DE genes, CEU and TSI show a higher number of up-regulated genes, 70 and 100% respectively (see figure File:Pedro DEG popspecific.pdf). GO analysis reveal that these genes are mainly involved in cell adhesion processes.



EXON INCLUSION

Exon inclusion levels were expressed as the Percentage Splice Index (PSI) [PMID:21876675,18978772], defined as the ratio between inclusion reads and inclusion reads plus exclusion reads. Differential analysis shows a relative low number of exons with population specific inclusion (Table Below and Figure File:Pedro PSI std0.15.pdf).



  • 175 210 exons with at least one PSI value



CEU FIN GBR TSI YRI
CEU # 15 33 15 165
FIN # # 21 8 158
GBR # # # 8 154
TSI # # # # 132


END OF PUTATIVE TEXT


Gene Differential Expression

To perform differential gene expression we used a method called tweeDEseq that uses a Poisson-Tweedie family of distributions. It improves the negative binomial distribution using a three parameter distribution. This method is well suited to compare groups with more than 15 samples. A vignette explain how to use can be found here:

[1] [2]


Processing

Using the read count dataset provided in the ftp dataset we have applied the normalisation and the filtering step as described in the above mentioned vignette. The normalisation is based on the TMM method (Trimmed Mean of M values by Robison and Oshlack, 2010) and the filtering consists in removing those genes with less than 5 counts per million in all samples but one. From the 24800 initial gene a set of 16 583 genes remained for analysis.

Results

Analysis were performed in two modes. One by doing pairwise comparison of all the populations, i.e. 10 pairwise comparisons. This allows to detect gene differential expressed between populations. In a second mode we compared one population against the remaining samples, i.e. 5 comparisons in total. This provides the population specific differentially expressed genes. For each gene we obtain the target population mean value, the compared population mean value, the log fold-change (logFC), the nomimal and adjusted p-value. Genes are considered to be differentially expressed when P < 0.05 and logFC > 2 or 3. We report statistics at two levels of fold change. One can then focus on the results of a more or less stringent fold-change threshold.

Pairwise Comparison

logFC > 2


CEU FIN GBR TSI YRI
CEU # 1 176 (U:935, D:242) 1 456 (U:1158,D:299) 1 189 (U:963, D:227) 751 (U: 375, D: 376)
FIN # # 744 (U:396, D: 348) 518 (U:265, D:253) 866 (U:184, D: 682)
GBR # # # 273 (U:133, D:140) 801 (U: 179, D: 622)
TSI # # # # 478 (U:69, D: 409)

Number of genes differential expressed between pairs of populations and how many are up (U) and down (D) regulated in the population with relation to the other populations.

Pairwise Comparison

logFC > 3


CEU FIN GBR TSI YRI
CEU # 570 (U:502, D: 68) 723 (U:628, D:95) 551 (U:485, D:66) 236 (U:113, D:123)
FIN # # 228 (U:120, D: 108) 116 (U:58, D:58) 350 (U:48, D:302)
GBR # # # 65 (U:30, D:35) 344 (U:39, D: 305)
TSI # # # # 167 (U:14, D:153)


Number of genes differential expressed between pairs of populations and how many are up (U) and down (D) regulated in the population with relation to the other populations.


One vs All

logFC > 2

Population Total Up Down
CEU 861 554 305
FIN 773 71 702
GBR 890 76 814
TSI 531 520 11
YRI 405 132 273

Number of genes differential expressed and how many are up and down regulated in the population with relation to the other populations.

 2">


logFC > 3

Population Total Up Down
CEU 334 236 99
FIN 332 21 311
GBR 417 13 404
TSI 248 248 0
YRI 116 23 93

Number of genes differential expressed and how many are up and down regulated in the population with relation to the other populations.


 3">


Functional Enrichment

For functional enrichment analysis we considered the genes that are population specific (from the one vs all comparison) and up-regulated at the stringent threshold of logFC > 3. The DAVID functional annotation server [david.abcc.ncifcrf.gov/] was used for this analysis. It searches enrichment in pathways, GO, and other databases. Only TSI and CEU show significant enrichment (FDR<0.05). Below the GO terms significantly enriched (BP == biological process, MF = molecular function). In both populations functions related to cell adhesion appear as differentially enriched.

GO TERMS enriched for TSI

category term genes Percentage pval pvalAdj
GOTERM_BP_FAT cell adhesion 26 13.5 3.1E-8 3.6E-5
GOTERM_BP_FAT biological adhesion 26 13.5 3.2E-8 1.9E-5
GOTERM_BP_FAT cell-cell adhesion 15 7.8 7.8E-7 3.1E-4
GOTERM_BP_FAT homophilic cell adhesion 9 4.7 5.5E-5 1.6E-2
GOTERM_MF_FAT calcium ion binding 27 14.1 1.3E-6 4.3E-4


GO TERMS enriched for CEU

category term genes Percentage pval pvalAdj
GOTERM_BP_FAT cell adhesion 26 14.4 1.4E-8 1.6E-5
GOTERM_BP_FAT biological adhesion 26 14.4 1.5E-8 8.0E-6
GOTERM_BP_FAT axonogenesis 13 7.2 4.0E-7 1.5E-4
GOTERM_BP_FAT cell-cell adhesion 15 8.3 5.0E-7 1.4E-4
GOTERM_BP_FAT cell projection organization 17 9.4 5.8E-7 1.3E-4
GOTERM_BP_FAT cell morphogenesis involved in neuron differentiation 13 7.2 9.3E-7 1.7E-4
GOTERM_BP_FAT neuron development 16 8.8 1.1E-6 1.7E-4
GOTERM_BP_FAT neuron projection morphogenesis 13 7.2 1.1E-6 1.6E-4
GOTERM_BP_FAT neuron differentiation 18 9.9 1.2E-6 1.5E-4
GOTERM_BP_FAT cell morphogenesis involved in differentiation 13 7.2 4.7E-6 5.1E-4
GOTERM_BP_FAT cell projection morphogenesis 13 7.2 4.9E-6 4.9E-4
GOTERM_BP_FAT cell part morphogenesis 13 7.2 7.6E-6 7.0E-4
GOTERM_BP_FAT neuron projection development 13 7.2 7.6E-6 7.0E-4
GOTERM_BP_FAT axon guidance 8 4.4 8.3E-5 7.0E-3
GOTERM_BP_FAT cell morphogenesis 13 7.2 1.9E-4 1.5E-2
GOTERM_BP_FAT cellular component morphogenesis 13 7.2 5.1E-4 3.7E-2
GOTERM_BP_FAT potassium ion transport 8 4.4 9.7E-4 6.5E-2
GOTERM_MF_FAT ion channel activity 14 7.7 1.1E-4 3.9E-2
GOTERM_MF_FAT substrate specific channel activity 14 7.7 1.5E-4 2.6E-2
GOTERM_MF_FAT channel activity 14 7.7 2.1E-4 2.5E-2
GOTERM_MF_FAT passive transmembrane transporter activity 14 7.7 2.2E-4 1.9E-2
GOTERM_MF_FAT potassium channel activity 8 4.4 3.4E-4 2.4E-2
GOTERM_MF_FAT cation channel activity 11 6.1 3.9E-4 2.3E-2
GOTERM_MF_FAT metal ion transmembrane transporter activity 12 6.6 4.0E-4 2.0E-2
GOTERM_MF_FAT gated channel activity 11 6.1 9.8E-4 4.3E-2
GOTERM_MF_FAT inward rectifier potassium channel activity 4 2.2 1.3E-3 4.8E-2

Brief Discussion

In general we observed that CEU is the population that departs more considerably from the other populations in terms of diff. expressed genes. On the other extreme we have the YRI population. Interestingly, when we compare CEU with FIN, GBR and TSI (logFC>2) we get respectively 79%, 79% and 80% of up-regulated genes in CEU. In the same comparison YRI shows 78%, 77% and 85% of genes down-regulated when compared with these three populations.

Personal tools
Namespaces

Variants
Actions
Navigation
RNAseq Data and Analysis
Admin and info
Public
Toolbox