Splicing

From Geuvadis MediaWiki
Jump to: navigation, search

Meta-page for all analyzes related to Splicing.

Contents

Overview

Page By Studies/Plots Messages TODO
Simple transcript variation measures Jean boxplots (nr. expressed spliceforms) vs (nr. detected spliceforms) for all populations,

density scatters for the consistency of the nr. of expressed spliceforms selected population pairs, GO of genes with most variable spliceforms, boxplots with major isoform ratio as a function of the nr. of detected spliceforms, boxplots with entropy as a function of the nr. of detected spliceforms

number of expressed spliceforms (>0.01 RPKM) grows significantly slower than number of detected spliceforms (>0 RPKM) => supports deconvolution, relative expression of major isoform saturates at ~0.4 => supports deconvolution, also Shannon entropy saturates, no striking population-specific differences in the latter studies but GO of genes with most differences in spliceforms detected between populations shows "adhesion" => support other studies linear regression of nr. expressed spliceform medians? determine supremum of Shannon entropy medians / infimum of rel. expression level of major isoform? Overlap genes with significant GO hits with other studies
Contribution of alternative splicing on the transcript variability Jean histograms for the distribution of CV, R... distance and 2 different Vs/Vt measurements in each population,

corresponding scatter plots for pairwise comparisons of selected populations with these measurements, GO analysis of genes that are most variable according to Vs/Vt

In general, the splicing variability seems to be conserved across the populations. The Pearson correlation coefficient is very high. Compared to the study from Mar Gonzalez-Porta's article, a greater number of genes are studied. We observe that CEU and YRI are the populations with the more different genes(older cell lines ?). Tune scatter plot?
Qualitative and quantitative mRNA variation Jean (summary of main results from the last two pages)
Alternative splicing analysis Jean scatter plot with splicing dispersion CEU vs. others

Venn diagrams with common and population-specific genes (according to dispersion) stacked barplots with dispersion bins for each population (the above 3 types of plots also when assessing splicing variability by a different measurement/threshold)

Similar results studying the different populations: around 50-55% of the transcript variability is estimated to be due to gene expression. The alternative splicing contribution is quite important. Even if the general distribution is similar, comparing the population at gene level shows that some of them have very different behaviour across the populations. GO study of most variable genes points to the cell surface. Overlap Venn Diagrams with Mar's / Pedro's results. A specific study of genes with very different behaviour across the populations. Overlap GO results on gene level with other studies.
Differential Expression Pedro tables/barplots of population-specifically regulated (up/down) genes,

GO analysis of population-specific genes

In general we observed that CEU is the population that departs more considerably from the other populations in terms of diff. expressed genes. On the other extreme we have the YRI population. Interestingly, when we compare CEU with FIN, GBR and TSI (logFC>2) we get respectively 79%, 79% and 80% of up-regulated genes in CEU. In the same comparison YRI shows 78%, 77% and 85% of genes down-regulated when compared with these three populations. Overlap GO analysis with other studies.

Discuss up-/down-regulated genes w.r.t. cell-line age: "FIN and GBR are recent, a couple of years. TSI a bit older, then YRI, and CEU is the oldest"

Percentage Splicing Index Pedro schema for PSI computation,

tables of exons that are specific to a certain population pair (one-PSI, all-PSI), heatmap/hier.clustering of PSI exon scores

hierachical clustering with site/exon variants instead of populations?
Ubiquity vs specificity of gene and isoform expression across populations

and

Differential isoform usage

Mar Beautiful Venn Diagrams showing the number of genes/transcripts that are specific for certain (groups of) population(s), and the ones that common to all (i.e., ubiquitous).

Pie charts showing the distribution of genes/transcripts that are specific for 1..4 populations, or for all 5 populations (i.e., ubiquitous)

Most of gene expression / alternative transcript usage remains unchanged across individuals.

GO analysis of specifically expressed genes (cell surface, mutagenesis site/host-virus interaction, cell cycle/apoptosis, transcription regulation) and transcript forms (protein transport, translation, RNA binding, RNA processing, cell cycle, protein degradation)

interpretation of GO analysis with respect to other analyzes
MRNA Quantification

and

A3 mRNA Variation

Thasso/Emilio/Micha Transcript quantification,

Splice junction quantification, Intron quantification, Gene Discovery across Individuals

Qualitative gene discovery does not saturate,

Split Mappings harbor many novel splice sites

gene discovery plot that works in smaller space
Inter-Individual Variability of Splicing Anna/Micha (incl. results of Pedro/Matthias) Histograms as barplots Variants distinguished by model-based classification according to their impact on splice sites also show differences in their allele frequency distributions,

Splice site modificating variants at the flanks of alternative exons influence their inclusion level, Split mappings predict rare and not yet annotated introns, however, these rare variants together across all individuals/populations double about the amount of splice sites that is to be considered in the human transcriptome

nr of variants (motif) <> variation, connection with LoF variants/regulatory variants

Expression Variability across Individuals

Ubiquity vs. Specificity

(Jean, Pedro, Mar, Micha)

Overall, the picture of gene expression as well as spliceform choice is largely similar across samples of the same tissue from different individuals,

  • Scatter Plot of CEU-xyz pairwise comparisons (Jean)
  • Pie Charts summarizing the number of genes that are specific for 1,2,..all populations (Mar)
  • Number of genes that are up-down regulated (Pedro)


Characterisation of Specifically Expressed and Processed Genes

(Jean, Pedro, Mar, Micha)

Commmon attributes of genes that constitute the minor changes in configuration between individuals

  • GO study of genes that exhibit population-selectivity in their expression (Mar)
  • GO study of genes that exhibit population-selectivity in their spliceform choice (Jean / Mar)


Inter-Individual Variability of Splicing

Splice Site Variants

Variants distinguished by model-based classification according to their impact on splice sites also show differences in their allele frequency distributions

  • Histogram of allele frequencies for the 5 splice site variant classifications, i.e., activating, improving, neutral, deteriorating and inhibiting variants


Alternative Exon Inclusion

Splice site modificating variants at the flanks of alternative exons influence their inclusion level

  • Histogram of PSI-Scores for exons with 0, 1, 2 deleterious variants compared to 0, 1, 2 neutral/improving variants


Novel Introns Variations

Split mappings predict rare and not yet annotated introns, however, these rare variants together across all individuals/populations double about the amount of splice sites that is to be considered in the human transcriptome

  • Histogram of known/novel intron usage in the Geuvadis individuals
Personal tools
Namespaces

Variants
Actions
Navigation
RNAseq Data and Analysis
Admin and info
Public
Toolbox