Experts in proteomics from Utrecht University and next-generation DNA sequencing from the Hubrecht Institute/UMC Utrecht collaborated in a study to systematically determine the consequences of genetic variation on the transcriptome and proteome. The researchers applied ultra-deep quantitative proteomics, whole genome sequencing and RNA-seq to liver tissues from different inbred rat strains. One of these inbred rats, the spontaneously hypertensive rat (SHR), is a widely used disease model for hypertension.
Proteomics experiments are normally hampered by identification methods based on reference genomes. Therefore, the authors designed a personalized protein database for each rat strain examined. They did so by integrating small genomic variants detected by whole genome sequencing and novel splicing and editing variants detected by RNA-seq. This extra information, in combination with the use of 5 different proteases, to extend the proteome coverage, resulted in the largest proteome to date (~13,000 proteins), with over 30% more proteins identified in a single sample than the current standard. Also, hundreds of novel genes, editing sites and transcript isoforms were identified at the protein level for the first time.
Besides this impressive gain in protein identifications subsequent integrated quantitative RNA and protein comparisons provided interesting novel insights in disease biology. Four differentially expressed genes popped up that had previously been associated with hypertension. One of those genes, Cyp17a1, was previously identified as one of the top hits in human hypertension GWAS studies.