Plotting · R

Inferring the sex based on transcriptome sequencing

Analogous to my previous post on inferring the sex of individuals based on exome sequencing I’ll now show you how to do the same for transcriptome sequencing. In the example, I use data from Lexogen QuantSeq but this is most likely equally applicable to other RNA-seq approaches. This is a useful QC step and can detect roughly 50% of sample swaps in your experiment.

This code is a function in my DEA.R R script for reproducible and convenient differential expression analysis from the command line. I use XIST as a female-specific gene and 4 chrY genes for male-specific expression (based on Staedtler et al 2013 ). It takes a vector of expected genders, a count matrix (e.g. from featureCounts) and a vector of sample names. The plot is made using ggplot2.



UPDATE: Devon Ryan suggested normalizing the read counts, of which I added the result (and the code) below. There is indeed some improvement, but not by a lot.



2 thoughts on “Inferring the sex based on transcriptome sequencing

    1. @Devon Ryan: I’ve updated my post and added the suggested plot using library-size normalized counts. There is a slightly tighter bunching of the points, but not by a lot.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s