Back in 2002 (
), we pioneered a new normalization strategy for more accurate normalization of mRNA expression data from RT-qPCR studies. Since then, the use of multiple stably expressed reference genes has become the gold standard method (see also MIQE guidelines in
). Today, more than 11000 papers have cited our seminal paper (according to Google Scholar)!
The most appropriate mRNA reference genes are typically identified in a geNorm pilot study, in which at least 8 candidate reference genes are measured in at least 10 representative samples (e.g. 5 from each of 2 groups). There are sufficient sets of candidate reference genes available in literature or from life science companies. Setting up a geNorm pilot experiment is not a lot of wet lab work (typically one 96-well plate) and data analysis is straightforward when using qbase+ that has an improved version of the original geNorm algorithm. Identifying the most stably expressed mRNA reference genes and using at least 2 of these in the normalization procedure significantly improves the accuracy of the RT-qPCR results.
However, when it comes to microRNA data normalization, there are no predefined sets of candidate reference microRNAs. Historically, qPCR users relied on small nuclear or small nucleolar RNAs for normalization (e.g. U6 or RNU44). But, similar to our arguments against ribosomal RNA for mRNA normalization, we advice against these internal controls. Both rRNA and sn(o)RNA are transcribed from a different RNA polymerase and have different functions than mRNA and miRNA, respectively. Even without these arguments, it's never wise to simply rely on one or few (popular) reference genes; you really should test if the candidate reference genes are stably expressed in your experimental condition.
In 2009, we published a new method for even more accurate normalization when a large unbiased set of genes is measured (
). We applied the method for normalization of microRNA expression profiling studies in which typically a few hundred microRNAs are measured. The method has since then been perfected (
) by attributing equal weight to each individual miRNA during normalization. The improved method is also built into qbase+ as 'global mean normalization method'.
While the above referenced method works great, it requires many microRNAs to be measured. For follow-up studies, one typically is only interested in the validation of (part of) the differentially expressed microRNAs. To normalize that type of data, we recommend the use of multiple stably expressed microRNAs.
We propose the following procedure to find such stably expressed candidate reference microRNAs:
- Import your microRNA profiling data into qbase+ and normalize data using global mean method (works only well if you have measured a large and unbiased set of miRs) (if no access to qbase+, you should start with normalized microRNA expression data)
- Export the result table from qbase+ in log scale format
- Open this file in e.g. Excel and calculate the standard deviation per microRNA
- Select candidate microRNAs that have:
- data for all samples
- lowest standard deviation
- do not belong to the same miR family[1] (use only the best miR per family)
- we recommend to select at least 3 (5 or more is better)
- Verify in your final experiment that these candidate miRs are stably expressed (low M values, guidance is offered in the geNorm report in qbase+)
If you do not have access to large scale miR profiles, consider to profile a few representative samples after which you can follow the procedure outlined higher. If that is not an option, then you should setup a classic geNorm pilot experiment with sn(o)RNAs and published candidate reference miRs (ideally in the same type of samples). Typically, 8 candidate small RNAs are measured in at least 10 representative samples. The geNorm report in qbase+ will help you to identify the most stably expressed genes and will suggest how many genes to use to achieve optimal normalization.
[1] miR families can be inspected in a special miRBase file
Editor's note: This post was originally published in April 2013 and has been completely updated for accuracy.