Data Analysis and BioInformatics in real-time qPCR (3) main page subpage 1 subpage 2 subpage 3 subpage 4 -- integrative data analysis subpage 5 -- latest paper updates Molecular Regulatory Networks Big Data in Transcriptomics & Molecular Biology
Overview: How to do successful gene expression analysis using real-time PCR Stefaan Derveaux, Jo Vandesompele, Jan Hellemans Methods Vol 50, Issue 4, April 2010, in The ongoing Evolution of qPCR edited by Michael W. Pfaffl, Pages 227-230 Reverse
transcription quantitative PCR (RT-qPCR) is considered
today as the gold standard for accurate, sensitive and
fast measurement of gene expression. Unfortunately,
what many users fail to appreciate is that numerous
critical issues in the workflow need to be addressed
before biologically meaningful and trustworthy
conclusions can be drawn. Here, we review the entire
workflow from the planning and preparation phase, over
the actual real-time PCR cycling experiments to
data-analysis and reporting steps. This process can be
captured with the appropriate acronym PCR:
plan/prepare, cycle and report. The key message is
that quality assurance and quality control are
essential throughout the entire RT-qPCR workflow; from
living cells, over extraction of nucleic acids,
storage, various enzymatic steps such as DNase
treatment, reverse transcription and PCR
amplification, to data-analysis and finally reporting.
Quantitative real-time RT-PCR data analysis: current concepts and the novel "gene expression's CT difference" formula. Schefe JH, Lehmann KE, Buschmann IR, Unger T, Funke-Kaiser H. Center for Cardiovascular Research (CCR)/Institute of Pharmacology and Toxicology, Charité-Universitätsmedizin Berlin, Hessische Strasse 3-4, 10115, Berlin, Germany. J Mol Med. 2006 4(11):901-10. Epub 2006 Sep 14. For
quantification of gene-specific mRNA, quantitative
real-time RT-PCR has become one of the most frequently
used methods over the last few years. This article
focuses on the issue of real-time PCR data analysis
and its mathematical background, offering a general
concept for efficient, fast and precise data analysis
superior to the commonly used comparative CT
(DeltaDeltaCT) and the standard curve method, as it
considers individual amplification efficiencies for
every PCR. This concept is based on a novel formula
for the calculation of relative gene expression
ratios, termed GED (Gene Expression's CT Difference)
formula. Prerequisites for this formula, such as
real-time PCR kinetics, the concept of PCR efficiency
and its determination, are discussed. Additionally,
this article offers some technical considerations and
information on statistical analysis of real-time PCR
data.
Multiway real-time PCR gene expression profiling in yeast Saccharomyces cerevisiae reveals altered transcriptional response of ADH-genes to glucose stimuli. Ståhlberg A, Elbing K, Andrade-Garda JM, Sjögreen B, Forootan A, Kubista M. TATAA Biocenter, Odinsgatan 28, 411 03 Göteborg, Sweden. anders.stahlberg@neuro.gu.se BMC Genomics. 2008 9:170. BACKGROUND: The large sensitivity, high reproducibility and essentially unlimited dynamic range of real-time PCR to measure gene expression in complex samples provides the opportunity for powerful multivariate and multiway studies of biological phenomena. In multiway studies samples are characterized by their expression profiles to monitor changes over time, effect of treatment, drug dosage etc. Here we perform a multiway study of the temporal response of four yeast Saccharomyces cerevisiae strains with different glucose uptake rates upon altered metabolic conditions. RESULTS: We measured the expression of 18 genes as function of time after addition of glucose to four strains of yeast grown in ethanol. The data are analyzed by matrix-augmented PCA, which is a generalization of PCA for 3-way data, and the results are confirmed by hierarchical clustering and clustering by Kohonen self-organizing map. Our approach identifies gene groups that respond similarly to the change of nutrient, and genes that behave differently in mutant strains. Of particular interest is our finding that ADH4 and ADH6 show a behavior typical of glucose-induced genes, while ADH3 and ADH5 are repressed after glucose addition. CONCLUSION: Multiway real-time PCR gene expression profiling is a powerful technique which can be utilized to characterize functions of new genes by, for example, comparing their temporal response after perturbation in different genetic variants of the studied subject. The technique also identifies genes that show perturbed expression in specific strains. Statistical aspects of quantitative real-time PCR experiment design Robert R. Kitchen, Mikael Kubista, Ales Tichopad Methods Vol 50, Issue 4, April 2010, in The ongoing Evolution of qPCR edited by Michael W. Pfaffl, Pages 231-236 Experiments using
quantitative real-time PCR to test hypotheses are
limited by technical and biological variability; we
seek to minimise sources of confounding variability
through optimum use of biological and technical
replicates. The quality of an experiment design is
commonly assessed by calculating its prospective
power. Such calculations rely on knowledge of the
expected variances of the measurements of each group
of samples and the magnitude of the treatment effect;
the estimation of which is often uninformed and
unreliable. Here we introduce a method that exploits a
small pilot study to estimate the biological and
technical variances in order to improve the design of
a subsequent large experiment. We measure the variance
contributions at several 'levels' of the experiment
design and provide a means of using this information
to predict both the total variance and the prospective
power of the assay. A validation of the method is
provided through a variance analysis of representative
genes in several bovine tissue-types. We also discuss
the effect of normalisation to a reference gene in
terms of the measured variance components of the gene
of interest. Finally, we describe a software
implementation of these methods, powerNest, that gives
the user the opportunity to input data from a pilot
study and interactively modify the design of the
assay. The software automatically calculates expected
variances, statistical power, and optimal design of
the larger experiment. powerNest enables the
researcher to minimise the total confounding variance
and maximise prospective power for a specified maximum
cost for the large study.
The Prime Technique - Real-time PCR Data Analysis Mikael Kubista, Institute of Molecular Genetics and TATAA Biocenter, Sweden Radek Sindelka, Institute of Molecular Genetics, Czech Republic G.I.T. Laboratory Journal 9-10/2007, pp 33-35, GIT VERLAG GmbH & Co. KG, Darmstadt For measuring gene expression there is only one technique: PCR. But how can it be used with maximum efficiency? This article tries to give the answer to that question. Gene expression profiling – Clusters of possibilities Anders Bergkvist, Vendula Rusnakova, Radek Sindelka, Jose Manuel Andrade Garda, Björn Sjögreen, Daniel Lindh, Amin Forootan, Mikael Kubista Methods Vol 50, Issue 4, April 2010, in The ongoing Evolution of qPCR edited by Michael W. Pfaffl, Pages 323-335 Advances in qPCR
technology allow studies of increasingly large systems
comprising many genes and samples. The increasing data
sizes allow expression profiling both in the gene and
the samples dimension while also putting higher
demands on sound statistical analysis and expertise to
handle and interpret its results. We distinguish
between exploratory and confirmatory statistical
studies. In this paper we demonstrate several
techniques available for exploratory studies on a
system of Xenopus laevis development from egg to
tadpole. Techniques include hierarchical clustering,
heatmap, principal component analysis and
self-organizing maps. We stress that even though
exploratory studies are excellent for generating
hypotheses, results have not been proven statistically
significant until an independent confirmatory study
has been performed. An exploratory study may certainly
be valuable in its own right, and there are often not
enough resources to report both an exploratory and a
confirmatory study at the same time. However,
exploratory and confirmatory studies are intimately
connected and we would like to raise that awareness
among qPCR practitioners. We suggest that scientific
reports should always have a hypothesis focus. Reports
are either hypothesis generating, from an exploratory
study, or hypothesis validating, from a confirmatory
study, or both. In either case, we suggest the
generated or validated hypotheses be specifically
stated.
Download latest Genex version here => http://genex.gene-quantification.info/ Validation of differential gene expression algorithms: application comparing fold-change estimation to hypothesis testing. Yanofsky CM, Bickel DR. Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology, and Immunology, University of Ottawa, Ottawa, Ontario, Canada. BMC Bioinformatics. 2010 Jan 28;11:63. BACKGROUND: Sustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from equally justified algorithms. Recently, a concordance method that measures agreement among gene lists have been introduced to assess various aspects of differential gene expression detection. This method has the advantage of basing its assessment solely on the results of real data analyses, but as it requires examining gene lists of given sizes, it may be unstable. RESULTS: Two methodologies for assessing predictive error are described: a cross-validation method and a posterior predictive method. As a nonparametric method of estimating prediction error from observed expression levels, cross validation provides an empirical approach to assessing algorithms for detecting differential gene expression that is fully justified for large numbers of biological replicates. Because it leverages the knowledge that only a small portion of genes are differentially expressed, the posterior predictive method is expected to provide more reliable estimates of algorithm performance, allaying concerns about limited biological replication. In practice, the posterior predictive method can assess when its approximations are valid and when they are inaccurate. Under conditions in which its approximations are valid, it corroborates the results of cross validation. Both comparison methodologies are applicable to both single-channel and dual-channel microarrays. For the data sets considered, estimating prediction error by cross validation demonstrates that empirical Bayes methods based on hierarchical models tend to outperform algorithms based on selecting genes by their fold changes or by non-hierarchical model-selection criteria. (The latter two approaches have comparable performance.) The posterior predictive assessment corroborates these findings. CONCLUSIONS: Algorithms for detecting differential gene expression may be compared by estimating each algorithm's error in predicting expression ratios, whether such ratios are defined across microarray channels or between two independent groups.According to two distinct estimators of prediction error, algorithms using hierarchical models outperform the other algorithms of the study. The fact that fold-change shrinkage performed as well as conventional model selection criteria calls for investigating algorithms that combine the strengths of significance testing and fold-change estimation. Automated validation of polymerase chain reaction amplicon melting curves. Mann TP, Humbert R, Stamatoyannopolous JA, Noble WS. Department of Genome Sciences, University of Washington, Seattle, WA, USA J Bioinform Comput Biol. 2006 4(2):299-315. The polymerase
chain reaction (PCR) is a fundamental tool of
molecular biology. Quantitative PCR is the
gold-standard methodology for determination of DNA
copy numbers, quantitating transcription, and numerous
other applications. A major barrier to large-scale
application of PCR for quantitative genomic analyses
is the current requirement for manual validation of
individual PCRs to ensure generation of a single
product. This typically requires visual inspection
either of gel electrophoreses or temperature
dissociation ("melting") curves of individual PCRs--a
time-consuming and costly process. Here we describe a
robust computational solution to this fundamental
problem. Using a training set of 10 080 reactions
comprising multiple quantitative PCRs from each of
1728 unique human genomic amplicons, we developed a
support vector machine classifier capable of
discriminating single-product PCRs with better than
99% accuracy. This approach has broad utility, and
eliminates a major bottleneck to widespread
application of PCR for high-throughput genomic
applications.
Statistical models in assessing fold change of gene expression in real-time RT-PCR experiments Fu WJ, Hu J, Spencer T, Carroll R, Wu G. Department of Epidemiology, Michigan State University, East Lansing, MI 48824, USA. Comput Biol Chem. 2006 30(1): 21-6. Real-time RT-PCR
has been frequently used in quantitative research in
molecular biology and bioinformatics. It provides
remarkably useful technology to assess expression of
genes. Although mathematical models for gene
amplification process have been studied, statistical
models and methods for data analysis in real-time
RT-PCR have received little attention. In this paper,
we briefly introduce current mathematical models, and
study statistical models for real-time RT-PCR data. We
propose a generalized estimation equations (GEE) model
that properly reflects the structure of repeated data
in RT-PCR experiments for both cross-sectional and
longitudinal data. The GEE model takes the correlation
between observations within the same subjects into
consideration, and prevents from producing false
positives or false negatives. We further demonstrate
with a set of actual real-time RT-PCR data that
different statistical models yield different
estimations of fold change and confidence interval.
The SAS program for data analysis using the GEE model
is provided to facilitate easy computation for
non-statistical professionals.
The Importance of Quality Control During qPCR Data Analysis Barbara D’haene, Ph.D. & Jan Hellemans, Ph.D.Biogazelle & Ghent University Drug Discovery - August/September 2010 IntroductionSince its introduction in 1993, qPCR has paved its way towards one of the most popular techniques in modern molecular biology [1]. Despite its apparent simplicity, which makes qPCR such an attractive technology for many researchers, final results are often compromised due to unsound experimental design, a lack of quality control, improper data analysis, or a combination of these. To address the concerns that have been raised about the quality of published qPCR-based research, specialists in the qPCR field have introduced the MIQE guidelines for publication of qPCR-based results [2]. The main purpose of this initiative is to make qPCR-based research transparent, but the MIQE guidelines may also serve as a practical framework to obtain high-quality results. Within the guidelines, quality control at each step of the qPCR workflow, from experimental design to data analysis, is brought to the attention as a necessity to ensure trustworthy results. Numerous papers have been written about assay and sample quality control [3], but less attention has been spent on quality control on post-qPCR data. This article summarizes recommendations for this latter type of quality control including: detection of abnormal amplification, inspection of melting curves, control on PCR replicate variation, assessment of positive and negative control samples, determination of reference gene expression stability, and evaluation of deviating sample normalization factors. Error bars in experimental biology Geoff Cumming,1 Fiona Fidler,1 and David L. Vaux2 1School of Psychological Science and 2Department of Biochemistry, La Trobe University, Melbourne, Victoria, Australia 3086 Error bars commonly appear in fi gures in publications, but experimental biologists are often unsure how they should be used and interpreted. In this article we illustrate some basic features of error bars and explain how they can help communicate data and assist correct interpretation. Error bars may show confi dence intervals, standard errors, standard deviations, or other quantities. Different types of error bars give quite different information, and so fi gure legends must make clear what error bars represent. We suggest eight simple rules to assist with effective use and interpretation of error bars. Automatic Genomics: a user-friendly program for the automatic designing and plate loading of medium-throughput qPCR experiments Callejas S, Alvarez R, Dopazo A. Genomics Unit, Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, Spain. Biotechniques. 2011 50(1):46-50. Quantitative PCR (qPCR) remains the method of choice for gene and microRNA (miRNA) expression studies. Many laboratories wish to automate some or all of the steps of medium-throughput qPCR experiments through the use of various types of liquid handling robots. However, it is not uncommon to find cases in which scripts provided by the robot supplier are too rigid for user-specific applications, do not include all the desired options, or are too complicated to be modified by a nonprofessional programmer. Here, we present Automatic Genomics, a program that allows users with a limited programming background to automate medium-throughput qPCR experiments by using commercially available liquid-handling robots. The user is able to optimize the plate design in terms of number of genes, number of samples, and controls. Interactive analysis of systems biology molecular expression data Zhang M, Ouyang Q, Stephenson A, Kane MD, Salt DE, Prabhakar S, Burgner J, Buck C, Zhang X. Bindley Bioscience Center, Purdue University, West Lafayette, IN 47907, USA. BMC Syst Biol. 2008 2:23. BACKGROUND: Systems biology aims to understand biological systems on a comprehensive scale, such that the components that make up the whole are connected to one another and work through dependent interactions. Molecular correlations and comparative studies of molecular expression are crucial to establishing interdependent connections in systems biology. The existing software packages provide limited data mining capability. The user must first generate visualization data with a preferred data mining algorithm and then upload the resulting data into the visualization package for graphic visualization of molecular relations. RESULTS: Presented is a novel interactive visual data mining application, SysNet that provides an interactive environment for the analysis of high data volume molecular expression information of most any type from biological systems. It integrates interactive graphic visualization and statistical data mining into a single package. SysNet interactively presents intermolecular correlation information with circular and heatmap layouts. It is also applicable to comparative analysis of molecular expression data, such as time course data. CONCLUSION: The SysNet program has been utilized to analyze elemental profile changes in response to an increasing concentration of iron (Fe) in growth media (an ionomics dataset). This study case demonstrates that the SysNet software is an effective platform for interactive analysis of molecular expression information in systems biology. Roadmap for developing and validating therapeutically relevant genomic classifiers Simon R. National Cancer Institute, 9000 Rockville Pike, MSC 7434, Bethesda, MD 20892, USA J Clin Oncol. 2005 23(29):7332-41. Epub 2005 Sep 6. Oncologists need
improved tools for selecting treatments for individual
patients. The development of therapeutically relevant
prognostic markers has traditionally been slowed by
poor study design, inconsistent findings, and lack of
proper validation studies. Microarray expression
profiling provides an exciting new technology for
relating tumor gene expression to patient outcome, but
it also provides increased challenges for translating
initial research findings into robust diagnostics that
benefit patients and physicians in therapeutic
decision making. This article attempts to clarify some
of the misconceptions about the development and
validation of multigene expression signature
classifiers and highlights the steps needed to move
genomic signatures into clinical application as
therapeutically relevant and robust diagnostics.
Cluster analysis and display of genome-wide expression patterns Eisen MB, Spellman PT, Brown PO, Botstein D. Department of Genetics, Stanford University School of Medicine, 300 Pasteur Avenue, Stanford, CA 94305, USA. Proc Natl Acad Sci U S A. 1998 95(25):14863-8. A system of
cluster analysis for genome-wide expression data from
DNA microarray hybridization is described that uses
standard statistical algorithms to arrange genes
according to similarity in pattern of gene expression.
The output is displayed graphically, conveying the
clustering and the underlying expression data
simultaneously in a form intuitive for biologists. We
have found in the budding yeast Saccharomyces
cerevisiae that clustering gene expression data groups
together efficiently genes of known similar function,
and we find a similar tendency in human data. Thus
patterns seen in genome-wide expression experiments
can be interpreted as indications of the status of
cellular processes. Also, coexpression of genes of
known function with poorly characterized or novel
genes may provide a simple means of gaining leads to
the functions of many genes for which information is
not available currently.
Biomarker Discovery via RT-qPCR and Bioinformatical Validation Christiane Becker, Irmgard Riedmaier, and Michael W. Pfaffl Book chapter 18 in PCR Technology - Current Innovations, Third Edition, Pages 259–270 Editors: Tania Nolan and Stephen A. Bustin; CRC Press 2013, Print ISBN: 978-1-4398-4805-0 There is a growing interest in life science research in the use of expressed transcripts that form the basis of biological markers (biomarkers) and in addressing some of the challenging statistical issues that arise when attempting to validate them. Biomarkers have extensively been used across diagnostic and therapeutic areas of many life science disciplines, including clinical, physiological, biochemical, developmental, morphological, and molecular applications. Biomarkers have been defined as “cellular, biochemical or molecular alterations that are measurable in biological media such as human tissues, cells, or fluids.” The official definition, developed by the “Biomarkers definitions working group” of the NIH is: “A biomarker is a characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention.” More recently the definition has been broadened to include more biological characteristics that can be objectively measured and evaluated as a biological indicator. A biomarker can refer to any measurable molecular, biochemical, cellular, or morphological alternations in biological media such as human tissues, cells, or fluids. How to perform RT-qPCR accurately in plant species? A case study on flower colour gene expression in an azalea (Rhododendron simsii hybrids) mapping population. De Keyser E, Desmet L, Van Bockstaele E, De Riek J. Institute for Agricultural and Fisheries Research (ILVO)-Plant Sciences Unit, Caritasstraat 21, 9090, Melle, Belgium. BMC Mol Biol. 2013 Jun 24;14(1): 13 BACKGROUND: Flower colour variation is one of the most crucial selection criteria in the breeding of a flowering pot plant, as is also the case for azalea (Rhododendron simsii hybrids). Flavonoid biosynthesis was studied intensively in several species. In azalea, flower colour can be described by means of a 3-gene model. However, this model does not clarify pink-coloration. The last decade gene expression studies have been implemented widely for studying flower colour. However, the methods used were often only semi-quantitative or quantification was not done according to the MIQE-guidelines. We aimed to develop an accurate protocol for RT-qPCR and to validate the protocol to study flower colour in an azalea mapping population. RESULTS: An accurate RT-qPCR protocol had to be established. RNA quality was evaluated in a combined approach by means of different techniques e.g. SPUD-assay and Experion-analysis. We demonstrated the importance of testing noRT-samples for all genes under study to detect contaminating DNA. In spite of the limited sequence information available, we prepared a set of 11 reference genes which was validated in flower petals; a combination of three reference genes was most optimal. Finally we also used plasmids for the construction of standard curves. This allowed us to calculate gene-specific PCR efficiencies for every gene to assure an accurate quantification. The validity of the protocol was demonstrated by means of the study of six genes of the flavonoid biosynthesis pathway. No correlations were found between flower colour and the individual expression profiles. However, the combination of early pathway genes (CHS, F3H, F3'H and FLS) is clearly related to co-pigmentation with flavonols. The late pathway genes DFR and ANS are to a minor extent involved in differentiating between coloured and white flowers. Concerning pink coloration, we could demonstrate that the lower intensity in this type of flowers is correlated to the expression of F3'H. CONCLUSIONS: Currently in plant research, validated and qualitative RT-qPCR protocols are still rare. The protocol in this study can be implemented on all plant species to assure accurate quantification of gene expression. We have been able to correlate flower colour to the combined regulation of structural genes, both in the early and late branch of the pathway. This allowed us to differentiate between flower colours in a broader genetic background as was done so far in flower colour studies. These data will now be used for eQTL mapping to comprehend even more the regulation of this pathway. External oligonucleotide standards enable cross laboratory comparison and exchange of real-time quantitative PCR data. Vermeulen J, Pattyn F, De Preter K, Vercruysse L, Derveaux S, Mestdagh P, Lefever S, Hellemans J, Speleman F, Vandesompele J. Center for Medical Genetics, Ghent University Hospital, Belgium. Nucleic Acids Res. 2009 Nov;37(21):e138 The quantitative polymerase chain reaction (qPCR) is widely utilized for gene expression analysis. However, the lack of robust strategies for cross laboratory data comparison hinders the ability to collaborate or perform large multicentre studies conducted at different sites. In this study we introduced and validated a workflow that employs universally applicable, quantifiable external oligonucleotide standards to address this question. Using the proposed standards and data-analysis procedure, we obtained a perfect concordance between expression values from eight different genes in 366 patient samples measured on three different qPCR instruments and matching software, reagents, plates and seals, demonstrating the power of this strategy to detect and correct inter-run variation and to enable exchange of data between different laboratories, even when not using the same qPCR platform. SPUD qPCR assay confirms PREXCEL-Q softwares ability to avoid qPCR inhibition. Gallup JM, Sow FB, Van Geelen A, Ackermann MR. Department of Veterinary Pathology, Iowa State University, Ames, 50011-1250, USA. Curr Issues Mol Biol. 2010; 12(3): 129-134 Real-time quantitative polymerase chain reaction is subject to inhibition by substances that co-purify with nucleic acids during isolation and preparation of samples. Such materials alter the activity of reverse transcriptase (RT) and thermostable DNA polymerase enzymes on which the assay depends. When removal of inhibitory substances by column or reagent-based methods fails or is incomplete, the remaining option of appropriately, precisely and differentially diluting samples and standards to non-inhibitory concentrations is often avoided due to the logistic problem it poses. To address this, we invented the PREXCEL-Q software program to automate the process of calculating the non-inhibitory dilutions for all samples and standards after a preliminary test plate has been performed on an experimental sample mixture. The SPUD assay was used to check for inhibition in each PREXCEL-Q-designed qPCR reaction. When SPUD amplicons or SPUD amplicon-containing plasmids were spiked equally into each qPCR reaction, all reactions demonstrated complete absence of qPCR inhibition. Reactions spiked with about 15,500 SPUD amplicons yielded a Cq of 27.39 plus/minus 0.28 (at about 80.8% efficiency), while reactions spiked with about 7,750 SPUD plasmids yielded a Cq of 23.82 plus/minus 0.15 (at about 97.85% efficiency). This work demonstrates that PREXCEL-Q sample and standard dilution calculations ensure avoidance of qPCR inhibition. Mathematical analysis of the Real Time Array PCR (RTA PCR) process. J. Frits Dijksman and Anke Pierik Chemical Engineering Science (2012) vol. 71 March 26, 2012. p. 496-506 Real Time Array PCR is a recently developed biochemical technique that measures amplification curves (like quantitative real time Polymerase Chain Reaction (qPCR)) of a multitude of different templates ina sample. It combines two different techniques to profit from theadvantages of both techniques, namely qPCR (real time quantitative detection) with microarrays (high multiplex capability). This enablesthe quantitative detection of many more target sequences than can be done by qPCR. Thereby, the concentration of the many different target molecules originally present in a sample can be measured. Labeled primers are used that are first elongated to form labeled amplicons in the bulk and these can hybridize to capture probes immobilizedon the surface of the microarray. During each PCR cycle, there is atime window available during which the formed labeled amplicons canhybridize to the target sequences on the microarray surface. By detection of the fluorescence of the spots on the microarray, amplification curves comparable to real time PCR can be obtained, which can be used to deduce the information needed on the presence and the amount of targets originally present in the sample. We present a mathematical model that provides fundamental insights in the different steps of Real Time Array PCR and that can be used to optimize the different biochemical processes taking place. At the microarray surface specific molecules are captured and taken away from the solution, causing a concentration gradient that powers a material flow towards themicroarray surface. Only the labeled strand of the amplicon is captured by the probes on the microarray surface and as a result locallythe PCR process is not symmetric anymore. Moreover, in course of the process more and more ssDNA renatures, leaving relatively less strands and complexes available for hybridization. We found that to a large extent, however, the surface fluorescence scales with the bulkconcentration. Important parameters to optimize are the enzyme concentration and degradation, the primer concentration and the capture probe decay rate. Also the surface hybridization time is critical since the time to reach a steady state is at least one order of magnitude longer compared to the timing of the bulk processes in qPCR. Selecting control genes for RT-QPCR using public microarray data. Popovici V, Goldstein DR, Antonov J, Jaggi R, Delorenzi M, Wirapati P. Bioinformatics Core Facility, Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland. BMC Bioinformatics. 2009 Feb 2;10:42 BACKGROUND: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e.g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. RESULTS: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/~vpopovic/research/ CONCLUSION: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable. |
|