Data Analysis and BioInformatics in real-time qPCR (2) main page subpage 1 subpage 2 subpage 3 subpage 4 -- integrative data analysis subpage 5 -- latest paper updates Molecular Regulatory Networks Big Data in Transcriptomics & Molecular Biology
Posttranscriptional Expression Regulation -- What Determines Translation Rates ? Regina Brockmann, Andreas Beyer, Juergen J. Heinisch, Thomas Wilhelm Theoretical Systems Biology, Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena, Germany, Fachbereich Biologie/Chemie, AG Genetik, Universitaet Osnabrueck, Osnabrueck, Germany, Department of Bioengineering, University of California San Diego, La Jolla, California, USA Recent analyses
indicate that differences in protein concentrations
are only 20%–40% attributable to variable mRNA
levels, underlining the importance of
posttranscriptional regulation. Generally, protein
concentrations depend on the translation rate (which
is proportional to the translational activity, TA)
and the degradation rate. By integrating 12
publicly available large-scale datasets and
additional database information of the yeast
Saccharomyces cerevisiae, wesystematically analyzed
five factors contributing to TA: mRNA concentration,
ribosome density, ribosome occupancy,the codon
adaptation index, and a newly developed ‘‘tRNA
adaptation index.’’ Our analysis of the
functionalrelationship between the TA and measured
protein concentrations suggests that the TA follows
Michaelis–Mentenkinetics. The calculated TA,
together with measured protein concentrations,
allowed us to estimate degradation ratesfor 4,125
proteins under standard conditions. A significant
correlation to recently published degradation
ratessupports our approach. Moreover, based on a
newly developed scoring system, we identified and
analyzed genessubjected to the posttranscriptional
regulation mechanism, translation on demand. Next we
applied these findings topublicly available data of
protein and mRNA concentrations under four stress
conditions. The integration of thesemeasurements
allowed us to compare the condition-specific
responses at the posttranscriptional level. Our
analysis ofall 62 proteins that have been measured
under all four conditions revealed proteins with
very specific posttranscriptionalstress response, in
contrast to more generic responders, which were
nonspecifically regulated under severalconditions.
The concept of specific and generic responders is
known for transcriptional regulation. Here we show
that it also holds true at the
posttranscriptional level.
How
does
gene expression clustering work?
Patrik D'haeseleer, Nature Biotechnology 23, 1499 - 1501 (2005) Patrik
D'haeseleer is in the Microbial Systems Division,
Biosciences Directorate, Lawrence Livermore National
Laboratory, PO Box 808, L-448, Livermore, California
94551, USA.
Clustering
is often one of the first steps in gene expression
analysis. How do clustering algorithms work, which
ones should we use and what can we expect from them?
Our ability to
gather genome-wide expression data has far
outstripped the ability of our puny human brains to
process the raw data. We can distill the data down
to a more comprehensible level by subdividing the
genes into a smaller number of categories and then
analyzing those. This is where clustering comes in.
The goal of clustering is to subdivide
a set of items (in our case, genes) in such a way
that similar items fall into the same cluster,
whereas dissimilar items fall in different clusters.
This brings up two questions: first, how do we
decide what is similar; and second, how do we use
this to cluster the items? The fact that these two
questions can often be answered independently
contributes to the bewildering variety of clustering
algorithms. Gene expression
clustering allows an open-ended exploration of the
data, without getting lost among the thousands of
individual genes. Beyond simple visualization, there
are also some important computational applications
for gene clusters. For example, Tavazoie et al.1
used clustering to identify cis-regulatory sequences
in the promoters of tightly coexpressed genes. Gene
expression clusters also tend to be significantly
enriched for specific functional categories—which
may be used to infer a functional role for unknown
genes in the same cluster.In this
primer, I focus specifically on clustering genes
that show similar expression patterns across a
number of samples, rather than clustering the
samples themselves (or both). I hope to leave you
with some understanding of clustering in general and
three of the more popular algorithms in particular.
Where possible, I also attempt to provide some
practical guidelines for applying cluster analysis
to your own gene expression data sets.
Evaluation of gene-expression clustering
via mutual information distance measure.
BACKGROUND: The definition of a distance measure
plays a key role in the evaluation of different
clustering solutions of gene expression profiles. In
this empirical study we compare different clustering
solutions when using the Mutual Information (MI)
measure versus the use of the well known Euclidean
distance and Pearson correlation coefficient. RESULTS:
Relying on several public gene expression datasets, we
evaluate the homogeneity and separation scores of
different clustering solutions. It was found that the
use of the MI measure yields a more significant
differentiation among erroneous clustering solutions.
The proposed measure was also used to analyze the
performance of several known clustering algorithms. A
comparative study of these algorithms reveals that
their best solutions are ranked almost oppositely when
using different distance measures, despite the found
correspondence between these measures when analysing
the averaged scores of groups of solutions.
CONCLUSIONS: In view of the results, further attention
should be paid in the selection of proper distance
measure for analyzing the clustering of gene
expression data.Priness I, Maimon O, Ben-Gal I.; BMC Bioinformatics. 2007 Mar 30;8(1): 111 How
to infer gene networks from expression profiles.
Mukesh Bansal, Vincenzo Belcastro, Alberto Ambesi-Impiombato & Diego di Bernardo Telethon Institute of Genetics and Medicine, Via P Castellino, Naples, Italy, European School of Molecular Medicine, Naples, Italy, Department of Natural Sciences, University of Naples ‘Federico II’, Naples, Italy and Department of Neuroscience, University of Naples ‘Federico II’, Naples, Italy These authors contributed equally to this work Systems Biology Lab, Telethon Institute of Genetics and Medicine, Via P Castellino 111, Naples 18131, Italy. Molecular Systems Biology 13 February 2007 Inferring, or
‘reverse-engineering’, gene networks can be defined
as the process of identifying gene interactions from
experimental data through computational analysis.
Gene expression data from microarrays are typically
used for this purpose. Here we compared different
reverseengineering algorithms for which ready-to-use
softwarewas available and that had been tested on
experimental data sets. We show that
reverse-engineering algorithmsare indeed able to
correctly infer regulatory interactions among genes,
at least when one performs perturbationexperiments
complying with the algorithm requirements. These
algorithms are superior to classic clustering
algorithmsfor the purpose of finding regulatory
interactions among genes.
Distribution-insensitive
cluster
analysis in SAS on real-time PCR gene
expression data of steadily expressed genes. Tichopad A, Pecen L, Pfaffl MW. Comput Methods Programs Biomed. 2006 Apr;82(1):44-50. Epub 2006 Cluster analysis is a tool often employed in the micro-array techniques but used less in the real-time PCR. Herein we present core SAS code that instead of the Euclidian distances takes correlation coefficient as a dissimilarity measure. The dissimilarity measure is made robust using a rank-order correlation coefficient rather than a parametric one. There is no need for an overall probability adjustment like in scoring methods based on repeated pair-wise comparisons. The rank-order correlation matrix gives a good base for the clustering procedure of gene expression data obtained by real-time RT-PCR as it disregards the different expression levels. Associated with each cluster is a linear combination of the variables in the cluster, which is the first principal component. Large set of variables can then be replaced by the set of cluster components with little loss of information. In this way, distinct clusters containing unregulated housekeeping genes along with other steadily expressed genes can be disclosed and utilized for standardization purposes. Simulated data in parallel with the data from a biological experiment were taken to validate the SAS macro. For both cases, good intuitive results were obtained.and, although further improvements are needed, have reached a discreet performance for being practically useful. A
new molecular breast cancer subclass defined from
a large scale real-time quantitative RT-PCR study.
BackgroundMaïa Chanrion, Hélène Fontaine, Carmen Rodriguez, Vincent Negre, Frédéric Bibeau, Charles Theillet, Alain Hénaut and Jean-Marie Darbon 1INSERM U868, Cancer Research Centre, CRLC Val d'Aurelle-Paul Lamarque, Montpellier, France 2UMS 2293 CNRS, University of Evry-Val d'Essonne, France BMC Cancer 2007, 7:39 Current histo-pathological prognostic factors are not very helpful in predicting the clinical outcome of breast cancer due to the disease's heterogeneity. Molecular profiling using a large panel of genes could help to classify breast tumours and to define signatures which are predictive of their clinical behaviour. Methods To this aim, quantitative RT-PCR amplification was used to study the RNA expression levels of 47 genes in 199 primary breast tumours and 6 normal breast tissues. Genes were selected on the basis of their potential implication in hormonal sensitivity of breast tumours. Normalized RT-PCR data were analysed in an unsupervised manner by pairwise hierarchical clustering, and the statistical relevance of the defined subclasses was assessed by Chi2 analysis. The robustness of the selected subgroups was evaluated by classifying an external and independent set of tumours using these Chi2-defined molecular signatures. Results Hierarchical clustering of gene expression data allowed us to define a series of tumour subgroups that were either reminiscent of previously reported classifications, or represented putative new subtypes. The Chi2 analysis of these subgroups allowed us to define specific molecular signatures for some of them whose reliability was further demonstrated by using the validation data set. A new breast cancer subclass, called subgroup 7, that we defined in that way, was particularly interesting as it gathered tumours with specific bioclinical features including a low rate of recurrence during a 5 year follow-up. Conclusion The analysis of the expression of 47 genes in 199 primary breast tumours allowed classifying them into a series of molecular subgroups. The subgroup 7, which has been highlighted by our study, was remarkable as it gathered tumours with specific bioclinical features including a low rate of recurrence. Although this finding should be confirmed by using a larger tumour cohort, it suggests that gene expression profiling using a minimal set of genes may allow the discovery of new subclasses of breast cancer that are characterized by specific molecular signatures and exhibit specific bioclinical features. Transcriptional regulatory network analysis
of developing human erythroid
Deciphering the molecular basis for human
erythropoiesis should yield information benefiting
studies of the hemoglobinopathies and other
erythroid disorders. We used an in vitro erythroid
differentiation system to study the developing red
blood cell transcriptome derived from adult CD34+
hematopoietic progenitor cells. mRNA expression
profiling was used to characterize developing
erythroid cells at six time points during
differentiation (days 1, 3, 5, 7, 9, and 11).
Eleven thousand seven hundred sixty-three genes
(20,963 Affymetrix probe sets) were expressed on
day 1, and 1,504 genes, represented by 1,953 probe
sets, were differentially expressed (DE) with 537
upregulated and 969 downregulated. A subset of the
DE genes was validated using real-time RT-PCR. The
DE probe sets were subjected to a cluster metric
and could be divided into two, three, four, five,
or six clusters of genes with different expression
patterns in each cluster. Genes in these clusters
were examined for shared transcription factor
binding sites (TFBS) in their promoters by
comparing enrichment of each TFBS relative to a
reference set using transcriptional regulatory
network analysis. The sets of TFBS enriched in
genes up- and downregulated during erythropoiesis
were distinct. This analysis identified
transcriptional regulators critical to erythroid
development, factors recently found to play a
role, as well as a new list of potential
candidates, including Evi-1, a potential silencer
of genes upregulated during erythropoiesis. Thus
this transcriptional regulatory network analysis
has yielded a focused set of factors and their
target genes whose role in differentiation of the
hematopoietic stem cell into distinct blood cell
lineages can be elucidated.progenitors reveals patterns of coregulation and potential transcriptional regulators. Keller MA, Addya S, Vadigepalli R, Banini B, Delgrosso K, Huang H, Surrey S. Cardeza Foundation of Hematologic Research, Jefferson Medical College, Philadelphia, Pennsylvania 19107, USA. Physiol Genomics. 2006 Dec 13;28(1): 114-128. An approach for clustering gene expression
data with error information.
BACKGROUND:
Clustering of gene expression patterns is a
well-studied technique for elucidating trends
across large numbers of transcripts and for
identifying likely co-regulated genes. Even the
best clustering methods, however, are unlikely to
provide meaningful results if too much of the data
is unreliable. With the maturation of microarray
technology, a wealth of research on statistical
analysis of gene expression data has encouraged
researchers to consider error and uncertainty in
their microarray experiments, so that experiments
are being performed increasingly with repeat spots
per gene per chip and with repeat experiments. One
of the challenges is to incorporate the
measurement error information into downstream
analyses of gene expression data, such as
traditional clustering techniques. Tjaden B. Computer Science Department, Wellesley College, Wellesley, MA 02481, USA. BMC Bioinformatics. 2006 Jan 12;7:17. RESULTS: In this study, a clustering approach is presented which incorporates both gene expression values and error information about the expression measurements. Using repeat expression measurements, the error of each gene expression measurement in each experiment condition is estimated, and this measurement error information is incorporated directly into the clustering algorithm. The algorithm, CORE (Clustering Of Repeat Expression data), is presented and its performance is validated using statistical measures. By using error information about gene expression measurements, the clustering approach is less sensitive to noise in the underlying data and it is able to achieve more accurate clusterings. Results are described for both synthetic expression data as well as real gene expression data from Escherichia coli and Saccharomyces cerevisiae. CONCLUSION: The additional information provided by replicate gene expression measurements is a valuable asset in effective clustering. Gene expression profiles with high errors, as determined from repeat measurements, may be unreliable and may associate with different clusters, whereas gene expression profiles with low errors can be clustered with higher specificity. Results indicate that including error information from repeat gene expression measurements can lead to significant improvements in clustering accuracy. Expression profiles and biological
function.
Expression arrays facilitate the monitoring
of changes in expression patterns of large
collections of genes. It is generally expected
that genes with similar expression patterns would
correspond to proteins of common biological
function. We assess this common assumption by
comparing levels of similarity of expression
patterns and statistical significance of
biological terms that describe the corresponding
protein functions. Terms are automatically
obtained by mining large collections of Medline
abstracts. We propose that the combined use of the
tools for expression profiles clustering and
automatic function retrieval, can be useful tools
for the detection of biologically relevant
associations between genes in complex gene
expression experiments. The results obtained using
publicly available experimental data show how, in
general, an increase in the similarity of the
expression patterns is accompanied by an
enhancement of the amount of specific functional
information or, in other words, how the selected
terms became more specific following an increase
in the specificity of the expression patterns.
Particularly interesting are the discrepancies
from this general trend, i.e. groups of genes with
similar expression patterns but very little in
common at the functional level. In these cases the
similarity of their expression profiles becomes
the first link between previously unrelated genes.Oliveros JC, Blaschke C, Herrero J, Dopazo J, Valencia A. Protein Design Group, Centro Nacional de Biotecnolog i a (CNB-CSIC), Campus de Cantoblanco, 28049 Madrid, Spain. Genome Inform Ser Workshop Genome Inform. 2000; 11: 106-117. Smoking and cancer-related gene expression
in bronchial epithelium and non-small-cell lung cancers.
Tobacco smoking is the leading cause of lung
cancer worldwide. Gene expression in surgically
resected and microdissected samples of
non-small-cell lung cancers (18 squamous cell
carcinomas and nine adenocarcinomas), matched
normal bronchial epithelium, and peripheral lung
tissue from both smokers (n = 22) and non-smokers
(n = 5) was studied using the Affymetrix U133A
array. A subset of 15 differentially regulated
genes was validated by real-time PCR or
immunohistochemistry. Hierarchical cluster
analysis clearly distinguished between benign and
malignant tissue and between squamous cell
carcinomas and adenocarcinomas. The bronchial
epithelium and adenocarcinomas could be divided
into the two subgroups of smokers and non-smokers.
By comparison of the gene expression profiles in
the bronchial epithelium of non-smokers, smokers,
and matched cancer tissues, it was possible to
identify a signature of 23 differentially
expressed genes, which might reflect early
cigarette smoke-induced and cancer-relevant
molecular lesions in the central bronchial
epithelium of smokers. Ten of these genes are
involved in xenobiotic metabolism and redox stress
(eg AKR1B10, AKR1C1, and MT1K). One gene is a
tumour suppressor gene (HLF); two genes act as
oncogenes (FGFR3 and LMO3); two genes are involved
in matrix degradation (MMP12 and PTHLH); three
genes are related to cell differentiation (SPRR1B,
RTN1, and MUC7); and five genes have not been well
characterized to date. By comparison of the
tobacco-exposed peripheral alveolar lung tissue of
smokers with non-smokers and with adenocarcinomas
from smokers, it was possible to identify a
signature of 27 other differentially expressed
genes. These genes are involved in the metabolism
of xenobiotics (eg GPX2 and FMO3) and may
represent cigarette smoke-induced, cancer-related
molecular targets that may be utilized to identify
smokers with increased risk for lung cancer.Woenckhaus M, Klein-Hitpass L, Grepmeier U, Merk J, Pfeifer M, Wild P, Bettstetter M, Wuensch P, Blaszyk H, Hartmann A, Hofstaedter F, Dietmaier W. Department of Pathology, University of Regensburg, Franz-Josef-Strauss-Allee 11, D 93053 Regensburg, Germany. J Pathol. 2006 210(2):192-204. Leighton Pritcharda, Dave Corneb, Douglas Kella, Jem Rowlandc, Mike Winsona Institute of Biological Sciences, University of Wales, Aberystwyth, Ceredigion, SY23 3DD,Wales,UK School of Systems Engineering, University of Reading, P.O. Box 225, Whiteknights, Reading,Berkshire RG6 6AY, UK Department of Computer Science, University of Wales, Aberystwyth, Ceredigion, SY23 3DB, UK
Error propagation in relative real-time
reverse transcription polymerase chain reaction quantification
models: The balance between accuracy and precision.
Oddmund Nordgad, Jan Terje Kvaløy, Ragne Kristin Farmen, Reino Heikkila Department of Hematology and Oncology, Stavanger University Hospital, 4068 Stavanger, Norway Department of Mathematics and Natural Science, University of Stavanger, 4036 Stavanger, Norway Division of Research and Human Resources, Stavanger University Hospital, 4068 Stavanger, Norway Real-time reverse transcription polymerase chain reaction (RT-PCR) has gained wide popularity as a sensitive and reliable technique for mRNA quantification. The development of new mathematical models for such quantifications has generally paid little attention to the aspect of error propagation. In this study we evaluate, both theoretically and experimentally, several recent models for relative realtime RT-PCR quantification of mRNA with respect to random error accumulation. We present error propagation expressions for the most common quantification models and discuss the influence of the various components on the total random error. Normalization against a calibrator sample to improve comparability between different runs is shown to increase the overall random error in our system. On the other hand, normalization against multiple reference genes, introduced to improve accuracy, does not increase error propagation compared to normalization against a single reference gene. Finally, we present evidence that sample-specific amplification efficiencies determined from individual amplification curves primarily increase the random error of real-time RT-PCR quantifications and should be avoided. Our data emphasize that the gain of accuracy associated with new quantification models should be validated against the corresponding loss of precision.
A quantitative model of error
accumulation during PCR amplification.
E. Pienaar a, M. Theron b, M. Nelson c, H.J. Viljoen a,∗ a Department of Chemical Engineering, University of Nebraska, Lincoln, NE 68588, USA b Department of Human Genetics, University of the Free State, NHLS, Bloemfontein 9330, South Africa c Megabase Research Products, Lincoln, NE 68504, USA The
amplification of target DNA by the polymerase
chain reaction (PCR) produces copies which may
contain errors. Two sources of errors are associated
with the PCR process: (1) editing errors that
occur during DNA polymerase-catalyzed
enzymatic copying and (2) errors due to DNA thermal
damage. In this study a quantitative model of
error frequencies is proposed and the role of
reaction conditions is investigated. The
errors which are ascribed to
the polymerase depend on the efficiency of its
editing function as well as the reaction
conditions; specifically the temperature and
the dNTP pool composition. Thermally induced
errors stem mostly from three sources: A+G
depurination, oxidative damage of guanine to 8-oxoG
and cytosine deamination to uracil. The
post-PCR modifications of sequences are
primarily due to exposure of nucleic acids to
elevated temperatures,
especially if the DNA is in a single-stranded
form. The proposed quantitative model predicts
the accumulation of errors over the course
of a PCR cycle. Thermal damage contributes
significantly to the total errors; therefore
consideration must be given to thermal
management of the PCR process.
Standardisation
of
data
from real-time quantitative PCR methods
– evaluation of outliers and comparison of calibration curves. Malcolm J Burns*, Gavin J Nixon, Carole A Foy and Neil Harris Address: Bio-Molecular Innovation, LGC Limited, Queens Road, Teddington, Middlesex, TW11 0LY, UK Background: As
real-time quantitative PCR (RT-QPCR) is increasingly
being relied upon for the enforcement of legislation
and regulations dependent upon the trace detection
of DNA, focus has increased on the quality issues
related to the technique. Recent work has focused on
the identification of factors that contribute
towards significant measurement uncertainty in the
realtime quantitative PCR technique, through
investigation of the experimental design and
operating procedure. However,
measurement uncertainty contributions made during
the data analysis procedure have not been studied in
detail. This paper presents two additional
approaches for standardising data analysis through
the novel application of statistical methods to
RT-QPCR, in order to minimise potential uncertainty
in results. Results: Experimental data was generated
in order to develop the two aspects of data handling
and analysis that can contribute towards measurement
uncertainty in results. This paper describes
preliminary aspects in standardising data through
the application of statistical techniques to the
area of RT-QPCR. The first aspect concerns the
statistical identification and subsequent handling
of outlying values arising from RT-QPCR, and
discusses the implementation of ISO guidelines in
relation to acceptance or rejection of outlying
values. The second aspect relates to the development
of an objective statistical test for the comparison
of calibration curves. Conclusion: The preliminary
statistical tests for outlying values and
comparisons between calibration curves can be
applied using basic functions found in standard
spreadsheet software. These two aspects emphasise
that the comparability of results arising from
RT-QPCR needs further refinement and development at
the data-handling phase. The implementation of
standardised approaches to data analysis should
further help minimise variation due to subjective
judgements. The aspects described in this paper will
help contribute towards the development of a set of
best practice guidelines regarding standardising
handling and interpretation of data arising from
RT-QPCR experiments.
Accurate and statistically verified
quantification of relative mRNA abundances using
Among the many methods currently available for
quantifying mRNA transcript abundance, reverse
transcription-polymeraseSYBR Green I and real-time RT-PCR. Julie H. Marinoa, Peyton Cook, Kenton S. Millera aculty of Biological Sciences, The University of Tulsa, 600 S. College Avenue, Tulsa, OK 74104-3189, USA Kervin Bovaird Center for Studies in Molecular Biology and Biotechnology, The University of Tulsa, Tulsa, OK 74104-3189, USA; Department of Mathematical and Computer Sciences, The University of Tulsa, Tulsa, OK 74104-3189, USA Journal of Immunological Methods 283 (2003) 291– 306 chain reaction (RT-PCR) has proved to be the most sensitive. Recently, several protocols for real-time relative RT-PCR using the reporter dye SYBR Green I have appeared in the literature. In these methods, sample and control mRNA abundance is quantified relative to an internal reference RNA whose abundance is known not to change under the differing experimental conditions. We have developed new data analysis procedures for the two most promising of these methodologies and generated data appropriate to assess both the accuracy and precision of the two protocols. We demonstrate that while both methods produce results that are precise when 18S rRNA is used as an internal reference, only one of these methods produces consistently accurate results. We have used this latter system to show that mRNA abundances can be accurately measured and strongly correlate with cell surface protein and carbohydrate expression as assessed by flow cytometry under different conditions of B cell activation. Customized
Molecular Phenotyping by Quantitative Gene Expression
and Pattern Recognition Analysis.
Shreeram Akilesh, Daniel J. Shaffer, and Derry Roopenian1 The Jackson Laboratory, Bar Harbor, Maine 04609, USA Description of
the molecular phenotypes of pathobiological
processes in vivo is a pressing need in genomic
biology.We have implemented a high-throughput
real-time PCR strategy to establish quantitative
expression profiles of a customized set of target
genes.It enables rapid, reproducible data
acquisition from limited quantities of RNA,
permitting serial sampling of mouse blood during
disease progression.We developed an easy to use
statistical algorithm—Global Pattern Recognition—to
readily identify genes whose expression has changed
significantly from healthy baseline profiles.This
approach provides unique molecular signatures for
rheumatoid arthritis, systemic lupus erythematosus,
and graft versus host disease, and can also be
applied to defining the molecular phenotype of a
variety of other normal and pathological processes.
Mathematical
Model
of Real-Time PCR Kinetics.
Jana L. Gevertz,1 Stanley M. Dunn,1 Charles M. Roth1,2 1Department of Biomedical Engineering, Rutgers University, Piscataway, New Jersey 2Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, New Jersey 08854 Biotechnol Bioeng. 2005 Nov 5;92(3):346-55. Abstract:
Several real-time PCR (rtPCR) quantification
techniques are currently used to determine the
expression levels of individual genes from rtPCR
data in the form of fluorescence intensities. In
most of these quantification techniques, it is
assumed that the efficiency of rtPCR is constant.
Our analysis of rtPCR data shows, however, that even
during the exponential phase of rtPCR, the
efficiency of the reaction is not constant, but is
instead a function of cycle number. In order to
understand better the mechanisms belying this
behavior, we have developed a mathematical model of
the annealing and extension phases of the PCR
process. Using the model, we can simulate the PCR
process over a series of reaction cycles. The model
thus allows us to predict the efficiency of rtPCR at
any cycle number, given a set of initial conditions
and parameter values, which can mostly be estimated
from biophysical data. The model predicts a
precipitous decrease in cycle efficiency when the
product concentration reaches a sufficient level for
template–template reannealing to compete with
primer-template annealing; this behavior is
consistent with available experimental data. The
quantitative understanding of rtPCR provided by this
model can allow us to develop more accurate methods
to quantify gene expression levels from rtPCR data.
Evaluation
of
absolute quantitation by nonlinear regression in
probe-based real-time PCR.
Rasmus Goll, Trine Olsen, Guanglin Cui and Jon Florholmen Institute of Clinical Medicine, University of Tromso, Tromso, Norway Department of gastroenterology, University hospital of Northern Norway, Tromso, Norway BMC Bioinformatics 2006, 7:107 doi:10.1186/1471-2105-7-107 In
real-time
PCR data analysis, the cycle threshold (CT) method is
currently the gold standard. This method is based on
an assumption of equal PCR efficiency in all
reactions, and precision may suffer if this condition
is not met. Nonlinear regressionanalysis (NLR) or
curve fitting has therefore been suggested as an
alternative to the cycle threshold method for absolute
quantitation. The advantages of NLR are that the
individual sample efficiency is simulated by the model
and that absolute quantitation is possible without a
standard curve, releasing reaction wells for unknown
samples. However, the calculation method has not been
evaluated systematically and has not previously been
applied to a TaqMan platform. Aim: To develop and
evaluate an automated NLR algorithm capable of
generating batch production regression analysis.Total
RNA samples extracted from human gastric mucosa were
reverse transcribed and analysed for TNFA, IL18 and
ACTB by TaqMan real-time PCR. Fluorescence data were
analysed by the regular CT method with a standard
curve, and by NLR with a positive control for
conversion of fluorescence intensity to copy number,
and for this purpose an automated algorithm was
written in SPSS syntax. Eleven separate regression
models were tested, and the output data was subjected
to Altman-Bland analysis. The Altman-Bland analysis
showed that the best regression model yielded
quantitative data with an intra-assay variation of 58%
vs. 24% for the CT derived copy numbers, and with a
mean inter-method deviation of x0.8. NLR can be
automated for batch production analysis, but the CT
method is more precise for absolute quantitation in
the present setting. The observed inter-method
deviation is an indication that assessment of the
fluorescence conversion factor used in the regression
method can be improved. However, the versatility
depends on the level of precision required, and in
some settings the increased cost effectiveness of NLR
may justify the lower precision.
|
|