|
|
||||||||
Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin 53226
| |
ABSTRACT |
|---|
|
|
|---|
The Cannon lecture this year illustrates how knowledge of DNA sequences of complex living organisms is beginning to shape the landscape of physiology in the 21st century. Enormous challenges and opportunities now exist for physiologists to relate the galaxy of genes to normal and pathological functions. The first extensive genomic systems biology map for cardiovascular and renal function was completed last year as well as a new hypothesis-generating tool ("physiological profiling") that enables us to hypothesize relationships between specific genes responsible for the regulation of regulatory pathways. Techniques of chromosomal substitution (consomic and congenic rats) are beginning to confirm statistical results from linkage analysis studies, narrow the regions of genetic interest for positional cloning, and provide genetically well-defined control strains for physiological studies. Patterns of gene expression identified by microarray and mapping of expressed genes to chromosomal sites are adding to the understanding of systems physiology. The previously unimaginable goal of connecting ~36,000 genes to the complex functions of mammalian systems is indeed well underway.
consomic; congenic; systems physiology
| |
INTRODUCTION |
|---|
|
|
|---|
THE GOAL of this year's Cannon Lectureship was to provide a current overview of our ongoing research efforts to connect the nearly 36,000 genes of mammalian organisms to the understanding of integrated function of complex biological systems and the genetic basis of complex polygenic diseases. Although it would be presumptuous to suggest that great progress has yet been made, it is important to acknowledge what has already been accomplished toward this end and to recognize the enormous challenges and opportunities for advancing our understanding of biological function and disease in this postgenomic period. The work presented here has resulted from the fledgling efforts of a dedicated and highly interactive group of scientists working together at the Medical College of Wisconsin trained in physiology, genomics, and the computational sciences (see acknowledgments).
One need not contrive a reason to recognize the contribution of Walter Cannon in this Cannon Lecture. The parallelisms between the challenges faced by the physiologist in Walter Cannon's time and those at this moment in history are remarkable. Dr. Cannon, who dominated physiology during the first half of the 20th century, developed the overarching concept that biological systems were designed in a way that "trigger physiological responses to maintain the constancy of the internal environment in face of disturbances of external surroundings," a process that he called homeostasis. To fully understand these processes, he emphasized the need and ultimate goal of reassembling the rapidly expanding data on the reduced components of the biological system into the context of understanding whole organism function (3, 70). Although this goal of integration was obscured by the momentum of reductionism exemplified by molecular biology in the latter half of the 20th century, it has indeed emerged as the major imperative of 21st century biology.
Dr. Guyton's pioneering computer model of the cardiovascular system was one of the first comprehensive efforts to quantitatively integrate the complex pieces of cardiovascular data into an understanding of whole body function. From this computer simulation at the dawn of digital computing in 1972 (15), there emerged a vision of how computational biology could be used to integrate the detailed workings of the system and predict the behavior of complex biological systems. This work showed how the tools of high-speed computing could bring about a realization of Walter Cannon's goals of reassembling the various components of the biological system into the context of understanding whole organism function. With genomes of the human and commonly used domestic and laboratory species either sequenced or soon to be completed, it was no coincidence that the March 1st, 2002, issue of Science was dedicated to systems biology (23). The time has returned for systems integration. The quantitative cybernetic approaches developed by Guyton and his associates in the early 1970s were considered only curiosities by many at the time. They now, however, are being driven strongly not only by the vast amount of DNA sequence data, but, as presented here, by efforts to obtain functional data related to variations in gene sequences. The gulf between reductionism and systems science has never been more painfully apparent and it is a gulf that is increasingly less acceptable.
| |
HOW DID WE GET HERE? |
|---|
|
|
|---|
The major physiological interest of the scientists who engaged in the work presented in this lecture has been related to the regulation of regional blood flows, to the short- and long-term regulation of vascular resistance, and to the role of the kidney in fluid and electrolyte regulation. This work has in many ways tested the hypothesis proposed by Arthur Guyton's computer simulations predicting that the long-term level of arterial blood pressure was determined by the function of the kidneys and respective regulation of body fluid volumes (8, 15).
During the early 1990s as we were working on many detailed aspects of renal and cardiovascular function, the Human Genome Project took flight. There had been much grumbling about the folly of this effort and the large funds that the work pulled from areas of science that were perceived by many to be more meaningful. Yet the wheels were in motion and it was time to reflect on what one could or would do if this sequencing task was indeed to be completed within one's scientific lifetime. It was clear that the genomes would ultimately have to become connected to the complex pathways of biological function, but it was far from clear when or how these efforts should be undertaken. The challenge was trying to define the elusive boundary between imagination and pragmatism. The massive genome project had not yet made much progress by 1993 when we began our efforts, but was beginning to show signs of life as the power of robotics, automated sequencers, and high-speed computers began to be applied to this historically unparalleled biological project (11, 32). As we thought about how one would go about functionally annotating a mammalian genome, it was assumed that the human and other mammalian genomes contained nearly 100,000 genes, a frightening number of variables for an integrative biologist to contemplate. Although the publication of the first rough draft of the human genome arguably reduced this number to ~36,000 genes, it is believed that >120,000 proteins of the proteome are modified posttranslationally by phosphorylation, glycosylation, oxidation, and disulfide structures (31, 66).
| |
GENERALLY USED APPROACHES TO ANNOTATE GENOMES |
|---|
|
|
|---|
Many approaches quickly evolved in the 1990s to define the function of genes. The male pronucleus of fertilized oocytes was microinjected with transgenes to generate gain-of-function mutations. Targeted mutagenesis approaches and gene titration approaches (62) were developed. Null-mutant mice (knockout mice) were generated using mutant genes introduced into the germ line of mice using pluripotent embryonic stem (ES) cells isolated from the c57b6 strain of mice at the blastocyst stage. This approach, now in common use, introduces a modified gene of interest into ES cells that homologously recombines. A varying number of transfected ES cells contain a wild-type allele of the endogenous gene. These ES cells are then microinjected into the blastocyst-stage embryos and implanted into the uterus of foster mothers that give birth to chimeric mice (60).
More than 1,000 targeted mutations have been described since this technique was first introduced by Thomas and Capecchi in 1987 (64). These mutation models have proven to be of great utility to confirm hypotheses about the function of known genes (6). There are, however, clearly strategic limitations to this general approach. The process is very slow and its utility has not been proven for the routine discovery of genes with unknown function. One company (Lexicon Genetics) has directed its efforts toward producing and profiling 500-1,000 knockouts per year. This will require 25 years to complete the work for 36,000 genes. There is also the need to house 36,000 lines of mice or wait for a colony of mice to be derived from a cryo-preserved embryo or sperm stock before carrying out a study. Other limitations relate to the knockout approach itself. One of these is the genetic heterogeneity and variability among strains produced due to genetic background effects. ES cells are available for a limited number of different strains used for the host blastocyst embryo injections. The resulting gene knockout, therefore, may or may not result in a change of phenotype depending on the genetic background effects. Another limitation relates to the remarkable lack of baseline biology in the mouse. This represents a critical bottleneck and makes it particularly difficult to assess changes from normal function in response to the genetic manipulation. It is also apparent that the loss of a single gene is not likely to be the predominant cause of a multifactorial disease. Finally, it is also difficult to determine, after knockout of a gene, whether any observed alterations in function are a result of the altered gene or due to compensations that occur in the well-controlled closed-loop biological system. Mutagenic toxins (ENU) or X-rays are now also being used to produce discrete point mutations in genes of zebrafish, flies, and mice (68). When used with broad phenotypic screens, the mutants with the largest phenotypic effects can be identified. These approaches, however, result in very large numbers of progeny for screening and many of the inherent problems seen in knockout mice. For these and other reasons, we have put our energies into developing alternative strategies to attach function to the genome as described in the remainder of this lecture.
| |
USE OF INBRED RATS FOR COMPLEX GENETIC LINKAGE ANALYSIS |
|---|
|
|
|---|
One of the first critical decisions made by our research group was to use the rat as the model system for annotating the mammalian genome. This was based on the fact that techniques to study rat physiology were already well developed and there existed an enormous repository of baseline scientific data related to the function of this rodent. Furthermore, the rat was the model organism of choice of renal and cardiovascular physiologists, our major field of interest. As important, there were more than 200 well-defined strains of rats inbred for a wide variety of interesting complex traits and diseases relevant to human conditions, such as hypertension (48, 53), obesity, hyperlipidemia, type I and II diabetes, and insulin resistance (14, 19, 26), to name a few. Finally, there existed a wealth of published pathophysiological data related to these model systems in rats.
Our work began by using the classic genetic linkage approach in which one determines if a trait of interest segregates with (or transfers with) the inheritance of a genetic marker. This approach had been used successfully to map the chromosomal location of known genes or their genetic markers. Two inbred strains exhibiting a high degree of polymorphisms in genetic markers and many phenotypic differences were therefore intercrossed. The strains selected for this purpose were the salt-insensitive BN (Brown Norway) rat and the SS (salt-sensitive) rat (49). The first generation (F1) rats from such a cross are genetically identical, having received one chromosome from the mother and one chromosome from the father. Intercrossing of this F1 generation produces an F2 generation in which the genome is scrambled by multiple chromosomal recombinations that occur during meiosis within the F1 generation rats. Genotyping of markers that exhibit polymorphic differences in the parental strains can then be performed to determine if the quantitative trait of interest such as blood pressure segregates with the inheritance of the genetic marker allele from the normotensive or hypertensive parental stain.
Such a segregation analysis defines the chromosomal location (the quantitative trait loci; QTL) of a particular trait and this has long been the traditional approach to mapping genes. These early studies, however, had taken up the challenge of mapping and identifying genes responsible for hypertension and other complex diseases at a time when it was necessary to choose a single marker or candidate gene to see if they segregated with blood pressure (47, 48). The uniqueness of the initial study that we undertook was based on two key elements. First were the development and application of microsatellite markers based on nucleotide repeats such as CACA (1, 17, 28). Second, rather than mapping only one or several traits such as blood pressure as traditionally done, hundreds of likely determinants of this complex trait were measured in each of the F2 generation rats. Each of these phenotypes was chosen to represent some aspect of renal, vascular, cardiac, or neuroendocrine function, each of which is in some manner involved in the final determination of arterial pressure. In this sense, the experimental design represented a systems biological approach to the understanding of the genomic regulation of the homeostatic processes involved in the regulation of cardiovascular function and arterial pressure.
The use of microsatellite markers for mapping of traits first requires the determination of polymorphic or allelic differences between the parental strains. This is determined by techniques of gel electrophoresis whereby alleles with longer CACA repeats run at slower speeds than the allele with fewer CACA repeats. Indeed, panels of microsatellite markers exceeding 10,000 have now been mapped on the rat genome (http://www.rgd.mcw.edu). The markers that are polymorphic (e.g., exhibiting sequence length differences between the 2 parental inbred strains) can be selected at relatively uniform distances on each chromosome to distinguish the parental alleles in the F2 generation of rats of the linkage analysis. Unlike most previous linkage studies that had used known candidate genes suspected to be linked to a trait, total genome scans with many anonymous microsatellite markers provide a powerful tool for the possible discovery of novel genetic pathways involved in the homeostatic regulation of complex biological pathways. In our own laboratory, total genome scans are generally run using ~200-250 polymorphic markers spaced ~10-cM intervals across the ~2,000-cM length genome. Maximum-likelihood analysis using MAPMAKER program (29) is then used to determine which of the markers correlate with a given trait in the F2 generation.
The maximum likelihood analysis can be easily understood if one plots the correlation of each marker with the trait of interest to determine whether there is a significant correlation between the level of the quantitative trait such as blood pressure and the marker. For example, if those rats of the F2 generation that are homozygous for the allele from the BN rat at a specific marker on chromosome 1 all have low levels of blood pressure, and those that are homozygous for the allele from the SS strain are hypertensive, then a high level of correlation will be found between blood pressure trait and this genetic marker. This would then be identified as a QTL for blood pressure, the significance of which is expressed by the log of the odds ratio (LOD score). Each of the polymorphic markers used in the analysis are thereby analyzed by a maximum likelihood interval analysis to determine the regions within the chromosome that correlate with the trait of interest (i.e., where the LOD score represents the greatest likelihood of a linkage). As described by Lander and Kruglyak (30), LOD scores >2.8 are considered suggestive of linkage between the marker and trait and those with >4.2 are considered highly significant.
As we began our work in 1994, we therefore had several advantages. First, there were by then several hundred anonymous microsatellite markers with which to carry out total genome scans in rats. Second, there existed within our group a broad understanding of cardiovascular systems physiology and phenotyping skills to obtain a rich data set with which to correlate these many markers. The challenge was whether we could carry out a large enough study to provide the statistical power to genetically map important determinants of complex cardiovascular function. Could we simultaneously carry out a linkage analysis on so many determinants of arterial pressure and map these traits onto the rat genome? Could we indeed develop a first generation genetic map of cardiovascular function?
| |
GENETIC MAP OF CARDIOVASCULAR FUNCTION |
|---|
|
|
|---|
The successful mapping of many determinants of complex cardiovascular function was reported for the male F2 rats in Science, November 2001 (57). In this study, 239 measured or derived phenotypes were determined in each individual rat of the F2 generation of the intercross between the BN/Mcw and SS/Mcw inbred rats. The study was designed to capture a broad array of traits related to and considered to be determinants of arterial pressure. Only a brief description of the experimental protocol can be provided herein. All rats were maintained on a low-salt (0.4%) diet until 5 wk of age to allow for normal development and then placed on a lower salt diet (0.1% NaCl) until 9 wk of age. At this time, when the postnatal development of the kidney was complete, the salt content in the rat chow was increased to 8%. After 3 wk on a high-salt diet, rats were anesthetized and surgically instrumented with indwelling arterial catheters. Daily 3 h blood pressure measurements were begun 1 wk after surgical recovery with the conscious rats unrestrained in their home cages (9). Blood pressure data were also collected for time series analysis. Blood was obtained for determinations of electrolytes, serum creatinine, renin activity, triglycerides, total cholesterol, high-density lipoproteins, protein, and hematocrit. Urine was collected to determine 24-h protein and creatinine excretion. A diuretic challenge with furosemide was then given and the rats were returned to the low-salt (0.1%) diet and blood pressure was again measured for several days. Rats were then anesthetized and instrumented with electromagnetic flow probes on the renal arteries to determine the renal vascular responses together with peripheral vascular responses to acute infusions of angiotensin, norepinephrine, acetylcholine, and NG-nitro-L-arginine methyl ester. A number of morphometric measurements were determined after completion of these acute studies, including the evaluation of heart and kidney size and weight, renal glomerular diameters, vascular endothelial cell damage, and the degree of renal glomerular sclerosis.
The total genome scan carried out for this linkage analysis used markers chosen to have an average of 10-cM spacing. After testing the phenotype data for normalcy, 166 traits were analyzed parametrically using MAPMAKER/QTL (29) and the remaining 73 traits were analyzed using the nonparametric mapping algorithm. All in all, 90 of these traits mapped to 19 different chromosomes, 115 QTL with LOD scores >2.8 (suggestive significance), and 11 with LOD scores of >4.3 (highly significant) (57; a user interface to the complete data set of the male linkage analysis can be found at http://brc.mcw.edu/phyprf). The relevance of these proposed thresholds for significance suggested by Lander and Kruglyak (30) for a map with an infinite number of genetic markers was tested by carrying out a permutation analysis of all of the data to determine the actual threshold for significance for false positives in this multiple correlation analysis. The correlation analysis was repeated 125 times by correlating randomly assigned genotype values with phenotype values to set the threshold of suggestive and significant linkage. This permutation analysis indicated that the 2.8 and 4.3 threshold values were appropriate assignments to conservatively avoid false positive correlations.
As summarized in Fig. 1, many of the QTL
involved in determining blood pressure were found to aggregate within
broad regions of specific chromosomes. Specifically, six or more QTL
with overlapping 95% statistical confidence intervals were found on
rat chromosomes 1, 2, 7, and 18 of the male F2 rats. In most cases, the
phenotypes were independently correlated, suggesting that the aggregate
cluster of traits was likely to be a result of separate genes rather
than a single gene influencing more than one trait (pleiotropy)
(44). It has been proposed that when two or more traits
are affected by a single mutation, a single optimal genetic sequence
may become common, which may account for the relatively low
within-population variation for many proteins. One of the most
interesting and yet unexplained observations found in our linkage
analysis was related to the gender differences. Even though the rats
used in this linkage analysis were all of the same age and were
brothers and sisters of the same litters of the F2 population, the QTL
aggregates for males were clustered on different chromosomes from
females (40). These findings indicate that the genes and
probably the interaction of the genes responsible for the complex
regulation of cardiovascular and renal function differed between the
genders, a finding reported by Yagil et al. (72). The
implication of these observations is profound and will require
substantial research to explain.
|
In brief, the cardiovascular genetic map has provided the first rough approximation of the regions of the genome that are linked to the mechanisms of the cardiovascular and renal function related to the homeostatic control of sodium and water excretion and arterial pressure. Within these broad QTL regions reside hundreds of genes that may act alone or in concert with each other to determine their relationship to the traits with which they segregate. For example, the genetic distance of 1 cM represents ~2,000,000 base pairs, and the QTL regions in this initial study range generally from 10 to 30 cM in length. Because the estimated number of genes in the mammalian genome ranges from 30,000 to 40,000, these regions likely contain several hundred genes. For this reason, other approaches described below are required to more finely map these regions.
| |
GENETICALLY DEFINED RISK OF SYNDROME X |
|---|
|
|
|---|
One particular aggregate of QTL on chromosome 18 was of particular
interest to our research group. This grouping of QTL could be divided
into three functional groups: blood pressure salt sensitivity, plasma
lipid concentrations, and renal function (Fig.
2). Blood pressure salt sensitivity
within this aggregate of traits accounted for 17% of the overall
variance in salt sensitivity, indicating that other chromosomal regions
also contribute importantly to this trait. However, this collective
profile of phenotypes on chromosome 18 is particularly interesting
because it resembles Syndrome X or the "metabolic syndrome" in
humans (13). In addition, QTL related to blood pressure
variability as determined by a time series analysis (22)
and the change in blood pressure in response to an alerting stimulus
were also found in the same region. Such functional cassettes
have been observed for QTL in agriculture (65) and other
areas of biomedical research (2, 42).
|
It was reassuring to find that this QTL region of rat chromosome 18 related to salt sensitivity corresponded to the same region that Jacob and colleagues (18) determined in an F2 cross of spontaneously hypertensive rats (SHR) and Wistar-Kyoto (WKY) rats. Although affirmation of this observation is reassuring, it should be recognized that one often fails to find common areas of linkage between different crosses of rats. This is to be anticipated given the differences in the genetic backgrounds of the parental strains used to generate these intercrosses. Nevertheless, this genetic diversity may result from different "causal" genes or result in gene-gene interactions in these intercross progeny whereby salt-sensitive alleles may result in hypertension when expressed on one of the genetic backgrounds but not on the other. That is, different levels of genetic susceptibility to environmental stressors (e.g., salt intake) exist not only between rat progeny used in different intercross studies but also individual F2 rats within a specific intercross study. For these same reasons, QTL found in one linkage analysis may not appear in another study when one intercrosses other strains of rats or mice. It is not surprising that different genes contribute to salt-sensitive hypertension and the expression of salt-sensitive genes in different genetic backgrounds. This conforms to our physiological understanding of the many complex pathways and neuroendocrine controllers that can influence sodium homeostasis.
| |
RELEVANCE OF RODENT GENOMICS TO HUMAN PHYSIOLOGY AND DISEASE |
|---|
|
|
|---|
The extent to which animal model systems are applicable to our understanding of function and disease in humans has been long debated. Genomics, coupled with detailed physiological characterizations, are beginning now to provide a better understanding of the relevance and application of genetic animal models to human correlates of various diseases (24, 52). Genomics has demonstrated that there is >85% similarity in the coding regions of the rat genome compared with the human genome (27, 37). Given the high degree of conserved gene order, diseased genes identified in the rat can be predicted to be in regions of synteny (homologous chromosomes) in humans and mice. There is evidence that the QTLs mapped using natural variant models (inbred strains) of complex diseases such as hypertension found in rat and mouse strains (56, 69, 71) may be able to predict the regions containing genes for the same traits on human chromosomes. Such homology mapping is now possible using a high-density integrated genetic linkage and radiation hybrid maps of the laboratory rat that have recently been developed (56, 69). Using these in silico approaches, 57 QTL for 33 blood pressure traits in the prodigies of seven F2 rat intercrosses for genetic hypertension have been identified (57). Using comparative mapping strategies it was found that there were 26 homologous regions in the human genome. Furthermore, it was found that five of the six known QTL for human hypertension were correctly predicted from the genetic studies of hypertension in rat (58). Nearly all of these QTL were also predicted by linkage studies in mice (59). It is clear that the relevance of rats or mice to human will not ultimately be revealed until human genetics and physiology is better understood. In the mean time, however, comparative mapping approaches may provide valuable insights to confirm the results obtained from human population studies, to provide clues for regions of likely importance in humans, and to reveal pathways and hypotheses for experimental studies.
The influence of sodium intake on arterial blood pressure in the human population remains one of the unresolved and contentious health and public policy issues of our time. It has been difficult to establish a strong relationship between salt intake and arterial pressure in human populations, despite many epidemiological and randomized control studies (63). As seen by the genetic linkage analysis described above and as will be shown by the results obtained by chromosomal substitution studies described below, the SS/Mcw rat strain (Dahl salt-sensitive hypertensive) provides a particularly robust model for the genetic dissection of the pathways that contribute to salt-sensitive forms of hypertension. The genetic linkage analysis in rodents overcomes many of the problems faced in human studies of complex genetic traits in which multiple genes, moderate gene penetrance, environmental interactions, and genetic heterogeneity (allelic: differences within the same gene; locus: different gene combinations causing the same phenotype) are present in the population. Hereditary hypertension in rats is a result of naturally occurring allelic variance captured during the selection and inbreeding process. Use of inbred strains removes the problem of heterogeneity within a given cross. Taken together, the utility of these approaches is obvious and our results indicate that by comparative mapping strategies, data obtained from rodent studies may be useful in determining potentially important genomic regions, genes, and pathways controlling complex function in humans and other mammals.
| |
PHYSIOLOGICAL PROFILING |
|---|
|
|
|---|
Efforts to understand how the higher levels of organization of the
biological system relate to the underlying genetic background of the
organism have now begun in our group. Toward this end, we recently
developed a computational approach referred to as "physiological
profiling" as illustrated in Fig.
3. Although the physiological
profile represented in this figure resembles a photograph of a
fluorescently tagged gene microarray, it is not. It is the color-coded
results of an analysis of the complex correlational relationships
between 125 of the measured phenotypes that were linked to the genome
in our F2 linkage analysis in which the correlation of each trait is
determined against all other traits. The color of each small block
represents the level of the respective correlations. Red colors
represent the highest level of positive correlation, blue colors
represent the highest level of negative correlation, and brown or black
color represents little or no correlation. Physiological profiles
provide a way to relate genetic information with functional pathways
whereby one can estimate within the F2 generation of rats the effect of
a specific allelic substitution (different genotypes) upon the complex
functional relationships of an organism.
|
In one respect, this approach represents nothing more than a traditional multiple cross-correlation and linkage analysis, because it represents the likelihood that specific regions of the genome are related to a set of measured traits. However, as these relationships examined closely, it becomes apparent that these analyses enable one to go well beyond the simple concept of linkage. They enable the prediction, by rapid visualization of these colored matrixes, of how all of the complex correlational relationships of the homeostatic determinants of arterial pressure can be influenced by genes within these QTL regions. The utility of this approach was demonstrated recently (57) using the QTL for mean arterial pressure and renal blood flow responses to acetylcholine that mapped to chromosome 10 and to chromosomes 4 and 12. Interestingly, these three QTL regions harbored the genes for nitric oxide synthase (chromosome 4-NOS III, chromosome 10-NOS II, and chromosome 12-NOS I). We therefore investigated the impact of BN and SS alleles of all three NOS genes on the physiological profiles of the mapped traits. This analysis demonstrated that rats that were homozygous for the SS allele for the marker related to NOS II (chromosome 10) exhibited a positive correlation between blood pressure responses to infused norepinephrine and angiotensin. In contrast, F2 rats that were homozygous for BN at this same marker exhibited only weak correlations among these phenotypes. The shattering of these correlations seen in BN homozygote rats at this marker site demonstrates that a gene or genes within this region have a major effect upon the functional responses of these rats to adrenergic or angiotensin stimulation. Furthermore, the blood pressure responses before and after the infusion of norepinephrine were significantly different in rats that were homozygote BN at this locus in that pressure fell transiently below control values when the infusion was ended.
Physiological profiling therefore represents a tool whereby one is able to determine the affect of an allele of a gene or gene region characterized by a genetic marker on the complex homeostatic relationships of the organism. That is, rats partitioned on the basis of the marker for the NOS II gene exhibited different blood pressure responses to norepinephrine (and angiotensin) infusions. Although the outcome of the example used for this specific analysis was consistent with the known importance of nitric oxide to the control of vascular tone and blood pressure, the observation of an extended hypotension in response to norepinephrine was novel. Indeed, a number of other relationships can be and were derived from similar computations that were more unexpected and demonstrated how the systems biology map and physiological profiling can be used to find novel relationships and testable hypotheses (57). A user interface to the complete data set showing the physiological profiles for each marker of the F2 rat generation in this linkage study is found at http://brc.mcw.edu/phyprf and anyone can use this complete data set to evaluate the relationship between any genotype and profile of interest.
| |
DEVELOPMENT AND APPLICATION OF CONSOMIC AND CONGENIC RAT STRAINS |
|---|
|
|
|---|
The genetic linkage analysis described above yielded the first extensive genomic systems biology map for cardiovascular function. It identified many chromosomal regions that appear to segregate with traits that are important determinants of arterial blood pressure and other cardiovascular functions. This analysis represents only the first attempt to determine the specific chromosomal regions that are participating in the complex regulation of cardiovascular function. Nevertheless, this work has set the stage for the development of the next discovery phase required to fill in more detailed mechanistic, functional pathways and to identify the specific genes and proteins responsible for the regulation of these functional pathways. It is now our task to confirm these QTL associations, to narrow the regions of genetic interest for the identification of specific genes and to carry out more detailed physiological studies to identify at greater levels of complexity. The relationship between the genes and the physiological regulation of these complex pathways must ultimately be unraveled.
Toward this end, we are currently using techniques of chromosomal substitution to inbreed strains of consomic and congenic rats. As described and used by others (43), a congenic strain is developed by the integration of an individual piece of a chromosome containing a QTL, or a region with candidate genes of interest, into the genomic background of the recipient strain. Congenic strains are developed to confirm and narrow QTL regions of interest (7, 12, 21, 25, 41, 51, 55), and these model systems have proven very useful in the deconvolution of complex traits and the identification of candidate genes. During the past year, three genes involved in the etiology of complex diseases have been positionally cloned using this approach (36, 67, 73). However, this work has been hampered by the time and expense involved in producing these informative recombinant rats. Even with the use of marker-assisted selection (MAS; 28) to identify the rats best suited for backcrossing in generations, we found that the process of developing an inbred congenic strain requires generally from 2 to 3 years and ~10 generations of backcrosses to achieve rats that are significantly isogenic to make meaningful comparisons.
To overcome these limitations, we have begun developing panels of
consomic inbred rat strains in which an entire chromosome is
introgressed into the genomic background of the recipient strain (see
Fig. 4). In a panel of consomic rats,
single chromosomes are replaced, one at a time, so that the
contribution of genes on each chromosome can be assessed by phenotyping
the consomic strain for the traits of interest. Chromosomes are
transferred using traditional backcross breeding techniques in
combination with MAS. For example, to transfer chromosome 1 from the
BN/Mcw strain onto the background of the parental SS/Mcw rat strain, one takes the first generation intercross pups (F1 generation) that are
heterozygous for all different parental alleles and backcrosses these
rats to the SS/Mcw stain. According to Mendelian genetics and
exponential decay, each backcross would reduce the number of loci that
are heterozygous by 50%, requiring 10 generations to complete. With
the use of total genome scans it is possible, however, to select only
rats for backcrossing that possess the greatest numbers of alleles that
are homozygote for the parental (background) SS/Mcw strain while
maintaining the "targeted chromosome" (e.g., the one being
transferred between strains) heterozygote for each of these
backcrosses. Rats found to be isogenic SS/Mcw at all chromosomes except
the "targeted chromosome," are intercrossed to produce a large F2
generation. Of this F2 generation a small percentage of the animals
will be homozygous (BN/BN) for the targeted chromosome.
|
Once derived, these strains can be bred to provide renewable inbred lines of consomic rats (e.g., SS.BN1 in this example). Using MAS (28) we are presently constructing consomic lines in six to seven generations of backcrossing, individually substituting each chromosome into the background of the SS/Mcw rat. Although the process of generating the consomic panels is expensive and time consuming, once they are developed there are many advantages to their use, ranging from screening to determine the chromosomal location of genes contributing to any phenotype or drug responses, to rapidly generating congenics for positional cloning, to developing control strains with a standard genome background for well-defined physiological studies as discussed later. Our survey of 48 commonly used inbred strains of rats revealed that on average there are six flavors or alleles (range 1-13 alleles) for a given simple sequence repeat genetic marker. We are currently developing two consomic strains with three strains of rats (BN/Mcw × SS/Mcw; and BN/Mcw × FHH/EurMcw) that on average capture three alleles or 50% of the average genetic variance. This work at our institution, PhysGen (http://pga.mcw.edu/), is supported by one of the 11 Programs for Genomic Applications (PGA) of the National Heart, Lung, and Blood Institute of the National Institutes of Health. Consomic mice have also been generated by Nadeau et al. (43).
The MCW-PhysGen consomic rat panels on completion will consist of 22 inbred strains in which the chromosomes of the BN/Mcw rats have been systematically transferred into the genomic background of the SS/Mcw and 22 inbred stains in which BN/Mcw chromosomes have been individually transferred into the background of the FHH/EurMcw (Fawn Hooded) strain of rats (counting the separate stains for each the X and Y chromosome). Up to 3,500 genotypes each week are carried out for the selection of rats needed for the breeding of these consomic panels. These consomic rat panels provide the ability to assess the contribution of genes specific to a chromosome to defined traits of interest. Importantly, this PGA program provides a mechanism for the distribution of these consomic strains. After completion of the characterization of genotypes and phenotypes, each of the consomic strains is sent to Charles River Laboratories for breeding and commercial distribution to the scientific community. Inasmuch as each of the parental and congenic stains has a uniform genetic background, they enable an investigator to carry out studies using rats with a validated genomic background. Genomic scans of each of these derived strains and the parental strains are routinely carried out using 200 microsatellite markers by contractual agreement in Dr. Howard Jacob's laboratory to ensure genetic fidelity. The availability of these consomic rat panels will now provide a renewable national resource that investigators can use to understand the impact of allelic variance and interactions with the environment on complex function and disease.
Environmental stressors used in the characterization of each consomic panel include chronic and acute hypoxia, acute hypercapnia, exercise, and high-salt diets. These stressors are used in our PGA program to unmask deficiencies in the normal homeostatic mechanisms and idiopathic mechanisms that contribute to disease. The consomic panels are being characterized using more than 200 phenotypes specific to the heart, lung, kidney, vasculature, and blood function. The results of this work are posted quarterly at http://pga.mcw.edu/. The measured phenotypes were selected to reveal alterations in a diverse set of underlying pathways that control these higher level functions. Nearly 3,700 rats of both genders are being phenotyped annually (15 strains per year) using a variety of parallel high-throughput phenotyping techniques. For example, vascular aortic ring studies assess the effects of vasodilator and constrictor substances and acute hypoxia using a 16 aortic ring bath setup for parallel analysis of eight rats each day. Cardiac dynamics and responses to acute global ischemia of six isolated hearts are assessed in parallel each day. Responses of hearts of rats subjected to 20 days of either 12 or 21% inspired O2 are assessed. In other groups of rats, airway sensitivity, pulmonary vascular mechanics, pulmonary endothelial angiotensin-converting enzyme activity, and differences in pulmonary endothelial redox status are determined in similarly conditioned rats. Patterns of breathing and lung function are assessed in custom-designed plethysmographic chambers in unanesthetized rats subjected to acute hypercapnia and acute hypoxia. Arterial blood pressure is determined using unanesthetized rats with indwelling arterial catheters maintained on either a low (0.4%)- or a high (4%)-salt diet. Maintenance of these rats in metabolic cages enables the collection of 24-h urines for the determination of creatinine clearance and daily excretion of protein to assess renal function. Indwelling arterial catheters also enable the assessment of blood gases for the respiratory studies and enable the strains to be assessed for blood glucose, creatinine, bilirubin, cholesterol, total protein albumin, urea nitrogen, alkaline phosphatase, blood electrolytes, and other blood chemistries.
| |
APPLICATIONS OF CONSOMIC RAT STRAINS |
|---|
|
|
|---|
There are many uses of consomic strains once they have been developed. First, they can be used to validate the functional significance of QTL; second, they can be used for fine mapping of genes influencing physiological traits by rapid development of congenic-inbred strains; and third, they can provide a genetically well-defined control strain for physiological studies.
Validation of functional significance of QTL.
The phenotyping of a number of consomic strains of rats has been
completed and the data publicly released. An example of how one can
validate the results of the genetic linkage analysis is shown in Fig.
5. In this case, substitution of
chromosome 18 from the BN/Mcw strain into the SS/Mcw genomic background
significantly reduced the level of hypertension and proteinuria
achieved in these rats when fed a high (4%)-salt diet compared with
the parental SS/Mcw strains. The importance of the QTL cluster
(Syndrome X) found by linkage analysis in male rats on chromosome 18 (Fig. 2) was therefore validated by the SS.BN18 consomic.
|
Chromosomal localization of physiological traits without genotyping. One of the most useful applications of consomic rat strains is the ability that they provide to localize what chromosomes are determining the phenotype of interest. Although substantial genotyping is required in the development of the consomic inbred strains, once the chromosomal substitution is complete, it is necessary only to make careful and well-designed physiological measurements to determine which chromosome(s) is influencing the trait of interest. Unlike a single rat within the F2 generation of a linkage analysis, which is literally a single experiment (n = 1), consomic strains are inbred and replicate biological measurements can be carried out to achieve the level of statistical confidence required to reach firm conclusions. Furthermore, they provide a renewable source of inbred rats for many purposes.
Figure 5 illustrates how the trait of blood pressure salt sensitivity can be localized within the genome. Neither substitution of BN chromosome 9 (SS.BN9) nor 20 (SS.BN20) into the background of the SS/Mcw rat strain reduced the level of hypertension compared with the SS/Mcw strain in response to 3 wk of a high (4%)-salt diet. In contrast, consomic SS.BN13, SS.BN16, and SS.BN18 strains all exhibited significant reductions in blood pressure salt sensitivity. Without even genotyping these rat strains, it is evident that these three chromosomes (13, 16, and 18) contain genes of interest to blood pressure salt sensitivity. These data also illustrate that the effects of weak quantitative loci can be identified in fewer consomic rats than are required with segregating crosses and many other mapping methods, as also emphasized by Nadeau et al. (42). It is also evident that chromosomes were revealed that were not apparent in the linkage analysis. For example, chromosome 18 was the chromosome that appeared to be the one of major interest with respect to the trait of blood pressure salt sensitivity in the F2 intercross study described above. The phenotyping of the consomic strains has revealed other important chromosomes related to this trait (e.g., chromosomes 13 and 16) that were not identified in the F2 cross. The added power of using consomic (and congenic) strains to localize traits can be explained by the well-known limitation of an F2 linkage study whereby each individual rat of the F2 generation has a unique genetic background due to the random recombinations that occur in the population as a large population is generated from the mating of many sibling pairs of the F1 generation. These various chromosomal recombinations within the F2 population result in variations in the genetic background of each of the F2 rats. These genetic heterogeneities can either increase or decrease susceptibility to salt sensitivity and other traits in each individual rat. Although positive data obtained from an F2 linkage analysis are highly useful, one cannot conclude from lack of linkage that a chromosome does not contain genes of interest.Rapid development of congenic-inbred strains from consomics. Congenic strains provide a powerful tool for studying complex diseases, as pioneered by Snell in his Nobel laureate work on the major histocompatibility complex (54). Congenic rats are isogenic with a small region of the genome from one strain introgressed on to the genomic background by repeated backcrossing in the same manner selective backcrossing was used to generate consomic strains. The congenic strains are the ultimate goal and the major reason for developing the inbred lines with a complete chromosomal substitution. It is necessary to build a series of congenic stains with overlapping regions within the chromosome so that the narrowest region containing the trait of interest can be localized. A narrow region of a single chromosome is identified that retains a complex trait or the elements of complex disease. Traditionally, these narrowed congenic regions have been used to identify candidate genes for positional cloning. As emphasized below, candidate genes of interest within this region can be identified by gene expression differences using DNA microarray approaches, Northern blots, etc., or by direct sequencing of all genes in a region and looking for causal mutations, all part of the positional cloning process. Like the consomic strains, these congenic strains also represent a renewable resource.
The use of congenic strains has been very limited until this time because development of a congenic line typically requires 10-12 generations of backcrossing, after completion of the crossbreeding studies to define the QTL of interest. The advantage gained in developing a congenic strain using a defined consomic strain relates to the strategic issues of speed and cost. Because the consomic animal is already isogenic, the target chromosome can therefore be rapidly subdivided by developing congenic substrains on the identical genomic background (see Fig. 5). All of the chromosomal alleles except the target chromosome are already homozygous and identical to the original parental strains. This means that the backcrossing and genotyping normally required to achieve homozygosity for each of the individual chromosomes have already been completed in the consomic strains before one begins to narrow the regions of interest on the target chromosome of interest. Indeed, when starting this process using a consomic strain, only two generations of backcrosses are required to develop the narrowed region of interest (6 mo compared with several years if one starts using the parental strains alone) and only one additional intercross is then required to harvest the ~25% of rats of the next generation that are homozygote within this narrowed region or a single intercross if it is large enough.Consomic and congenic strains provide genetically well-defined
control strains for physiological studies.
An inbred consomic or congenic strain that has been shown to exhibit a
difference in phenotype(s) of interest compared with the parental
strain provides a powerful model system for physiological studies. In
the past, it has been very difficult to select an appropriate control
strain for comparison to an inbred strain that had been selected for a
specific trait of interest such as hypertension. For example, in the
case of the SHR and Dahl salt-sensitive rats, the so-called
normotensive "control" strains were chosen based on the common
ancestral origins to the strain to be studied and largely because the
strain was normotensive. Genome scans of these and other commonly used
"control" strains (see PGA PhysGen web site) have now shown that
these traditionally used "control" strains (WKY, Dahl-R, etc.) have
undergone considerable genetic divergence from the inbred hypertensive
strains. Using nearly 1,541 microsatellite markers, 77% allelic
differences were found to be present between BN/Mcw and SS/Mcw rats,
48% between SHR and WKY, 52% between Sprague-Dawley and SS/Mcw, 57%
between ACI and SS/Mcw, and 30% between Dahl-R and SS/Mcw rats. By
comparison, the SS.BN13 consomic strain exhibits only
1.95%
variation from the parental SS/Mcw strain, as chromosome 13 is 3% of
the genome. It is evident that even more stringent experimental control
strains will be developed and used as the genetic region most
specifically responsible for the trait of interest is narrowed within
the chromosome of interest. However, we have found it of great utility
to first substitute large segments or the entire chromosome to
initially determine whether any effect can be seen within the broad
region before beginning the laborious task of narrowing the region.
|
| |
EXPRESSION OF GENES THAT "TRIGGER THE PHYSIOLOGICAL RESPONSES TO
MAINTAIN THE CONSTANCY OF THE INTERNAL ENVIRONMENT IN FACE OF
DISTURBANCES OF EXTERNAL SURROUNDINGS" (3) HOMEOSTASIS |
|---|
|
|
|---|
Patterns of gene expression with microarrays and mapping of expressed genes to chromosomal sites are beginning to provide new insights and generate new hypotheses for the understanding of complex homeostatic processes. Gene expression determined by microarray approaches represents a powerful tool that can be used for two quite different and important purposes. The first is for screening, where the goal is to uncover suggestive genes and related biochemical pathways involved in a complex trait of interest. Such screens are particularly useful in identifying possible candidate genes of interest within a QTL region of a congenic strain of rats to provide guidance for positional cloning efforts. Screening has been the major use for the large arrays in part because of concerns regarding the reliability of microarray results and because these screens can be carried out using pooled tissue samples to reduce the cost of large array hybridizations. However, microarrays can also be used for a second purpose, as true tools of discovery and not merely for screening. To do this, one must apply the same rigid standards of any biological assay and use an appropriate number of replicate samples together with acceptable statistical methods to identify false positives. In addition, when possible, the results of differentially expressed genes of interest should be validated using other independent methods, such as Northern blot or real-time RT-PCR analysis. Although this approach requires a greater investment of time and resources, the method allows the analysis of expression differences of a remarkable number of genes in parallel.
Microarray discovery type studies were recently carried out in our laboratory related to our interest in the role of the renal medulla in sodium and water homeostasis and hypertension. These studies illustrate not only how microarrays, but also other tools of physiological genomics, are converging to help us better understand the regulation of a complex biological system. The search to elucidate the controllers of sodium and water homeostasis has long been the focus of physiologists. A number of important pathways involved in this process have been discovered and laboriously studied over the past half century, such as the renin-angiotensin-aldosterone system and autonomic reflex pathways. It is likely, however, that we have only scratched the surface in understanding the true complexity of these homeostatic pathways as well as the secondary pathophysiological consequences of a high-salt diet.
Our entry point into the cDNA discovery field began by using a custom-stamped cDNA microarray developed in our department containing ~2,000 genes. The genes on this array represented ~80% of all of the rat genes that had been assigned defined names at the time of this study (2001-2002), most of which were genes with some known function. The goal of the study was to identify genes in the renal medulla of SS/Mcw rats stimulated by a high-salt (4%) diet and determine which of these genes would be expressed differently in SS.BN13 rats, the consomic strain that exhibited reduced hypertension, renal interstitial fibrosis, and glomerular sclerosis compared with the SS/Mcw strain. Intermediate sequential responses at 18 h, 3 days, and 2 wk were recently reported by Liang et al. (34). Individual responses of three rats at each time point with duplicate sets of genes on each array were analyzed in this study, with >58 total microarrays being hybridized. A conservative but robust method was used to eliminate false positives, and the validity of the differentially expressed genes was supported by gene resequencing and by Northern blot analysis.