Chapter 7: Investigating Gene Function – Gene Expression Pattern
in situ hybridization and reporter constructs
Introduction
Starting with this chapter we are beginning to look not just at certain techniques, but the applications of the techniques and the types of biological questions we are trying to answer when we design a genetically engineered construct. In Chapter 7 we’ll look at RNA in situ hybridization, a technique that allows us to visualize the expression pattern of a gene. We’ll begin by briefly discussing what expression pattern means and then go over the technique of in situ hybridization. The genetic engineering focus is on making the construct you will use to transcribe and make the labeled RNA probe. In Chapter 8 we’ll talk about making reporter constructs that give us similar information. Both are valuable approaches.
If you find the discussion of gene expression confusing, please review transcription and translation to help keep the ideas fresh in your mind.
Contents
Learning Outcomes
A. Gene expression
A-1. Transcription
A-2. Between transcription and translation
A-3. Translation and after
B. When and where is the gene expressed?
B1-i. Making a labeled probe
B1-ii. Preparation of the tissue
B1-iii. Hybridization
B1-iv. Detection
Learning Outcomes
- Define and describe what we mean by gene expression
- Outline the various levels of control of gene expression
- Explain the usefulness of knowing the expression pattern of a gene
- Describe the use of in situ hybridization to elucidate the expression pattern (when and where the gene is expressed) of a particular gene
- Explain why we need to know the orientation of the insert in a plasmid to make the desired RNA
- Given an image of a polylinker, determine how to clone the insert into the vector in the desired orientation for a particular application
- Given an image of a polylinker and insert, select the RNA polymerase to use to make either the sense or antisense RNA
A. Gene expression:
When we talk about gene expression we are really talking about all aspects of the process of transcription and translation – and beyond. There are many levels at which a cell can control which functional proteins (and functional RNAs, too, but we will focus on protein coding genes in this course) are being made.
What follows is a very simple version of the levels of control of gene expression. Think about how this information intersects with the content of your other courses, especially previous or concurrent genetics and molecular biology courses. If you have taken or are taking Developmental Biology, you will find many examples of gene regulation at all of the following levels. Connecting your learning in one course to your learning in other courses is the most efficient way to learn.
A-1. Transcription
Gene expression begins with transcription. The core promoter needs to be available to the binding of RNA polymerase and the general transcription factors. The core promoter is the TATA box and the promoter proximal elements. It has to be in an open conformation so that the transcription factors can bind to it. You will learn about chromatin and chromatin structure in other courses. The regulatory sequences that are commonly found upstream from a gene can bind transcriptional activators, proteins that promote transcription, or suppressor proteins that reduce or prevent transcription. The bound proteins, either activators or suppressors, work together with the general transcription factors to greatly increase or prevent (reduce) transcription. Sometimes there is even a competition between activators and suppressors and whichever are more abundant “win”. If the gene has been activated, transcription will proceed and an mRNA will be made.
A-2. Between transcription and translation
Transcripts must be processed and modified before export from the nucleus (we are focussing on eukaryotic genes in this section). Transcripts that remain in the nucleus cannot be translated. Processing includes:
- Adding a 5′ modified G nucleotide as a cap
- Adding a poly A tail to the 3′ end of the message
- Splicing out introns (non-coding sequences) to ensure that the message encodes the correct amino acid sequence for the protein to be produced
The poly A tail and 5′ cap help to protect the message from degradation and assist with export from the nucleus.
In particular, the poly A tail is associated with message stability in the cytoplasm – a very short tail is soon degraded by nucleases and the message may be translated only a few times before the nucleases begin degrading it. A very long poly A tail takes a longer time to degrade and thus the message may be translated many times before the nucleases start degrading the actual coding sequence. So the length of the poly A tail in part regulates the amount of protein made.
Some mRNAs are released from the nucleus but are then sequestered in a location so that they are only translated in part of the organism or at a particular time. Some of you will already know about this from your Developmental Biology classes; this is one way that timing of anterior-posterior specification is regulated in early fly embryos. And in the same system there are cytoplasmic proteins that can bind to certain mRNAs and prevent their translation in some regions of an organism.
Correct splicing is important- if a message is not completely spliced, there may be splicing complexes (of RNA and proteins) that remain attached to the transcript and prevent it moving out of the nucleus. This is not a common way of controlling gene expression levels but has been reported in the literature.
In some cases a process called nonsense-mediated decay can recognize and degrade mRNA inside the nucleus that has retained an intron or lost and exon and is therefore frame-shifted. If the reading frame of a gene shifts, most likely a stop codon will occur soon after the point where the reading frame changed. If the stop codons ahead of the real stop codons of the gene are detected, the message will be degraded before release from the nucleus.
There are other means of control of gene expression that are post-transcriptional, but pre-translational.
The production of double stranded RNA (ds-RNA), described in the next section is technically speaking, a pre-translation mechanism.
A-3. Translation and after
Some genes are regulated by a process in which small non coding RNAs called micro RNAs (miRNAs) are transcribed. These are complementary to part of the mRNA of the genes that are controlled by the miRNAs and the miRNAs bind to the mRNA. This produces a double-stranded region of RNA in that transcript, which cannot be translated while it is double-stranded, and which also triggers a mechanism to clear the ds-RNA from the cell (a protection against RNA viruses, and discussed in more detail in Chapter 9). The double-stranded RNA is only produced during certain stages; at other times the gene is expressed. MiRNAs affect gene regulation in other ways too sometimes; you will learn more about this type of gene regulation mechanism in other courses, particularly developmental biology.
Beyond translation, there are other ways the production of a functional protein can be controlled. Proteins must be correctly folded to be active. Some proteins are produced as a proenzyme that must be cleaved to become active and others may need to be modified by adding or removing a molecule to the protein. Two examples are glycosylation/deglycosylation (sugar molecule) and phosphorylation/dephosphorylation (phosphate molecule) and there are quite a few others. Proteins may also only function when in the correct sub-cellular location. An active transcription factor can only perform its gene regulation function if it is in the nucleus, for instance, and so transport of the protein to the correct location is a necessary step in gene expression. A cell surface protein cannot perform its function until it has been transported to and localized in the cell membrane.
Although there are many levels of gene regulation, much of the regulation occurs at the level of transcription. In situ hybridization, discussed next, detects the presence of mRNA in the cytoplasm of cells and so tells us if the gene is being transcribed, but cannot tell us whether a functional protein is produced in the cell or not. This is an important caveat to keep in mind..
B. When and where is the gene expressed?
Simply having a nearly fully sequenced genome does not tell us the functions and interactions of the genes. Understanding the time and place that a gene is expressed can give some clues to its function. Genes expressed in all cells at all times are likely to be “housekeeping genes”. Genes such as actin, tubulin, ribosomal proteins, and many others, are expressed at high level in most/all cells. The products of these genes are involved in processes common to all cells. Genes expressed only in a precise life stage of an organism, such as the early embryo, are likely to be involved in some process that is unique to that stage, such as the establishment of the basic body plan in the case of animals. If a gene is expressed only in the photosynthetic cells of a plant we might guess that its function is likely to relate to photosynthesis. We may not be correct about that, but it is a starting point to investigating the function of the gene. If we want to knock out a gene to assess its role in an organism, knowing the expression pattern tells us where to look for the phenotype – the effect of reducing or eliminating the gene’s function. If a gene is expressed primarily in the larval stages of a fly then this is the stage we examine to see the effect of eliminating the function of the gene.
B-1. In situ hybridization
A direct way to determine when and where a gene is expressed is to detect the mRNA in the cells of the organism, by in situ hybridization. The term in situ refers to the fact that the gene’s mRNA is being detected in its natural place in the organism. A labeled RNA probe is made that is complementary to the mRNA of the gene we are studying, it is hybridized to fixed tissue- either sections or whole mount – and then it is detected in various ways. My experience of in situ hybridization is that it is a long procedure with a lot of steps and you have be very careful throughout. But seeing the results of your week of work is very satisfying. There are many different protocols available for doing in situ hybridization so I’m not focussed on you memorizing the details of one or two such procedures. Instead, we’ll look briefly at the processes common to all procedures. All include the making of a labeled probe by in vitro transcription, the preparation of the tissue, the hybridization of the probe to the tissue, and then the detection of the probe. This course is focused on the genetic engineering aspect of this technique so pay particular attention to the orientation of the insert in the plasmid, and selection of the polymerase to use for making the probe.
B1-i. Making a labeled probe
The RNA probe we make is meant to detect the mRNA that is being produced in the cell. The mRNA is called the sense RNA because it contains the codons to make the protein product of the gene. Therefore we want to transcribe the other strand when we are making the probe: the antisense strand. (Review the terms: coding, non-coding, sense, antisense and template strands as they apply to DNA and sense and antisense as they apply to RNA if you are confused by the terms). The probe is made by in vitro transcription of a plasmid that contains the gene we are interested in studying. Some plasmids have just one promoter, to one side of the polylinker. Others have different promoters on each side of the polylinker. In both cases we must know the orientation of our insert in the plasmid in order to be sure we produce the antisense RNA as a probe. The RNA polymerases most often used are T3, T7 and SP6 RNA polymerases, all derived from bacteriophages of the same name. In the image below, the polylinker of a plasmid is shown with the DNA insert and the start and stop codons in bold. The gene is way too short – but assume that there are many more Ns in the real gene. To either side of the insert is a promoter (the rest of the plasmid is there too, but only this part of the plasmid is shown).
In this orientation, the sense strand of the DNA is the top strand and the antisense is the bottom strand. You can tell the top strand is the sense strand because the start (ATG) and stop (TAA or TAG or TGA) codons are shown in bold. The T3 RNA polymerase binds to the bottom strand of the DNA to the left of the insert, reads the bottom strand and makes an RNA that is complementary to it. Is it the sense or the antisense RNA that is produced? The T7 RNA polymerase recognizes and binds to the upper strand of the DNA and makes an RNA that is complementary to the upper strand. Is it the sense or the antisense RNA that is made in this case? Note that the RNA polymerases will move along the strand they are copying from 3′ to 5′ and will make an RNA that is made 5′ to 3′. Draw it out for clarification.
It is usual to linearize the plasmid before doing the in vitro transcription because some of the RNA polymerases are “promiscuous” and can bind to the promoter for other polymerases. This is the term we use but the reality is likely just that the promoters for the polymerases are all quite similar in sequence and so can be recognized by multiple polymerases. In any case, we don’t want to make both strands of RNA. If we did we would make double stranded RNA which would not find and bind the mRNA in the tissue we are studying. Also, if we have no transcriptional terminator in our plasmid (some don’t) then the polymerase won’t necessarily stop at the end of the gene sequence and we could wind up with a probe, a lot of which is actually plasmid sequence instead of the gene. To linearize the plasmid we cut it with a restriction enzyme that cuts once in the polylinker at the end of the gene. So for example if we want to use the T3 polymerase to make RNA (in the picture above), we need to linearize the plasmid by cutting it on the T7 side. Then the T3 polymerase transcribes to the end of the gene and cannot transcribe further. To use the T7 polymerase, we cut the polylinker on the T3 side of the insert.
The linearized plasmid, the appropriate buffers and a ribonucleotide mix are all part of the in vitro transcription reaction. There are equal numbers of all four ribonucleotides, and a small proportion of one of these – C for instance – is labeled. The label is either radioactive or a fluorescent label (called a fluorophore) or another molecule that can be detected with antibodies later in the procedure (see detection, below). Radioactive labelling is not done as often as it used to be. We use a very small proportion of labeled nucleotides because the fluorophores or other molecules that are attached to them are large and bulky and could impede the synthesis of the probe if there are too many labeled nucleotides close to each other.
RNase inhibitors are used in the production of the labelled RNA to ensure that the RNA of the probe is not degraded. Below is a video to my recorded information on in situ hybridization. Click here for the powerpoint slides presented in the video.
B1-ii. Preparation of the tissue
RNA in situ hybridization is not done on live tissue- it must be fixed and then permeabilized. Fixation is a process which toughens the tissue somewhat because there is a lot of handling in the hybridization procedures, and the tissues need to be protected from possible damage due to handling. Fixation makes crosslinks between the macromolecules (proteins, lipids, nucleic acids) in the tissue. If proteins are bound to a membrane they stay there and if RNA is in the cytoplasm of a cell, it stays there too – it is not able to diffuse away. This is important because if the treatment of the tissue caused the components such as the RNA of each cell to leak out and diffuse into other cells we could never be sure which cell produced that RNA. Formaldehyde and paraformaldehyde are common components of fixation solutions. A buffer is used to keep pH and salt levels in the normal range.
If we only fixed the tissues and then did our hybridization, the probe would never be able to penetrate the hardened tissues. So after fixation we permeabilize the tissue – disrupt cell membranes – so that the probe can get inside the cells. In fly work, which I am most familiar with, we used protease and detergent treatment. This was carefully timed to disrupt the cell membranes without damaging the tissue too much. In the fly lab, we re-fixed the tissue after permeabilization. Now the cell membranes are “open” but the tissue is hardened again so that it can withstand all the handling during the long in situ process.
If we are preparing whole mounts, intact organisms or structures (whole embryos most commonly), it can be tricky to strike the right balance between fixation and permeabilization. The latter is usually done more vigorously than for sectioned tissues. Sectioned tissues are fixed too, and then thin slices taken of the structure and mounted on slides. Because the slices are very thin, the permeabilization and re-fixation steps are shorter and less intense. And we are usually more confident that our probe reached all parts of a sectioned sample. When you have a whole mount structure or embryo, the tissues are much thicker and if you don’t see hybridization in a region, you may wonder if the probe just was unable to reach that region or if the gene is really not expressed there.
The hybridization procedure requires an incubation with the fixed tissues and the probe. The hybridization solution doesn’t usually need RNase inhibitors to because the hybridization buffer doesn’t really favour RNase activity, however if RNase contamination has been a problem RNase inhibitors can be added. There are lots of different protocols, and the details are not important here – you will learn about this if you work in a lab where this is done and the procedure is likely to be quite different from the ones I know. The important points are just that the hybridization buffers promote the formation of RNA:RNA duplexes. These are more stable than DNA duplexes. The hybridization solution contains appropriate buffer, such as saline sodium citrate (SSC) and it will also contain formamide, or a similar substance, which promotes stringency of base pairing. This ensures that the probe only binds to the mRNA it is complementary to (That is what stringency means- it refers to how accurately the probe binds the target sequences. We only want it to bind the correct sequence!). We include bovine serum albumin as a “blocking” agent. This prevents non-specific binding of the probe especially to proteins that have nucleic acid binding sites on them. We also sometimes add a large amount of sheared (broken randomly into small pieces) salmon sperm DNA or herring testis DNA also to reduce non-specific binding of the probe. The blocking substances are added before the probe. These bind to non-specific targets so that these targets are not available to the probe.
Each procedure is optimized for a particular tissue and organism to promote the probe only binding the intended mRNA and to minimize the likelihood of background signal, which can obscure the expression pattern of the gene.
After hybridization there are many wash steps and these remove excess probe, and wash away components of the hybridization solution. The probe is detected through autoradiography, fluorescence or antibodies.
B1-iv. Detection
If your probe was labeled with P32 or P33, autoradiography is used to detect the probe in your specimen. We use a piece of X-ray film placed over the sample and as the isotope decays it emits particles which leave a dark spot on the film. After sufficient exposure, we develop the film and determine where in the tissue the signal was found. This is very rarely done now.
If the probe was labeled directly with a fluorophore, the specimen can be examined using fluorescence microscopy to see the pattern of expression. This is more commonly done, as is the procedure below.
Some labels are not themselves directly visible. Digoxygenin (DIG) is a molecule produced by the foxglove plant. If the probe has been labeled with DIG, we can wash away excess probe and then hybridize the specimen with an antibody to DIG. The antibody is conjugated with (attached to) the enzyme alkaline phosphatase. After the tissue is incubated with the antibody, a series of wash steps remove all the unbound antibody. When we then add a staining solution to the tissue the alkaline phosphatase enzyme interacts with a component of the staining solution to make a blue stain.
Some antibodies are also conjugated to fluorophores of different colours. This allows the detection of multiple RNAs in a sample at once. If three different probes are made to three different genes, each with their own label: for example, DIG for one, fluorescein for the second, and dinitrophenol (DNP) for the third, you can then incubate with anti-DIG antibodies with one fluorophore (red) and anti-FLUOR antibodies with a different fluorophore (blue) and anti-DNP antibodies with yet a different fluorophore (green). And you can get spectacular fluorescence images that show multiple RNA expression patterns at the same time in the same specimen. It is not quite that straightforward- each label is separately detected and there are many additional steps. You only need to understand the logic of the detection: how we make the probes, and then how detect them
If you are interested, this video shows a Harvard grad student and her supervisor doing an in situ experiment on developing flowers. It doesn’t show a lot of details but does give a sense of why they are doing the hybridization, how it works in general, and it also gives a sense of the “ridiculous number” of washes in various solutions that take place.
This is the second recording describing the in situ hybridization procedure. Click here for the powerpoint slides presented in the video.
Once you have done the in situ hybridization, you might be unsure of the results. Is the pattern you see the real expression pattern or has the probe been unable to reach some parts of the sample? It is worthwhile to repeat the procedure multiple times to ensure the pattern is the same each time and to try to get the best images possible if the work is to be published.
But how can we ensure that we can see the full pattern of expression of the gene? We can use the regulatory region of that gene in a construct that includes a reporter gene (one whose product is easily visualized) to test whether the expression we see in an organism containing this construct is the same as the in situ results. If you have made a transgenic organism, ALL cells will have your construct so you can tell which cells are transcribing the construct. Once you have this working well, you can also use the technique to determine which sequences in the upstream region of a gene are responsible for the observed expression pattern. This is the topic of Chapter 8.