Investigating Gene Function Part 2 – Reporter Constructs

kathleef

9 Investigating Gene Function Part 2 – Reporter Constructs

Introduction

Reporter constructs are another way to approach the question of the expression pattern of a gene. In this approach we use the presumed regulatory sequences of a gene to “drive” expression of a reporter gene. A reporter gene encodes a protein that we can visualize relatively easily in an organism to view which cells are expressing that protein. Since the regulatory sequences contain all the instructions about when and where the gene is supposed to be expressed, the reporter construct provides a readout of the information of the regulatory sequences we have cloned into the construct.

It is essential to ensure that the reporter construct we make and introduce into the organism is present in every cell. There are ways of doing so and we’ll cover them in a later section, toward the end of the semester. For now, assume that every cell in the organism contains the reporter construct. If it does not, how could you tell if the lack of reporter protein in a cell was because the gene was not transcribed? It might just be that the cell did not contain the construct.

Learning Outcomes

Explain the difference in information we get from transcriptional and translational fusion constructs
Describe the steps of making a transcriptional fusion construct
Compare and contrast the features of Lac Z, GUS, GFP as reporter genes
Define the characteristics that make a good reporter gene
Describe the steps of making a translational fusion construct
List and explain the key concerns with translational fusion constructs
Practice designing a transcriptional and translational fusion construct for your assigned gene (Bioinformatics assignment #5)

A. Transcriptional Fusions:

A-1. What directs gene expression?

What directs when and where the gene is expressed? It is the regulatory sequences which are usually upstream of the gene (5′ of the coding strand). This is where transcription factors, activators and suppressors bind to direct transcription of the gene. Some regulatory sequences are occasionally found inside the gene, usually in an intron, and even downstream of the gene (to the 3′ end of the coding strand). Remember in the previous section that we talked about other types of control of gene expression, but we will begin with this one: the control of transcription.

In order to determine when and where a gene is expressed (or to confirm what we’ve found using in situ hybridization), we need to test the regulatory sequences. These sequences contain the information about which cells the gene will be transcribed in and at what stage. So we make a reporter construct; one which will give us a visual read out of where the gene is expressed. The read out comes from attaching a reporter gene to the regulatory sequence of our GOI. This reporter gene can be easily visualized either as a fluorescent protein in the living organism, or through fixing and staining the tissue for the presence of the protein. We don’t usually know all of the regulatory sequences for our gene, so we often begin by testing a large section of upstream (5′ to the sense strand) sequence – up to 5 kb.

Once we determine the expression pattern of the gene, or confirm the expression pattern seen with in situ hybridization, we can begin to test subsets of the regulatory sequences to define the smallest region we can that gives a wild type expression pattern for the gene of interest. This is how upstream enhancer sequences can be identified.

A-2. Reporter genes

Reporter genes are genes that encode proteins that can be assayed relatively easily to see when and where a gene is being expressed, and if you make a translational fusion (section B), where the protein encoded by the gene of interest is located inside the cell. We’ve already talked about one type of reporter gene in Chapter 6: lacZ. The lacZ gene encodes the β-galactosidase enzyme, which interacts with the X-gal substrate (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside) to produce a blue or purplish colour. The lacZ gene is used as a reporter gene in animal tissues and the GUS (uidA) gene in plants encodes an enzyme called β-glucuronidase. It interacts with X-gluc (5-bromo-4-chloro-3-indolyl-β-D-glucuronic acid) and produces a blue or purplish stain. Note that it is the “X” part of these molecules, the “5-bromo-4-chloro-3-indolyl” part that is responsible for the blue colour. The GUS assay will not work in vertebrates or molluscs, as these organisms have GUS activity naturally and you would not be able to distinguish between the activity of the reporter gene and the activity that the organism naturally possesses.

These two reporter genes require histological staining. Tissues must be fixed and stained to see the expression pattern. If you want to follow the changes in gene expression over time, you cannot follow it in living organisms over a long period of time; you must fix and stain many different stages of the organism and make a sort of time line of expression from multiple individuals each of which represents a different stage of development.

Some reporter genes express fluorescent proteins that do not require fixing and staining of tissues but instead fluoresce in the living organisms. You can observe the changes in expression over time in individuals or in cultured cells when these are used. Green fluorescent protein (GFP) is found in the jellyfish Aquorea victoria. GFP fluoresces green (a slightly yellowish green) upon exposure to light in the UVA range (395 nm). Many derivatives of GFP have been developed that fluoresce different colours. This was done through site directed mutagenesis. Enhanced GFP, eGFP, fluoresces more brightly and in a slightly more greenish shade. It was also developed by site directed mutagenesis.

There are other reporter genes that you may learn about but these will be sufficient for this course. If you are interested, you can look up Ds-Red and luciferase which are also commonly used reporters.

A key point is that the reporter gene cannot be a gene that is normally found in the organism- it must come from a different organism. This is why we cannot use GUS staining in vertebrates. The product of the reporter gene- the protein it encodes – also needs to stay in the cell it is transcribed and translated in. If it diffuses from cell to cell we won’t get a clear and accurate read out of the gene’s expression pattern.

In addition, many tissues have autofluorescence; molecules that also emit light when stimulated by blue light, or UV or other wavelengths of light. It is usually possible to distinguish autofluorescence from GFP or some of the other derivatives of GFP but it is something to be aware of. Sometimes staining with GUS or lacZ may be a better approach.

Here is part of my lecture on the topic (Click here for the powerpoint slides presented in the video below):

A-3. Transcriptional fusion constructs

Transcriptional fusions are relatively straightforward to make. The key is selecting the reporter, and generating the regulatory sequence you want to test.

Many vectors are available that already have the reporter gene most often lac Z or GFP. They also have a terminator, so the only sequence missing is a promoter. The cloning site is immediately upstream of the GFP or lac Z gene in the plasmid. So you PCR amplify the presumed regulatory sequence of your gene and clone that sequence into the vector.

In most genes, regulatory sequences are found upstream of the coding region. However, in some genes, regulatory elements (aka cis elements) may also be located in introns (see translational fusions, below), and in a very few cases, downstream of the transcription unit. We won’t cover situations like this; we will focus on the majority of regulatory sequence being upstream with the possibility that some may be inside the gene itself. We don’t usually know where exactly all the relevant regulatory sequences are for the gene we are studying.

A simple approach that can be taken is to use PCR to amplify a large segment of DNA upstream of (5′ to) the start codon and up to about 4 kilobases (kb) further upstream. This fragment will contain the transcription start site, the core and proximal promoter elements binding all the transcription factors for basal level of transcription. Most of the time it will also contain most or all enhancer and suppressor elements that are required for cell, tissue, organ, developmental stage or other specific patterns of gene expression. We generally don’t include neighbouring genes in this amplicon so if another gene is nearby, we usually just amplify the sequence between the two genes. This 4 kb (or smaller) fragment is then cloned into the plasmid with a transcription unit for GFP, lac Z (animals) or GUS (plants). Since the promoter should generate a transcript of the reporter gene, this type of fusion is known as a transcriptional fusion. For many genes this will give complete expression information. But the normal protein product of the gene itself won’t be made because the coding sequence is not included in the sequence that we amplify. So we will see the reporter gene’s protein product in the cytoplasm of the cells in which it is expressed.

B. Translational Fusions:

Translational fusions are more work to make and we have to be careful about stop codons and reading frame (see below). But we can get additional information with a translational fusion – which leads to production of a fusion protein – the protein of our gene of interest fused to the reporter protein. If we make the full length protein and tag it with GFP or another protein, we can see the subcellular location of the protein. Knowing this information can help us make reasonable guesses about what the protein does. If it localizes to the mitochondrial membrane we would guess it might be related to ATP production or programmed cell death perhaps. If it is found on the cell surface it could play a role in cell adhesion, signalling, or transport between cells.

If there are regulatory elements inside the gene that affect the overall expression pattern, we can find this out using the translational fusion – we might see the gene expressed in more or fewer tissues or stages of the organism. And there is additional regulatory information we might find out. Perhaps the gene is expressed – transcribed- but then microRNAs or regulatory proteins prevent translation of the mRNA. We would be able to find this out with a combination of transcriptional fusion (which tells us the gene is transcribed – we would see the reporter protein in the cytoplasm) and translational fusion (though the gene is transcribed, the RNA is not translated so we see little or no protein in the cell). Recall the section on gene expression in Chapter 7A. We may find that the patterns of production of the RNA for the gene and production of the protein of that gene are not the same.

The construct for a translational fusion includes all the same regulatory sequence from the transcriptional fusion but also the transcription unit of the GOI. Because of this, we must PCR amplify a very large amplicon. Special PCR kits are available for making large PCR products. It can be difficult to clone very large inserts. Sometimes we have to assemble the construct in pieces step by step, to make the handling easier. This is where alternative cloning methods like USER, gateway and SLIC can be used effectively. We cover these methods in several later chapters. We can use the same vectors for making translational fusions as we did transcriptional, keeping in mind that there are some extra considerations in the cloning itself (see below- reading frame and stop codons).

When we have made our translational fusion and tested the expression and localization of the protein, there are some additional things to consider. Suppose adding a bunch of amino acids (of the GFP protein) to the C-terminus of the protein we are studying causes it to fold incorrectly and it therefore does not go to its normal location nor does it perform the function it is supposed to perform. We might see the tag on the protein in the cytoplasm and draw the wrong conclusion about where the protein should be in the cell.

To address this we perform a rescue experiment. The details are outside the range of this course but you will know about rescue if you took 302W and/or other MBB courses. In short, if we want to find out if the protein is doing its job we delete the normal gene in the organism so that the only copy the organism has is the tagged copy. If the organism is wild type we know that the protein fused with GFP (the tagged copy) has done its job correctly and we can feel more confident that the expression pattern we are seeing is the correct one.

We don’t HAVE to put the reporter gene at the 3′ end of our gene of interest and thus the C-terminus of the protein. We can put it at the 5′ end of the gene and thus the N-terminus of the protein. If the active site of the protein is closer to the C-terminus it would make more sense to add the reporter at the other end. We could also clone it into the middle of the gene, making sure not to put it into an intron. If we did that it would likely be spliced out after transcription and there would be no tag on our protein. There are different vectors available with cloning sites either before or after the reporter gene. If we want to clone the reporter inside the gene of interest, we generally have to make the GOI-reporter fusion first and then clone the whole thing into the plasmid.

There are two key considerations beyond where to add the reporter gene. The first is reading frame. In your first two bioinformatics assignments a strong focus was finding the reading frame of the gene you were assigned. That was partly in preparation for this. If you are combining two genes in the hopes of making a fusion protein they must be fused in-frame. That means that the reading frame of the first protein leads straight into the reading frame of the second. If the frame is shifted by one or two nucleotides, then during translation the second protein will not be made. Whatever amino acids are added after the first protein will be incorrect and quite soon the ribosomes will encounter a stop codon. And the second key point is that the stop codon between the two genes must be removed. Otherwise the fused genes will be transcribed but translation will stop at the end of the first gene. Ribosomes don’t know what protein you are intending to make; they just translate whatever they encounter. So we have to be 100% correct with the instructions we provide them. Below is a sketch that shows two genes to be fused together. Only the first few and last few codons are shown for each of them. Whether we use restriction enzymes or alternative methods, the aim is to produce the fusion sequence that is shown below them. The stop codon of the first gene has been eliminated and the reading frame of one gene matches the frame of the second. All the correct amino acids will be produced. Imagine if there is just on extra A between the two genes. Figure out which amino acids will be made starting at that A. You can see that the reading frame will be shifted and the amino acids will all be wrong.

To be sure that you clone your gene into the vector in the correct frame you can buy kits that contain 3 versions of the vector. You cut the vector with the restriction enzyme you are going to use and each version of the vector is in a different reading frame. So one of these is going to be the correct reading frame for your intended fusion construct. Once you have figured out reading frame for a construct, you get a better feel for it and it isn’t not so awful. The first time, it is quite difficult but it gets easier with practice.

Below is the second part of my lecture on this topic (click here for the powerpoint slides presented in the video):

There will be lots of practice in planning the design of translational fusions this semester. You will have several assignments, worksheets etc. that will allow you to get more familiar with this kind of design.

Once we know where our gene is normally expressed, we can start to probe more deeply into its function by either misexpressing it or by reducing or eliminating its expression. Each of these manipulations may provide information about the potential function based on the phenotype we see resulting from the change we made in the gene’s expression. That is the subject of Chapter 9.

Previous (Chapter 7 ) Next (Chapter 9)

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License