Chapter 17: Introducing DNA into Eukaryotic Cells

Introduction

Here we will briefly cover how we get DNA – our engineered constructs into eukaryotic cells. We’ve already talked about transformation of bacteria in a previous chapter. Some of the same techniques can be used in eukaryotic cells as well. But there are additional methods as well.

There is some useful terminology to review here:

Transformation = plasmid into bacteria

If the vector is a virus, it is called transduction. If we introduce the construct into mammalian cells via a virus, it is called transfection. Go figure.

 

We will start with a short overview of expressing proteins in bacteria – the advantages and disadvantages and some strategies for isolating the protein.  For some types of protein, eukaryotic expression systems – mammalian cells, insect cells, yeast, etc. are necessary. That leads us then into the introduction of DNA into eukaryotic cells.

 


Learning Outcomes

  • Explain the advantages of using bacteria to express a protein
  • Explain the issues with eukaryotic protein expression in bacteria, and the approaches that help
  • Give examples of promoters used in expression vectors
  • Describe the methods of introducing DNA into eukaryotic cells
  • Distinguish between screening and selection
  • Distinguish between the four types of viral vectors

A. Expressing proteins in bacteria:

A-1. Pros and cons of bacterial expression

It is very cost efficient to make a construct that allows the expression of a protein in bacteria. They are not expensive to grow in large amounts, and the constructs are not difficult to make. We know quite a lot about bacteria and thus they are fairly easy to work with.

If you do want to express a human or other eukaryotic protein in bacteria you must obviously clone the cDNA into the construct. Bacteria do not have introns and thus have no splicing machinery.  Consider what happens when an intron is left in the sequence introduced into bacteria, and the DNA is then transcribed and translated.

There are some other challenges as well.

Codon usage: Most amino acids have at least two different codons that can specify that amino acid. Several have 6 different codons. Different organisms tend to favour just some of the codons for a particular amino acid. For example, there are 6 codons for arginine: CGT, CGC, CGG, AGG, AGA, CGA.  In bacteria, the last three are not used very often so there is not much (if any) of the corresponding tRNAs expressed in the cells. This can greatly affect the amount of protein that can be produced if your gene of interest has a lot of these last three codons.

It is worthwhile checking the codons used in the gene you are studying.  Sometimes it is worth the effort of synthesizing a “codon-optimized” version of the gene you want to express. This would involved site-directed mutagenesis to change the codons that bacteria tend not to use into codons more often used by bacteria. If you were planning to synthesize human insulin, for use in treating diabetes, it would be worth the time and expense to alter the gene sequence to allow expression of this protein.

Some proteins require post-translational modification before they are active. They may need to be cleaved, or have sugar groups added to certain amino acids (called glycosylation). Additional types of modifications include the addition of acetyl or methyl groups, ubiquitination, phosphorylation, the addition of lipid groups and others. Bacteria do not have the ability to make these modifications. In the case of human insulin, the protein is made first as preproinsulin. The first 24 amino acids form a signal peptide that directs the insulin out of the cell. This is cleaved off, leaving 84 amino acids of proinsulin. The proinsulin is then folded and crosslinked and cleaved into an A chain and B chain and another peptide of 33 amino acids. The construct used to express the insulin in bacteria has the coding sequences for the A and B chains separately cloned so that none of the cleavage and modifications that take place normally are needed, since bacteria cannot perform those modifications.

This video shows how human insulin is produced in bacteria. It is a good brief description of how researchers got around the problems of post-translational modification of the insulin protein. You are quite accustomed to computer voices at this point in the semester.

 

 

 

Some proteins may not fold correctly in bacteria. In eukaryotic cells, some proteins don’t naturally fold into the correct conformation. Special proteins called chaperones assist the correct folding of these proteins. Bacteria don’t have these chaperones. So the mis-folding of a protein expressed in bacteria is a problem that may not be easily fixed. A eukaryotic expression system might be necessary.

 


A-2. Goals of bacterial expression

When we express proteins by recombinant DNA technology, we have simple goals. We want a large amount of the protein, we want it to be soluble and properly folded, and we want to be able to isolate it easily.  Though these goals are simple and straightforward, sometimes they are not easily achieved.

When we express large amounts of proteins that are not normally found in bacteria, these proteins sometimes don’t function as you would expect. Sometimes the large quantities of these proteins inhibit the bacterial growth. In that case, although there is a fairly good amount of protein in each cell, the cells don’t divide well enough for us to get a good yield.  Sometimes the large amounts of proteins being expressed in the cells lead to the proteins aggregating, sticking together, which may make them difficult to isolate. In such a case it is a good idea to have an “inducible” system (described more later). We put our gene of interest under the control of a promoter (like the lac promoter, see below) that we can control so that we can first grow large cultures of cells, and then turn on the high expression of the gene once we have enough cells. This avoids the problem of high expression of the protein inhibiting cell growth – we separate the growth of the cells from the expression of the protein with an inducible promoter.

It can also be difficult to separate the protein we are expressing from all the bacterial proteins.  To deal with this, we can add a “tag” to the proteins to facilitate their isolation.

Click here for powerpoint slides presented in video below.


A-3. Promoters

The promoters used in expression constructs vary, but are frequently strong viral promoters such as: T3, T7, SP6. These come from the T3 bacteriophage, the T7 bacteriophage and the SP6 bacteriophage (surprise!).  E. coli promoters are also used sometimes, such as the lac operon promoter. The advantage of this promoter is that it is inducible – we can turn on the expression of the protein at a time of our choosing.

It is also possible to combine promoters to gain the features of both. The T7 promoter is a very strong promoter that gives robust expression of the gene of interest. If the T7 polymerase is put under the lac operon promoter, this allows us to turn on expression of our gene (which is under the T7 promoter) by inducing expression of the T7 polymerase. Recall from an earlier chapter that the lac operon is negatively regulated by the lac repressor. This protein binds the operator (right next to the promoter) and prevents expression of the operon genes. When lactose is present, it binds the repressor and removes it from the operator to allow strong expression of the operon genes. In the culture of bacterial cells we can use IPTG to bind to the repressor and remove it from the operator. The cells we use in this case MUST have the functional lac I gene, which encodes the repressor. In cells that have had this gene deleted, the lac operon is on all the time and is no longer inducible.

 


A-4. Types of fusion proteins

In order to address problems of solubility and ease of isolation, we can make various types of fusion proteins, by – of course – modifying the genes so that we can produce these fusions. Here are a few examples (there are others, which you may learn about in other courses):

We can add some amino acids to our protein that will facilitate its isolation. Two examples are GST and 6X His. In the former, the gene for glutathione-S-transferase is fused in frame with the GOI, and in the latter the nucleotides to encode 6 histidines are added to the end of the gene sequence. In both cases, a column purification is performed to isolate the proteins.

In the 6X His method, the histidines are strongly attracted to certain metal ions such as nickel or cobalt. The cells, with their expressed proteins, are lysed and the lysate is applied to the column which is packed with nickel ion conjugated beads (or some other metal ions). The histidines attach to the beads while most other proteins flow through. The column is washed multiple times to remove all but the most strongly bound proteins, which should be the expressed proteins. Then the proteins are eluted off the beads by altering the pH (lowering it to about 4.5) which reduces the attraction between the protein and the nickel ions. Alternatively, a competitive inhibitor such as imidazole is used. This compound is more strongly attracted to the nickel ions than the histidines are and so the imidazole displaces the protein from the beads. The protein is then collected in the eluate.

Click here for powerpoint slides presented in video below.

In the GST fusion example, you have produced a fusion protein consisting of your protein of interest and the GST protein. There is an endopeptidase motif between them – some amino acids that can be cleaved by a certain peptidase. This is how you will separate the GST from the protein once the isolation is complete. The cells are lysed and the lysate is applied to a column as before but in this case the column has glutathione on the beads. The GST enzyme binds to the immobilized glutathione, which is the substrate for the reaction it catalyzes. Other, non-tagged proteins flow through the column and are washed away in several wash steps. The fusion protein is eluted off of the column by the addition of large amounts of free glutathione, to which the GST will bind. The protein with the G-GST bound can now be collected in the eluate.  The GST is cleaved from the protein using the endopeptidase, as mentioned above. Some proteins that are produced in bacteria can form inclusion bodies, which are lipid bound aggregates of the protein – this makes them really hard to isolate. GST is a very soluble protein; the use of a GST fusion protein helps promote the solubility of our protein product as well as facilitating the isolation of the protein and reducing the chances that the protein will aggregate or form inclusion bodies.

A signal peptide can be added to the protein of interest to secrete it from the bacterial cells. If necessary the signal peptide can then be cleaved from the protein after isolation.

Also, sometimes we want to isolate proteins that naturally embed in membranes and these are particularly difficult to isolate.  The GST fusion method can help.  Depending on exactly what we want to do with the protein it might also be possible to generate a version of the protein that lacks the membrane targeting sequences. This is only going to be feasible occasionally, because most of the time you want the complete and functional protein.

 

Click here for powerpoint slides presented in video below.


B. Introducing DNA into Eukaryotic Cells:

This second section of the chapter covers some of the methods used to introduce DNA into eukaryotic cells. For some protein expression we need to use eukaryotic cells, and we work with eukaryotes experimentally as well, particularly in medical research.

In prokaryotes, we use heat shock or electroporation to transform the cells. We can use these methods for some eukaryotic cells too, but there are multiple additional methods used in eukaryotes, some of which may surprise you.

The methods are divided into biological and physical/chemical methods.

Biological Methods:

B-1. Viruses and mammalian cells

Viruses are efficient for getting DNA constructs into mammalian cells; nearly all the virus particles used infect cells. This means this method is suitable for producing large amounts of protein in cultured cells, or for RNAi, and also for gene therapy. We won’t have time to cover gene therapy this semester, but these viral procedures are the method by which DNA can be introduced into cells such as bone marrow cells of patients with certain cancers in an attempt to treat the illness.

The disadvantage of working with viruses is that it is a lot of work to generate the virus particles that contain your construct. There are many safety considerations as well when working with viruses. You can purchase kits that make it simpler to produce the desired viral vector and to transfect the cells. These are quite expensive, which for most labs is a consideration.

B1-i. Types of viral vectors

DNA VIRUSES

Adenoviruses are DNA viruses that cannot integrate into the genome of the host cell. These commonly infect mammalian cells and thus can cause a rapid immune response in the cells. This is an important consideration because if you are running an experiment you must be aware of the possibility of the immune response confounding the results. Human adenoviruses are small, and are associated with many different types of infections but they generally cause respiratory infections and gastrointestinal tract infections.

Adeno-associated viruses are also DNA viruses that cannot integrate into the host genome but because they don’t commonly infect mammalian cells, they are unlikely to raise an immune response.

Because these two types of viruses don’t have the ability to insert their DNA into the host cells’ genome, they are used for transient assays- described more below.

RNA VIRUSES

γRetroviruses are RNA viruses that provide reverse transcriptase along with their RNA genome into their host cells. The RNA is reverse transcribed and then the DNA is integrated into the host genome. The DNA can only be integrated into the host genome during cell division when the nuclear envelope has been dismantled. This means that the DNA cannot otherwise enter the nucleus.

Lentiviruses are also RNA viruses that provide reverse transcriptase along with their RNA genome into the host cells. Their DNA can integrate into the host’s genome too, but can do so in both dividing and non-dividing cells.  HIV (human immunodeficiency virus) is an example of a lentivirus. There are related viruses that cause immunodeficiency in cats (FIV) and in monkeys (SIV).

In some cases, the integration of the DNA into the host genome is not great because you cannot control where the integration takes place. The insertion of the transgene into a gene in the host cell genome could cause inactivation of that gene. This complicates interpretation of experimental results and can be a negative consequence of gene therapy. In the late 1990s, an attempt to treat severe combined immunodeficiency disease (SCID) by providing a wild type copy of the gene in the stem cells of the bone marrow was partially successful, but a number of the children who were treated this way eventually developed leukaemia. The transgene delivered by the retroviral vector preferentially inserted into a gene called LMO2 and activated it. This led to the development of the cancer. Research is ongoing in this area – currently there are many projects working toward making the use of viral vectors for gene therapy safer. And there are some promising cases of gene therapy, for example, cystic fibrosis, using pigs as a model system.

The host range of both types of RNA viruses are determined by their envelope proteins. These viruses infect cells, through binding to receptors on the cell membrane; their membrane envelope fuses with the cell membrane, allowing the viral capsid to enter the cell and release the RNA genome into it (this is called uncoating).  Reverse transcription produces a DNA copy of the viral genes that can integrate into the host’s genome. The viral genes are transcribed; some are translated to produce the capsid proteins, and the reverse transcriptase and other enzymes that are packaged into the newly formed viruses and the envelope proteins that will be embedded in the membrane that surrounds the virus. The viral RNA genome is also transcribed and is incorporated into the newly formed virion. This is exocytosed from the host cells – as it is sent out of the cell, it is enclosed in cell membrane, with the envelope proteins embedded in it. These proteins will help the virus infect a particular type of host cell in subsequent infection.

Click here to download the powerpoint slides presented in video below.

The genome of retroviruses (including lentiviruses) is bordered by two long terminal repeat (LTR) sequences. These are needed for the integration of the viral genome into the host genome.  The psi sequence (ψ) at one end of the molecule, is a packaging signal – it is needed to put the viral genes into the newly formed capsid.  The gag region encodes coat proteins, for producing the capsid, inside which the RNA genome is held. The pol region encodes reverse transcriptase, and sometimes additional proteins such as an integrase, and other enzymes needed for viral replication. And the env region encodes the viral envelope proteins – as stated already, these dictate the host range – the types of host cells that the virus can successfully infect.

When making viral vectors, it is important to be very safe. You want to ensure that once inside the cell for therapeutic or research purposes, the introduced sequences cannot somehow reconstitute infectious virus particles.  For this reason the functions of the viral genome are divided between multiple vectors so that when the viral vector with your gene of interest is produced, it is able to infect the cells and introduce your gene into the host cells but cannot produce more virus particles. In this way we use the efficiency of viral infection but can avoid the virulence.

The vectors made include:

  • The transfer vector which contains the LTRs, packaging signal and the genes to be transfected into the host cell between the LTRs, instead of the viral genes.
  • The packaging vector, which contains the gag and pol regions of the viral genome and thus produces the proteins needed to make new virus particles, except these will contain your gene or genes to be introduced into the host cells.
  • The envelope vector, that contains the env region, and thus produces the envelope proteins to direct the virus to the desired host cells.

The three vectors are mixed and used to infect cells. These turn the host cells into a virus factory that secretes many virus particles that all have your gene(s) of interest between the LTRs of the genome. These are collected from the media surrounding the cells. They can now be used to transfect the host cells and the gene(s) of interest can be incorporated into the genome of the host. Here the infection process ends, because no new viruses can be made.

There is a lot more detail that we could go into, but I think this is enough as an introduction.

Click here to download the powerpoint slides presented in video below.


B-2. Plants: Agrobacterium transfection

B2-i. T-DNA

Agrobacterium tumefaciens is a soil dwelling bacterium that can gain entry into a variety of plants and transfer DNA from its Ti plasmid into the genome of the plant cell. This causes formation of a crown gall tumour; a group of undifferentiated plant cells that are reprogrammed to produce nutrients – specialized sugars and amino acids – to feed the bacteria that are living inside the tumour. We’ve covered this process in some detail in a previous chapter, so you may wish to review this information before proceeding.

The Ti plasmid is very large – over 200 kb in length. The DNA that is normally transferred to the plant genome is between two short segments of DNA called the left border (LB) and right border (RB).  That DNA contains the genes that direct tumour formation and the production of nutrients for the bacteria.

How the T-DNA (the part that is transferred to the host genome) is actually integrated into the genome is not fully understood, but it is thought that when a DNA break occurs and is undergoing repair, there is an opportunity for the T-DNA to pair with the DNA at the damage site. If this occurs, it can be integrated into the DNA at the site of the damage, probably through a homology dependent repair mechanism. The amount of homologous sequence leading to the base pairing between the host DNA and the T-DNA need only be a few base pairs; it is not an extensive region of homology. Because only the LB and RB sequences are required for T-DNA integration into the host genome, it is likely that the short regions of homology are limited to these regions.

The process of Agrobacterium infection of plant cells has been re-engineered for the introduction of genes of interest into plant cells. The genes for production of plant hormones and opines are removed and these are replaced with an antibiotic resistance gene for selection, and the gene or genes of interest. The Ti plasmid is very large, and furthermore, it is a low copy number plasmid, which is difficult to work with. This issue has been addressed by making a binary system with two plasmids in the bacterium. One is the Ti plasmid, which contains the genes for virulence, so that the Agrobacterium can still infect the plant cell. But the other is called the binary plasmid, and it contains the DNA to be introduced into the host genome, between the RB and LB sequences. This binary plasmid is much smaller  (around 10 kb to 25  kb) than the Ti plasmid.

What features are found in this binary plasmid?

  • There are two different origins of replication, one that is specific for E. coli, and one that is specific for A. tumefaciens. When we are making the vector and growing up large amounts of it, we work with it in E. coli. When we want to infect plants with it, we must do so using A. tumefaciens.
  • The plasmid is a high copy number plasmid, so it is easy to purify large amounts of it to use in cloning experiments
  • The plasmid is small, around one tenth the size of the wild type Ti plasmid, and it contains a small number of unique restriction sites. These cut only once in the plasmid and facilitate cloning the gene(s) of interest into the vector.
  • The LB and RB sequences; the gene of interest is cloned between these, along with a selectable marker gene to be used to select plant cells that contain the construct
  • There is also an antibiotic resistance gene outside the right border and left border sequences, for selection in bacteria.

Click here to download the powerpoint slides presented in video below.

B2-ii. Transformation of the plant cells

There are several approaches that are taken to transformation of plants, with different plants needing different methods.

In some cases, a tissue culture approach is needed. Explants are made, in which a part of the plant, usually a leaf (or seeds also are commonly used), is wounded and then  grown on a special type of agar that induces callus formation. These are undifferentiated plant tissues that can be removed, and cultured on another type of media that contains plant hormones to stimulate the differentiation of the callus into an embryo. The plant cultures are exposed to A. tumefaciens that contain your gene(s) of interest, and after a short time they are selected on an antibiotic such as hydromycin. Only transformed plant cells will grow on this media. The cultured plant cells are then transferred to a series of types of media that promote formation of  small plants that can then be grown into larger ones.  The entire process takes several months.

Plants such as Arabidopsis, a commonly used genetic model organism, can be transformed in a quicker process, called the floral dip method. In this case, the immature flowers of a plant are dipped into a solution containing the Agrobacteria, which will infect some of the cells in the flowers. If an egg cell is infected, the resulting seed will contain the construct in its genome. And the seed will be able to sprout and grow into a plant on media containing the antibiotic.

In other experiments, the Agrobacteria can be introduced into leaves by force, using a syringe (without a needle).  This is generally done for transient assays – for instance, if you just wanted to test the effect of your gene on the phenotype of leaves, you could see the effect for a short time after transformation. In this case, the transgene is stably inserted into the genome of the host cells, but these will not form seeds and so are not made into a stable line of plants carrying your transgene.

Click here to download the powerpoint slides presented in video below.


C. Introducing DNA into Eukaryotic Cells – Physical and chemical methods:

C-1. Screening and selection

We’ll start by revisiting transient vs stable introduction of DNA into recipient cells. Transient assays are temporary – they are much easier to do and they allow us to introduce some DNA and check the immediate response to it in the cells (this is usually done in cultured cells). You have a few hours to maybe a few days to observe the effect. The introduced DNA is not stable in the cells and is degraded by nucleases.

A stable introduction of DNA into a cell involves the DNA being integrated into the genome of the cells. It is replicated and passed on to daughter cells along with the rest of the genes one the chromosome of the organism.  It is more work to make and characterize the transgenic organisms or cells but you then have a permanent strain or line of the organism to do further investigations. Because the integration of the DNA into the organism’s genome is relatively rare, we use screening and selection to identify the organisms in which this event has occurred.

In selection, we provide the transgenic organisms the means to survive under selective conditions, in which the non-transgenic ones will not survive. You are familiar with selection already because we use it all the time when plating transformed cells on plates containing agar with antibiotics added to it. In plant and animal cells, the gene for hygromycin resistance, called Hygromycin phosphotransferase (HPT) is commonly used. In screening, we use a visual method to identify cells that have been transformed. Often it is the expression of a marker gene, also called a reporter gene. Technically the two uses are different but the genes are generally the same ones, GFP, lacZ, etc. and people tend to use the two terms interchangeably.

The chemical methods of transformation include the cation/heat shock method you are already familiar with, and the use of liposomes, lipid vesicles, to deliver DNA into cells.

The physical methods may surprise you. One is electroporation which we have already discussed, another is injection, but we can also use microprojectile bombardment. This last is literally shooting the DNA into prepared plant cells!


C-2. Heat Shock: Yeast

Yeasts are very interesting organisms because they are eukaryotes, but they grow like microorganisms, in a dish. We can work with them on a large scale, like bacteria, but with the advantage that yeasts can splice introns and make at least some of the post-translational modifications  (and folding) we want for our expressed proteins. Some human insulin is manufactured in yeast cells at large scale.

We can heat shock yeasts to introduce our plasmids into them.  The procedure is very similar to that of bacteria except that the steps take longer and the temperatures are different. We grow bacteria at 37oC but we grow yeasts usually at 30oC. Instead of growing them overnight, we grow them for 3 days before colonies are formed. Heat shocks for bacteria are usually 45-60 seconds, whereas for yeasts,  5 -15 minutes is common, though I’ve seen protocols that recommend 45 minutes!

Click here to download the powerpoint slides presented in video below.

 


C-3. Lipofection

When we use liposomes to introduce DNA into cells, it is called lipofection (more new terminology).  The DNA we want to add to our cells is incubated in a solution with positively charged lipids (cationic lipids).  This causes the spontaneous formation of lipid vesicles hat surround the negatively charged DNA. This DNA/Lipid complex is called a lipoplex.  When the lipoplex-containing solution is exposed to cells, the cells can take up the lipoplexes by endocytosis. Once inside the cell the endosome is degraded, which releases the DNA.  The DNA can find its way to the nucleus during mitosis, when the nuclear membrane has been disassembled.

 


C-4. Microinjection (mouse embryos)

This method works in other organisms, too, but we’ll use the example of mouse embryos.  The DNA to be integrated is injected into a fertilized mouse embryo before nuclear fusion has taken place. The DNA is injected directly into the male pronucleus, which is larger than the female pronucleus, and is perhaps an easier target. The injected DNA is integrated into the nuclear DNA most likely during repair of DNA nicks.  Multiple repeats of the transgene sometimes are incorporated into the chromosome of the host cell.  The injected embryos are then transferred to the oviducts of a mouse that is “pseudopregnant”; she has been treated with hormones so that she can support the development of the transferred embryos.

To screen the offspring, a coat colour selection system is used. If the injected embryos have white fur, and the injected DNA construct has a gene for brown fur, then offspring carrying the transgene will have brown fur.

Injection is used in many other organisms, and if you have taken developmental biology you may remember some of them.

Click here to download the powerpoint slides presented in video below.


C-5. Electroporation

Electroporation works as a transformation mechanism in many cell types. The principle is the same as with bacteria. The cells are put into an electroporation cuvette in solution and a short (milliseconds) electric current of 200 – 2000 volts is applied. This induces temporary pores or channels in the cell membrane though which the DNA construct can enter the cell.  Careful cleaning of the cuvettes between use is necessary and the amount of salt must be kept very low, as already described in Chapter 6.

 


C-6. Microprojectile bombardment method

In this method, we literally shoot the DNA into plant cells. The plasmid DNA is delivered on tiny particles of tungsten or gold.  These particles are put on a disk inside the “gene gun”. When the trigger is pulled, the disk is pushed forward by either gas pressure or by shot-gun pellets. The disk moves ahead until it hits a screen; at this point the disk stops moving but the DNA coated particles continue moving at high speed and penetrate the plant cells.  Some of the DNA will find its way into the nuclei of the tissue. The way the DNA is incorporated into the genome is not known, but it is probably similar to other methods in which it is thought that the DNA is incorporated into the genome as a mistake during repair. It is known that a plasmid that has been linearized by restriction digest before being used to coat the gold or tungsten particles has a higher likelihood of being incorporated into the genome.  And the transgenes are frequently incorporated as tandem (right beside each other) repeats.

Click here to download the powerpoint slides presented in video below.


 

Previous (Chapter 16)                                                                                                                                                             Next (Chapter 18)

License

Share This Book