Creating a system for positive and negative selection of Cas9
Cas9-mediated mutation of human genes is well established, and improvements in Cas9-mediated mutagenesis continue to be developed. One route to improved creation of mutant cell lines is to use an expression system that allows one to enrich for cells that express high levels of Cas9, yet subsequently delete the introduced Cas9 gene and select for cells that lack its expression.
Toward this end, we generated a new Cas9 expression vector, pFF4 (Fig. 1A). This plasmid contains a single PolII transcription unit controlled by the cytomegalovirus (CMV) early enhancer/promoter. The CMV promoter is followed by a pair of loxP sites that flank a single yet polycistronic open reading frame (ORF). This polycistronic ORF encodes, in succession, (1) Cas9-3xNLS, a form of the SpCas9 protein, containing three copies of the SV40 nuclear localization signal, (2) the 19 amino acid-long porcine teschovirus 2a peptide (p2a; ATNFSLLKQAGDVEENPGP), (3) enhanced green fluorescent protein (EGFP), (4) another copy of the p2a peptide, (5) the herpes simplex virus thymidine kinase (HSVtk), (6) a third copy of the p2a peptide, and (7) Puro, a puromycin-resistance protein. Downstream of the 3’ loxP site is a polyadenylylation signal from the bovine growth hormone gene. As was done previously for SpCas9 and EGFP, the HSV TK.2 and PuroR.2 ORFs are distinct from the original HSV TK and PuroR gene sequences, respectively, in that the HSV TK.2 and PuroR.2 sequences were designed for optimal human codon usage and synthesized in vitro.
Translation of the encoded polycistronic mRNA is predicted to yield 4 separate polypeptides due to the action of the three copies of p2a peptide sequence, which inhibits peptide bond formation between the penultimate glycine and terminal proline of the peptide. The four predicted protein products are, in succession: (i) Cas9-3xNLS-p2a(18), which should carry out Cas9 functions, be more efficiently imported into the nucleus due to its inclusion of three copies of the nuclear localization signal from SV40T antigen (data not shown), carry the first 18 amino acids of the p2a peptide (ATNFSLLKQAGDVEENPGCOOH) at its C-terminus; (ii) EGFP-p2a(18), which should confer green fluorescence on expressing cells and consist of EGFP with an additional proline at its N-terminus and the p2a remnant peptide (ATNFSLLKQAGDVEENPGCOOH) at its C-terminus; (iii) HSVtk-p2a(18), which should confer ganciclovir sensitivity on puromycin-resistant cells due to its (relatively) non-specific kinase activity, carry an extra proline at its N-terminus, and carry the p2a remnant peptide (ATNFSLLKQAGDVEENPGCOOH) at its C-terminus; and (iv) Puro, which should confer resistance to the antibiotic puromycin due to its puromycin N-acetyltransferase activity and have an extra proline at its N-terminus.
The pFF4 plasmid also contains two PolIII-transcribed genes. One consists of the human 7sk promoter driving expression of a Cas9 guide RNA (gRNA) that targets exon 2 of the human CD63 gene, and one consists of the human H1 promoter driving expression of a Cas9 guide RNA (gRNA) that targets exon 4 of the human CD63 gene.
Creation of CD63-deficient human cells
HEK293 cells were transfected with pFF4. 12 h later the cells were examined by immunofluorescence microscopy, which revealed that more than half the cells expressed detectable levels of EGFP fluorescence and that nearly all cells still expressed detectable levels of CD63. The cells were then diluted into normal growth medium containing puromycin and grown for an additional 5 days. An aliquot of the mixed pool of transfected cells were again examined by immunofluorescence microscopy and we now observed a smaller percentage of EGFP-positive cells but a higher percentage of cells that no longer expressed detectable levels of CD63. These results are consistent with the fact that transfected cells express higher transgene levels shortly after transfection, and that introduction of null mutations in a gene will not be reflected until previously synthesized mRNA and protein have to turn over before loss of the protein can be observed. On day 5 after transfection the cells were also diluted into growth medium lacking puromycin and diluted at an average of 1 cell per well in 96 well plates to generate single cell clones (SCCs).
Protein lysates were prepared from 40 different SCCs and from the parental HEK293 cell line. Immunoblot analysis using antibodies specific for CD63 indicated that 9/40 SCCs lacked detectable levels of CD63 protein. However, this initial screen was uncontrolled, and we therefore performed a more detailed analysis of 4 of the candidate CD63-deficient SCCs. Specifically, we grew 3 or more separate cultures of each of the 4 SCCs and the parental HEK293 cell line (actually, 6 separate cultures in the cases of HEK293 and SCCCD63ko-23 cells), lysed the cells, and processed them for immunoblot using antibodies specific for (Fig. 1B, upper panels) CD63 and (Fig. 1B, lower panels) beta-actin. We failed to detect CD63 protein in the lysates from the SCCs SCCCD63ko-4, SCCCD63ko-15, and SCCCD63ko-21 in each of three trials, and for SCCCD63ko-23 in each of 6 trials. In contrast, we detected actin in all samples, even in the lysates that had been generated from the SCCCD63ko-21 cell line, which grows far more slowly than the parental HEK293 cells or the other SCCs.
To determine whether these four SCCs had mutations in the expected positions of either exon 2 or exon 4 of the CD63 gene, we performed exon-specific PCR on each genomic DNA (gDNA) preparation and (i) determined the size of the PCR products by agarose gel electrophoresis, and (ii) cloned the PCR products and determined the DNA sequence of the PCR products by sequencing the inserts in multiple plasmid subclones. We hoped to identify no more than two mutant alleles in each SCC, as CD63 is located on chromosome 12, whereas HEK293 cells are triploid for the X, tetraploid for chromosomes 17 and 22, yet diploid for much of the remaining genome. Amplification of exon 2 from the four CD63-deficient SCCs yielded PCR products of varying size, some of which were of the same size as the fragment that we amplified from HEK293 gDNA, and some that were larger or smaller. All PCR products amplified from HEK293 cells had the WT sequence.
Amplification of the exon 2 containing fragment from SCCCD63ko-4 gDNA yielded two products, one that was close to WT size and another that was ~100 bp longer. Sequence analysis showed that the longer product had a 113 bp insertion at the position noted in the figure, whereas the normal-sized PCR product had a bp deletion at the noted position. PCR products amplified from SCCCD63ko-15 gDNA all had the same sequence, indicating that this mutation may have been generated on both alleles. The SCCCD63ko-21 exon 2 PCR products also appeared to be of WT size, but sequence analysis of subclones identified two classes of PCR products, one with a 1 bp insertion and another containing a 4 bp deletion. The SCCCD63ko-21 exon 2 PCR products appeared to be slightly smaller than WT, as did the inserts in the subclones that were generated using the PCR products. Sequence analysis was consistent with our gel electrophoresis results, as one set of subclones had a 13 bp deletion and the other had a 10 bp deletion. All of the mutations detected in these CD63-deficient cell lines are predicted to shift the CD63 reading frame, resulting in the potential expression of proteins that encode only the first 44–46 amino acids of CD63 followed by varying but short stretches of “junk” amino acids. Given that these mutations all introduce premature stop codons in exon 2 of this 7 exon-containing gene, they are also predicted to induce nonsense-mediated RNA decay of the CD63 transcript.
In contrast to what we observed at exon 2, the exon 4 PCR products amplified from control or SCC gDNAs displayed only the WT size and sequence, indicating that no mutations were introduced at this site. Control experiments in which we amplified and sequenced the exon 2 and exon 4 amplicons from HEK293 genomic DNA revealed that the sequences of these regions of the CD63 gene in HEK293 cells were identical to that reported in the human genome sequence and had the target sequences complementary to the gRNAs encoded by pFF4.