What this is
- This research compares the genome editing capabilities of two Cas9 proteins: SpCas9 and SaCas9.
- The study evaluates their editing efficiencies, off-target effects, and optimal spacer lengths in human cells.
- Findings suggest that SaCas9 outperforms SpCas9 in terms of efficiency and specificity, making it more suitable for therapeutic applications.
Essence
- SaCas9 demonstrates superior genome editing efficiency and reduced off-target effects compared to SpCas9, particularly with optimal spacer lengths of 22 nt for SaCas9 and 20 nt for SpCas9.
Key takeaways
- SaCas9 achieves higher editing efficiencies than SpCas9 across 11 target sites in human induced pluripotent stem cells (iPSCs) and K562 cells.
- SaCas9's optimal spacer length for effective editing is 22 nt, while SpCas9's optimal length is 20 nt, indicating SaCas9's greater sensitivity to spacer length.
- SaCas9 exhibits approximately 20-fold lower off-target effects than SpCas9, enhancing its potential for precise gene editing in therapeutic applications.
Caveats
- The study focuses on specific cell types (iPSCs and K562), which may limit the generalizability of the findings to other cell types or organisms.
- While SaCas9 shows improved performance, it may still be less effective than SpCas9 in certain contexts or specific target sequences.
Definitions
- CRISPR-Cas9: A genome editing technology that uses a guide RNA to direct the Cas9 endonuclease to specific DNA sequences for modification.
- NHEJ: Nonhomologous end joining, a DNA repair pathway that directly joins broken DNA ends, often leading to insertions or deletions.
- HDR: Homology-directed repair, a precise DNA repair mechanism that uses a template to guide the repair process.
AI simplified
Introduction
The clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR associated protein 9 (Cas9) is a powerful tool for gene editing and is widely used in basic research and clinical gene therapies [1], [2]. In this system, Cas9 endonuclease is directed by a programmable single-guide RNA (sgRNA), which is complementary to the target DNA sequence upstream of a protospacer adjacent motif (PAM) [3]. After base pairing of sgRNA and DNA, Cas9 induces DNA double-strand breaks (DSBs), which are usually repaired by nonhomologous end-joining (NHEJ), microhomology-mediated end-joining (MMEJ), or homologous recombination (HR) [4], [5]. In the presence of a homology-directed repair (HDR) template donor, precise gene knock-in or correction can be realized [6]. SpCas9 from Streptococcus pyogenes and SaCas9 from Staphylococcus aureus are the most widely used Cas9 orthologs in the CRISPR genome editing system.
The SpCas9 protein recognizes the PAM sequence of NGG, which appears every 8 bp in the genome [7]. Numerous studies have demonstrated the vigorous nuclease activity of SpCas9 in various prokaryotic and eukaryotic organisms [1], [8], [9]. However, the large size of SpCas9 (1368 amino acids, ⌠4.1 kb) limits its further applications for adeno-associated virus (AAV)-based in vivo gene therapy, as the limited cargo size of the AAV vector prevents co-packaging of both SpCas9 and sgRNA expression cassettes [10]. Therefore, SpCas9 is commonly used in cell-based research and is less attractive for in vivo delivery. Furthermore, multiple reports have shown that SpCas9 is more likely to cause staggered breaks, leading to NHEJ-mediated error-prone DNA repair outcomes and mutations [11], [12]. Since NHEJ is highly active throughout the cell cycle in various adult cell types, many strategies have been developed to inhibit NHEJ to increase the efficiency of precise genome editing [4], [13]. As such, Cas9 orthologs with a minor bias to staggered cleavage are expected to favor HDR-mediated gene knock-in.
SaCas9 is a protein of 1053 amino acids, which is over 300 amino acids shorter than SpCas9 [14]. Benefiting from its small size, SaCas9 can be packaged with an sgRNA expression cassette in a single AAV vector. It recognizes an NNGRRT (where R is A or G) PAM, which appears every 32 bp in the genome [15]. The unique PAM pattern of sgRNA reduces the probability of SaCas9 finding suitable target sites, leading to the postulation that SaCas9 may have higher specificity than SpCas9 [16]. Indeed, several reports have shown that SaCas9 targets DNA with high specificity compared with SpCas9 [14], [17]. However, the impact of spacer lengths on its on-target and off-target effects, the indel patterns, and the HDR editing efficiency of the SaCas9-sgRNA editing system have not been comparatively investigated.
This study rigorously compared SaCas9 with SpCas9 at 11 target sites with clinical application prospects in human induced pluripotent stem cells (iPSCs) and K562 cells. We systematically investigated their nuclease activities directed by sgRNAs with different protospacer lengths (18â21 nt for SpCas9 and 19â23 nt for SaCas9). We found that SaCas9 editing was more sensitive to spacer length, with 21-nt or 22-nt sgRNA being the most effective. In addition, SaCas9 was more potent than SpCas9 when the PAM was not NNGGAT. Furthermore, for the first time, we demonstrated that SpCas9 was more prone to NHEJ +1 editing than SaCas9, thus hampering the double-stranded oligodeoxynucleotide (dsODN) insertion or AAV donor HDR integration. Finally, the GUIDE-seq analysis revealed a considerably higher fidelity of SaCas9 compared with SpCas9. Therefore, our study demonstrates that SaCas9 is a superior nuclease for manipulating human genomes and clinical gene therapy.
Results
Using an optimized sgRNA scaffold and a novel Cas9 fusion protein improves genome editing efficiency
To quantitate the editing activities of these fusion SaCas9 vectors, we performed a reporter knock-in assay in K562 cells (Figure 1B). After electroporation with sgGAPDH, Cas9, and a double-cut donor plasmid pD-E2A-mNeonGreen-sg [22], successful HDR editing would lead the cells to fluoresce green. As expected, no mNeonGreen-positive cells were detected in the negative control that omitted sgGAPDH by fluorescence-activated cell sorting (FACS) 72 h after electroporation (data not shown). Consistent with early studies of SpCas9, the fusion of SaCas9 with a single BPNLS enhanced the editing efficiency by ⌠30% compared with SaCas9 fused with NPM·NLS (Figure 1C). However, two BPNLSs at both the N- and C-termini did not further increase the HDR efficiency. Of note, an N-terminal HMGA2 and a C-terminal BPNLS additions showed significantly increased editing efficiency (Figure 1C). Therefore, we chose the HMGA2-SaCas9-BPNLS construct for further studies.
sgRNA expression was driven by the U6 RNA polymerase III promoters, for which a TTTT stretch is sufficient for transcription pause and sometimes termination, leading to reduced transcription [23], [24]. The original sgRNA scaffolds for both SpCas9 and SaCas9 start with GUUUU, and a mutation of thymine at position 4 can significantly increase the activity of low-performance sgRNA [25]. In addition, extending the duplex with UGUCG could also improve editing efficiency [26]. For SpCas9, we adopted the sgRNA scaffold with the best performance, which contains a T4 > C mutation (GUUUC) and UGUCG addition [26]. We wondered whether the optimized sgRNA (Sa-v2) could also perform better when complexed with HMGA2-SaCas9-BPNLS (Figure 1D). Consistent with a previous study, the modified sgRNA (Sa-v2) significantly increased mNeonGreen knock-in efficiencies in GAPDH (⌠40%â50%; Figure 1E). We carried out the following genome editing studies using the optimized sgRNA (Sa-v2) based on the aforementioned results.

Optimization of sgRNA structure and Cas9 NLS to improve genome editing efficiency of SaCas9 Schematic diagram of SaCas9 variants with different NLSs. SaCas9 expression was driven by the EF1 promoter.Schematic of HDR-mediated gene editing at. sgRNA was designed to target the last intron upstream of the stop codon. A promoterless double-cut HDR donor pD-E2A-mNeonGreen-sg was used to guide HDR-mediated insertion of the mNeonGreen fluorescent protein-coding gene. The orange boxes indicate the left and right HAs (both 600 bp in length); the blue box indicate a self-cleaving linker for multicistronic expression; the red lightening indicates the Cas9âsgRNA cleavage site.Cas9 fusion proteins enhance HDR editing efficiency of SaCas9. mNeonGreen-positive cells were determined by FACS 3 days after electroporation of K562 cells (= 5).Scaffold optimization of sgRNA interacting with SaCas9. sgRNA (Sa) indicates the sgRNA with the original sgRNA scaffold for SaCas9; sgRNA (Sa-v2) indicates the optimized sgRNA including a U > A mutation and UGCUG addition.The scaffold-optimized sgRNA improves the HDR editing efficiency atin K562 cells (= 4). gN20 indicates the 21-nt sgRNA commenced with a mutant guanine to ensure the U6 ptomoter activation. Data are shown as mean ± SD. Significance (< 0.05) was calculated using unpaired two-tailed Studentâs-test. sgRNA, single-guide RNA; NLS, nuclear localization signal; BPNLS, bipartite nuclear localization signal; HDR, homology-directed repair; HA, homologous arm; FACS, fluorescence-activated cell sorting; KI, knock-in. A. B. C. D. E. GAPDH n GAPDH n P t
Experimental design for comparing SpCas9 and SaCas9 editing systems
To stringently compare the cleavage efficiencies of SaCas9 and SpCas9, we designed sgRNAs targeting the sites with the NGGRRT PAM, which both SpCas9 and SaCas9 can recognize. We also constructed HMGA2-SpCas9-BPNLS to compare with HMGA2-SaCas9-BPNLS, and the TTTT stretch was mutated to prevent premature transcriptional termination. We chose 11 sites from 8 genes, including AAVS1, ALB, PD1, B2M, CCR5, TRAC, CIITA, and CD326 (also known as EPCAM), due to their clinical potential in cell and gene therapy (Table S1).

Experimental design and data reproducibility Schematic of genome editing with the CRISPR-Cas9 strategy in K562 cells and iPSCs.The high reproducibility of genome editing results in K562 cells and iPSCs. The indel data at 48 h and 72 h after electroporation were combined for analysis.The correlation of indel frequencies between K562 cells and iPSCs. Pearson linear regression analysis was conducted in (B) and (C). iPSC, induced pluripotent stem cell; Indel, insertion and deletion. A. B. C.
The optimal spacer length for efficient cleavage with SpCas9 is 20 nt
At the five locations showing similar editing efficiencies between the 18-nt and 19-nt groups, we observed two with a matched G and three with a mismatched g. However, at four sites displaying lower editing efficiencies with the 18-nt spacers, we observed three with a matched G and one with a mismatched g. These data suggest that the mismatched g at the 5âČ end of the spacer may affect the editing efficiency in some cases, but not necessarily, given that the matched G could also result in low efficiency.
All these aggregating data showed that 20-nt sgRNAs function better than sgRNAs with shorter or longer spacers (Figure 3B). Our data demonstrate that the optimal spacer length for SpCas9 is 20 nt, but sgRNAs with a spacer of 18 nt, 19 nt, or 21 nt may occasionally have higher activity. These results highlight the importance of testing the optimal spacer length in clinical gene therapy.

The effect of spacer length on SpCas9 editing efficiency Relative indel frequencies of SpCas9 in complex with sgRNAs of 18â21 nt in spacer length (= 4â8 replicates for each spacer length). The indel values were normalized to the highest editing efficiency for each target. The sgRNA sequences of 21 nt are shown.Statistical analysis of relative indel frequencies of SpCas9 in complex with sgRNAs with different spacer lengths. Data are shown as mean ± SD. Significance (< 0.05) was calculated using an unpaired two-tailed Studentâs-test. A. B. n P t
The optimal spacer length for efficient cleavage with SaCas9 is 21â22 nt

The effect of spacer length on SaCas9 editing efficiency Relative indel frequencies of SaCas9 in complex with sgRNAs of 19â23 nt in spacer length (= 8 replicates for each spacer length). The indel values were normalized to the highest editing efficiency for each sgRNA. The sgRNA sequences of 23 nt are shown.Statistical analysis of relative indel frequencies of SaCas9 in complex with sgRNAs with different spacer lengths.Effects of different PAMs on indel frequencies. All the sites were divided into four groups based on their PAM sequences (= 4â6 replicates for each PAM sequence). Data are shown as mean ± SD. Significance (< 0.05) was calculated using an unpaired two-tailed Studentâs-test. PAM, protospacer adjacent motif. A. B. C. n n P t
SaCas9 is superior to SpCas9 in generating indels

Comparison of relative indel frequencies of SpCas9 and SaCas9 Comparison of SpCas9 and SaCas9 gene editing efficiencies at 11 individual sites in iPSCs (top) and K562 cells (bottom).Comparison of the average gene editing efficiencies of SpCas9 and SaCas9 in iPSCs (top) and K562 cells (bottom) by aggregating the data shown in (A). All the indel values were normalized to the average editing efficiency of SpCas9. Data are shown as mean ± SD. Significance (< 0.05) was calculated using an unpaired two-tailed Studentâs-test. A. B. P t
SaCas9 has a strikingly lower potential for staggering cleavage than SpCas9

Comparison of NHEJ +1, NHEJ â1, and MMEJ frequencies after editing with the SpCas9 and SaCas9 editing systems Representative repair patterns after SpCas9 (top) and SaCas9 (bottom) cleavage. The microhomologies are highlighted in the black boxes. The NHEJ +1 insertion events are shown in the red boxes. All the repair patterns of the 11 sites in both iPSCs and K562 cells are displayed in Figure S5.Comparison of the percentages of NHEJ +1 insertion events between SpCas9 and SaCas9.Comparison of the percentages of NHEJ â1 deletion events between SpCas9 and SaCas9.Comparison of the percentages of MMEJ events between SpCas9 and SaCas9. Wilcoxon matched-pairs tests were conducted.< 0.05 was considered statistically significant. NHEJ, nonhomologous end-joining; MMEJ, microhomology-mediated end-joining. A. B. C. D. P
SaCas9 editing favors the NHEJ-mediated dsODN insertion and HDR-mediated gene knock-in
Our recent work on the SpCas9 system has demonstrated that +1 NHEJ is the speedy pathway to repair Cas9-mediated DSBs, which outcompetes MMEJ and HDR editing outcomes [12]. Given that SaCas9-created DSBs are less likely to be fixed by +1 NHEJ, we hypothesized that SaCas9 might be a favorable nuclease for applications such as transgene knock-in.
Finally, we explored whether the TP53BP1 inhibitor can further improve HDR efficiency in our improved SaCas9 editing system. TP53BP1 is an essential regulator of the DSB repair pathway and functions to favor NHEJ over HDR by suppressing end resection [30]. Using a genetically encoded inhibitor of TP53BP1, we found that the precise HDR efficiency increased by ⌠40% (Figure 7F), consistent with a previous report [30].

SaCas9 editing favors NHEJ-mediated dsODN insertion and HDR-mediated gene knock-in Schematic diagram of dsODN insertion through NHEJ and AAV donor knock-in through HDR after Cas9âsgRNA cleavage.Representative repair patterns after editing with Cas9âsgRNA and dsODN. As shown in the red boxes, dsODN could be integrated with both forward and reverse orientations.Similar NHEJ +1 frequencies between SpCas9 and SaCas9 editing atare associated with similar dsODN insertion levels.Considerably lower NHEJ +1 frequencies after SaCas9 editing atandlead to a marked increase in dsODN insertion.Lower NHEJ +1 frequencies after SaCas9 editing atandincrease HDR-mediated AAV donor knock-in.The TP53BP1 inhibitor enhances HDR editing efficiency in our improved SaCas9 editing system. The relative HDR efficiency was normalized to the control (= 15). Data are shown as mean ± SD. Significance (< 0.05) was calculated using paired two-tailed Studentâs-test. dsODN, double-stranded oligodeoxynucleotide; AAV, adeno-associated virus; DSB, double-strand break. A. B. C. D. E. F. ALB-1 CCR5 B2M1 AAVS1d PD1 n P t
GUIDE-seq revealed superior fidelity of SaCas9 to SpCas9
We decided to use the off-index metric (total off-target reads divided by on-target reads) to quantitate the off-target effects. A comparison of the two editing systems showed that the off-index of SaCas9 was ⌠20-fold lower than that of its SpCas9 counterpart (Figure 8C). For editing with SpCas9, spacer lengths from 18 nt to 21 nt did not show significant differences in the off-index (Figure 8D), suggesting that truncated sgRNAs may not be able to reduce the off/on-target ratio in SpCas9-based applications. For SaCas9, at the two sites with a substantial number of off-target reads, AAVS1c and TRAC, the 22-nt sgRNAs showed a considerably lower off-index than their 21-nt counterparts (Figure 8B and E). Together, carefully designed and prescreened 22-nt sgRNAs with SaCas9 may abrogate the off-target cleavage events.

GUIDE-seq analysis revealed higher specificity of SaCas9 than SpCas9 Off-target sites identified by GUIDE-seq for SpCas9.Off-target sites identified by GUIDE-seq for SaCas9. The target sequence with PAM is shown on the top line. Mismatches found in off-targets are highlighted in color. The read counts corresponding to different spacer lengths are shown on the right. The off-targets of(SpCas9) are truncated, and the full list is shown in Figure S6. Sp18âSp21 indicate different spacer lengths of sgRNAs used for SpCas9-mediated editing; Sa20âSa23 indicate different spacer lengths of sgRNAs used for SaCas9-mediated editing.Comparison of the off-index values between SpCas9 and SaCas9. The off-index was calculated as the total off-target reads divided by the on-target reads.Spacer lengths do not affect SpCas9 specificity.Effects of spacer lengths on SaCas9 specificity. The off-index values of SpCas9 and SaCas9 were normalized to 20 nt and 21 nt, respectively. Data are shown as mean ± SD (n â„ 2). Significance (< 0.05) was calculated using paired two-tailed Studentâs-test. A. B. C. D. E. AAVS1c P t
Discussion
SpCas9 and SaCas9 are two of the best-characterized nucleases for genome editing. Several independent studies have focused on their nuclease activities and clinical applications [14], [32]. Here, we report a novel Cas9 fusion protein with enhanced nuclease activity in mammalian cells. We systemically compared SpCas9 and SaCas9 in terms of spacer lengths, indel frequencies and patterns, knock-in efficiencies, and off-target activities using the improved design. Our results validated and extended previous reports on SaCas9 [14]. For the first time, we report that SaCas9 has a considerably lower propensity for staggered cuts, which is beneficial for dsODN insertion and HDR editing. In addition, SaCas9âs on- and off-target activities are more sensitive to the spacer length of sgRNA. We recommend using SaCas9 with sgRNA of 22 nt, either gN21 or GN21, to achieve high editing efficiency and low off-target cuts.
CRISPR-Cas9 evolved in prokaryotes. In its natural setting, Cas9 does not encounter the nuclear envelope and nucleosomes. To facilitate transfer across nuclear pores, one needs to tag Cas9 with the NLS. We and others have shown that the BPNLS from SV40 is more potent than the NLS from the NPM gene. Therefore, one copy of BPNLS is sufficient for efficient editing. In addition, nucleosomes impede Cas9 access to DNA [33], particularly in less accessible chromatin. Therefore, the fusion of Cas9 with a chromatin remodeler may improve its functionality in eukaryotes. Here, we report that the HMGA2-Cas9-BPNLS performs better than the commonly used Cas9 fusion protein. In addition, we adopted the modified sgRNA scaffold to prevent premature transcription termination by mutating the T4 strip and increase the binding of sgRNA and Cas9 by extending the repeat:anti-repeat duplex [26]. With all these modifications, editing efficiencies are considerably improved.
SpCas9 was the default nuclease version in most gene editing applications. Although it has been well studied, we also investigated the effects of spacer lengths in our improved system. Consistent with previous studies, we found that in many cases, spacers of 18â21 nt showed similar editing efficiencies. As a general principle, 20 nt is the optimal length for high activity, consistent with another report that extension with one nucleotide decreases its activity [34]. However, we found that the optimal spacer size was 18, 19, or 21 nt for a few sgRNAs. Therefore, we recommend screening sgRNAs of 18, 19, 20, and 21 nt for clinal applications. In support of our finding, certain 18-nt sgRNAs are more effective than the 20-nt version for SpCas9-mediated gene knockout in hematopoietic stem cells [35]. In contrast to a previous report that truncated sgRNAs showed less off-target activities, we observed similar adverse effects for sgRNA spacers ranging from 18 nt to 21 nt. These results suggest that wild-type SpCas9 is inappropriate for applications sensitive to off-target cleavage.
SaCas9 is the ideal nuclease for in vivo gene editing due to its small size, high efficiency, and low off-target activities. We validated previous reports that 21-nt or 22-nt sgRNAs are optimal for SaCas9 editing. We also found that the best PAM sequence was NNGGGT, followed by NNGAGT and NNGAAT. Therefore, if possible, one would avoid choosing a target site with the NNGGAT PAM. We show that SaCas9 is more effective at locations other than those bearing the NNGGAT PAM. This result is consistent with the report that SaCas9 has higher cleavage activity than SpCas9 at target sites harboring the NNGGGT PAM [36]. In contrast to SpCas9, we found that SaCas9 is more sensitive to spacer length, and shorter or longer than 21â22 nt significantly decreased cutting efficiency. In addition, the 22-nt spacer showed considerably lower off-target activities than the 21-nt version. Considering both efficacy and adverse effects, we recommend using 22-nt sgRNA, such as gN21 or GN21, for SaCas9-based applications. However, screening spacers of 21, 22, and 23 nt will benefit clinical applications.
In contrast to early studies using the T7E1 mismatch cleavage assay to assess indel frequencies [14], we conducted deep sequencing to analyze cleavage outcomes accurately. For the first time, we identified a distinctive feature of SaCas9, a considerably lower propensity for the staggered cut. It is well established that SpCas9 has a strong bias for inducing staggered 5âČ end overhangs after cleavage, thus resulting in a high probability of +1 nucleotide insertion [11], [12]. However, SaCas9 is ⌠10 times less likely to generate staggered dsDNA ends. Since +1 NHEJ repair after a staggered cut occurs much faster than other repair pathways, such as HDR [12], this distinctive feature translates into a greater proportion of dsODN insertion and HDR knock-in after SaCas9 cleavage. The two Cas9 orthologs have other distinct features, e.g., SpCas9 maintains binding to the DNA several hours after cleavage [37], whereas SaCas9 releases the DNA at the distal end of the PAM immediately after cleavage [38], [39]. Therefore, SaCas9 functions as a multiple-turnover enzyme, whereas SpCas9 is a single-turnover nuclease. A crystal structure comparison between SaCas9 and SpCas9 revealed notable differences in their functional domains [16]. Further investigation into the differences between SaCas9 and SpCas9 will inspire the design of novel Cas9 proteins with outstanding features.
The identified rules for optimal SaCas9, a 22-nt spacer length, and avoiding targets with the NNGGAT PAM will ensure high-level editing and minimal off-target activity. This stringency will lead to an approximately 5-fold decrease in the number of available sgRNAs compared with SpCas9, limiting its use in base editing and correction of point mutations. However, it will find applications in creating knockout phenotypes and HDR editing by targeting introns [40]. In addition, researchers have engineered SpCas9 variants, such as SpCas9-HF1, with better specificity than wild-type SpCas9 [41], [42]. Similarly, SaCas9 variants with greater specificity have also been developed [43], [44]. It would be interesting to compare the activity and fidelity of HiFi SpCas9 and HiFi SaCas9 variants in future investigations.
We observed that the optimal spacer lengths for SpCas9 and SaCas9 were 20 nt and 22 nt, respectively. However, for a few sgRNAs, a longer or shorter spacer may display greater activity. This might be explained by the sgRNAâDNA binding energy and/or sgRNA secondary structure. Similarly, SaCas9 tended to be more potent than its SpCas9 counterpart when targeting the same sequence. However, occasionally, SaCas9 is less effective than SpCas9. These observations may be attributed to the distinct effects of chromatin structure on the survey efficiency of SpCas9 and SaCas9.
In summary, we systematically described the gene-editing results of SpCas9 and SaCas9. SaCas9 cleaves the target sequence more effectively than SpCas9. Furthermore, the unique feature of SaCas9 to create blunt ends makes it more effective for dsODN insertion and HDR knock-in. Above all, SaCas9 combined with a 22-nt sgRNA showed strikingly lower off-target activities than the SpCas9 system. Our study will provide much-needed guidance for genome editing in human cells and in vivo gene therapy.
Materials and methods
Plasmid vector design and construction
We used CHOPCHOP [45] to design sgRNAs that target 11 sites on human AAVS1, ALB, B2M1, B2M2, CIITA, PD1, TRAC, and CD326. Those sgRNAs with the NGGRRT PAM were selected for simultaneously comparing two Cas9 orthologs. The sgRNA sequences are listed in Table S1. All sgRNAs were initiated with a G to ensure U6 promoter activation. The AAV HDR vectors consisted of a backbone carrying a 141-bp AAV2 inverted terminal repeat (ITR) sequence and a 6-bp short insertion flanked by 700-bp HAs.
The gene-editing experiments were conducted through plasmid electroporation. All plasmid vectors expressing Cas9, sgRNAs, or HDR donors were constructed according to our previous description [22], [46]. Briefly, all fragments were PCR amplified from human genomic DNA or existing plasmids in our lab using KAPA HiFi polymerase (Catalog No. KK2602, KAPA Biosystems, Swiss) and purified using the Zymoclean gel DNA recovery kit (Catalog No. D4001, ZYMO Research, Irvine, CA). Then the fragments were assembled into a plasmid backbone using the NEBuilder HiFi DNA assembly cloning kit (Catalog No. M5520AA, New England Biolabs, France). Multiple colonies were picked for Sanger sequencing (Tsingke Biotechnology, China) and Nanopore sequencing (GenoStarBio, China) to identify the correct clone.
Cell culture
iPSCs used in this study were derived from human adult peripheral blood as previously described [12], [47]. Cells were cultured in mTeSR E8 medium (Catalog No. 85850, Stemcell Technologies, Canada) on Matrigel (Catalog No. 354277, BD, Becton, NJ)-coated tissue culture plates and kept in a humidified incubator at 37 °C and 5% CO2. The medium was daily refreshed, and the cells were routinely passaged using 1 mM EDTA after reaching 80% confluence.
K562 cells were maintained in RPMI-1640 medium (Catalog No. 12633020, Gibco, CA) supplemented with 10% (v/v) fetal bovine serum (Catalog No. 12664025, Gibco) and 1% (v/v) penicillin/streptomycin (Invitrogen, CA) in a 37 °C, 5% CO2, and fully humified incubator. Medium changes were usually performed 2â3 times per week.
AAV6 packaging, purification, and titering
Recombinant AAV6 vectors were produced through a PEI (polyethyleneimine) MAX 40K (Catalog No. 24765-1, Polysciences, PA) transfection system as previously described [48]. Briefly, HEK293T (ATCC) cells at a confluency of ⌠85% were transfected with plasmids expressing AAV6 capsid, AAV helper, and HDR donor. 5 U/ml benzonase (Catalog No. 9025654, SCBT, Dallas, TX) was added to the medium 18 h pre-harvest to eliminate the residual plasmid. Cells were treated with 500 mM NaCl (Sigma, MO) 5 days later. The supernatant was harvested 2 h later and then sterilized with a 0.22-Όm filter after centrifugation. We used the Minimate (PALL) tangential flow filtration system equipped with a 300-KD molecular weight cutoff (MWCO) capsule to concentrate the supernatant. Then the AAV6 products were purified with iodixanol gradient centrifugation. The vector titer was analyzed by qPCR as described previously [48].
dsODN preparation
The blunt-ended dsODN used in our study was prepared by annealing two modified ssODNs (5âČ-P-G*T*TTAATTGAGTTGTCATATGTTAATAACGGT*A*T-3âČ and 5âČ-P- A*T*ACCGTTATTAACATATGACAACTCAATTAA*A*C-3âČ, where P represents 5âČ phosphorylation and * indicates a phosphorothioate linkage, IDT) [47] with the following program: 95 °C for 5 min and then slowly brought to room temperature. dsODN at 50 pmol was used in each transfection.
Plasmid electroporation
Cells were electroporated using a Lonza 2b nucleofector following the manufacturerâs recommended protocol. iPSCs at 60%â70% confluency were dissociated into a single-cell suspension and electroporated by human stem cell Nucleofector kit 2 with program B-016. 10Â ÎŒM ROCK inhibitor Y27632 (Catalog No. 04001210, STEMGENT, PA) was maintained in the iPSC culture on the first day after transfection. K562 electroporation was performed with the Amaxa cell line Nucleofector kit V (Catalog No. VVCA1003, Lonza, Swiss) in program T-016.
We used 1 Ă 106â2 Ă 106 cells for each transfection and delivered 1 ÎŒg of Cas9 and 0.5 ÎŒg of sgRNA plasmids. For iPSCs, 0.5 ÎŒg of BCL-XL plasmid was also used for transient BCL-XL overexpression to increase cell viability [46]. For AAV6-mediated HDR, the AAV6 donor was added to the culture after electroporation without further manipulation. The multiplicity of infection (MOI) was typically 10,000â50,000. The AAV6-containing medium was replaced with a fresh culture medium 24 h later.
Flow cytometry
The expression of mNeonGreen was evaluated by flow cytometry 72 h post nucleofection on a BD FACS Canto II flow cytometer. The FITC channel was used to determine the proportion of mNeonGreen-positive cells, which were considered as HDR knock-in cells. Electroporations without Cas9, fluorescent reporter HDR donors, or relevant sgRNAs were also performed as negative controls. The FACS data were analyzed by FlowJo v10.
Quantification of genome-editing events
To evaluate genome-editing efficiency, we performed PCR followed by Illumina deep sequencing. In brief, approximately 1 Ă 105 cells were harvested 72 h after electroporation for genomic DNA extraction as described previously [12]. Nested PCR was conducted to avoid HDR artifacts induced by AAV6 donors. Primers for amplifying target sequences are listed in Tables S2 and S3. DNA amplicon libraries were prepared with barcoded primers using KAPA HiFi DNA polymerase (Roche Sequencing, Swiss). Libraries were pooled and sequenced using Illumina NovaSeq6000 system (Novogene, China). The paired-end raw data were processed with Seqkit [49] and demultiplexed with Barcode-splitter (https://pypi.org/project/barcode-splitter/â). The editing frequencies, HDR efficiencies, and dsODN insertion rates were analyzed and visualized using CRISPResso2 [12].
GUIDE-seq
We conducted GUIDE-seq to investigate the off-targets following published methods [31]. We transfected cells with SaCas9 or SpCas9, the corresponding sgRNA, and the dsODN bait. Genomic DNA was extracted 72 h post-transfection, and 2 ÎŒg gDNA was used for NGS library construction following the GUIDE-seq method with minor modifications. Briefly, DNA was sheared, followed by adaptor ligation and two rounds of PCR enrichment for 34-bp dsODN baits. The PCR products were pooled for 150-bp paired-end Illumina sequencing (Novogene, china). The raw data were preprocessed with Seqkit [49] and analyzed through the GUIDE-seq software workflow. Alignments were used to identify genome-wide dsODN integration sites. Off-targets bearing up to 6 mismatches within the protospacer were identified.
Statistical analysis and reproducibility
We conducted the data statistical analysis with GraphPad Prism 8. Significance was calculated using paired or unpaired two-tailed Studentâs t-test for normally distributed data. Wilcoxon matched-pairs tests were conducted for the abnormally distributed data. All adjusted P values are indicated in the figures. P values of less than 0.05 were considered statistically significant. The data presented in this study were acquired from at least three independent experiments.
Data availability
The Illumina sequencing raw data have been deposited in the Genome Sequence Archive for Human [50] at the National Genomics Data Center, Beijing Institute of Gemonics, Chinese Academy of Sciences / China National Center for Bioinformation (GSA-Human: HRA002490), and are publicly accessible at https://ngdc.cncb.ac.cn/gsa-human/â.
Competing interests
The authors have declared no competing interests.
CRediT authorship contribution statement
Zhi-Xue Yang: Investigation, Methodology, Writing â original draft, Visualization. Ya-Wen Fu: Investigation, Formal analysis, Data curation, Writing â review & editing. Juan-Juan Zhao: Investigation. Feng Zhang: Investigation. Si-Ang Li: Software. Mei Zhao: Investigation. Wei Wen: Investigation. Lei Zhang: Funding acquisition. Tao Cheng: Resources, Funding acquisition. Jian-Ping Zhang: Writing â review & editing, Funding acquisition. Xiao-Bing Zhang: Conceptualization, Supervision, Project administration, Writing â review & editing, Funding acquisition. All authors have read and approved the final manuscript.