What this is
- This research investigates the AbiF system in Clostridioides difficile, a significant pathogen in antibiotic-associated infections.
- The study identifies a novel , RCd22, which regulates the expression of the AbiF protein, a toxin that can impair bacterial growth.
- Findings suggest that RCd22 acts as an antitoxin, providing insights into bacterial defense mechanisms against phage infections.
Essence
- The study characterizes a new AbiF-like system in C. difficile, revealing that the RCd22 regulates the toxic activity of the AbiF protein, which can inhibit bacterial growth.
Key takeaways
- The AbiF protein exhibits toxic activity, leading to reduced growth in C. difficile. Overexpression of AbiF resulted in a significant growth defect in both C. difficile and E. coli.
- RCd22 down-regulates the expression of the abiF gene, demonstrating its role as an antitoxin. Deletion of RCd22 led to a 17× increase in abiF expression compared to the wild-type strain.
- RCd22 interacts directly with the AbiF protein, preventing its toxic effects. Co-purification experiments confirmed RCd22 as the primary target of AbiF, indicating a stable RNA-protein complex.
Caveats
- The study primarily focuses on a single strain of C. difficile, which may limit the generalizability of the findings to other strains or species.
- Further research is needed to fully elucidate the mechanisms by which RCd22 regulates abiF expression and its role in phage resistance.
Definitions
- non-coding RNA (ncRNA): RNA molecules that do not encode proteins but play roles in regulating gene expression.
- Abortive Infection (Abi) system: A bacterial defense mechanism that induces cell death or dormancy to prevent phage replication.
Simplified
Introduction
Bacterial evolution has been profoundly affected by viruses, namely bacteriophages (or phages). To defend themselves against phages, bacteria have developed efficient anti-phage defense systems such as Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR), CRISPR-associated (Cas), Toxin-Antitoxin (TA), Restriction-Modification (RM), superinfection exclusion (Sie) and Abortive Infection (Abi) systems. Abi is a common defense mechanism present in most bacteria. Upon activation, Abi induces cell death (or dormancy) before the infecting phage can complete its replication cycle, thus protecting the colony from phage spreading [1]. This defense strategy is associated with numerous newly discovered immune systems [2]. The total number of defense systems described so far exceeds 130, revealing the complexity of bacterial immunity and the diversity of associated mechanisms [3].
Clostridioides difficile, a human pathogen responsible for antibiotic-associated infections, evolves in a phage-rich environment inside the intestinal tract. To cope with the presence of phages and other invaders, this bacterium has developed several defense mechanisms. In addition to a well-characterized type I-B CRISPR-Cas system [4–7], other anti-phage systems were identified such as RM [8] and superinfection exclusion [9]. A putative abiF-like gene was also found in the C. difficile phi027 prophage [9], φC2 [10], and φCD2301 [11] phage genomes, but its function and contribution to the defense mechanisms remain uncharacterized.
The AbiF system was originally described in Lactococcus lactis [12]. It shares structural similarities with AbiD and AbiD1, two other systems of L. lactis, giving a single AbiD/F group of abortive infection systems [13]. This group belongs to the superfamily Abi_2, widely distributed among bacteria. In L. lactis, AbiF provides resistance to phage infection by interfering with DNA replication [12] while AbiD1 interferes with a phage-encoded endonuclease [14,15]. Aside from what is known in L. lactis, the function and activity of this group of Abi systems remain poorly characterized in other species, and the exact mechanism of action for both AbiD1 and AbiF proteins remains to be defined.
Given the toxicity of Abi systems for the bacterial cell itself, they must be tightly regulated. Most Abi systems are composed of a sensing module responding to phage infection, and an effector module triggering cell death or dormancy [1,2]. The sensing module usually detects a conserved phage protein directly, or sometimes indirectly through the effect of phage infection on the host, like inhibition of the cell machinery [2]. In L. lactis AbiD1, a structured 5’UTR region on the abiD1 mRNA was shown to contribute to the control of abiD1 translation [15]. During phage infection, a conserved phage protein binds to the 5′UTR of the abiD1 mRNA, leading to the activation of abiD1 translation [15]. A short premature terminated transcript has been detected upstream of the abiD1 coding sequence, suggesting that the AbiD1 system could be also regulated at the transcriptional level [16]. Another example of an RNA-based mechanism associated with the abortive infection strategy is the type III toxin-antitoxin (TA) system. In this case, the antitoxin is a non-coding RNA (ncRNA) that neutralizes the toxin protein through an RNA-protein interaction [17,18]. This system is activated following the inhibition of host transcription during phage infection, leading to the release of the stable toxin [19].
ncRNAs are key components of various regulatory mechanisms that control virulence in major pathogens. They also contribute to anti-phage systems including CRISPR-Cas and type I, III and VIII TA with antitoxin RNAs neutralizing the toxin by RNA-base pairing (type I and VIII) or direct protein interaction (type III TA). Recently discovered defense mechanisms considered as Abi systems also relie on RNAs either for their activation (CBASS [20]) or function (retrons [21,22] and DRTs [23]). Numerous ncRNAs were identified in C. difficile, including CRISPR RNA, cis and trans riboregulators and riboswitches, showing the importance of RNA-dependent regulation mechanisms in this human pathogen [4]. Our previous work demonstrated that CRISPR arrays and type I TA modules are frequently associated with prophages in C. difficile. They provide efficient defense against other invading phages, and/or contribute to the stability of these regions [6,7,24]. In this study, we describe RCd22, a ncRNA specific to the hypervirulent C. difficile strain R20291↗. Its sequence was found upstream of a putative abiF-like gene located in the lysis module of phi027, a prophage highly conserved among ribotype 027 C. difficile isolates [25,26]. This work provides the characterization of a putative AbiF-like system in C. difficile, and reveals the tight regulation of this system by a ncRNA.
Results
Identification of an AbiF-like system conserved inhypervirulent ribotype 027 strains C. difficile
To assess first the distribution of AbiF system across the phylogenetic tree, Blast analyses of Abi_2/AbiF conserved domains were done on the complete genomes of 46,864 bacterial and 681 archaeal species. Results showed that this protein family is largely distributed in bacteria belonging to the group of Bacillota, comprising Bacilli and Clostridia classes. It is also found in the group of Pseudomonadota, comprising alpha, beta and gammaproteobacteria classes (Fig 1A). The phylogenetic tree also showed that the Rickettsiales order (alphaproteobacteria class), Borreliaceae family (betaproteobacteria class) and several other genera and species lack the abiF/abi_2 gene. Interestingly, when the abiF/abi_2 gene was detected in a species, it was not necessarily detected in all genomes of that species. For example, among 1,800 genomes of Staphylococcus aureus analyzed, 200 contain no abiF/abi_2 homologous genes (11%). Bacterial species generally carry between one and up to eight copies of a putative abi_2/abiF gene in their genome (red bars in Fig 1A), with a majority carrying one or two copies per genome.
The biological function of the AbiF system originally found on a L. lactis plasmid was characterized [12], but the function of the chromosome-encoded version remains unknown. Analysis of the gene environment and genome organization could give clues regarding the biological function of the putative AbiF/Abi_2 protein. We therefore sought for genes located near abiF/abi_2 in the 7,828 annotated genomes carrying at least one abiF/abi_2 gene homolog. We evaluated the annotated functions of gene products five positions upstream and five positions downstream from the abiF/abi_2 gene (Fig 1B). We grouped gene products according to their biological function. Genes present with an abundance of less than 10% were grouped in the category “others”. Despite the functional variability in the closest genomic environment, defense mechanisms and mobilome-associated gene functions were enriched in the proximity of the abiF/abi_2 gene at the positions -1 and +1. The presence of abiF/abi_2 in prophages or transposons could explain the large distribution of this gene in bacteria. Other copies of abiF/abi_2 (assigned to the defense mechanism functional class) were frequently found at the -1 or +1 position, suggesting potential gene duplication events. Other gene associations implying translation and posttranslational modification were found downstream from the abiF/abi_2 gene, while genes related to nucleotide metabolism and transcription were found upstream of abiF/abi_2.
Abi systems have never been functionally characterized in clostridial species, although bioinformatics analyses revealed the presence of putative AbiF systems in prophages of C. difficile [9–11] (Fig 2). In this study, we sought to characterize the function of a putative AbiF system located in the phi027 prophage present in the chromosome of the hypervirulent strain R20291↗ [9] (Fig 2A) and generally well-conserved in ribotype 027 isolates of C. difficile [25,26]. Among 170 analysed genomes of C. difficile, 106 lack the abiF/abi_2 genes (62%). In C. difficile strains carrying a putative abiF gene, only one copy is present. This putative AbiF system is encoded by a single gene CDR20291_1462 (GenBank accession number: CBE04038.1↗), hereafter named abiFCd. A BlastP search using this protein sequence as the query revealed homologs in the genomes of C. difficile phage φC2 (NCBI accession number: YP_001110753.1↗, 99.88% identity) [5,10] and φCD2301 (NCBI accession number: QVW56672.1↗, 99.54% identity) (Fig 2B) [11]. The hypothetical AbiFCd protein belongs to the superfamily Abi_2 (cl01988↗), containing the Abi_2 conserved domain (pfam07751) and the AbiF conserved domain (COG4823). The AbiF protein originally identified in a L. lactis plasmid shares 34.39% identity with AbiFCd.
Of note, all putative abiF genes in C. difficile strains are located between a predicted phage holin and a gene encoding an endolysin, within the lysis module of a prophage genome (Fig 2B). The association with lysis module was also observed in other bacterial species, but the abiF/abi_2 gene was located downstream from this module. Therefore, the location of abiF inside the lysis module is specific to C. difficile strains (Fig 2A). The location of a putative anti-phage system inside a prophage is intriguing. It could provide benefits for both the bacterial host and the prophage, by protecting bacteria from infections by unrelated phages, and/or stabilizing the prophage in the bacterial chromosome, as shown for type I toxin-antitoxin systems [24,27,28]. We further investigated the colocalization of the abiF/abi_2 gene with prophage elements. Of the 7,828 annotated genomes carrying at least one abiF/abi_2 gene homolog, 2,600 representative genomes were analyzed with geNomad [29]. The coordinates of the identified prophages were then compared with the coordinates of the abiF/abi_2 gene to detect overlaps. A total of 288 prophages were found to carry an abiF/abi_2 gene homolog. Analysis of the -5 to +5 genomic environment surrounding abiF/abi_2 revealed that the category “Mobilome: prophages, transposons” was enriched in the -1 and +1 position, as well as “cell wall/membrane/envelope biogenesis” (S1A Fig). The latter category is consistent with abiF/abi_2 being closely associated with the lysis module in prophages (i.e., holin and endolysin genes). In the 2,312 genomes where abiF/abi_2 was not located on a prophage, the functions associated with “Defense mechanisms” and “Replication, recombination and repair” were enriched (S1B Fig).

Distribution and environment of AbiF/Abi_2 systems in bacteria. Distribution of AbiF/Abi_2 systems in 46,864 bacterial genomes from different phyla. The phylogenetic tree was made with iTOL v7 server by downloading the 3 files: the phylogenetic tree in newick format, the count of AbiF systems per genome added to iTOL template_simplebar, and the colors added to iTOL template_color_strip (). Colors were defined according to taxon levels to highlightandand visualize taxon with no AbiF/Abi_2 system. Other taxonomic levels were colored when they constituted a balanced group according to the number of genomes in which the AbiF-like system was searched, and those remaining are shown in black. Red bars correspond to the number (1 up to 8 copies) of/genes in genomes. Black stars correspond to bacteria order/family/species without/genes.Environment ofgenes in bacterial genome. Plot of genes located at gene positions -5 to -1 (upstream) and +1 to +5 (downstream) of the(position 0). Genes that are the most frequently located at each position are represented by the colored portions, where color identifies categories of biological function. Genes present with an abundance of less than 10% were grouped in the “others” category. (A) (B) S6 Table C. difficile S. aureus abiF abi_2 abiF abi_2 abiF/abi_2 abiF/abi_2

Identification of putative AbiF/Abi_2 systems associated with phage lysis module. Localization of putativegenes in selected bacterial genomes.Comparison of the genome sequence of threeprophages: phi-027 prophage (top), φC2 (middle) and φCD2301 (bottom). Predicted ORFs and the direction of transcription are indicated by arrows. Green arrows represent putativegenes; dark blue arrows represent putative-like genes; black arrows represent putativegenes. Conserved regions are shaded in grey and color intensity corresponds to sequence identity level (80% to 100%). Genomic comparisons were performed using BLASTn and the figure was produced using Easyfig 2.2.5. (A) (B) abiF/abi_2 C. difficile holin abiF endolysin
Identification of a ncRNA associated with the putative AbiFsystem Cd
Comparative analysis of RNA-seq data from the C. difficile strain R20291↗ (ribotype 027) and the reference strain 630Δerm grown to the late exponential phase in TY medium [7] revealed the presence of a potential ncRNA gene specific to the R20291↗ strain and located upstream of the abiFCd gene. In particular, RNA-seq reads were detected in the intergenic region between the holin and abiFCd genes of the R20291↗ strain (Fig 3A). In agreement with RNA-seq data, Northern blot analyses on RNA extracts from the R20291↗ strain revealed the presence of a major transcript of about 100 nt, that we named RCd22, as well as a less abundant and longer transcript (Fig 3B). The profiling of RCd22 expression during growth showed that the maximum level of expression is reached during late exponential phase (6h of growth in TY medium) (Fig 3B). The laboratory reference strain 630Δerm lacks the phi027 prophage and as expected, no RNA could be detected by Northern blot using a RCd22-specific probe (Fig 3B). In silico analysis of the corresponding region in the R20291↗ genome revealed the presence of a Sigma A-dependent promoter consensus upstream of the RCd22 sequence. A predicted Rho-independent terminator was also detected in the RCd22 3’ end, in concordance with the estimated size of about 100–130 nucleotides for the detected transcripts (Fig 3C). The RCd22 gene transcriptional start site (TSS) was located 168 nucleotides upstream of the abiFCd ATG start codon associated with a ribosome binding site (AGGTGA) for initiation of translation. Moreover, no promoter could be predicted between the RCd22 terminator and the abiFCd gene, suggesting that RCd22 and abiFCd are co-transcribed. To map the 5’ and 3’ extremities of transcripts derived from this region, we performed a 5’/3’ Rapid amplification of cDNA ends (5’/3’ RACE) experiment. The major 5’-end was mapped to the A nucleotide position associated with the consensus elements of SigA-dependent promoter. Several 3’ extremities were mapped just upstream or inside the terminator hairpin structure followed by the four-U stretch that corresponds to the full-length 132 nt-transcript and shorter degradation or processing products ranging from 88 to 115 nt in length (Fig 3C). The major 3’-end was located 8 nt upstream of the terminator stem-loop corresponding to the 98-nt transcript. This approach allowed us to confirm the presence of two groups of transcripts sharing the same TSS, a short one ranging from 88 nt to 115 nt in length and corresponding to the RCd22 sequence and a long one of 1,086 nt corresponding to the transcriptional readthrough leading to co-transcription of RCd22 and abiFCd genes (Fig 3C). RNA-seq data showed that the short form including the RCd22 sequence alone and lacking the abiFCd coding sequence is mainly transcribed under laboratory conditions (Fig 3A). These data suggest that the ncRNA RCd22 could regulate the expression of the downstream abiFCd gene through a “riboswitch-like” mechanism associated with premature termination of transcription or maturation. Moreover, a conserved abiF motif (Rfam: RF03085↗) has been identified in RCd22. This motif is associated with putative abiF genes in various bacterial species and especially in environmental isolates of C. difficile [30]. In RCd22, one copy of this motif is located between positions 9 and 41 according to the TSS, and a second copy is also present between positions 61 and 95 (Fig 3D). A RCd22 sequence was also found upstream of the abiFCd homologous genes in phages φC2 and φCD2301. This conserved co-localization of RCd22 and abiF suggests a functional link between RCd22 and the putative AbiF system in C. difficile.

A ncRNA RCd22 is associated with the hypothetical AbiFsystem. Cd RNA sequencing ofstrain showing transcript reads of a ncRNA, named RCd22, upstream of thegene, encoding a hypothetical AbiFprotein.Detection of RCd22 by Northern Blot in thestrain and the absence of the signal in thereference laboratory strain 630Δ. RNA samples were extracted from thestrain grown at exponential phase (E, 4h of growth), late exponential phase (LE, 6h of growth) or entry into stationary phase (S, 10h of growth) and from the 630Δstrain grown at late exponential phase (LE, 6h of growth). 5S rRNA is used as a loading control.Detection of two groups of transcripts by 5’/3’RACE experiments in thestrain. Short transcripts corresponding to RCd22 and a long transcript corresponding to the co-transcription of RCd22 andLight blue letters correspond to RCd22 and light blue bold letters correspond to the twomotifs; nucleotides shaded in light blue correspond to theencoding gene; black boxes correspond to the consensus -10 and -35 elements of the predicted SigA-dependent promoter; nucleotides shaded in pink correspond to the predicted Rho-independent terminators; underlined letters correspond to the predicted ribosome binding site; black arrows depict 5’ ends and red arrows depict 3’ ends, the arrow size is proportional to the number of extremities identified by 5’/3’RACE.Prediction of RCd22 secondary structure by RNAfold, and identification of twoconserved motifs (Rfam RF03085). Blue boxes correspond toconserved motifs; pink box corresponds to the predicted Rho-independent terminator; R = A or G; Y = U or C; inmotif: red characters correspond to 97% nucleotide identity, black characters to 90% nucleotide identity and grey characters to 75% nucleotide identity, red circles correspond to 97% nucleotide conservation. (A) (B) (C) (D) C. difficile abiF C. difficile erm erm abiF . abiF abiF abiF abiF abiF R20291 R20291 R20291 R20291 Cd Cd Cd Cd
Overexpression ofinduces growth defects inandstrains abiF C. difficile E. coli Cd
The toxins from Abi systems generally induce bacterial death or growth arrest when overexpressed from plasmids [17,31,32]. To determine whether the putative AbiFCd protein could be toxic for the cell, we analyzed the effect of its overexpression on the growth of C. difficile in broth and on agar. We first generated a plasmid for inducible overexpression of the abiFCd gene under the control of the anhydrotetracycline (ATc)-inducible Ptet promoter (p/abiFCd) (Fig 4A). The construct was transferred by conjugation into the C. difficile strain 630Δerm in which both φCD630–1 and φCD630–2 prophages were deleted to avoid potential interference (CD156). CD156 is used in this study as a control strain because it lacks the AbiFCd system. As shown in Fig 4B, after 4h of induction with the ATc inducer in liquid culture, bacterial growth was severely reduced for the strain carrying p/abiFCd plasmid as compared to the strain carrying an empty vector. A similar growth defect was observed in the R20291↗ strain overexpressing abiFCd from a plasmid, and also carrying the AbiFCd system on the phi027 prophage (S2A Fig). The bacteriostatic effect of abiFCd overexpression was further confirmed by monitoring CFU counts over time (S2B Fig). No growth difference was observed for C. difficile strain CD156 carrying p/abiFCd or an empty vector in the absence of ATc both in liquid medium (S3 Fig) and on TY plates (Fig 4C). By contrast, the overexpression of abiFCd caused a growth defect and reduced colony size on plates of the C. difficile strain carrying p/abiFCd in the presence of ATc as compared to the control strain (Fig 4C). To assess the toxic activity of AbiFCd in a heterologous host, we also analyzed the effect of its overexpression in E. coli grown in liquid culture. A severe growth defect was observed in the E. coli strain carrying the p/abiFCd plasmid in the presence of ATc as compared to the control strain with empty vector (S2C Fig). Altogether, these results demonstrate the toxic activity of the AbiFCd protein that induces growth reduction without cell death in E. coli and C. difficile.

Impact ofgene overexpression ongrowth. abiF C. difficile Cd Schematic representation of plasmids used for these experiments. pcorresponds to pDIA6103 plasmid carryingunder the control of the inducible Ppromoter; p/RCd22carries both RCd22 andunder the control of Ppromoter, separated by a Rho-independent terminator; p/RCd22carriesunder the control of Ppromoter and RCd22 under the control of its native promoter (P).Growth curve ofstrain CD156 (630 ΔΔphiCD630-1ΔphiCD630-2) carrying pDIA6103 empty plasmid (empty), pp/RCd22or p/RCd22. Induction of the Ppromoter by 250ng/mL ATc is indicated by the red arrow. Plotted values represent means and error bars represent standard error of the means (N = 3 biologically independent samples).Spot assay ofCD156 strains carrying the different plasmid constructions on TY agar plates supplemented with Tm and with or without 250ng/mL ATc inducer. Pictures are taken after 48h incubation at 37°C. (A) (B) (C) /abiF abiF abiF - abiF abiF - abiF C. difficile erm /abiF , abiF - abiF - C. difficile Cd Cd tet Cd cis Cd tet Cd trans Cd tet RCd22 Cd Cd cis Cd trans tet
RCd22 down-regulatesexpression at the transcriptional level and represses AbiFtoxic activity abiF Cd Cd
To investigate the regulatory mechanism of the putative AbiFCd system in the R20291↗ strain, we first assessed the possibility of a transcriptional control. We deleted the RCd22 sequence from the R20291↗ genome, including both abiF motifs and the predicted terminator, but kept the promoter of RCd22 and the RBS of abiFCd. The absence of RCd22 expression in the ΔRCd22 strain was confirmed by qRT-PCR (S4A Fig). We then measured the level of abiFCd expression by qRT-PCR in the absence or the presence of RCd22 (Fig 5A). In the strain deleted for RCd22, we observed a 17-fold increase in abiFCd expression as compared to the wild-type control strain. This shows down-regulation of abiFCd expression by RCd22. Accordingly, the deletion of RCd22 also induced a growth defect and a decreased growth yield in the R20291↗ strain (S4B Fig). The stronger growth defect observed in the 630Δerm strain correlated with a higher level of abiFCd expression from the plasmid (Figs 4B and S5A). To further analyze the effect of RCd22 on abiFCd expression, the RCd22 sequence was fused to the phoZ reporter gene on a plasmid. The expression was under the control of a constitutive promoter (Pcwp2) and the alkaline phosphatase activity was measured (Fig 5B). The plasmid carrying the RCd22-phoZ transcriptional fusion (p/RCd22-phoZ) and the control plasmid with phoZ under the control of the Pcwp2 constitutive promoter (p/phoZ) were conjugated in four different strains: the 630Δerm naturally lacking the AbiF system, the R20291↗ wild-type strain carrying the phi027 prophage and the AbiF system, the R20291↗ Δphi027 strain deleted for the prophage, and the R20291↗ ΔRCd22 strain deleted from the RCd22 gene. The presence of RCd22 upstream of the phoZ gene severely reduced the alkaline phosphatase activity in all strains analyzed (Fig 5B). These results suggest that RCd22 down-regulates abiFCd expression at the transcriptional level independently from the genetic background used. They also suggest that the presence of the Rho-independent terminator at the 3’-end of RCd22 leads to a strong termination of transcription, thereby reducing the read-through transcription into the downstream gene abiFCd.
We then tested the role of RCd22 as a repressor of AbiFCd toxic activity. The RCd22 sequence was cloned in the p/abiFCd plasmid, upstream of the abiFCd gene to give p/abiFCd-RCd22cis (Fig 4A). In agreement with the ability of RCd22 to down-regulate abiFCd expression, the presence of RCd22 in cis repressed the toxic activity of AbiF, leading to a complete reversion of the growth defect in liquid culture in the presence of ATc inducer (Fig 4B). Similar results were obtained on agar plates since the presence of RCd22 in cis led to the recovery of normal colony size and normal growth (Fig 4C). To get insights into the RCd22 mode of action, we also cloned it under the control of its own promoter in trans in a different location on the plasmid (p/abiFCd-RCd22trans). Co-expression of the RCd22 in trans along with the abiFCd, also led to the inhibition of the toxic activity of AbiFCd (Fig 4B and 4C). The transcript abundance of each gene was controlled by qRT-PCR after 4h of induction (S5 Fig). In agreement with reporter fusion assays (Fig 5B), RCd22 expression in cis negatively impacted the expression of abiFCd, while RCd22 expression in trans did not affect abiFCd expression (S5 Fig). These results clearly show that RCd22 can act both in cis and in trans to control AbiFCd toxic activity.

RCd22 down-regulatesgene expression at the transcriptional level. abiF Cd ()expression level measured by qRT-PCR, inΔRCd22 compared to WT (N = 4, biologically independent samples).Alkaline phosphatase activity of the RCd22-reporter fusion (p/RCd22-) under the control of the constitutive Ppromoter, compared to empty plasmid (p/) in different genetic backgrounds of. Values represent the mean standard error of the mean (N = 3 biologically independent samples). Asterisks indicates significant differences (test): * indicates a p-value<0.05; ** indicates a p-value<0.01; *** indicates a p-value<0.001. A (B) abiF phoZ phoZ phoZ C. difficile t Cd cwp2 R20291
RCd22 interacts with the AbiFprotein Cd
We then investigated the mechanism of regulation in trans by RCd22 of abiFCd. We wondered whether RCd22 interacts with abiFCd mRNA or AbiFCd protein to counteract its toxic activity. To discriminate these two hypotheses and to identify all mRNA and protein targets of RCd22, we used an MS2-Affinity Purification approach coupled with RNA-Sequencing (MAPS) or Mass Spectrometry [33]. For this experiment, we first added an MS2-tag to the 5’-end of the RCd22 ncRNA and integrated this construct into a plasmid under the control of an inducible Ptet promoter (Fig 6A). We conjugated the resulting plasmid into the R20291↗ ΔRCd22 strain and induced the expression of the MS2-RCd22 ncRNA during exponential growth phase in the presence of ATc inducer. We then enriched all mRNA and protein targets by affinity purification using an MBP-MS2 protein complex that interacts with the MS2-tag. The mRNA targets were identified by RNA-sequencing while protein targets were identified by mass spectrometry. We compared these results with the strain carrying the plasmid with the MS2-tag alone as a control, to eliminate targets interacting with MS2-tag. As expected, RCd22 appeared as the most enriched RNA in this analysis. In addition to CDR20291_RS16617 corresponding to a lysine riboswitch upstream of the asd CDR20291_3084 gene, many potential mRNA targets were identified with this technique, including mRNA involved in nucleic acid metabolism (S6A Fig and S4 Table). We then used IntaRNA to predict in silico the RNA-RNA interactions between RCd22 and its potential targets. We observed that single stranded regions of RCd22 base-paired with the ORF of mRNA potential targets (S6B Fig). Six mRNAs, including the most enriched mRNAs in MAPS analysis and abiFCd mRNA containing the predicted sites of interaction with RCd22, were in vitro transcribed and mixed with radiolabeled RCd22. For three of them (CDR20291_1558, 2768 and 3357), the transcribed RNA was divided into a 5’ and a 3’ part for interaction analysis. Gel shift assays did not reveal any formation of RNA-RNA complexes (S6C Fig), except for the CDR20291_1558 3’ fragment, where a complex with RCd22 appeared, exhibiting a weak affinity (S6C Fig). This indicates that RCd22 generally not act on mRNA targets in trans. For the protein targets, mass spectrometry analysis of MAPS protein fraction identified the AbiFCd protein as the main protein target of RCd22 (Fig 6B and S5 Table), with an enrichment log2 fold change (FC) of 6. The enrichment score of other potential protein targets was under a log2(FC) of 2.53. This result indicates that RCd22 interacts with the AbiFCd protein. We concluded that the main target of RCd22 is the AbiFCd protein. Thus, the ncRNA could act as an antitoxin by directly targeting and neutralizing the AbiFCd protein, similar to type III TA systems.
To investigate the RCd22-AbiFCd interaction in vitro, we first purified the 10xHis-tagged AbiFCd protein expressed in E. coli (S7 Fig) and confirmed that this recombinant protein conserved its toxic activity on cell growth (S7A Fig). Since the AbiFCd protein is toxic for E. coli (S2C and S7A Figs) and its production was not efficient (S7B Fig), we co-expressed RCd22 with abiFCd to counteract the toxic activity during abiFCd induction and improve AbiFCd production (S7C Fig). The analysis of elution fractions from the first HisTrap affinity purification step on SDS-PAGE revealed a protein of the expected size at 34.2kDa (Fig 6C). In agreement with MAPS results, we were able to co-purify the ncRNA RCd22 by targeting AbiFCd in the absence of nuclease treatment. Indeed, RNA extraction followed by agarose gel and RT-qPCR analysis detected RCd22 in the HisTrap column elution fractions (Figs 6C, 5D, and S7D). However, despite numerous attempts, we were unable to get the purified AbiFCd protein exempt of RCd22, suggesting the formation of stable RNA-protein complex with low dissociation rate.

RCd22 interacts with the AbiFprotein. Cd Schematic representation of MS2 Affinity Purification experiment coupled with mass spectrometry or RNA sequencing.Volcano plot of RCd22 protein targets. Red dots indicate enriched proteins targeted by RCd22 compared to the control.Co-purification of RCd22 with Histag-AbiFAbiFor with Histag-AbiFcarrying mutations R202D and H207D (AbiF). SDS-PAGE of WT or mutated Histag-AbiFpurified fraction from HisTrap affinity column and agarose gel after RNA extraction from WT or mutated Histag-AbiFpurified fraction.Representative result of at least two independent experiments for qRT-PCR of RNA extracted from WT or mutated Histag-AbiFpurified fraction. Several targets were tested, relative RNA quantity is estimated according to the standard curve with this formula: 10^((Cq-intercept)/slope). (A) (B) (C) (D) Cd ( WT) Cd mut Cd Cd Cd
The RCd22 secondary structure and bothmotifs are important for neutralization of AbiFtoxic activity abiF Cd
In silico predictions suggested that RCd22 is a highly structured RNA with two conserved abiF motifs folding into stable hairpin structures that could constitute important functional elements of the antitoxin RNA (Figs 3D and 7A). To test this hypothesis, we first introduced mutations in both RCd22 abiF motifs and then tested their effects on the antitoxin activity of RCd22 (Fig 7A). The first type of mutation (mut1, Fig 7A) was designed to affect the stem structure stability in both RCd22 abiF motifs (abiF1 and abiF2) by replacing a G by a C, thus removing one base pair within this motif. The second type of mutation (mut2, Fig 7A) was designed to change the sequence without affecting the structure of RCd22. For this, two C and G nucleotides in the stem of the abiF1 and abiF2 motifs were inverted, thus keeping the base-pairing inside the motif. With these two types of mutations, we sought to get information about the dependence of the RCd22 structure or sequence for neutralization of the toxicity of AbiFCd. These mutations were introduced into plasmids co-expressing abiFCd and RCd22 in cis or in trans. The expression of both abiFCd and RCd22 was validated by qRT-PCR analysis after their introduction into C. difficile by conjugation (S5 Fig). On the one hand, the RCd22 mut2 variants that retained the stem-loop structures down-regulated the expression abiFCdin cis, similar to WT RCd22 expressed in cis. On the other hand, the RCd22 mut1 variants, carrying destabilized stem structures, were no longer able to repress abiFCd expression (S5 Fig). The capacity of mutated RCd22 to neutralize the toxic activity of AbiFCd was then tested in liquid medium and on agar plates. RCd22 with mutations destabilizing the stem structure (RCd22mut1) were no longer able to neutralize the toxicity of AbiFCd. Indeed, neither the growth defect nor the colony size could be restored, in contrast to the normal growth of strains carrying unmodified RCd22 constructs (Fig 7B and 7C). This effect was observed for both constructs expressing the mutated RCd22 in cis and in trans. A similar action in trans of mutated RCd22 was expected for both constructs, through the produced RNAs that interact with the toxin. However, for the cis-located RCd22 mutant variants, the introduced mutations could destabilize the stem structure and affect the conformation of the terminator, thus leading to abiFCd transcription and AbiFCd translation. We thus concluded that the stem structure within the two RCd22 abiF motifs was important for the interaction with the AbiFCd protein and neutralization of its toxic activity. On the contrary, compensatory mutations that only changed the RCd22 sequence inside the stem without altering its structure (RCd22mut2) did not affect neutralization of the toxic activity in cis and in trans (Fig 7B and 7C). These nucleotides might not be important for the interaction with AbiFCd, as they are base-paired in the stem and are probably not accessible for the interaction with the protein.

Importance ofmotifs and RCd22 structure for AbiF-RCd22 interaction. abiF Cd Schematic representations of mutations introduced in the RCd22 conservedmotifs. The predicted nucleotides potentially contributing to AbiF-RCd22 interaction are indicated in red.Growth curve ofstrain CD156 carrying either the pDIA6103 empty plasmid (empty), or expressingunder the control of the inducible Ppromoter (por co-expressingand RCd22 (orwith or without mutations 1 or 2). Induction of the Ppromoter by 250ng/mL ATc is indicated by the red arrow. Plotted values represent the mean standard error of the mean (N = 3 biologically independent samples).Spot assay ofCD156 strains on TY agar plates supplemented with Tm and 250ng/mL ATc inducer. Pictures are taken after 48h incubation at 37°C. (A) (B) (C) abiF C. difficile abiF /abiF ) abiF in cis in trans, C. difficile Cd Cd tet Cd Cd tet
prediction of RCd22-AbiFcomplex structure and identification of key residues for toxicity In silico Cd
We further analyzed the RCd22-AbiFCd interaction to determine which amino acids and nucleotides are involved in this complex formation. We predicted the AbiFCd protein structure using AlphaFold2-Multimer (Fig 8A) and the structure of the RCd22-AbiFCd complex using AlphaFold3 (Fig 8B). The AlphaFold2 structural model for the AbiFCd protein (best out of 5 models, see Methods) suggested with very high confidence that it could form a homodimer (predicted TM-score (pTM) = 0.946 and interface predicted TM-score (ipTM) = 0.935). The AlphaFold3 structural model (best out of 25 models, see Methods) including two copies of AbiFCd and one copy of a short (98-nt) abundant transcript of RCd22 (Fig 8B) displayed overall good confidence (pTM = 0.8 and ipTM = 0.71). This was mostly driven by the very high confidence of the protein homodimer prediction (protein chain pTM = 0.87 and protein-protein ipTM = 0.87). The protein-protein homodimer predicted by AlphaFold3 could be superimposed almost exactly on the AlphaFold2 prediction (root mean square deviation (RMSD) = 0.5 Å). The RNA molecule in the AlphaFold3 prediction had much lower confidence than the proteins (RNA pTM = 0.12 and RNA-protein ipTM = 0.17-0.18). This is not uncommon in our experience with AlphaFold3 predictions involving RNA molecules with no close homologs in the Protein Data Bank. Our confidence in the RNA and protein-RNA predictions was nevertheless reinforced by several observations: (i) the secondary structure of the RNA stem loops in the AlphaFold3 model matched exactly the secondary structure predicted by RNAfold (Fig 3D); (ii) the two abiF conserved motifs interacted with the same zones of each monomer, although the exact binding modes were different (which was expected given the sequence difference between the two motifs); and (iii) the interaction surface created by the two AbiFCd monomers presented a conserved, positively charged groove highlighted in the electrostatic mapping (Fig 8A), where the single strand RNA region in-between the two abiF motifs is lodged in the model.
The above-mentioned conserved, positively charged groove is an interesting candidate for interaction with nucleic acids and RNase activity of toxin AbiFCd (Fig 8). To validate the importance of this region for AbiFCd toxicity, we defined two conserved arginine R202 and histidine H207 residues within the basic groove formed at the AbiFCd dimer interface for targeted mutagenesis (Figs 8C and S8). We introduced amino acid substitutions to replace R202 and H207 residues either by negatively charged aspartic acid or by neutral alanine. The corresponding plasmids were used to overexpress these protein variants and test their impact on C. difficile growth (abiFCdmut1 and abiFCdmut2 in Fig 8D and 8E). In the presence of ATc inducer, both types of mutations led to the loss of toxic activity. Little to no growth defect was observed in liquid and solid culture as compared to the severe growth inhibition following overexpression of the native AbiFCd protein (Fig 8D and 8E).
To assess whether RCd22 is still binding to the mutated AbiFCd protein, we co-expressed RCd22 and mutated His-tagged AbiFCd in E. coli and performed co-purification assay as for native AbiFCd protein. In accordance with the loss of AbiFCd-R202D-H207D toxicity (Fig 8D and 8E), the mutated His-tagged AbiFCd protein expressed in the absence of RCd22 could be easily detected with anti-His-tag antibodies in contrast to native His-tagged AbiFCd (S7B Fig). Western blot analysis also revealed that R202D and H207D mutations did not largely affect the cellular levels of AbiFCd produced in E. coli with co-expression of RCd22 (S7C Fig). In contrast to AbiFCd-RCd22 co-purification assay, elution profile of the His-tagged AbiFCd carrying R202D and H207D mutations showed a sharp decrease in the absorption signal at 260nm corresponding to the nucleic acid detection (S7D Fig). In accordance, RCd22 was only barely detected on agarose gel (Fig 6C) and by qRT-PCR (Fig 6D) after RNA extraction from Histrap elution fractions of mutated AbiFCd as compared to native protein co-purification suggesting a weaker interaction between mutated protein and RCd22.

structural prediction of AbiF-RCd22 complex and importance of R202 and H207 residues for AbiFtoxic activity. In silico Cd Cd Prediction of the homodimer AbiFstructure by AlphaFold2-Multimer. The AbiFhomodimer complex is represented in orange and green (left). The electrostatic potential (from red to blue, negatively to positively charged) and evolutionary conservation (from white to red, most variable to most conserved) were mapped on the surface of the homodimer and visualized with Pymol.Prediction of AbiF-RCd22 interaction by AlphaFold3. The RCd22 sequence is in purple with the twomotifs in cyan.Prediction of important amino acids, R202 and H207, for AbiFmutagenesis. Amino acids of both AbiFmonomers are represented as sticks and indicated in red.Growth curve ofstrain CD156 carrying either the pDIA6103 empty plasmid (empty), or expressingmutated variants (substitutions R202D+H207D, namedR202D-H207D or R202A+H207A, namedR202A-H207A) under the control of the inducible Ppromoter. Induction of the Ppromoter by 250ng/mL ATc is indicated by the red arrow. Plotted values represent the mean standard error of the mean (N = 3 biologically independent samples).Spot assay ofCD156 strains on TY agar plates supplemented with Tm and 250ng/mL ATc inducer. (A) (B) (C) (D) (E) Cd Cd Cd Cd Cd Cd Cd Cd tet tet abiF C. difficile abiF abiF abiF C. difficile
Discussion
In this study, we provide the first characterization of an AbiF-like system in Clostridia and unravel its regulation mechanism by the ncRNA RCd22. Abi systems are generally defined as bacterial defense mechanisms against phages, where bacteria die or stay in a dormancy state before the infecting phage can complete its lytic cycle, thus protecting the bacterial population [1]. Intriguingly, AbiFCd is encoded within the well-conserved phi027 prophage in the R027 hypervirulent strains of C. difficile and in two other C. difficile prophages, φC2 and φCD2301. AbiFCd belongs to the Abi_2 superfamily and AbiD/F group, first identified and characterized in L. lactis pNP40 plasmid [12]. AbiD, AbiD1 and AbiF systems superimpose structurally, giving a single AbiD/F group [13] involved in phage resistance [12,34]. This protein family has structural similarities to the HEPN (Higher Eukaryotes and Prokaryotes Nucleotide-binding) domains of RNA-guided RNase Cas13 in type VI RNA-targeting CRISPR-Cas systems [35]. HEPN RNases are widespread in prokaryotic defense systems including type II and type VI CRISPR-Cas, as well as type II and type VII TA systems and are parts of the eukaryotic RNA processing and degradation machineries [36,37]. In accordance with structural modeling of AbiFCd (Fig 8), homodimerization is a common feature of HEPN proteins. Two different HEPN domains in Cas13 also dimerize intramolecularly to form an active RNase site [35]. It was suggested that Cas13 nuclease domain evolved from ancestral AbiD/F ribonuclease with potential RNA-guided capabilities [35] that could be related to a conserved abiF RNA motif of 36–37 nt associated with AbiF system (RNA family: RF03085) [30]. During the preparation of this manuscript, in agreement with the proposed model, Zilberzwige-Tal et al. reported biochemical and structural characterization of the evolutionary origins of CRISPR-Cas13 from AbiF through a miniature Cas13e [38]. They experimentally solved the structure of Prevotellaceae bacterium AbiF (PbAbiF), which forms a homodimer, in complex with two copies of a ncRNA. Comparison of this experimental structure with our AlphaFold3 model of the RCd22-AbiFCd complex (S9 Fig) shows very good superimposition of the protein homodimer (RMSD = 3.1 Å). The interaction surfaces of the ncRNA on PbAbiF are very similar to the interaction surfaces of the abiF motifs of RCd22 on AbiFCd, despite some divergence in the ncRNA structures. The presence of two abiF motifs in a single RCd22 molecule means that the single-strand RNA region in between the two motifs has no equivalent in the PbAbiF/ncRNA structure. The amino acids R202 and H207 in AbiFCd are structurally aligned with R210 and H215 in PbAbiF, and the R210A+H215A variant of PbAbiF was shown in [38] to be inactive.
A Blast search for abi_2/abiF domains in bacterial genomes showed that Abi_2/AbiF domain is widely distributed with up to eight abi_2/abiF genes per genome. Surprisingly, some of the putative abi_2/abiF genes were located within prophages, downstream from the lysis module composed of a holin-endolysin operon, whereas it is located inside the lysis module in C. difficile. This location could be linked with the function or activity of AbiFCd system in C. difficile. Most Abi systems described in the literature are plasmid-encoded [39], but some were found within prophages. For example, two Abi systems were reported in L. lactis prophages with ORF852 encoding a putative AbiF protein downstream from the endolysin gene of the t712 prophage and ORF2248 encoding an AbiLi-like protein as a part of a two-component abortive infection system within L. lactis MG-4 prophage [40]. Other anti-phage systems considered as Abi were described in S. aureus [27] and in Mycobacterium sp. prophages [28,41]. Abi systems inside prophage genomes could function to protect from infection by other lytic phages or to stabilize the prophage in the bacterial chromosome, as demonstrated for type I toxin-antoxin systems [24,42,43].
The activity of AbiD/D1/F systems was first characterized in L. lactis plasmids as abortive phage infection modules also contributing in some cases to the low temperature stress response [12,15,16,34]. Here we demonstrate that AbiFCd has a toxic activity in C. difficile and E. coli. We showed that overexpression of abiFCd induced a bacterial growth defect in liquid medium and reduced colony size on plates, suggesting a bacteriostatic effect. Even though most Abi systems induce bacterial death, growth arrest was also observed as a strategy to let time for the bacteria to develop efficient defense against phage infection [32]. Interestingly, in relation to the evolutionary origins of CRISPR-Cas13 from AbiF system, the Cas13 protein has a nonspecific RNase activity triggered by target recognition, and induces bacterial dormancy during phage infection, reminiscent to the abortive infection mechanism [44–46]. This strategy seems to provide robust defense against phage infection but also prevents the emergence of CRISPR-resistant mutants [44]. The collateral RNA degradation by Cas13a targets the anticodons in a subset of tRNAs, leading to the inhibition of protein synthesis and thus providing anti-phage defense [46]. In addition, this tRNA cleavage indirectly triggers the activation of RNases from type II TA targeting mRNAs, contributing to the dormancy state [46]. The conserved RNase motif RφxxxH, where φ is the polar residue N, H, or D, is associated with the catalytic site of HEPN RNases [36]. Using AlphaFold protein modeling, we defined the amino acid substitutions targeting the conserved R202 and H207 residues within the RNase motif of AbiFCd of the basic groove formed at the AbiFCd dimer interface. When overexpressed on a plasmid in C. difficile, the AbiFCd protein variants mutated at these key positions lost their toxic activity, in agreement with the presence of the HEPN RNase RNQCAH motif. In addition, co-purification assay showed that R202D and H207D mutations also affected the AbiFCd interaction with RCd22 in accordance with structural modeling. Further studies will be required to define the specificity of the RNase activity of AbiFCd contributing to its toxicity in C. difficile and in the heterologous E. coli host. The anti-phage activity of AbiFCd remains challenging to demonstrate in C. difficile. Indeed, all known C. difficile phages are temperate and lysogeny is frequent, therefore leading to significant lysogeny-mediated phage resistance. Based on its prophage localization, AbiFCd could also contribute to prophage maintenance as a common strategy associated with TA modules inside mobile genetic elements [47].
Since the AbiFCd is toxic for C. difficile, it must be tightly regulated under normal growth conditions. In this paper, we describe the RNA-based regulation mechanism of an AbiF-like system in C. difficile adding further evidence for the role of RNAs in the crossroads of phage-bacteria interactions. RNAs have emerged as crucial components of the numerous anti-phage defense strategies including CRISPR-Cas, TA and reverse-transcriptase-associated systems [48,49]. The presence of a ncRNA, named RCd22, upstream of the abiFCd gene was discovered by RNA-sequencing of the R20291↗ strain and further detected by Northern Blot. 5’/3’-RACE experiments suggest that RCd22 and abiFCd genes are co-transcribed from the same SigA-dependent promoter. However, most of the transcription stops at a Rho-independent terminator upstream of the abiFCd coding sequence, leading to the accumulation of a major and abundant transcript, RCd22. The presence of RCd22 within the 5’-untranslated-region of the abiFCd mRNA impacts the expression of the downstream gene through premature transcription termination, similar to riboswitch-dependent regulations. It is tempting to speculate that during phage infection or under stressful conditions RCd22 RNA motifs could recognize specific phage- and/or stress-associated stimuli leading to conformational changes to favor readthrough transcription increasing the abiF expression. We showed that under laboratory conditions this termination of transcription is strong enough to counteract the toxic effect of AbiFCd on bacterial cell. In addition to this antitoxin activity in cis, RCd22 was also able to counteract the AbiFCd toxic activity in trans. For the first time, we adapted the MAPS approach in C. difficile for RCd22 interactomics analysis and defined the direct RNA-protein interaction behind this antitoxin RCd22 activity in trans. We successfully identified the AbiFCd protein as the most enriched target of the ncRNA antitoxin. This powerful approach has been previously adapted in Gram-positive bacteria, for example to identify RNA partners of ncRNAs in S. aureus [50–52]. By co-expressing the His-tag-AbiFCd protein with RCd22 in E. coli, we were also able to co-purify the ncRNA RCd22 with His-tag-AbiFCd, providing additional experimental evidence for the RNA-protein complex formation. All our attempts to purify AbiFCd alone by removing the ncRNA from the complex were unsuccessful, suggesting a tight interaction between RCd22 and AbiFCd, as previously reported for ToxIN type III TA RNA-protein complexes [53,54].
A similar two-step regulation mechanism for antitoxin action is operating in type III TA systems. In these TA systems, the antitoxin is a ncRNA transcribed from an array of short tandem repeats followed by the toxin protein coding gene [17,18]. The antitoxin and toxin are co-transcribed and separated by a Rho-independent terminator, controlling the ratio of antitoxin RNA over the toxin mRNA. This balance between toxin mRNA and antitoxin ncRNA allows to keep the level of antitoxin in excess compared to the toxin, as the antitoxin is less stable than the toxin protein. The toxin protein activity is inhibited by the trans-acting antitoxin through specific RNA-protein interactions [17,18]. However, the structure of both toxin and antitoxin from previously characterized type III TA systems differs from AbiFCd and RCd22 structural model. Indeed, in the most characterized ToxIN system in the plant pathogen Pectobacterium atroseptica [17,55] and in E. coli [56], protein-RNA interactions lead to a heterohexameric triangular assembly of three ToxN proteins with the interspersed pseudoknots of 36-nt ToxI RNAs [18]. All other identified type III TA systems, i.e., AbiQ, TenpNI and CptNI, are homologous to ToxIN [57] with a structure similar to that of the AbiQ system [58], but with a completely different heterotetrameric quaternary organization for the CptNI complex due to extended antitoxin RNA size [59]. Similar to previously characterized type III TA systems [17,18], the abiFCd gene is preceded by an array of two repeated abiF RNA motifs within the RCd22 ncRNA gene. Antitoxin repeats have been defined as a key feature of type III TA systems [60]. The ToxI or antiQ antitoxins from the ToxIN family are composed on an RNA motif of 34–36 nucleotides repeated from 2.8 to 5.5 times. The TenpNI and CptNI type III TA are associated with a longer RNA antitoxin of 50 and 45 nt repeated 2.1 times [60]. At least one complete RNA repeat was essential for antitoxin activity in vitro for ToxIN [17] but the number of repeats required to keep the functionality varies between systems [61,62]. In the ToxIN system, the repeated sequences in the transcript are cleaved by ToxN into individual 36-nt units followed by a self-assembly to 3 ToxN:3 ToxI complex [18]. In contrast with other type III TA with toxin-mediated processing of antitoxin RNA, all the transcripts detected by RNA-seq, Northern blot and 5’/3’RACE analysis in C. difficile carried two abiF motifs of 33 and 35-nt in length interspaced by 19 nt. Potential processing sites were identified only upstream of the terminator stem-loop structure, with no cleavage between the abiF motifs. With respect to the evolutionary history of the AbiF-like systems at the origin of CRISPR-Cas13, such repeated stem-loop motif array inside the RCd22 antitoxin could mimic the CRISPR array organization, the duplication of abiF motif representing an intermediate step towards larger arrays of repeated motifs.
The in silico predictions suggest that two conserved abiF motifs of RCd22 [30] provide a stable structure that constitutes an important element for the functionality of the RCd22 antitoxin. We demonstrated that mutations affecting the structure of RCd22 led to an inactive form of antitoxin RNA unable to neutralize the toxic activity of AbiFCdin cis and in trans, while compensatory mutations had no impact on antitoxin activity. The overall RCd22 structure stabilized by two abiF motifs linked together could be important to keep the conformation of the RNA region in between accessible for the interaction with AbiFCd protein.
In conclusion, this work presents the identification of a new member of under-studied TA modules of type III operating through tight interaction between antitoxin RNA and toxic protein encoded inside the conserved prophage of the important human enteropathogen C. difficile. The association with the phage lysis module in several Gram-positive pathogens like S. aureus, Streptococcus agalactiae and Listeria monocytogenes is intriguing. Furthermore, the unique localization inside this module in C. difficile prophages deserves further investigations to better understand the role of AbiF system during interactions with phages. Whether the induction of the system is activated by specific or general signals associated with phage infection or stress needs to be determined. The prophage-associated localization could also contribute to the dissemination of the AbiF module within the bacterial population. This study emphasizes a unique position of AbiF system in the evolutionary path on the crossroad of previously characterized type III TA and ancestral CRISPR-Cas13 sharing either functional or structural similarities. Future comparative studies will decipher the specificity of RNase activity associated with the AbiF toxicity inside the bacterial cell. Overall, our findings further highlight the increasing evidence for the role of RNAs as key components of numerous anti-phage defense systems. By characterizing a prophage-located AbiF-like system, this work is providing insights into the RNA-based bacteria-phage interaction mechanisms paving the way for future biotechnological and health applications.
Materials and methods
Bacterial strains and growth conditions
C. difficile and E. coli strains used in this study are described in S1 Table. C. difficile strains were grown in anaerobic conditions (5% H2, 5% CO2, and 90% N2), using an anaerobic chamber (Jacomex) in TY [63] or Brain Heart Infusion (BHI, Difco) medium. Thiamphenicol (Tm, 7.5 µg/mL), cefoxitin (Cfx, 25 µg/mL) and cycloserine (Cs, 250 µg/mL) were added when needed. E. coli strains were grown in aerobic conditions in LB [64] with ampicillin (Amp, 100 µg/mL) and chloramphenicol (Cm, 15 µg/mL) when necessary.
Plasmid construction and conjugation intostrains C. difficile
All plasmids and primers used in this study are listed in S2 and S3 Tables, respectively. All derived plasmids were transformed into E. coli NEB10β strain and inserts were verified by sequencing. Then, they were transformed into E. coli HB101 (RP4) strain for subsequent transfer by conjugation into C. difficile strains. The HB101 donor and C. difficile receptor strains were grown overnight in their respective media supplemented with antibiotics when needed. One milliliter of donor strain was centrifuged at 3,500xg for 5 minutes and the supernatant was discarded. Then, 200 µL of C. difficile strain was used to gently resuspend the donor strain in anaerobic conditions. For the R20291↗ strain, the cell suspension was first heated at 50°C for 10 minutes [65]. The co-culture was then spotted onto BHI plate and incubated for 8 hours. The cells were then collected with 600µL PBS, plated on BHI supplemented with Cfx, Cs and Tm and incubated for 48 to 72h at 37°C. Transconjugants were verified by PCR for the presence of the plasmid.
Mutant construction and mutagenesis strategy
For the generation of deletion mutants, we used an allelic exchange method [24] with the toxin from a type I TA system for counter-selection. The pMSR0 plasmid used for this method encodes the toxin CD2517.1 under the control of the inducible promoter Ptet and the antitoxin RCd8 under the control of its own promoter. The homology arms were designed to have 800–1,000 bp of upstream and downstream homology to the chromosomal sequence to delete. They were amplified by PCR from the genomic sequence of the C. difficileR20291↗ strain (S2 and S3 Tables for plasmid and primers description). Colonies were tested by PCR followed by sequencing to confirm the deletion. RCd22 and phi027 deletion mutants were generated in the C. difficileR20291↗ background and phiCD630–1 was generated in C. difficile 630Δerm ΔphiCD630–2 background [24].
Bioinformatic analyses for distribution and position of AbiF/Abi_2 systems
The presence of proteins assigned to AbiF/Abi_2 COG4823 and pfam07751 family was analyzed in 47,545 complete prokaryotic genome assemblies (chromosome level assembly, November 2023) both from Refseq and Genbank available in NCBI database with their taxonomic description (S6 and S7 Tables). The phylogenetic tree was constructed using iTOL v7 web site including the count of the AbiF/Abi_2 family proteins. The information on the flanking genes located at positions from -5 to +5 from the abiF/abi2 gene has been extracted from the genomic data. All proteins were annotated by PSIBLAST 2.16.1 version using clusters of orthologous genes (COG), conserved domain (CD) and PFAM profiles from the conserved domain (CDD) database with E-value equal to 1e-4 and other default parameters settings. Genomes in which the abiF/abi2 gene was detected were further analyzed with geNomad v1.8.0 [29] to determine if the abiF/abi2 gene was located on a mobile genetic element (S8 Table). The coordinates of the prophages identified with geNomad were compared with the coordinates of the abiF/abi2 gene to determine which were overlapping (S8 Table).
Phenotypic analysis for growth kinetics and spotted growth assay
C. difficile strains were grown overnight in TY broth supplemented with Tm and then inoculated at an OD600 of 0.05 in fresh medium with Tm. Growth kinetics were measured manually, using a spectrophometer (Fisher Scientific Cell Density Meter 40), taking OD600 every 1 hour during 10 hours. Alternatively, a microplate reader (Cerillo) was used for growth curve analysis. For induction of the Ptet promoter of pDIA6103 plasmid derivatives, anhydrotetracycline (ATc, 250 ng/mL) was added at an OD600 of 0.4. For drop assay, dilutions (from 10-1 to 10-6) of an exponential phase grown culture were spotted onto TY agar plates containing Tm (15 µg/mL) and ATc (250 ng/mL).
Alkaline phosphatase activity assays
C. difficile strains containing the phoZ fusion plasmids were grown in TY broth supplemented with Tm and harvested at the end of the exponential phase. Samples were stored at -20°C and the alkaline phosphatase assay was performed as previously described [24]. Time elapsed for the assay was recorded (Δt in minutes). The absorbance at both OD420 and OD550 was taken after the alkaline phosphatase reaction. Activity units were calculated and normalized using the following formula: ((OD420 – (1.75 × OD550)) × 1000)/(Δt × OD600 × vol. cells (mL)).
RNA extraction, qRT-PCR, Northern blot, and 5′/3′RACE
For RNA extraction, C. difficile strains were grown in TY broth and harvested at various growth phases. C. difficile strains carrying pDIA6103 plasmid derivatives were grown in TY with Tm and ATc (250 ng/mL) was added at an OD600 of 0.4 for induction of the Ptet promoter. Cells were harvested after 4 hours of induction. RNA were extracted using Trizol (Sigma), as previously described [66]. Northern Blot and 5’/3’RACE experiments were performed as previously described [4].
MAPS experiment coupled with mass spectrometry or RNA-sequencing
MAPS experiment coupled with RNA-Sequencing or mass spectrometry was performed and adapted to C. difficile from [33]. The R20291↗ ΔRCd22 strain carrying the p117 (pDIA6103 carrying the MS2-tag alone, under the control of the Ptet inducible promoter) or p220 plasmid (pDIA6103 carrying MS2-tag and a 6 nucleotides spacer to the 5’-end of RCd22, under the control of the Ptet inducible promoter) was grown in 250mL of TY with 7.5µg/mL Tm until the OD600 reached 1. Then the expression of the tagged ncRNA RCd22 was induced with 100 ng/mL ATc for 10 min and the culture was cooled on ice. The culture was centrifuged (6,000xg, 15 min, 4°C) and the pellet was resuspended in 5mL ice-cold buffer A (20mM Tris-HCl pH 8, 150mM KCl, 1mM MgCl2, 1mM DTT). The mix was transferred into Lysis Matrix B tubes and lysed with a FastPrep machine (30s at setting 6.5, twice with a 2 min pause on ice). Samples were centrifuged at 15,700xg for 15 min at 4°C and the supernatant was transferred into a new tube and maintained at 4°C. For the affinity purification, a poly-prep empty chromatography column was used and all further steps were done at 4°C. First, the column was washed with ultra-pure water (RNase-free) and then 300µL of amylose resin was added. The column was washed with 10mL of buffer A. Then, 1,200pmol of MS2-MBP protein diluted in 6mL of Buffer A was loaded in the column, followed by a wash with 10mL of buffer A. The cell lysate was then loaded onto the column, followed by 3 washes with 10mL of buffer A. The complex was then eluted with 1mL of buffer E (20mM Tris-HCl pH8, 150mM KCl, 1mM MgCl2, 1mM DTT, 0.1% Triton X-100, 12mM Maltose). A volume of 1mL of each fraction was used for RNA extraction. One milliliter of phenol was added, mixed vigorously and centrifuged at 16,000xg for 10 min at RT. The upper phase was transferred into a new tube and the organic phase kept for protein precipitation. One milliliter of chloroform/isoamyl alcohol (24/1) was added, followed by a second centrifuge step. The upper phase was transferred into a new tube and 2.5 volumes of 100% cold ethanol and 0.1 volume of 3M sodium acetate pH5.2 were added and the mix was incubated overnight at -20°C. Then, the sample was centrifuged at 4,000xg for 90 min at 4°C, the supernatant was discarded and 500µL of 80% of cold ethanol was added followed by a second centrifugation step at 4,000xg for 45 min at 4°C. The pellet was dried and resuspended in 90µL of ultra-pure water, followed by DNase treatment and inactivation of DNase (TURBO DNA-free, Invitrogen). The RNA was quantified by Qubit and reverse transcription was done on 50 ng of purified RNA with the AMV Reverse Transcriptase (Promega). The resulting cDNA was used as template for PCR to check for the enrichment of MS2-RCd22 in the MAPS eluted fraction. cDNA libraries were prepared using the ScriptSeq kit from Illumina and sequenced by Novogene.
For protein precipitation, 4 volumes of cold acetone were added to the tube containing the organic phase, vortexed and incubated overnight at -20°C. The sample was centrifuged 10 min at 13,000xg and the supernatant was gently discarded. The pellet was dried for 10–30 min and the sample was then processed at the I2BC facility for liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) analyses. Peak picking, database search against the R20291↗ proteome, protein inference, and quantitation were done with Mascott (v2.6.2) and experiment-wide grouping with protein cluster analysis were done with Scaffold (v5.0.1) by the Proteomic-Gif (SICaPS) I2BC facilities. We then used the provided “total spectrum counts” as an estimation of the protein quantities and applied the SARtools-DESeq2 (v1.8.1) statistical routines for assessing differential protein enrichment.
RNA-Sequencing analysis
All codes used during this study are available on github (https://github.com/i2bc/RNAreg_AbiF_CDiff↗).
The RNA sequencing data were processed as described on the github page. Differential expression analysis for enriched genes in MAPS interactomics was performed using the DESeq2 [67] based script SARTools [68], and genes were considered differentially expressed with at least 2 log2 fold change and an adjusted P < 0.05. The analyses were combined in a snakemake pipeline. Briefly, it includes quality control of reads with FastQC and fastQ Screen (and correction with FastP if necessary), creation of an index from the C. difficile ribotype R20291↗ genome, mapping of reads onto the genome with Bowtie2, selection of mapped reads with Samtools and their counting by the FeatureCounts tool of the Subread package on the coding sequences listed in the genome annotation and augmented by the list of ncRNAs selected in a previous analysis [69], as well as a differential gene expression analysis with the SARTools package using the DESeq2 method. All parameters and software versions are defined in an associated configuration file, enabling this pipeline to be re-used for other MAPS analyses involving other ncRNAs and/or genomes. Raw sequencing data have been submitted to ENA with the accession number PRJEB87349 (https://www.ebi.ac.uk/ena/browser/view/PRJEB87349↗).
Construction, production and protein (co-)purification
The abiFCd gene from the C. difficile strain R20291↗ was synthesized and cloned into a pET-16b vector, containing a N-terminal 10X-histidine tag, under the control of the T7lac promoter (S1 and S3 Tables for strains and plasmids). The ncRNA RCd22 sequence with the terminator region was amplified from the R20291↗ strain and cloned into the same plasmid, at a different location, under the control of the T7 promoter (S3 Table for primers). The plasmid was transformed into E. coli BL21(DE3) cells and grown overnight at 37°C, 180 RPM in LB medium supplemented with Cm (15 µg/mL) and Amp (100 µg/mL). A sub-culture was then grown at 37°C, 180 RPM until an OD600 of 0.7. IPTG was then added to a final concentration of 1mM and cells were incubated at 37°C, 180 RPM during 4h. Cells were harvested by centrifugation at 4,000xrpm g for 10 min at 4°C. Cells were resuspended in lysis buffer (20mM Na2HPO4 pH7.5, 1M NaCl, 5% glycerol, 2mM ß-mercaptoethanol, cOmplete mini EDTA-free protease inhibitor cocktail, Roche) and lysed by sonication. The lysate was centrifuged at 4,000xg for 15 min at 4°C, the supernatant was passed through a 0.45µm filter and was loaded on a Ni2 + NTA column. The complex AbiFCd-His-tag-RCd22 was eluted by FPLC with AKTA system using elution buffer (Na2HPO4 pH7.5, 400mM NaCl, 5% glycerol, 2mM ß-mercaptoethanol, 500mM imidazole). Fractions with the complex eluted were pooled and twofold diluted with dilution buffer (Na2HPO4 pH7.5, 5% glycerol, 2mM ß-mercaptoethanol) to decrease the NaCl and imidazole concentration. In an attempt to purify the AbiFCd-His tag protein alone and remove RCd22, benzonase (12.5 U/mL) and MgCl2 (10 mM) were added and the sample incubated for 1h with shaking at 180 RPM at room temperature. Then, the sample was loaded on heparin column and AbiFCd-His tag protein was eluted with an AKTA FPLC system using elution buffer (Na2HPO4 pH7.5, 2M NaCl, 5% glycerol, 2mM ß-mercaptoethanol). Finally, gel filtration was performed to reduce the NaCl concentration to 200mM and the protein was conserved in 10% glycerol at -80°C. Production of AbiFCd-His tag protein was verified at every step by SDS-PAGE followed by Coomassie staining and Western Blot with an HRP anti-His tag antibody (Proteintech, HRP-66005). To verify ncRNA RCd22 copurification from the affinity column, an RNA extraction was performed directly from the eluted protein purification fraction using phenol-chloroform, following by a gel electrophoresis and RT-qPCR.
AlphaFold structure prediction
The structural model of AbiFCd protein homodimer was obtained with AlphaFold2-Multimer [70], using the ColabFold implementation [71] (v1.5.2, commit 3574273 from 24 Feb 2023, which uses AlphaFold version 2.3). First, a multiple sequence alignment (MSA) of AbiFCd homologs was obtained using MMseqs2 [72] (commit 4148e09, 30 Jan 2023) to query the UniRef30 database version 2202 [73]. Then, two concatenated copies of this MSA were used as input to AlphaFold2-Multimer (multimer_v3 parameters) to generate 5 models. The best out of 5 was selected using the AlphaFold2-Multimer combined score (0.8 ipTM + 0.2 pTM). For evolutionary conservation mapping, we used the MSA generated by MMseqs2 and retained only the first 100 sequences (already displaying divergence down to 38% sequence identity with the query AbiFCd sequence). The structural models of RNA-protein interactions were predicted using the AlphaFold3 web server [74]. The best model out of 25 models (5 independent web server runs), ranked by protein-RNA ipTM score while controlling for good overall ipTM and pTM scores, was used for display and analysis.
Preparation of RNAs forexperiments in vitro
Transcription of RNAs used in this study (RCd22, CDR20291_1462, 1558, 2768, 0538, 1829, 3357) was achieved using PCR products containing the sequence of the RNA downstream of the T7 promoter, introduced along with the primer (S3 Table). After purification, these PCR products were used as templates for in vitro transcription using T7 RNA polymerase. RNAs were treated with DNase I, purified on an 6% polyacrylamide-8 M urea gel, eluted with 0.5 M ammonium acetate, 0.1 mM EDTA, and 0.1% SDS and precipitated in cold absolute ethanol. RCd22 was labeled with T4 polynucleotide kinase (Fermentas) and [γ32P] ATP and ultimately purified on a 6% polyacrylamide-8M urea gel and eluted as described above.
Electrophoretic mobility shift assay (EMSA)
5’-end radiolabeled RCd22 (10,000 cps/sample, < 1pM), and cold mRNAs were denatured separately by incubating at 90°C for 1 min in 100 mM Tris-HCl pH 7.5, 300 mM KCl, 200 mM NH4Cl, then cooled down for 1 min on ice and renatured at RT for 10 min after addition of 10 mM MgCl2. Complexes were formed at 37°C for 15 min. After the addition of 1 volume of glycerol blue loading buffer, the samples were loaded on a native 6% polyacrylamide gel containing 10 mM MgCl2 and migrated at 300 V and 4°C in 1X Tris Borate buffer with 10 mM MgCl2 before autoradiography.