General Peptide Information

The leaderless communication peptide (LCP) class of quorum-sensing peptides is broadly distributed among Firmicutes

Leaderless communication peptide (LCP) system is broadly distributed

To assess the distribution of SIP-like LCP-based qs systems across bacteria, we employed a large-scale search strategy that includes: (i) search for RopB homologs across a dataset of 129,001 bacterial genomes and 9,421 reference metagenomics-assembled genomes (MAGs), (ii) probing the genomic vicinity of RopB homologs for small ORFs with a preceding Shine-Dalgarno Ribosome-Binding-Site (RBS) motif that is indicative of their likely translation, (iii) identification of clans of RopB homologs for which the most likely translated adjacent sORF is ultrasmall and encodes a SIP-like LCP, and (iv) functional validation by assessing the regulatory activity of chosen subset of candidate LCPs.

A Blastp search of RopB against the target dataset resulted in 19,280 hits (sequence identity ≥25%, mutual length coverage ≥70%, Supplementary Data 1) encoded in 15,776 genomes and 39 MAGs distributed across 468 taxa. These 19280 hits correspond to 975 unique protein sequences that are predicted to harbor C-terminal tetratricopeptide repeat domain, a hallmark structural element responsible for peptide recognition by the RRNPPA family of receptors7 (Supplementary Data 1). Importantly, 478 out of 975 RopB homologs (~49%) are flanked by at least one candidate sORF with a high confidence upstream RBS motif (Supplementary Data 1)13,14.

To determine whether clans of RopB homologs are enriched for their association with LCPs, we inferred the phylogeny of RopB-like receptors and mapped the amino acid sequence of flanking sORF with the strongest RBS motif on its corresponding receptor leaf of the tree (Fig. 1b, Supplementary Data 1). The two previously well-characterized clans of RopB homologs correspond to streptococcal Rgg regulators, which recognize adjacently encoded canonical propeptides that are post-translationally processed into mature short hydrophobic peptides (SHPs)15,16. Accordingly, the two Rgg clans appeared as two well-delineated clans in the RopB tree and were correctly mapped to SHP propeptides, which lends validation to our methodology to identify clans of qs systems (Fig. 1b and S1, Supplementary Data 1 and 2). Remarkably, RopB and its evolutionarily closest relatives represent a distinct clan of 183 receptors that were frequently associated with an ultrasmall peptide with amino acid composition similar to SIP (Fig. 1b, Supplementary Data 1 and 2). We term the new clan, RopB clan, and the receptors in RopB clan represent a new class of qs system that uses LCPs as signaling molecules.

The refined manual assessment of the candidate cognate LCP of each receptor within the RopB clan (Fig. 2 and Supplementary Data 2) revealed that the LCP system is broadly distributed in bacterial genomes of a large taxonomic diversity, spanning over Streptococcaceae, Lactobacillaceae, Enterococcaceae, Carnobacteriaceae, Bacillaceae and Staphylococcaceae families (Fig. 1c). Most of the candidate receptor-LCP pairs are encoded in the bacterial genomes, and a non-negligible number of LCP systems are present in either plasmids or integrative and conjugative elements (ICEs) (Fig. 3). However, receptor-LCP pairs were not detected in prophages or phages. The genetic elements encoding LCP-receptor pairs are found in various host-associated microbiomes, waste waters, and fermented food products. The LCP systems are prevalent in the genomes of several clinically relevant human pathogens such as S. pyogenes or Enterococcus casseliflavus, and in animal pathogens such as S. porcinus and S. pseudoporcinus (Fig. 3).

Fig. 2: Predicted LCPs encode diverse peptide communication codes.
figure 2

The phylogenetic tree of the RopB clan, rooted at mid-point for visualization purpose, comprises 182 leaves and was inferred from a trimmed alignment of 229 sites. The label of a leaf corresponds to the NCBI protein id of the putative receptor, followed by the name of the encoding species. The most outer label in bold indicates the sequence of a putative LCP encoded in the genomic vicinity of a ropB homolog (leaf) whenever detected. The three-color strips correspond to (i) the genomic orientation of the candidate receptor-LCP pair, (ii) the intergenic distance between the receptor and LCP ORFs and+ (iii) the confidence score, ranging from 1 to the 27 of the Shine-Dalgarno RBS motif identified upstream from the LCP’s ORF (white meaning no RBS detected). Tips without labels correspond to collapsed receptors encoded by the same species and without LCP detected in the vicinity.

Fig. 3: Taxonomic, sequence, ecological and functional diversity of candidate LCP systems.
figure 3

Each row depicts a representative of a group of receptor-LCP pairs with the same LCP sequence and the same species within the RopB clan. The stacked histogram highlights the proportion of receptor-LCP encoding elements in chromosomes, plasmids, ICEs or unclassified encoding contigs within a group. The matrix shows the biomes in which the LCP systems of a group were detected. The last column indicates the adjacently located putative target regulon controlled by each representative LCP system.

Most putative LCPs are 8 to 10 amino acids long with few exceptions such as LCPs from Lysinibacillus sphaericus, Staphylococcus delphini, Enterococcus durans, and Granulicatella balaenopterae. The predicted LCPs are highly hydrophobic and are mostly comprised of aliphatic and aromatic amino acids (Fig. 2 and S1). The peptides encode distinct communication codes with unique LCP amino acid sequences, suggesting that LCP-mediated communication is specific among closely related bacterial species or strains. Importantly, the characterized RopB-SIP system of S. pyogenes represents only a small subset of the identified LCP systems (Fig. 2), indicating the broader distribution of LCP systems and their potential unappreciated roles in bacterial pathophysiology.

The structure-guided multiple amino acid sequence alignment (MSA) analyses of RopB clan receptors indicate that the LCP-binding C-terminal domain (CTD) diversified faster than the N-terminal DNA-binding domain (DBD) (Figs. S2 and S3a)7,11,17. Among the structural elements of CTD, the α6 helix is highly conserved relative to the rest of CTD (Figs. S2 and S3a). The α6 helix constitutes a critical structural element for LCP binding as it forms the floor of the LCP binding pocket as well as engages in intramolecular interactions that provide the scaffold for the LCP-binding pocket of RopB (Supplementary Fig. 3b)7,11,17. In accordance with the suggested functional constraint on α6 helix to bind LCPs, the site-wise dN/dS ratio analyses indicate that α6 helix of candidate LCP receptors are under strong purifying selection against amino acid substitutions (Supplementary Fig. 3a). Similarly, the MSA analyses of 12 LCP-contacting amino acids of RopB with corresponding amino acids from candidate RopB-clan receptors suggest that these amino acids evolve slower and face stronger purifying selection compared to the reminder of CTD (Supplementary Fig. 4a, b). These observations are suggestive of faster diversification of TPR motifs likely due to their innate degeneracy18,19 relative to LCP-receptor specificity diversification. In accordance with this, pairwise comparisons of evolutionary distances between LCP receptors, LCP-contacting residues in receptors, and candidate LCPs suggest that the amino acid sequences of LCP receptors diverged faster than the LCP-contacting residues and amino acid sequences of putative LCPs, and both LCPs and LCP-contacting amino acids in RopB clan receptors diverge at similar evolutionary rate (Supplementary Fig. 5). Collectively, these observations are suggestive of the function of RopB clan receptors and candidate LCPs as qs receptor-signal pairs.

To understand the co-evolution of receptors and candidate LCPs, we aligned the 12 LCP-contacting amino acids of RopB with similarly located amino acids from RopB-clan receptors and compared them with the physicochemical characteristics of the corresponding LCPs (Supplementary Fig. 4a, b). Consistent with the preponderance of aliphatic and aromatic amino acids in majority of candidate LCPs (Supplementary Fig. 1d), the peptide-contacting amino acids are relatively well conserved among RopB clan receptors (Supplementary Fig. 4). However, compared to RopB, the peptide-contacting residues of RopB clan receptors from Ligilactobacillus muralis, Pediococcus acidilactici, Ligilactobacillus animalis, and Enterococcus casseliflavus are distinct (Supplementary Fig. 4). Accordingly, the physicochemical properties of corresponding LCPs also deviate from the typical signature of LCPs as they contain charged, polar, and proline residues (Supplementary Fig. 4). These observations suggest a tropism of RopB clan receptors for SIP-like LCPs, however, divergence exists to achieve alternative LCP specificities through receptor-LCP co-evolution. Finally, analyses of the RopB clan receptor-LCP pairs from Bacillus cereus and Lysinibacillus sphaericus revealed an evolutionary feature that is suggestive of peptidase-mediated processing of some LCPs similar to canonical RRNPP propeptides (Supplementary Fig. 4). The predicted LCP binding sites and the corresponding candidate LCPs of B. cereus and L. sphaericus have identical amino acid composition. However, the LCP from L. sphaericus have an additional eight amino acids in their C-terminus compared to B. cereus LCP (Supplementary Fig. 4). This observation suggests that LCPs may exist in longer precursor forms and the C-terminal appendages may be involved in peptidase-mediated cleavage of precursor form to release mature LCPs.

Analyses of the genomic context of representative putative LCP systems showed that the candidate genes regulated by LCPs are predominantly in a divergent context relative to the receptor (Fig. 2) and belong to 3 major categories: biosynthetic gene clusters (BGCs) predicted to produce ribosomally synthesized and post-translationally modified peptide (RiPPs) as well as non-ribosomally synthesized antimicrobials, ABC-type transporters, and type VII secretion systems that are typically involved in the translocation of virulence factors (Fig. 3). These observations suggest a broader and more diverse role for LCP systems in bacterial pathogenesis, physiology, and microbial ecology.

LCP in S. salivarius mediates gene regulation

To investigate whether the LCPs other than SIP also act as intercellular signals, we characterized the putative cytosolic receptor-LCP pair from S. salivarius (RopBss-LCPss). The LCPss is encoded in a megaplasmid and located downstream of ropBss and transcribed divergently (Fig. 4a). The LCPss encodes an eight amino acid hydrophobic peptide with a predicted amino acid sequence of MWLILLFL with no additional amino acids at either end (Fig. 4b). The genetic proximity of a 14-gene operon encoding a putative non-ribosomal peptide synthase biosynthesis gene cluster (NRPS-BGC) located immediately downstream of LCPss and transcribed convergently (Fig. 4a) suggest that NRPS-BGC is the regulatory target of the RopBss-LCPss pathway.

Fig. 4: Intercellular signaling and gene regulation by S. salivarius LCP.
figure 4

a Schematic representation of genetic elements in S. salivarius (ss) encoding ropBss, LCPss, and a biosynthetic gene cluster (BGC) encoding non-ribosomal peptide synthase (nrps). The divergently transcribed ropBss and LCPss along with predicted transcription start site (PLCPss, bent arrow) of LCPss are shown. The numbers below denote the nucleotide positions relative to the first nucleotide of the start codon of nrps-BGC. b The nucleotide sequence of the ropBss-LCPss intergenic region, coding sequence of LCPss, and corresponding predicted amino acid sequence of LCPss are shown. The ribosomal-binding sites (RBS) of LCPss and RopBss are denoted by arrows. c Analysis of nrps transcript levels in the indicated strains by qRT-PCR. d Addition of full-length synthetic LCPss activates nrps expression in LCPss inactivated mutant (LCPss*). The amino acid sequences of the synthetic LCPss variants used are shown (right). e qRT-PCR based nrps transcript level analyses in LCPss* mutant grown in spent medium from the indicated strains. f Cytosolic fluorescence corresponding to FITC-labeled LCPss indicative of the import of exogenous LCPss as assessed by fluorescence measurements. Unmodified or FITC-labeled LCPss was added at a final concentration of 1 µM to either WT or oligopeptide permease-inactivated mutant (∆opp). After 30 min incubation, fluorescence in the clarified cell lysates was measured using excitation and emission wavelengths of 480 nm and 520 nm, respectively. g Analysis of the binding between purified RopBss and fluoresceinated LCPss by fluorescence polarization (FP) assay. h Ability of LCPss or SCRA peptide to compete with the FITC-labeled LCPss–RopBss complex for binding. A preformed RopB (250 nM)-labeled LCPss (10 nM) complex was challenged with the indicated unlabeled peptides. i Nucleotide sequence of the identified RopBss binding site in LCPss promoter used in the DNA-binding studies is shown. j Summary of the affinity of different forms of RopBss to LCPss promoter as assessed by FP assays. k Proposed model for LCPss signaling. LCPss is produced, exported, and reinternalized into the cytosol. The recognition of LCPss by RopBss promotes high affinity interactions between RopBss and binding site in LCPss promoter, which leads to the upregulation of LCPss and NRPS-BGC. In (c, d, e, f), data are derived from three biological replicates and analyzed in duplicates. In (g, h), data are derived from three independent experiments. In (c–h), data graphed represent mean values ± s.e.m. P values in (c, d, e, f) were calculated by Kruskal-Wallis test. In (c), * – P = 0.0259, ** – P = 0.073. In (d), * – P = 0.014. In (e), * – P = 0.0114, ** – P = 0.0027. In (f), ** – P = 0.0016, **** – P < 0.0001. n.s not significant. Source data are provided as a Source Data file.

In accordance with this, inactivation of ropBss or LCPss abrogated NRPS-BGC expression and cis-complementation of ∆ropBss and LCPss* mutants with ropBss and LCPss, respectively, restored NRPS-BGC expression (Fig. 4c). Similarly, the addition of synthetic LCPss containing the predicted amino acid sequence in native order (LCPss), not in scrambled order (SCRA), restored WT-like NRPS-BGC expression in the LCPss* mutant (Fig. 4d). However, supplementation with synthetic LCPss containing staggering truncations at either N-terminal or C-terminal ends (Fig. 4d) failed to activate NRPS-BGC expression in the LCPss* mutant, demonstrating that LCPss encodes a mature LCP and lacks the hallmarks of canonical bacterial peptide signals (Fig. 4d). Furthermore, supplementation with even 20X molar excess of synthetic LCPss failed to activate NRPS-BGC expression in ∆ropBss (Fig. 4d), indicating that LCPss activity requires its cognate receptor RopBss. However, despite the absence of the secretion signal sequence, the LCPss is secreted and reinternalized into the cytosol and acts as an intercellular signal. This was demonstrated by the presence of LCPss associated regulatory activity only in the secreted component of peptide-producing strains (WT, LCPss*::LCPss, and ∆ropBss::ropBss) and internalization of the exogenously added FITC-labeled synthetic LCPss (Fig. 4e, f and Supplementary Fig. 6). Furthermore, inactivation of the canonical bacterial peptide import machinery oligopeptide permease (∆opp) did not affect LCPss import (Fig. 4f)20, suggesting that unknown reimport mechanisms are involved in LCPss import.

To elucidate the molecular mechanism of LCPss-mediated signaling, we investigated the sequence-specific recognition of LCPss by cytosolic RopBss by fluorescence polarization (FP) assay using FITC-labeled LCPss. RopBss binds LCPss with high affinity (Kd ~8 nM) (Fig. 4g) and the pre-formed RopBss-FITC-LCPss complex was disrupted only by unlabeled LCPss, not by non-specific SCRA (Fig. 4h), indicating that RopBss recognition of LCPss is sequence-specific. To explain the downstream consequences of RopBss-LCPss interactions, we hypothesized that LCPss facilitates RopBss interactions with target promoters and promotes RopBss-dependent activation of NRPS-BGC expression. To map the operator sequences for RopBss in LCPss and NRPS-BGC promoters, we performed electrophoretic mobility shift assays (EMSA) using different DNA fragments that span LCPss and LCPss-NPRS-BGC intergenic region (Supplementary Fig. 7). RopBss bound only to a 43-bp fragment located immediately upstream of the putative −35 hexamer of the LCPss promoter and did not interact with the NRPS-BGC promoter (Fig. 4i and Supplementary Fig. 7). These results indicate that LCPss and NRPS-BGC are likely expressed as a polycistronic transcript and RopBss binding site is located in the LCPss promoter (Supplementary Fig. 7). We further probed the 43-bp fragment for the presence of putative palindromes and found an inverted repeat with a 12 bp half site –4 bp spacer –12 bp half site motif that likely constitutes RopBss binding site (Supplementary Fig. 7e–f). However, the RopBss binding site differs from RopB-GAS binding site in several aspects including the motif arrangement, length, and nucleotide composition. The RopB-GAS binds to a palindrome with a 9 bp half site –7 bp spacer –9 bp half site motif (25 bp long)10 compared to the 12 bp half site – 4 bp spacer – 12 bp half site motif (26 bp long) of RopBss. The half site of the palindrome in RopB-GAS binding site has a nucleotide composition of GTTACGTNT10, which varies from RopBss binding site that has nucleotide composition of ATGTAACATATT (Supplementary Fig. 7f). These findings indicate that the receptors recognize operator sequences of different length and nucleotide composition in the target promoters. However, consistent with the role of RopBss and RopB-GAS as transcription activators10 and their likely role in the recruitment of RNA polymerase to defective promoters21, the binding sites for both receptors are located upstream of and around the −35 region of LCP promoters.

To delineate the influence of LCPss on RopBss-promoter interactions, we assessed RopBss-DNA interactions in the presence and absence of LCPss by FP assay using FITC-labeled oligoduplex containing the identified RopBss binding site (Fig. 4i). The addition of LCPss resulted in high affinity interactions between RopBss and the cognate DNA sequences (Kd ~ 90 nM) compared to that of apo- or SCRA-bound RopBss (Kd ~ > 500 nM)(Fig. 4j and Supplementary Fig. 7g, h), indicating that LCPss binding promotes high affinity interactions between RopBss and LCPss promoter. Based on these observations, we proposed a model for LCPss signaling and LCPss-dependent transcription activation of NRPS-BGC by RopBss (Fig. 4k).

LCP system regulates streptococcal virulence factor production

To assess the functionality of LCPs in other bacteria, we first assessed the regulatory activity of LCP from the swine pathogen S. porcinus. The amino acid sequence of S. porcinus LCP (LCPsp) is identical to that of SIP (Fig. 5a, b). The coding region of LCPsp is flanked upstream by ropB in the divergent direction and downstream by a gene encoding cysteine protease (speBsp) that is transcribed convergently (Fig. 5a). Consistent with the role of LCPsp as an intercellular signal that controls speBsp expression, supplementation of S. porcinus with synthetic LCPsp triggered early induction of speBsp expression, while the non-cognate SCRAsp had no effect on gene regulation (Fig. 5c). Since the secreted cysteine protease SpeB is critical for the virulence of S. pyogenes22,23, we reason that LCPsp-mediated activation of speBsp expression may impact the pathogenic traits of S. porcinus.

Fig. 5: Diverse LCPs mediate intercellular signaling in streptococcus and enterococcus.
figure 5

a Schematic representation of genetic elements in S. porcinus (sp) encoding ropBsp, LCPsp, and secreted cysteine protease speBsp. The ropBsp and LCPsp are divergently transcribed. The bent arrow above indicates the transcription start site of LCPsp. The numbers below denote the nucleotide positions relative to the first nucleotide of the start codon of speBsp. b The nucleotide sequence of the ropBsp-LCPsp intergenic region, coding sequence of LCPsp, and corresponding predicted amino acid sequence of LCPsp are shown. The ribosomal-binding sites (RBS) of LCPsp and RopBsp are marked by arrows. c Addition of synthetic LCPsp causes early induction of speB expression in WT S. porcinus. speB transcript levels were assessed by qRT-PCR and the fold change in speB expression relative to unsupplemented growth (reference) is shown. d Schematics of genetic elements in E. malodoratus (em) encoding ropBem, LCPem, and genes encoding putative T7 secretion system (T7SS) and cognate effector (T7SS-eff). The bent arrow above indicates the predicted transcription start site (PLCPem) of LCPem. The numbers below denote the nucleotide positions relative to the first nucleotide of the start codon of LCPem. e The nucleotide sequence of the ropBem-LCPem intergenic region, coding sequence of LCPem, and corresponding predicted amino acid sequence of LCPem are shown. The ribosomal-binding sites of LCPem and RopBem are underlined. f Addition of synthetic LCPem causes early induction of T7SS and T7SS-eff expression in WT E. malodoratus. Transcript levels were assessed by qRT-PCR and the fold change in gene expression relative to unsupplemented growth (reference) is shown. In c, f, data are derived from three biological replicates analyzed in duplicate and data graphed represent mean values ± s.e.m. P values in (c, f) were calculated by Kruskal-Wallis test. In (c), * – P = 0.0189, n.s not significant. In panel f, ** – P = 0.0015, n.s – not significant, whereas in (g), ** – P = 0.0049, n.s – not significant. Source data are provided as a Source Data file.

LCPs from different clades of the RopB clan receptors phylogeny mediate intercellular communication and gene regulation

To test whether LCP from non-streptococcal genus is functional, we characterized the regulatory activity of LCP from Enterococcus malodoratus (LCPem) (Fig. 2). The LCPem is 9 amino acid long and its amino acid sequence is distinct from SIP (Fig. 5d, e). The ropBem is divergently transcribed from LCPem. However, unlike the other characterized LCPs (above), there are no convergently transcribed genes downstream of LCPem (Fig. 5d). Instead, there are two genes encoding T7 secretion system-effector pair (T7SS) located downstream of ropBem and transcribed convergently from ropBem. Additionally, there are two hypothetical genes located downstream of LCPem but transcribed divergently from LCPem (Fig. 5d). Since the gene arrangement is distinct from other characterized LCP systems and regulatory influence of LCPem on these genes is unknown, we investigated the effect of synthetic LCPem on the expression of genes in both directions. Supplementation of E. malodoratus with LCPem induced only the expression of genes encoding T7SS and its effector and the induction was specific for LCPem (Fig. 5f). Contrarily, the LCPem had no influence on the expression profile of the two hypothetical genes located downstream of LCPem (data not shown). These findings demonstrate that the LCPem that is dissimilar to SIP acts as an intercellular signal and controls the production of an E. malodoratus T7 secretion system/effector system.

To investigate the functionality of a LCP system from a more distant LCP system, we assessed the regulatory activity of LCP from Limosilactobacillus reuteri (LCPlr) (Figs. 2, 6). Unlike other L. reuteri strains, the L. reuteri DSM32035 strain has a naturally occurring stop codon at amino acid position 4 of the putative LCPlr (Fig. 6b). The predicted untruncated full length LCPlr is 8 amino acid long (Fig. 6b, c) with the characteristic aliphatic and aromatic amino acid composition (Fig. 6a–c). The ropBlr is divergently transcribed from LCPlr (Fig. 6a). Two hypothetical genes encoding a putative ABC-type transporter are located downstream of LCPlr and transcribed convergently from LCPlr (Fig. 6a). Supplementation of synthetic LCPlr to the exponential growth of L. reuteri activated the naturally silent LCP pathway and induced the expression of genes encoding the ABC-type transporter. The induction was specific for LCPlr as the SCRAlr failed to activate the expression of ABC transporter (Fig. 6c). These results indicate that LCP from a distant species, LCPlr, functions effectively as a qs signal and mediates gene regulation.

Fig. 6: LCP from a diverse species mediate intercellular signaling in Limosilactobacillus reuteri.
figure 6

a Schematic representation of genetic elements in L. reuteri (lr) encoding ropBlr, LCPlr, ABC transporter (Gene 1). The ropBlr and LCPlr are divergently transcribed. The bent arrow above indicates the transcription start site of LCPlr. The numbers below denote the nucleotide positions relative to the first nucleotide of the start codon of gene 1. b The nucleotide sequence of the ropBlr-LCPlr intergenic region, coding sequence of LCPlr, and corresponding predicted amino acid sequence of LCPlr are shown. The ribosomal-binding sites (RBS) of LCPlr and RopBlr are marked by arrows. c Addition of synthetic LCPlr causes early induction of gene 1 expression in WT L. reuteri. Gene 1 transcript levels were assessed by qRT-PCR and the fold change in expression relative to unsupplemented growth (reference) is shown. In (c), data are derived from three biological replicates analyzed in duplicate and data graphed represent mean values ± s.e.m. P values in (c) were calculated by Kruskal-Wallis test. ** – P = 0.0024, n.s not significant. Source data are provided as a Source Data file.

Share with your friends!

Leave a Reply

Your email address will not be published. Required fields are marked *

Get Our Peptide Evolution Ebook For FREE!
straight to your inbox

Subscribe to our mailing list and get interesting stuff to your email inbox.

Thank you for subscribing.

Something went wrong.