PCR products were purified using GeneJET Gel Extraction Kit (Thermo Scientific Alpelisib price Fermentas) according to the manufacturer‘s instructions. The cloned DNA fragments were subjected to sequencing using the ABI 3130XL genetic analyser. Sequence walking was explored using internal primers constructed within the spacer sequences to complete the sequencing of the PCR fragments. A slightly modified spacer-crawling approach [29] was applied to amplify the CRISPR arrays of strains GV28 and GV33. The primers targeted cas2 and the repeat sequence within the CRISPR locus.

The resulting PCR product represented a ladder consisting of a number of fragments with increasing lengths: each fragment differed by the length of one spacer and one repeat. The mixture of fragments was cloned into the pJET1.2 YM155 mw vector (Thermo Scientific Fermentas); the recombinant plasmids containing the longest DNA inserts were selected and then subjected to sequencing. The

next round of amplification used the primer generated from the further spacer sequence and the primers located on the flanking regions downstream of the CRISPR sequence (Additional file 2). The resulting contigs were assembled with a minimum overlapping region of three spacers. Amplification and sequencing of the cas genes The presence of the cas genes was verified by amplification of the regions containing cas5-cas6e-cas1-cas2 (~3.6 kbp), cas3-cse1 (~3 kbp), cse2-cas5 (~2.7 kbp), cas5 (~0.88 Janus kinase (JAK) kbp) and cse2 (~0.6 kbp). The primers used in the PCR are provided in Additional file 2. The PCR regimen included 28 cycles of denaturation at 94°C for 30 s, primer annealing at 58°C for 30 s, and extension at 72°C for 1 min/kb PCR target. The final extension step was prolonged to 10 min. The cloned DNA fragments containing cas5 and cas2 were subjected to sequencing. CRISPR sequence analysis CRISPR information for the three G. vaginalis genomes (ATCC14019, 409–05, and HMP9231)

was retrieved from the CRISPR database [24]. CRISPRs Finder [24] was used to detect CRISPR repeat and spacer sequences. The identification of cas genes was also performed using NCBI BLAST (http://​blast.​ncbi.​nlm.​nih.​gov/​Blast.​cgi). Each piece of CRISPR and cas information retrieved from the databases was manually proofread. The search for similarities between each spacer and the sequences deposited in GenBank was performed using BLASTn at NCBI, with the search set limited to Bacteria (taxid:2) or Viruses (taxid:10239). All matches with a bit score above 40.0, corresponding to 100% identity over at least 20 bp, were considered legitimate hits. Only the top hit was taken into consideration. Matches to sequences found within G. vaginalis CRISPR loci were discarded. Spacers were compared to one another using the MAFFT program [33]. CRISPR spacers with up to three mismatches that had 100% overlap between sequences were considered identical. The consensus sequences of the CRISPR repeat and protospacer region alignments were generated by WebLogo [34].

