In order to validate the performance of RTD-based alignment-free method, 5 data sets, as listed in Table 1, were compiled by considering different factors such as,
Table 1: Summary of data sets used in
the study
Sr. no. |
Data set |
Taxon |
aOTUs |
bLength |
cLength |
dIdentity (%) |
1 |
MLST genes |
Genus Aeromonas |
115 |
4676-4957 |
4987 |
82.17 - 100 |
2 |
SH gene |
Mumps virus genotypes |
32 |
316-318 |
318 |
72.64 - 100 |
3 |
Mitochondrial genome |
Class Mammalia |
31 |
16338-17447 |
20317 |
61.42 - 97.78 |
4 |
Complete genome |
Genus Enterovirus |
113 |
6944-7458 |
8006 |
51.91 - 98.98 |
5 |
Complete genome |
Family Flaviviridae |
59 |
9406-12813 |
17163 |
33.40 - 90.81 |
a: number of operational taxonomic units (OTUs); b: length
variation of unaligned sequences; c: length of aligned
sequences; d: variation of percent identity in aligned
sequences. Data sets are listed in decreasing order of %
sequence identity.