FMDVSerPred: A Novel Computational Solution for Foot-and-mouth Disease Virus Classification and Serotype Prediction Prevalent in Asia Using VP1 Nucleotide Sequence Data


Cite item

Full Text

Abstract

Background:Three serotypes of Foot-and-mouth disease (FMD) virus have been circulating in Asia, which are commonly identified by serological assays. Such tests are timeconsuming and also need a bio-containment facility for execution. To the best of our knowledge, no computational solution is available in the literature to predict the FMD virus serotypes. Thus, this necessitates the urgent need for user-friendly tools for FMD virus serotyping.

Methods:We presented a computational solution based on a machine-learning model for FMD virus classification and serotype prediction. Besides, various data pre-processing techniques are implemented in the approach for better model prediction. We used sequence data of 2509 FMD virus isolates reported from India and seven other Asian FMD-endemic countries for model training, testing, and validation. We also studied the utility of the developed computational solution in a wet lab setup through collecting and sequencing of 12 virus isolates reported in India. Here, the computational solution is implemented in two user-friendly tools, i.e., online web-prediction server (https://nifmd-bbf.icar.gov.in/FMDVSerPred) and R statistical software package (https://github.com/sam-dfmd/FMDVSerPred).

Results:The random forest machine learning model is implemented in the computational solution, as it outperformed seven other machine learning models when evaluated on ten test and independent datasets. Furthermore, the developed computational solution provided validation accuracies of up to 99.87% on test data, up to 98.64%, and 90.24% on independent data reported from Asian countries, including India and its seven neighboring countries, respectively. In addition, our approach was successfully used for predicting serotypes of field FMD virus isolates reported from various parts of India.

Conclusion:The high-throughput sequencing combined with machine learning offers a promising solution to FMD virus serotyping.

About the authors

Samarendra Das

ICAR-National Institute on Foot and Mouth Disease, International Centre for Foot and Mouth Disease

Author for correspondence.
Email: info@benthamscience.net

Soumen Pal

Division of Computer Application, ICAR-Indian Agricultural Statistics Research Institute

Email: info@benthamscience.net

Samyak Mahapatra

Department of Bioinformatics, Centre for Post-Graduate Studies, Odisha University of Agriculture and Technology

Email: info@benthamscience.net

Jitendra Biswal

ICAR-National Institute on Foot and Mouth Disease, International Centre for Foot and Mouth Disease

Email: info@benthamscience.net

Sukanta Pradhan

Department of Bioinformatics, Centre for Post-Graduate Studies, Odisha University of Agriculture and Technology

Email: info@benthamscience.net

Aditya Sahoo

ICAR-National Institute on Foot and Mouth Disease, International Centre for Foot and Mouth Disease,

Email: info@benthamscience.net

Rabindra Singh

ICAR-National Institute on Foot and Mouth Disease, International Centre for Foot and Mouth Disease,

Author for correspondence.
Email: info@benthamscience.net

References

  1. Knight-Jones TJD, Rushton J. The economic impacts of foot and mouth disease - What are they, how big are they and where do they occur? Prev Vet Med 2013; 112(3-4): 161-73. doi: 10.1016/j.prevetmed.2013.07.013 PMID: 23958457
  2. G G, B GK, A K, et al. Economic impact of FMD in cattle and buffaloes in India. Indian J Anim Sci 2020; 90(7): 971-6. doi: 10.56093/ijans.v90i7.106662
  3. Shanafelt DW, Perrings C. The effect of the post 2001 reforms on fmd risks of the international live animal trade. EcoHealth 2018; 15(2): 327-37. doi: 10.1007/s10393-018-1315-8 PMID: 29488117
  4. Subramaniam S, Mohapatra JK, Sahoo NR, et al. Foot-and-mouth disease status in India during the second decade of the twenty-first century (2011-2020). Vet Res Commun 2022; 46(4): 1011-22. doi: 10.1007/s11259-022-10010-z PMID: 36190601
  5. Sharma GK, Mohapatra JK, Mahajan S, Matura R, Subramaniam S, Pattnaik B. Comparative evaluation of non-structural protein-antibody detecting ELISAs for foot-and-mouth disease sero-surveillance under intensive vaccination. J Virol Methods 2014; 207: 22-8. doi: 10.1016/j.jviromet.2014.06.022 PMID: 24996132
  6. Subramaniam S, Pattnaik B, Sanyal A, et al. Status of foot-and-mouth disease in India. Transbound Emerg Dis 2013; 60(3): 197-203. doi: 10.1111/j.1865-1682.2012.01332.x PMID: 22551096
  7. Pega J, Bucafusco D, Di Giacomo S, et al. Early adaptive immune responses in the respiratory tract of foot-and-mouth disease virus-infected cattle. J Virol 2013; 87(5): 2489-95. doi: 10.1128/JVI.02879-12 PMID: 23255811
  8. Dubie T, Amare T. Isolation, serotyping, and molecular detection of bovine FMD virus from outbreak cases in abaʼala district of afar region, ethiopia. Vet Med Int 2020; 2020: 1-9. doi: 10.1155/2020/8847728
  9. Senthilkumaran C, Yang M, Bittner H, et al. Detection of genome, antigen, and antibodies in oral fluids from pigs infected with foot-and-mouth disease virus. Can J Vet Res 2017; 81(2): 82-90. PMID: 28408775
  10. Dukes JP, King DP, Alexandersen S. Novel reverse transcription loop-mediated isothermal amplification for rapid detection of foot-and-mouth disease virus. Arch Virol 2006; 151(6): 1093-106. doi: 10.1007/s00705-005-0708-5 PMID: 16453084
  11. Ferris NP, Dawson M. Routine application of enzyme-linked immunosorbent assay in comparison with complement fixation for the diagnosis of foot-and-mouth and swine vesicular diseases. Vet Microbiol 1988; 16(3): 201-9. doi: 10.1016/0378-1135(88)90024-7 PMID: 3376418
  12. Vangrysperre W, De Clercq K. Rapid and sensitive polymerase chain reaction based detection and typing of foot-and-mouth disease virus in clinical samples and cell culture isolates, combined with a simultaneous differentiation with other genomically and/or symptomatically related viruses. Arch Virol 1996; 141(2): 331-44. doi: 10.1007/BF01718403 PMID: 8634024
  13. Lim DR, Ryoo S, Kang H, et al. Enhanced detection and serotyping of foot-and-mouth disease virus serotype O, A, and Asia1 using a novel multiplex real-time RT-PCR. Transbound Emerg Dis 2022; 69(5): e2578-89. doi: 10.1111/tbed.14603 PMID: 35614493
  14. Wong CL, Yong CY, Ong HK, Ho KL, Tan WS. Advances in the diagnosis of foot-and-mouth disease. Front Vet Sci 2020; 7: 477. doi: 10.3389/fvets.2020.00477 PMID: 32974392
  15. Biswal JK, Jena BR, Ali SZ, Ranjan R, Mohapatra JK, Singh RP. One-step SYBR green-based real-time RT-PCR assay for detection of foot-and-mouth disease virus circulating in India. Virus Genes 2022; 58(2): 113-21. doi: 10.1007/s11262-021-01884-3 PMID: 34988898
  16. Shiaelis N, Tometzki A, Peto L, et al. Virus detection and identification in minutes using single-particle imaging and deep learning. ACS Nano 2023; 17(1): 697-710. doi: 10.1021/acsnano.2c10159 PMID: 36541630
  17. Gilchrist CA, Turner SD, Riley MF, Petri WA Jr, Hewlett EL. Whole-genome sequencing in outbreak analysis. Clin Microbiol Rev 2015; 28(3): 541-63. doi: 10.1128/CMR.00075-13 PMID: 25876885
  18. Hassan AM, Zaher MR, Hassanien RT, et al. Molecular detection, phylogenetic analysis and genetic diversity of recently isolated foot-and-mouth disease virus serotype A African topotype, Genotype IV. Virol J 2022; 19(1): 1. doi: 10.1186/s12985-021-01693-y PMID: 34980196
  19. Keshavamurthy R, Dixon S, Pazdernik KT, Charles LE. Predicting infectious disease for biopreparedness and response: A systematic review of machine learning and deep learning approaches. One Health 2022; 15: 100439. doi: 10.1016/j.onehlt.2022.100439 PMID: 36277100
  20. Santangelo OE, Gentile V, Pizzo S, Giordano D, Cedrone F. Machine learning and prediction of infectious diseases: A systematic review. Mach learn knowl Extr 2023; 5(1): 175-98. doi: 10.3390/make5010013
  21. Ali MM, Paul BK, Ahmed K, Bui FM, Quinn JMW, Moni MA. Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison. Comput Biol Med 2021; 136: 104672. doi: 10.1016/j.compbiomed.2021.104672 PMID: 34315030
  22. Lu J, Hutchens R, Hung J, et al. Performance of multilabel machine learning models and risk stratification schemas for predicting stroke and bleeding risk in patients with non-valvular atrial fibrillation. Comput Biol Med 2022; 150: 106126. doi: 10.1016/j.compbiomed.2022.106126 PMID: 36206696
  23. Rai SN, Das S, Pan J, Mishra DC, Fu XA. Multigroup prediction in lung cancer patients and comparative controls using signature of volatile organic compounds in breath samples. PLoS One 2022; 17(11): e0277431. doi: 10.1371/journal.pone.0277431 PMID: 36449484
  24. Mayer LM, Strich JR, Kadri SS, et al. Machine learning in infectious disease for risk factor identification and hypothesis generation: Proof of concept using invasive candidiasis. Open Forum Infect Dis 2022; 9(8): ofac401. doi: 10.1093/ofid/ofac401 PMID: 36004317
  25. Hu J, Liu Y, Heidari AA, et al. An effective model for predicting serum albumin level in hemodialysis patients. Comput Biol Med 2022; 140: 105054. doi: 10.1016/j.compbiomed.2021.105054 PMID: 34847387
  26. Shehab M, Abualigah L, Shambour Q, et al. Machine learning in medical applications: A review of state-of-the-art methods. Comput Biol Med 2022; 145: 105458. doi: 10.1016/j.compbiomed.2022.105458 PMID: 35364311
  27. Ong E, Wang H, Wong MU, Seetharaman M, Valdez N, He Y. Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens. Bioinformatics 2020; 36(10): 3185-91. doi: 10.1093/bioinformatics/btaa119 PMID: 32096826
  28. Rawal K, Sinha R, Nath SK, et al. Vaxi-DL: A web-based deep learning server to identify potential vaccine candidates. Comput Biol Med 2022; 145: 105401. doi: 10.1016/j.compbiomed.2022.105401 PMID: 35381451
  29. Cacciabue M, Aguilera P, Gismondi MI, Taboga O. Covidex: An ultrafast and accurate tool for SARS-CoV-2 subtyping. Infect Genet Evol 2022; 99: 105261. doi: 10.1016/j.meegid.2022.105261 PMID: 35231666
  30. Meher PK, Sahu TK, Gahoi S, Tomar R, Rao AR. funbarRF: DNA barcode-based fungal species prediction using multiclass Random Forest supervised learning model. BMC Genet 2019; 20(1): 2. doi: 10.1186/s12863-018-0710-z PMID: 30616524
  31. Deshpande V, Wang Q, Greenfield P, et al. Fungal identification using a Bayesian classifier and the Warcup training set of internal transcribed spacer sequences. Mycologia 2016; 108(1): 1-5. doi: 10.3852/14-293 PMID: 26553774
  32. Delgado-Serrano L, Restrepo S, Bustos JR, Zambrano MM, Anzola JM. Mycofier: A new machine learning-based classifier for fungal ITS sequences. BMC Res Notes 2016; 9(1): 402. doi: 10.1186/s13104-016-2203-3 PMID: 27516337
  33. Schloss PD, Westcott SL, Ryabin T, et al. Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 2009; 75(23): 7537-41. doi: 10.1128/AEM.01541-09 PMID: 19801464
  34. Sindhu Meena K, Suriya S. A survey on supervised and unsupervised learning techniques Proceedings of International Conference on Artificial Intelligence, Smart Grid and Smart City Applications. Cham. Springer International Publishing 2020; 627-44. doi: 10.1007/978-3-030-24051-6_58
  35. Sathya R, Abraham A. Comparison of supervised and unsupervised learning algorithms for pattern classification. IntJ Adv Res Artif Intell 2013; 2: 34-8.
  36. Knowles N J, Wadsworth J B-B K K D P. 2016 VP1 sequencing protocol for foot and mouth disease virus molecular epidemiology Rev Sci Tech l’OIE 2016; 35: 741-55.
  37. Subramaniam S, Mohapatra JK, Sharma GK, et al. Evolutionary dynamics of foot-and-mouth disease virus O/ME-SA/Ind2001 lineage. Vet Microbiol 2015; 178(3-4): 181-9. doi: 10.1016/j.vetmic.2015.05.015 PMID: 26049591
  38. Kamath U, De Jong K, Shehu A. Effective automated feature construction and selection for classification of biological sequences. PLoS One 2014; 9(7): e99982. doi: 10.1371/journal.pone.0099982 PMID: 25033270
  39. Govindan G, Nair AS. New feature vector for apoptosis protein subcellular localization prediction 2011; 294-301. doi: 10.1007/978-3-642-22709-7_30
  40. Wilkinson S P. kmer: an R package for fast alignment-free clustering of biological sequences. R package version 100 2018.
  41. Kursa MB. Robustness of random forest-based gene selection methods. BMC Bioinformatics 2014; 15(1): 8. doi: 10.1186/1471-2105-15-8 PMID: 24410865
  42. Das S, Rai A, Mishra DC, Rai SN. Statistical approach for selection of biologically informative genes. Gene 2018; 655: 71-83. doi: 10.1016/j.gene.2018.02.044 PMID: 29458166
  43. Cheng T, Wang Y, Bryant SH. FSelector: A ruby gem for feature selection. Bioinformatics 2012; 28(21): 2851-2. doi: 10.1093/bioinformatics/bts528 PMID: 22942017
  44. Jackins V, Vimal S, Kaliappan M, Lee MY. AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. J Supercomput 2021; 77(5): 5198-219. doi: 10.1007/s11227-020-03481-x
  45. Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995; 20(3): 273-97. doi: 10.1007/BF00994018
  46. Wu X, Kumar V, Ross Quinlan J, et al. Top 10 algorithms in data mining. Knowl Inf Syst 2008; 14(1): 1-37. doi: 10.1007/s10115-007-0114-2
  47. Sun L, Kong X, Xu J, Xue Z, Zhai R, Zhang S. A hybrid gene selection method based on relieff and ant colony optimization algorithm for tumor classification. Sci Rep 2019; 9(1): 8978. doi: 10.1038/s41598-019-45223-x PMID: 31222027
  48. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 1943; 5(4): 115-33. doi: 10.1007/BF02478259
  49. Sun Y, Zhang Z, Yang Z, Li D. 2019 Application of logistic regression with fixed memory step gradient descent method in multi-class classification problem. 6th International Conference on Systems and Informatics (ICSAI) 2019; 516-21.
  50. Webb GI, Keogh E, Miikkulainen R, Miikkulainen R, Sebag M. Naïve Bayes Encyclopedia of Machine Learning. Boston, MA: Springer US 2011; pp. 713-4. doi: 10.1007/978-0-387-30164-8_576
  51. Das S, Paul AK, Wahi SD, Raman RK. A comparative study of various classification techniques in multivariate skew-normal data. J Indian Soc Agric Stat 2015; 69: 271-80.
  52. Breiman L. Random forests. Mach Learn 2001; 45(1): 5-32. doi: 10.1023/A:1010933404324
  53. Meher PK, Sahu TK, Rao AR. Prediction of donor splice sites using random forest with a new sequence encoding approach. BioData Min 2016; 9(1): 4. doi: 10.1186/s13040-016-0086-4 PMID: 26807151
  54. Liaw A, Wiener M. Classification and regression by random Forest. R News 2002; 2: 18-22.
  55. Landgrebe TCW, Duin RPW. Approximating the multiclass ROC by pairwise analysis. Pattern Recognit Lett 2007; 28(13): 1747-58. doi: 10.1016/j.patrec.2007.05.001
  56. Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2011; 12(1): 77. doi: 10.1186/1471-2105-12-77 PMID: 21414208
  57. Jamal SM, Belsham GJ. Foot-and-mouth disease: Past, present and future. Vet Res 2013; 44(1): 116. doi: 10.1186/1297-9716-44-116 PMID: 24308718

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2024 Bentham Science Publishers