Predicting Loss-of-Function Impact of Genetic Mutations: A Machine Learning Approach

Arshmeet Kaur

doi:10.54364/AAIML.2024.41119

Original Research (Published On: 22-Mar-2024 )

DOI : https://doi.org/10.54364/AAIML.2024.41119

Arshmeet Kaur

Adv. Artif. Intell. Mach. Learn., 4 (1):2091-2102

1. Arshmeet Kaur: Evergreen Valley College, Student, Transferring to Bioengineering

Download PDF Here

DOI: 10.54364/AAIML.2024.41119

Article History: Received on: 10-Jan-24, Accepted on: 15-Mar-24, Published on: 22-Mar-24

Corresponding Author: Arshmeet Kaur

Email: Arka7783@stu.evc.edu

Citation: Arshmeet Kaur and Morteza Sarmadi. Predicting Loss-of-Function Impact of Genetic Mutations: A Machine Learning Approach. Advances in Artificial Intelligence and Machine Learning. 2024;4(1):119.

Abstract

The innovation of next-generation sequencing (NGS) techniques has significantly reduced the price of genome sequencing, lowering barriers to future medical research; it is now feasible to apply genome sequencing to studies where it would have previously been cost-inefficient. Identifying damaging or pathogenic mutations in vast amounts of complex, high-dimensional genome sequencing data may be of particular interest for researchers. Thus, this paper’s aims were to train machine learning models on the attributes of a genetic mutation to predict LoFtool scores (which measure a gene’s intolerance to loss-of-function mutations). These attributes included, but were not limited to, the position of a mutation on a chromosome, changes in amino acids, and changes in codons caused by the mutation. Models were built using the univariate feature selection technique f-regression combined with K-nearest neighbors (KNN), Support Vector Machine (SVM), Random Sample Consensus (RANSAC), Decision Trees, Random Forest, and Extreme Gradient Boosting (XGBoost). These models were evaluated using five-fold cross-validated averages of r-squared, mean squared error, root mean squared error, mean absolute error, and explained variance. The findings of this study include the training of multiple models with testing set r-squared values of 0.97.

Statistics

Article View: 1253
PDF Downloaded: 15

Predicting Loss-of-Function Impact of Genetic Mutations: A Machine Learning Approach

Original Research (Published On: 22-Mar-2024 )

Abstract

Statistics

Other Journals

Site Links

Other Usefull Links

Publisher

Editor in Chief