Organ and cell-specific biomarkers of Long-COVID identified with targeted proteomics and machine learning - Molecular Medicine

  • 📰 BioMedCentral
  • ⏱ Reading Time:
  • 96 sec. here
  • 3 min. at publisher
  • 📊 Quality Score:
  • News: 42%
  • Publisher: 71%

Education Education Headlines News

Education Education Latest News,Education Education Headlines

Using machine learning algorithms, a study published in Molecular Medicine identifies 119 important proteins that differentiate LongCovid outpatients from other cohorts, indicating a unique protein profile.

). Specifically, we measured a total of 3072 plasma proteins in the plasma of Long-COVID, acutely ill COVID-19, and healthy control subjects. The Olink Explore 3072 library consists of multiple panels with some duplicated proteins leading to the measurement of 2925 unique proteins.

The following steps were undertaken to conduct a conservative analysis that mitigates concerns of relatively small sample sizes and overfitting due to Boruta feature reduction being based on Random Forest classifiers. First, the data was split into a feature reduction dataset and a testing dataset , stratified by subject groups. The Boruta algorithm was run on the feature reduction dataset to determine the most relevant features.

To prepare an optimal model, recursive feature elimination was used. As a Random Forest is a set of decision trees, we were able to interrogate this collection of trees to identify the features that have the highest predictive value . Based on this characteristic, RFE starts with the reduced dataset, fits a Random Forest classifier and determines the importance rankings. The algorithm then drops the least important feature and repeats the process until only 10 features are remaining.

Receiver operating characteristic curves using Logistic Regression were conducted to determine the sensitivity and specificity of individual molecules for predicting Long-COVID status in comparison to healthy controls and COVID-19 patients. Area-under-the-curve was calculated as an aggregate measure of protein performance across all possible classification thresholds . Precision and Recall were determined, including their combined metric , which was calculated as the harmonic mean.

A pairwise comparison, using cosine similarity, was conducted to determine the similarity between subjects across the selected biomarkers . As such, subjects similar across their selected biomarker profile have a score closer to 1, while dissimilar subjects have a score closer to 0. The analysis was done with data Min–Max scaled between 0 and 1 and the cosine similarities were visualized using a heatmap. The machine learning analysis was conducted using Python version 3.9.

 

Thank you for your comment. Your comment will be published after being reviewed.
Please try again later.
We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

 /  🏆 22. in EDUCATİON

Education Education Latest News, Education Education Headlines