October 16, 2025

Artificial intelligence reveals novel gene mutations tied to Alzheimer’s

AI-Powered Machine Learning Unlocks New Clues to Alzheimer’s Genetics

Background

Statistical tools have long been central in understanding the genetic roots of complex diseases like Alzheimer’s. Traditionally, researchers relied on linear additive models, where each gene variant contributes a small, separate risk. However, these models often overlook the interactions between different genetic markers.

A new study published in Nature Communications breaks new ground by using machine learning (ML) to analyze genome-wide data from a large European cohort of Alzheimer’s disease (AD) patients. This approach marks a shift beyond conventional statistical models.

Why This Study Matters

Genome-wide association studies (GWAS) have already mapped many gene variants linked to Alzheimer’s. These are used to build polygenic risk scores (PRS)—tools that estimate a person’s risk based on the sum of multiple variants.

However, PRS assumes each variant’s effect is independent and additive, which oversimplifies reality. For instance, APOE gene variants, which are strongly associated with AD, interact differently based on how they’re expressed and influence both disease traits and immune responses.

With GWAS sample sizes growing and PRS performance leveling off, this study leverages ML to capture deeper, non-linear relationships in massive genomic datasets.


Study Overview

Researchers applied three powerful ML models:

  • Gradient Boosting Machines (GBMs)

  • Neural Networks (NNs) with biological pathway knowledge

  • Model-based Multifactor Dimensionality Reduction (MB-MDR)

These models were tested on their ability to:

  1. Replicate known AD-related variants

  2. Discover new genetic loci missed by GWAS

  3. Accurately predict individuals at high risk for AD

They also carefully adjusted for variables like age, sex, genotyping methods, and population structure.


Key Findings

1. Replication of Known Variants

Even with a much smaller sample than large GWAS meta-analyses, ML models identified 22% of known AD-linked variants. This proves that flexible ML methods can detect relevant genetic signals using fewer samples.

2. Discovery of New Risk Genes

ML approaches consistently highlighted APOE, the most well-known AD gene. Beyond that, the models pinpointed six previously unknown loci linked to AD. These include genes like:

  • ARHGAP25

  • LY6H (linked to neurotransmission)

  • COG7 (implicated in protein processing)

  • AP4E1 (linked to beta-amyloid buildup)

  • SOD1 (related to oxidative stress)

These findings were validated using separate datasets, showing robustness and biological relevance.

3. Prediction of Alzheimer’s Disease

All models were able to predict AD status with similar accuracy. GBMs performed best overall, especially in distinguishing between cases and controls.

Interestingly, PRS and GBMs were strongly aligned, whereas NNs behaved differently, possibly due to how each model processes complex relationships.

Predictions were highly reproducible across different data splits, and gender imbalances in predictions were in line with dataset demographics.


How ML Compared to Traditional GWAS

Compared to 130 genes reported in earlier meta-analyses, ML models recovered 19 loci, including APOE and seven others identified by at least two of the models.

ML also uncovered signals concentrated in immune and glial cells, suggesting roles in inflammation and brain function:

  • Microglial activity

  • Beta-amyloid regulation

  • Protein glycosylation

  • Acetylcholine receptor modulation

Tags

Facebook
WhatsApp
Telegram
LinkedIn
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x