Projects per year
Abstract
Background: Interpreting variant effects is essential for precision medicine. Large Transformer-based genomic language models (DNABERT 2, Nucleotide Transformer) capture patterns in coding DNA but scale poorly for non coding variant prediction because attention complexity grows quadratically with sequence length. Evidence from natural language processing shows that pruning less informative layers can reduce model size and computational load without sacrificing accuracy.
Methods: We systematically ablated each Transformer layer in DNABERT 2 and the Nucleotide Transformer to assess its contribution to variant prediction. By observing changes in performance, we built layer importance profiles and created pruned models by removing redundant layers. Pruned and full models were fine tuned with identical hyperparameters using the Enformer eQTL causal variant dataset, a curated benchmark for non coding variant effect prediction.
Results: Layer ablation revealed that the importance of individual layers varies widely across models; some layers can be removed with little loss in performance while others are critical. After fine tuning, pruned models achieved accuracy and area under the ROC curve comparable to full models. Additionally, pruned versions required substantially less training time and memory, reducing resource usage by a significant margin.
Conclusions: Layer wise pruning provides a principled strategy for developing compact genomic LLMs. By identifying and removing less critical layers, we produced leaner models that preserve predictive power while lowering computational demands. These efficient models demonstrate how insights from general LLM research can advance genomic variant interpretation and make large scale non coding analysis more accessible in research and clinical settings. This approach complements ongoing efforts to optimise Transformer architectures for genomic data.
Methods: We systematically ablated each Transformer layer in DNABERT 2 and the Nucleotide Transformer to assess its contribution to variant prediction. By observing changes in performance, we built layer importance profiles and created pruned models by removing redundant layers. Pruned and full models were fine tuned with identical hyperparameters using the Enformer eQTL causal variant dataset, a curated benchmark for non coding variant effect prediction.
Results: Layer ablation revealed that the importance of individual layers varies widely across models; some layers can be removed with little loss in performance while others are critical. After fine tuning, pruned models achieved accuracy and area under the ROC curve comparable to full models. Additionally, pruned versions required substantially less training time and memory, reducing resource usage by a significant margin.
Conclusions: Layer wise pruning provides a principled strategy for developing compact genomic LLMs. By identifying and removing less critical layers, we produced leaner models that preserve predictive power while lowering computational demands. These efficient models demonstrate how insights from general LLM research can advance genomic variant interpretation and make large scale non coding analysis more accessible in research and clinical settings. This approach complements ongoing efforts to optimise Transformer architectures for genomic data.
| Original language | English |
|---|---|
| Article number | 1358 |
| Journal | Genes |
| Volume | 16 |
| Issue number | 11 |
| DOIs | |
| Publication status | Published - 10 Nov 2025 |
Fingerprint
Dive into the research topics of 'Reaping the fruits of LLM pruning: towards small language models for efficient non-coding variant effect prediction'. Together they form a unique fingerprint.Projects
- 1 Active
-
Natural Language Processing for non-coding Variant Effect Prediction
Rahman, F. (PI), Nebel, J.-C. (CoI) & Hegde, M. (Researcher)
4/12/23 → …
Project: Research