OmicsTweezer breakthrough: Advanced AI tool transforms cancer tissue analysis
OHSU scientists have developed OmicsTweezer, a powerful machine learning tool that uses advanced deep learning and optimal transport techniques to analyse cell type composition in human tissues. The breakthrough technology addresses long-standing batch effect challenges in cancer research, potentially improving therapeutic target identification and patient outcomes across multiple cancer types.
Zheng Xia, Ph.D., left, has developed a new tool called OmicsTweezer that uses advanced machine learning techniques to analyze large-scale biological data. Xinxing Yang, Ph.D., right, a postdoctoral scholar in Xia’s lab, is lead author of the published study. © OHSU/Christine Torres Hicks
Revolutionary approach to cellular deconvolution
Researchers at Oregon Health & Science University’s Knight Cancer Institute have created OmicsTweezer, which uses machine learning techniques to analyze biological data at a scale large enough to estimate the composition of cell types in tissue samples that may be taken from a biopsy. This process allows scientists to map the cellular makeup of tumours and surrounding tissues – an area known as the tumour microenvironment.
The innovation addresses a critical limitation in current cancer research methodology. Traditional approaches often struggle with “batch effects” – mismatches between different types of data collected in various ways – which can make it difficult to get accurate results when comparing single-cell data with bulk tissue samples.
Overcoming technical limitations in cancer research
Senior author Dr Zheng Xia, associate professor of biomedical engineering in the OHSU School of Medicine and a member of the OHSU Knight Cancer Institute, explained: “The tumour microenvironment, made up of diverse cell types that shape tumour development and patient outcomes, has been a longstanding research priority at the Knight Cancer Institute. Our goal is to infer cell type composition using bulk data from large clinical sample sizes.”
Single-cell technologies remain expensive and technically difficult to apply to large numbers of cells within tissue samples from patients. Scientists often rely on more accessible bulk data, which averages signals from many cells, but this approach has traditionally limited researchers’ ability to understand cellular heterogeneity within tumours.
Advanced algorithmic innovation
The research team developed OmicsTweezer as a distribution-independent cell deconvolution model that integrates optimal transport with deep learning to align simulated and real data in a shared latent space, effectively mitigating data shifts and inter-omics distribution differences.
OmicsTweezer takes a more sophisticated approach than traditional tools, using deep learning – a branch of machine learning that finds non-linear patterns in complex data – and optimal transport methodology to align different types of data. Dr Xia noted: “We use optimal transport to align two different distributions – single-cell and bulk data – in the same space. In this way, we can reduce the batch effect, which has long been a challenge when working with data from different sources.”
Validated performance across cancer types
The research team tested OmicsTweezer on both simulated datasets and real tissue samples from patients with prostate and colon cancer. The tool successfully identified subtle cell subtypes and estimated cell population changes between patient groups, which could help scientists pinpoint potential therapeutic targets.
Published online in Cell Genomics on 16 July 2025, extensive evaluations on simulated and real-world datasets demonstrate OmicsTweezer’s robustness and accuracy. Applications in prostate and colon cancer showcase the tool’s ability to identify biologically meaningful cell types with clinical relevance.
The versatility of OmicsTweezer extends beyond traditional RNA analysis. As a unified deconvolution framework for multi-omics data, OmicsTweezer offers an efficient and powerful tool for studying disease microenvironments, capable of deconvolving bulk RNA sequencing, bulk proteomics, and spatial transcriptomics.
Clinical implications and future applications
Dr Xia emphasised the clinical potential: “With this tool, we can now estimate the fractions of those populations defined by single-cell data in bulk data from patient groups. That could help us understand which cell populations are changing during disease progression and guide treatment decisions.”
OmicsTweezer was developed through multidisciplinary collaboration at the OHSU Knight Cancer Institute, partnering with Dr Lisa Coussens, Dr Gordon Mills, and the SMMART project. SMMART (Serial Measurements of Molecular and Architectural Responses to Therapy) represents the flagship project of the Knight Cancer Institute’s precision oncology programme, helping identify new treatments that last longer and improve quality of life for patients with advanced cancer.
Transforming cancer research methodology
Dr Xia concluded: “This kind of work wouldn’t be possible without collaboration. It really reflects the strength of the team at the Knight Cancer Institute.”
The development represents a significant advancement in computational oncology, offering researchers a unified platform for analysing diverse omics data types whilst addressing fundamental technical challenges that have historically limited cancer research accuracy.
Reference
Yang, X., Zhao, F., Ren, T., et. al. (2025). OmicsTweezer: A distribution-independent cell deconvolution model for multi-omics Data. Cell Genomics, 5, 100950. https://doi.org/10.1016/j.xgen.2025.100950