Data enrichment
K Pro's AI toolkit enriches raw data across three stages: data generation (lab protocols and tissue sourcing), data processing (SOTA cloud-based QC & ETL pipelines), and data augmentation where AI transforms unstructured biological data into quantified, analysis-ready biology.
Four augmentation axis are currently available, each described below.
AI cell detection: Histomics
Histomics is Owkin's AI-based digital pathology tool for cell detection and segmentation, including tumour-infiltrating lymphocytes (TILs) and tertiary lymphoid structures (TLS).
Key capabilities:
Detects 13 cell types, including understudied immune populations such as neutrophils and eosinophils
Trained across 5 cancer types, leveraging transfer learning to maximise efficiency
Achieves 24% better F1 classification of cells and 5% better detection using 5× fewer parameters
Built on 200,000 consensus annotations from 10 pathologists
Reference: Adjadj et al. arXiv 2025
AI spatial prediction
K Pro can predict gene expression at each spatial spot of a spatial transcriptomics cohort using the associated H&E tile, enabling near single-cell resolution through model distillation.
The model uses a spatial neighbourhood attention architecture (multi-head attention over tile embeddings from neighbouring spots), and was benchmarked on the HEST dataset:
Baseline iBOT
FFCD
0.246
H0
FFCD
0.286
H0-mini
FFCD
0.344
H0-mini
MOSAIC
0.381
Reference: Schmauch et al. arXiv 2024
AI enhanced resolution: Deconvolution
K Pro applies deconvolution algorithms to increase the resolution of Visium spatial transcriptomics data down to single-cell level, leveraging paired modalities (H&E + scRNA-seq + spatial).
Two outputs are supported:
Spot-level cell type deconvolution: answers specific tumour microenvironment (TME) questions by identifying dominant cell types per spot
Spatialization of tumour transcriptomic clusters: maps distinct tumour areas by learning cell signatures from single-cell RNA-seq on paired samples within the same cohort
For reference-free deconvolution, K Pro uses MixUpVI, a joint probabilistic model of pseudobulk and single-cell transcriptomics that estimates cell-type proportions without requiring a reference. Published at ICML 2025 (Grouard, Ouardini, Rodriguez, Vert, Espin-Perez).
AI cell-cell communication
K Pro models local ligand-receptor (LR) interactions using spatial data, without relying on a reference dataset. The pipeline computes LRI values across three diffusion modes — cell contact (no diffusion), secreted signalling (one-neighbour diffusion), and hormone signalling (two-neighbour diffusion) — using prior knowledge tables of ligand-receptor pairs.
Outputs include:
Ligand expression by cell type (dot plot per programme)
Cellular communication network (chord diagram of sender/receiver cell types)
Spatial map of LRI values overlaid on the tissue slide
Last updated
Was this helpful?