Data enrichment

K Pro's AI toolkit enriches raw data across three stages: data generation (lab protocols and tissue sourcing), data processing (SOTA cloud-based QC & ETL pipelines), and data augmentation where AI transforms unstructured biological data into quantified, analysis-ready biology.

Four augmentation axis are currently available, each described below.

AI cell detection: Histomics

Histomics is Owkin's AI-based digital pathology tool for cell detection and segmentation, including tumour-infiltrating lymphocytes (TILs) and tertiary lymphoid structures (TLS).

Key capabilities:

Detects 13 cell types, including understudied immune populations such as neutrophils and eosinophils
Trained across 5 cancer types, leveraging transfer learning to maximise efficiency
Achieves 24% better F1 classification of cells and 5% better detection using 5× fewer parameters
Built on 200,000 consensus annotations from 10 pathologists

Reference: Adjadj et al. arXiv 2025

AI spatial prediction

K Pro can predict gene expression at each spatial spot of a spatial transcriptomics cohort using the associated H&E tile, enabling near single-cell resolution through model distillation.

The model uses a spatial neighbourhood attention architecture (multi-head attention over tile embeddings from neighbouring spots), and was benchmarked on the HEST dataset:

Feature extractor

Training data

HEST Average (Pearson)

Baseline iBOT

FFCD

0.246

FFCD

0.286

H0-mini

FFCD

0.344

H0-mini

MOSAIC

0.381

Reference: Schmauch et al. arXiv 2024

AI enhanced resolution: Deconvolution

K Pro applies deconvolution algorithms to increase the resolution of Visium spatial transcriptomics data down to single-cell level, leveraging paired modalities (H&E + scRNA-seq + spatial).

Two outputs are supported:

Spot-level cell type deconvolution: answers specific tumour microenvironment (TME) questions by identifying dominant cell types per spot
Spatialization of tumour transcriptomic clusters: maps distinct tumour areas by learning cell signatures from single-cell RNA-seq on paired samples within the same cohort

For reference-free deconvolution, K Pro uses MixUpVI, a joint probabilistic model of pseudobulk and single-cell transcriptomics that estimates cell-type proportions without requiring a reference. Published at ICML 2025 (Grouard, Ouardini, Rodriguez, Vert, Espin-Perez).

AI cell-cell communication

K Pro models local ligand-receptor (LR) interactions using spatial data, without relying on a reference dataset. The pipeline computes LRI values across three diffusion modes — cell contact (no diffusion), secreted signalling (one-neighbour diffusion), and hormone signalling (two-neighbour diffusion) — using prior knowledge tables of ligand-receptor pairs.

Outputs include:

Ligand expression by cell type (dot plot per programme)
Cellular communication network (chord diagram of sender/receiver cell types)
Spatial map of LRI values overlaid on the tissue slide

PreviousPreparing your data NextPrivacy and compliance

Last updated 16 days ago

Was this helpful?

Good afternoon

hashtagAI cell detection: Histomics

hashtagAI spatial prediction

hashtagAI enhanced resolution: Deconvolution

hashtagAI cell-cell communication

AI cell detection: Histomics

AI spatial prediction

AI enhanced resolution: Deconvolution

AI cell-cell communication