Glossary

This glossary provides definitions for key terms used throughout the K Pro documentation. It is non-exhaustive; terms are listed alphabetically within thematic groups for ease of navigation.

Platform & Product Terms

Agentic Space A grouping of K Pro agents designed to address a specific phase of the research or drug development pipeline. The three agentic spaces are Analyze, Activate, and Amplify, each available from a specific service tier.

Agentic AI An AI system capable of autonomously planning and executing multi-step tasks by selecting and coordinating specialized tools and agents in response to a user's intent. K Pro uses an agentic AI architecture to orchestrate complex biomedical analyses.

K Agents Specialized AI modules within K Pro, each designed to perform a specific class of research task (e.g., literature review, multiomics analysis, clinical data exploration). Agents are orchestrated by K Pro's underlying system to collaborate in response to user queries.

K Pro Owkin's agentic AI platform for biomedical research. K Pro provides access to curated multimodal datasets, specialized AI agents, and visualization tools through a natural language interface. Available in four tiers: Free, Light, Standard, and Premium.

K Pro Free The no-cost entry tier of K Pro, providing individual access to core agentic capabilities and public datasets including the MOSAIC Window. Does not include data upload, collaboration, or enterprise features.

MCP (Model Context Protocol) An open protocol that enables AI assistants such as Claude to connect to external tools and data sources through a standardized interface. Owkin's MCP integration exposes the Pathology Explorer toolset to Claude.ai and Claude Desktop users.

MCP Connector The configuration entry point used to link Claude.ai or Claude Desktop to Owkin's MCP server at https://mcp.k.owkin.com/mcp. Once connected, users can invoke Pathology Explorer tools directly from their Claude interface.

Orchestration The process by which K Pro's underlying system interprets a user's intent, selects the appropriate agents and tools, sequences their execution, and consolidates the results into a coherent response.

Pathology Explorer An AI-powered tool available through K Pro and as an MCP integration, designed to transform H&E whole-slide images into granular, queryable insights. Trained on 200,000+ annotations, it detects and classifies 6 distinct cell types and supports spatial analysis of the tumor microenvironment.

Data & Datasets

Bring Your Own Data (BYOD) A K Pro capability (available from Light tier) allowing users to upload their own proprietary datasets for analysis within the platform. Data must conform to K Pro's supported modalities and file format specifications.

Bulk RNA-seq (bRNAseq) Bulk RNA sequencing. A method to assess the overall transcriptome of a tissue sample by sequencing RNA from a mixture of cells. Supported formats in K Pro: .txt, .tsv, .csv, .h5ad.

H&E (Haematoxylin and Eosin) A standard histological staining technique applied to whole-slide tissue images. H&E staining highlights cell nuclei and cytoplasm and is the primary imaging modality used by Pathology Explorer.

MOSAIC (Multi Omics Spatial Atlas In Cancer) A landmark multi-institutional dataset developed by Owkin in collaboration with academic centers (University of Pittsburgh, Gustave Roussy, Lausanne University Hospital, Erlangen University Hospital, and Charité Berlin). It contains spatial transcriptomics, single-nuclei RNA-seq, bulk RNA-seq, WES, H&E, and clinical data across 9 cancer types and 2,200+ patients.

MOSAIC Window A curated, publicly accessible subset of the MOSAIC dataset, included in all K Pro tiers. Contains multimodal data from 60 patients across five cancer types: BLCA, OV, GBM, DLBCL, and MESO.

Multimodal Data Data combining multiple biological measurement types (modalities) from the same patient or sample, such as clinical records, genomics, transcriptomics, histology, and spatial omics.

Single-cell RNA-seq (scRNA-seq) Single-cell RNA sequencing. A technique that measures gene expression at the level of individual cells, enabling identification of distinct cell populations and states within a tissue. Supported formats in K Pro: .mtx, .h5, .h5ad, .rds.

Spatial Transcriptomics (ST) A technique that measures gene expression while preserving the spatial location of cells or spots within a tissue section. K Pro supports the Visium Cytassist protocol from 10X Genomics. Supported formats: .mtx, .h5, .h5ad, .rds.

TCGA (The Cancer Genome Atlas) A publicly funded, comprehensive cancer genomics resource comprising data from 20,000+ primary cancer samples across 33 cancer types. Available in all K Pro tiers.

TME (Tumor Microenvironment) The cellular and molecular environment surrounding a tumor, including immune cells, stromal cells, blood vessels, and signaling molecules. Characterizing the TME is a key use case for Pathology Explorer and the MOSAIC dataset.

WES (Whole Exome Sequencing) A sequencing technique that targets the protein-coding regions (exons) of the genome, approximately 1–2% of the total genome, which harbors the majority of disease-causing mutations. Supported format in K Pro: .vcf.

AI & Technical Terms

AI-Readiness Maturity Model Owkin's 6-level framework (0–5) for assessing how well a dataset is prepared for use with K Pro. Levels range from uncontrolled data (Level 0) to fully traceable and AI/ML-optimized datasets with reproducibility standards (Level 5).

Embedding / Latent Representation A compressed, numerical representation of high-dimensional biological data (e.g., gene expression profiles) produced by machine learning models. Used in K Pro for dimensionality reduction plots (PCA, UMAP, t-SNE).

Foundation Model A large-scale AI model trained on broad data that can be adapted to many downstream tasks. Owkin uses foundation models for self-supervised feature extraction from whole-slide images in the data ingestion pipeline.

Hallucination In the context of LLMs, the generation of plausible-sounding but factually incorrect or fabricated information. K Pro implements monitoring (Tool Call Accuracy) and technical guardrails (PubMed ID verification via RAG) to detect and reduce hallucinations.

Harmonization The process of standardizing and normalizing heterogeneous datasets — across modalities, formats, and sources — so they can be queried consistently by K Pro's agents and tools.

LLM (Large Language Model) A deep learning model trained on large text corpora, capable of understanding and generating natural language. K Pro is built on top of LLMs, which interpret user queries and coordinate agent responses.

RAG (Retrieval-Augmented Generation) A technique that combines a generative LLM with a retrieval system to ground responses in factual, source-verified content. K Pro uses RAG over 22M+ PubMed abstracts to ensure literature citations are valid and relevant.

TCA (Tool Call Accuracy) An internal monitoring metric used by Owkin to measure the proportion of agent interactions in which the correct tool is identified and called with appropriate parameters. TCA is used to detect systematic agent errors.

Visualization Terms

Kaplan-Meier Plot A statistical visualization of time-to-event data (e.g., overall survival) that shows the estimated survival probability over time for one or more patient groups. Typically includes log-rank test p-values and hazard ratios.

Oncoprint A matrix visualization displaying the pattern of genetic alterations (mutations, copy number variants, etc.) across a set of patients and genes. Useful for identifying co-occurring or mutually exclusive alterations.

UMAP (Uniform Manifold Approximation and Projection) A dimensionality reduction algorithm used to project high-dimensional biological data (e.g., single-cell gene expression) into 2D or 3D for visual exploration. Also available: PCA and t-SNE.

Volcano Plot A scatter plot used to visualize differential gene expression, with statistical significance (−log p-value) on the y-axis and effect size (log fold-change) on the x-axis. Highlights genes that are both statistically significant and biologically meaningful.

WSI (Whole-Slide Image) A high-resolution digital scan of a complete tissue section on a glass slide. Supported formats in K Pro: .tif, .tiff, .svs, .dcm, .ndpi, .mrxs.

Security & Compliance Terms

GDPR (General Data Protection Regulation) The European Union's primary data protection regulation, governing the collection, storage, processing, and transfer of personal data. K Pro is GDPR compliant for EU and UK users.

HIPAA (Health Insurance Portability and Accountability Act) A US federal law establishing standards for the protection of health information. K Pro's security architecture is designed to support HIPAA compliance requirements.

IAM (Identity and Access Management) The framework of policies and technologies used to control user access to systems and data. K Pro uses IAM to enforce customer-level data segregation and role-based permissions.

ISO 27001:2022 The international standard for information security management systems (ISMS). Owkin is certified to ISO 27001:2022, demonstrating systematic controls for protecting data confidentiality, integrity, and availability.

ISO 13485:2016 The international standard for quality management systems in medical device development. Owkin holds this certification, applicable to its AI model development practices.

RBAC (Role-Based Access Control) A security model in which system access is granted based on a user's role within an organization. Available in the Premium tier of K Pro.

SSO (Single Sign-On) An authentication scheme that allows users to log in once and access multiple applications without re-entering credentials. K Pro supports SSO integration in the Premium tier.

Note: This glossary is maintained as a living reference. If you encounter a term not defined here, or believe a definition requires updating, please contact [email protected]envelope.

Last updated

Was this helpful?