> For the complete documentation index, see [llms.txt](https://docs.owkin.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.owkin.com/explore-and-analyse-data/data-catalog.md).

# Browse the dataset catalog

K Pro comes with a curated set of datasets ready to use from day one, and gives you the ability to discover and request access to additional datasets from the Owkin catalog. This section describes what is available out of the box in K Pro Free, and how to explore further datasets that can be integrated into your project.

Owkin's data coverage is designed around **depth and multimodality** while maximizing breadth across all domains.

By default, the K Pro comes with a **foundation of public datasets that can support user’s research**: TCGA, GTEx, CellxGene, CPTAC and Mosaic Window (free version of our flagship MOSAIC dataset). Other public datasets can be either uploaded into the product, or integrated at your request depending of the volume and complexity of the dataset.

To go beyond our readily available and sublicensable data products are currently concentrated in **oncology** (11 indications, including NSCLC, breast, ovarian, DLBCL, bladder, GBM, pancreatic, head & neck, and mesothelioma), where we offer one of the most comprehensive multimodal patient-level dataset catalogs available for licensing. This focus reflects deliberate curation rather than a gap, and **ensures high data quality, rich annotation, and clinical-grade metadata across these indications**.

Beyond our core catalog, **our data sourcing offering based on a network of 2.5M+ patient data points** **extends coverage to additional oncology indications as well as key therapy areas including Inflammation & Immunology** (e.g. IBD, SLE, RA), Neurology (Alzheimer's disease), and CVRM.

Regarding **data recency**, the majority of our datasets include patients enrolled from 2012 onwards, reflecting the period of most significant advances in molecular profiling and digital pathology. Our network infrastructure can support active refresh cycles or any *de novo* access through our data sourcing offering, with access to data collected through the current year for select partners and indications.

Geographically, our **proprietary data products draw from both US and EU cohorts**, providing transatlantic representation that supports regulatory-relevant diversity. Our sourcing network extends to the APAC region for targeted data acquisition when geographic diversity is a project requirement.

{% content-ref url="/pages/2uwOqESrXC0iFDj8MSGN" %}
[Datasets available in K Pro Free](/explore-and-analyse-data/data-catalog/datasets-available-in-k-pro-free.md)
{% endcontent-ref %}

{% content-ref url="/pages/y9KJm7yd1lPhdY2mf5lY" %}
[Browse for additional datasets of interests](/explore-and-analyse-data/data-catalog/browse-for-additional-datasets-of-interests.md)
{% endcontent-ref %}

The following pages further specify data assets available on K Pro:

{% content-ref url="/pages/8hJ2bLOWEh1QI1mtpGvs" %}
[TCGA dataset](/explore-and-analyse-data/data-catalog/datasets-available-in-k-pro-free/tcga-dataset.md)
{% endcontent-ref %}

{% content-ref url="/pages/Tt9DSGf8Ajmp3mwBbfYT" %}
[Mosaic Window](/explore-and-analyse-data/data-catalog/datasets-available-in-k-pro-free/mosaic-window.md)
{% endcontent-ref %}

{% content-ref url="/pages/aYf7gOKLERsVCZTj5mNP" %}
[MOSAIC dataset](/explore-and-analyse-data/data-catalog/browse-for-additional-datasets-of-interests/mosaic-dataset.md)
{% endcontent-ref %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.owkin.com/explore-and-analyse-data/data-catalog.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
