> For the complete documentation index, see [llms.txt](https://docs.owkin.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.owkin.com/explore-and-analyse-data/data-catalog/datasets-available-in-k-pro-free.md).

# Datasets available in K Pro Free

Below is the list of public and private datasets currently available in the Owkin K-Pro Free platform, along with their respective licenses and versions.

**1- Datasets that can be explored the MultiOmics Agent in K Pro Free**

The MultiOmics Agent analyzes complex biological datasets across different modalities, enabling you to explore relationships between molecular features and clinical outcomes.

| Dataset Name      | Type                | Website                                                      | License Link                                         | License Name                                                 | Version |
| ----------------- | ------------------- | ------------------------------------------------------------ | ---------------------------------------------------- | ------------------------------------------------------------ | ------- |
| **MOSAIC Window** | Non-public datasets | <https://www.mosaic-research.com/mosaic-window>              | Terms and conditions                                 | N/A                                                          | N/A     |
| **TCGA**          | Public              | <https://www.cancer.gov/ccg/research/genome-sequencing/tcga> | <https://creativecommons.org/licenses/by-nc-nd/4.0/> | CC Attribution-NonCommercial-NoDerivatives 4.0 International | 4.0     |
| **GTEx**          | Public              | <https://www.gtexportal.org/home/>                           | <https://www.gtexportal.org/home/license>            | GTEx Portal data                                             | N/A     |

**2- Datasets used to generate pre aggregated insights for the Knowledge agent**

Knowledge data represent a set of (gene-level) features that are based on publicly available databases and ressources, and provide information on the general biology of the target, independently of the specific disease or discovery context. Currently covered databases and sources can be found below:

| Dataset Name                  | Type   | Website                                                         | License Link                                                                                                                 | License Name                                                                 | Version |
| ----------------------------- | ------ | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ------- |
| **ChEMBL**                    | Public | <https://www.ebi.ac.uk/chembl/>                                 | [Deed - Attribution-ShareAlike 3.0 Unported - Creative Commons](https://creativecommons.org/licenses/by-sa/3.0/)             | Deed Attribution-Share                                                       | 3.0     |
| **CollecTRI**                 | Public | <https://github.com/saezlab/CollecTRI?tab=readme-ov-file>       | [The GNU General Public License v3.0 - GNU Project - Free Software Foundation](https://www.gnu.org/licenses/gpl-3.0.en.html) | The GNU General Public License v3.0 - GNU Project - Free Software Foundation | 3.0     |
| **Complex Portal**            | Public | <https://www.ebi.ac.uk/complexportal/home>                      | [Deed - CC0 1.0 Universal - Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/)                            | Deed - CC0 1.0 Universal - Creative Commons                                  | 1.0     |
| **DepMap**                    | Public | <https://depmap.org/portal/>                                    | [Deed - Attribution 4.0 International - Creative Commons](https://creativecommons.org/licenses/by/4.0/)                      | Deed - Attribution 4.0 International - Creative Commons                      | 4.0     |
| **Ensembl (BioMart)**         | Public | <https://www.ensembl.org/info/data/biomart/index.html>          | [Deed - Attribution 4.0 International - Creative Commons](https://creativecommons.org/licenses/by/4.0/)                      | Deed - Attribution 4.0 International - Creative Commons                      | 4.0     |
| **gnomAD**                    | Public | <https://gnomad.broadinstitute.org/>                            | [Deed - CC0 1.0 Universal - Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/)                            | Deed - CC0 1.0 Universal - Creative Commons                                  | 1.0     |
| **GTEx**                      | Public | <https://gtexportal.org/home/>                                  | [GTExPOrtal Data License](https://gtexportal.org/home/license)                                                               | GTExPOrtal Data License                                                      | 4.0     |
| **Hallmarks of Cancer**       | Public | <https://pubmed.ncbi.nlm.nih.gov/21376230/>                     | No licence                                                                                                                   | N/A                                                                          | N/A     |
| **Human Protein Atlas (HPA)** | Public | <https://www.proteinatlas.org/>                                 | [Deed - Attribution-ShareAlike 3.0 Unported - Creative Commons](https://creativecommons.org/licenses/by-sa/3.0/)             | Deed Attribution-Share                                                       | 3.0     |
| **IntOgen**                   | Public | <https://www.intogen.org/search>                                | [Deed - CC0 1.0 Universal - Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/)                            | Deed - CC0 1.0 Universal - Creative Commons                                  | 1.0     |
| **MsigDB**                    | Public | <https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp> | [Deed - Attribution 4.0 International - Creative Commons](https://creativecommons.org/licenses/by/4.0/)                      | Deed - Attribution 4.0 International - Creative Commons                      | 4.0     |
| **Open Targets**              | Public | <https://platform.opentargets.org/>                             | [Deed - CC0 1.0 Universal - Creative Commons](https://creativecommons.org/publicdomain/zero/1.0/)                            | Deed - CC0 1.0 Universal - Creative Commons                                  | 1.0     |
| **Reactome**                  | Public | <https://reactome.org/>                                         | [Deed - Attribution 4.0 International - Creative Commons](https://creativecommons.org/licenses/by/4.0/)                      | Deed - Attribution 4.0 International - Creative Commons                      | 4.0     |
| **Uniprot**                   | Public | <https://www.uniprot.org/>                                      | [Deed - Attribution 4.0 International - Creative Commons](https://creativecommons.org/licenses/by/4.0/)                      | Deed - Attribution 4.0 International - Creative Commons                      | 4.0     |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.owkin.com/explore-and-analyse-data/data-catalog/datasets-available-in-k-pro-free.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
