> For the complete documentation index, see [llms.txt](https://docs.owkin.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.owkin.com/integrate-your-data/preparing-your-data.md).

# Preparing your data

K is designed to work flexibly with the data clients provide. Within its supported modalities, K does not require data to conform to a single standard, ontology, or processing pipeline. However, the more standardized and homogeneous the data, the greater the analytical value K can deliver.

To help clients choose the right level of preparation, we define three tiers of data standardization, each with a different value-to-effort trade-off:

**Data Standardization Tiers**

* **Basic:** A small set of lightweight rules that the data must satisfy, for example, every clinical table must include a column with a unique patient identifier. These rules are minimal and easy to meet. The client is responsible for enforcing them. *(A full specification of these rules will be available soon and can be shared upon request.)*
* **Recommended:** An entry-level standardization effort focused on aligning key columns: demographics, procedural fields, and modality-specific measures (e.g., cell counts), to a shared data model. This makes the most common analytical operations directly comparable across datasets. Owkin can provide tooling (data contracts, methodology) and services to support these transformations.
* **Full:** Complete alignment to a common data model, ontology, and preparation pipeline. This level is typically required for use cases that involve cross-dataset cohorts or analyses run on merged datasets. Delivering this level of standardization requires a high-touch professional-services engagement from Owkin.

> **Optional enrichment:** Owkin has developed proprietary models that can extract additional features from raw data (for example, detecting cell types or adding bulk-deconvolution signatures) thereby augmenting the dataset before analysis.

**New modalities**

Adding aggregated tabular data for a new modality (such as a flow cytometry table) will not cause a technical failure in K. However, without a dedicated integration, the platform cannot guarantee the depth of analysis or reproducibility that a fully supported modality provides.

For this reason, we recommend clients to engage with the Owkin team before onboarding a new modality. This allows us to assess the data structure, confirm analytical coverage, and, where needed, implement the modality-specific logic required to deliver high-value, reproducible insights for the client's use case.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.owkin.com/integrate-your-data/preparing-your-data.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
