Sample Data · Try Profiler

Test with
real datasets

All test datasets are hosted on GitHub and reflect real omics scenarios — processed proteomics exports from DIA-NN/MaxQuant, raw instrument data from Bruker, Waters & Thermo Fisher (converted directly inside Profiler to .mzML / .mzXML), simulated multi-omics tabular data, survival datasets, and peer-review figure data.

7 folders 4 omics types Open on GitHub
yanisZirem / Profiler_v1_requests_datatests
Official test repository — all datasets used for Profiler development and peer review
View on GitHub →
📁 MaxQuant_data/ 5 months ago
Proteomics · MaxQuant
MaxQuant proteinGroups
Raw MaxQuant output — proteinGroups.txt directly uploadable into Profiler. The platform auto-detects LFQ intensity columns, filters reverse hits and contaminants, and maps samples automatically.
proteinGroups.txt LFQ intensities Auto-parsed by Profiler
View on GitHub → Upload directly — no conversion needed
📁 DIA-NN_data/ 10 months ago
Proteomics · DIA-NN
DIA-NN pg_matrix output
DIA-NN processed output file report.pg_matrix.tsv — protein group matrix with sample intensities. Profiler auto-detects the DIA-NN format and extracts sample columns on upload.
report.pg_matrix.tsv Protein groups DIA proteomics
View on GitHub → Upload pg_matrix.tsv directly
📁 Bruker_data/ 11 months ago Built-in conversion
Raw MS Data · Bruker instrument
Bruker Raw Data — Direct Conversion in Profiler
Raw mass spectrometry data from Bruker instruments (.d folders, zipped). No external tool required — Profiler includes a native Data Conversion module directly in the sidebar that converts Bruker raw files to .mzML or .mzXML in one click.
01
Open Profiler
Go to the Data Conversion sidebar tab
02
Upload .d folder
Select your Bruker raw files
03
Convert
Choose .mzML or .mzXML
04
Process & analyse
DIA-NN / MaxQuant → Profiler
Raw .d folders Bruker timsTOF · QTOF → .mzML · .mzXML
View on GitHub → No MSConvert needed — convert natively in Profiler
📁 Waters_data/ 11 months ago Built-in conversion
Raw MS Data · Waters instrument
Waters Raw Data — Direct Conversion in Profiler
Raw mass spectrometry data from Waters instruments (.raw folders, zipped). Like Bruker and Thermo Fisher formats, Waters .raw files can be converted directly inside Profiler's Data Conversion sidebar — no third-party tool needed.
01
Open Profiler
Go to the Data Conversion sidebar tab
02
Upload .raw folder
Select your Waters raw files
03
Convert
Choose .mzML or .mzXML
04
Process & analyse
MaxQuant / DIA-NN → Profiler
Raw .raw folders Waters Synapt · Xevo · Vion → .mzML · .mzXML
View on GitHub → No MSConvert needed — convert natively in Profiler
📁 Tabular_data_multi_omics / Binary_classes/ 2 days ago New ✦
Multi-omics · Binary classification · 4 omics types
Toy Multi-Omics — Tumor Aggressiveness (2 classes)
Four simulated datasets (one per omics type) for binary classification: Aggressive vs NonAggressive tumor samples. Ideal for testing the full Profiler pipeline — preprocessing, PCA/UMAP, volcano plots, ML classification, SHAP, ORA/GSEA enrichment and HTML report generation.
📄 toy_proteomics_tumor_aggressiveness.csv Proteomics
📄 toy_metabolomics_tumor_aggressiveness.csv Metabolomics
📄 toy_lipidomics_tumor_aggressiveness.csv Lipidomics
📄 toy_rnaseq_tumor_aggressiveness.csv Transcriptomics
View all files on GitHub → CSV with Class column — upload directly to Profiler
📁 Tabular_data_multi_omics / Multi_classes/ 2 days ago New ✦
Multi-omics · 3-class classification · 4 omics types
Toy Multi-Omics — Tumor / Necrosis / Healthy (3 classes)
Same four omics types as the binary set, now with 3 classes: Tumor, Necrosis and Healthy. Tests multi-class volcano plots, ANOVA-based statistics, multi-class ML models (Random Forest, XGBoost) and multi-group enrichment analysis.
📄 toy_proteomics_tumor_necrosis_healthy.csv Proteomics
📄 toy_metabolomics_tumor_necrosis_healthy.csv Metabolomics
📄 toy_lipidomics_tumor_necrosis_healthy.csv Lipidomics
📄 toy_rnaseq_tumor_necrosis_healthy.csv Transcriptomics
View all files on GitHub → CSV with Class column — upload directly to Profiler
📁 Survival_data/ last year
Survival Analysis
Survival Analysis Dataset
Dataset with time-to-event and censoring columns, designed to test Profiler's survival analysis module: Kaplan–Meier curves, log-rank test, and Cox proportional hazards model with forest plot output.
Time + Event columns Kaplan-Meier ready Cox model compatible
View on GitHub → Go to Survival tab in Profiler
📁 data_for_peerReview_paper/ 3 weeks ago
Peer Review · Publication data
Bioinformatics 2025 Paper Data
Datasets used for the peer-review submission of the Profiler publication in Bioinformatics (Oxford, 2025). These are the exact datasets shown in the figures of the paper.
Published · doi:10.1093/bioinformatics/btaf644 Figures data
View on GitHub → Reproduce published results

How to use these datasets in Profiler

01 Download the file from GitHub (click the file → Raw → Save). For tabular multi-omics CSVs, no conversion is needed.
02 For raw instrument data (Bruker .d / Waters .raw / Thermo Fisher): use the Data Conversion tab directly in Profiler's sidebar — no external tool needed. Convert to .mzML or .mzXML, then process with DIA-NN or MaxQuant before loading.
03 Upload in Profiler — go to the Load Data tab. The format (MaxQuant, DIA-NN, generic CSV…) is auto-detected. The Class column is recognised automatically.
04 Run the full pipeline: QC → Preprocessing → Visualisation → AI Modeling → ORA/GSEA → HTML Report.

Preparing your own dataset? Read the Data Import Guide for column naming conventions, supported software formats, and tips for longitudinal data. The Class, _meta and Subject_ID / Time (longitudinal) columns are described in detail.