Home › Tutorials › PBMC Demo

PBMC Multi-omics Demo

Run all 15 skills on PBMC data — the gold-standard benchmark for single-cell and multi-omics analysis. Choose between our built-in synthetic demo data or real publicly available datasets.

💡 Two options: Use our built-in synthetic data (no download needed, works immediately) or download real published PBMC datasets from the links below.

Option 1 — Built-in Synthetic Demo (recommended to start)

No download needed. Generates realistic PBMC data in seconds based on published marker gene profiles.

python3 data/pbmc_demo_generator.py

# Creates 8 synthetic PBMC layers instantly:
  scRNA-seq:     3,000 cells x 2,000 genes (10 cell types)
  Bulk RNA-seq:  12 donors — Healthy vs Sepsis
  ATAC-seq:      2,000 cells x 5,000 peaks
  Proteomics:    300 plasma proteins x 12 donors
  Metabolomics:  250 serum features x 12 donors
  CITE-seq ADT:  1,000 cells x 25 surface proteins
  Metagenomics:  20 gut microbiome samples
  Genomics:      500 variants + 8 pharmacogenomic loci

python3 omics_agent.py --demo

Option 2 — Real Published PBMC Datasets

Download these free, publicly available datasets and run OmicsAgent on real data.

scRNA-seq

10x PBMC 3k FreePublished~50 MB

3,000 PBMCs from a Healthy Donor · 10x Genomics Chromium · hg19

The most widely used single-cell benchmark dataset. 2,700 cells after QC, 8 cell types. Used in Seurat and Scanpy tutorials worldwide.

10x PBMC 68k FreePublished~1.2 GB

68,000 PBMCs from a Healthy Donor · 10x Genomics · hg19

Large-scale PBMC dataset used in the original Cell Ranger paper. Ideal for testing scalability and rare cell type detection.

Human Cell Atlas — PBMC FreePublished~500 MB

Pan et al. 2018 · Multiple donors · Multiple conditions

Multi-donor PBMC atlas with healthy and disease conditions. Good for batch correction and donor variability analysis.

scATAC-seq

10x PBMC scATAC 5k FreePublished~200 MB

5,000 PBMCs · 10x Genomics Chromium ATAC · hg38

Standard scATAC-seq PBMC benchmark. Includes fragments file, peak matrix, and per-barcode summary. Used in ArchR and Signac tutorials.

CITE-seq (RNA + Surface Proteins)

10x PBMC 10k Multiome CITE-seq FreePublished~800 MB

10,000 PBMCs · RNA + 17 antibodies (ADT) · 10x Genomics

Simultaneous RNA and surface protein measurement. Includes CD3, CD4, CD8, CD14, CD19, CD56, PD-1, and more. Ideal for WNN analysis.

Spatial Transcriptomics

10x Visium Human Lymph Node FreePublished~350 MB

4,035 spots · ~17k genes · Human lymph node section · hg38

The most widely used Visium benchmark. Includes H&E image, spot coordinates, and gene expression matrix. Germinal centers and T cell zones clearly resolved.

10x Visium Human PBMC FreePublished~280 MB

2,646 spots · ~17k genes · Human PBMC smear · hg38

Spatial transcriptomics of PBMCs directly. Useful for matching spatial and scRNA-seq data from the same cell type.

Multi-omics Integration

10x Multiome PBMC (RNA + ATAC) FreePublished~1.5 GB

10,000 PBMCs · Simultaneous RNA-seq + ATAC-seq · hg38

The gold standard for multi-omics integration. Same cells profiled for gene expression AND chromatin accessibility simultaneously. Perfect for WNN and MOFA+ tutorials.

Gut Metagenomics

HMP2 — Human Microbiome Project 2 FreePublishedVaries

Lloyd-Price et al. Nature 2019 · IBD vs Healthy · Multi-omics

The largest published multi-omics gut microbiome study. Includes metagenomics, metatranscriptomics, metabolomics, and proteomics from IBD patients and healthy controls.

How to Use Real Data with OmicsAgent.ai

scRNA-seq (10x PBMC 3k)

# 1. Download and extract
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/pbmc3k/pbmc3k_filtered_gene_bc_matrices.tar.gz
tar -xzf pbmc3k_filtered_gene_bc_matrices.tar.gz

# 2. Run via chat mode
python3 omics_agent.py --chat
You: Run scRNA-seq QC, normalization, clustering and cell type
     annotation on my 10x PBMC data at
     filtered_gene_bc_matrices/hg19/

Visium Spatial (Human Lymph Node)

# 1. Download matrix and spatial files
wget https://cf.10xgenomics.com/samples/spatial-exp/1.1.0/V1_Human_Lymph_Node/V1_Human_Lymph_Node_filtered_feature_bc_matrix.h5
wget https://cf.10xgenomics.com/samples/spatial-exp/1.1.0/V1_Human_Lymph_Node/V1_Human_Lymph_Node_spatial.tar.gz
tar -xzf V1_Human_Lymph_Node_spatial.tar.gz

# 2. Run spatial analysis
python3 omics_agent.py --chat
You: Analyze my Visium human lymph node data, find spatially
     variable genes, and identify tissue domains

Multiome RNA+ATAC Integration

# 1. Download both modalities
wget https://cf.10xgenomics.com/samples/cell-arc/2.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5
wget https://cf.10xgenomics.com/samples/cell-arc/2.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz

# 2. Run integration
python3 omics_agent.py --chat
You: Run multi-omics integration on my 10x Multiome PBMC data
     combining RNA and ATAC modalities with WNN and MOFA+

Expected Results — Biological Findings

SkillDatasetKey Finding
scRNA-seqPBMC 3k/68k10 cell types: CD4 T, CD8 T, NK, B, Mono Classical, Mono Non-classical, mDC, Plasmablast, Treg, Exhausted CD8 T
scATAC-seqPBMC 5k ATACSPI1/CEBPA motifs drive Monocyte chromatin; TOX/NR4A1 mark Exhausted CD8 T cells
CITE-seqPBMC 10k TotalSeqCD14+/HLA-DR+ marks Classical Monocytes; PD-1/LAG-3/TIM-3 co-express on Exhausted CD8 T
SpatialVisium Lymph NodeBCL6/AID in Germinal Center; COL1A1 in Capsule; CCR7/CCL19 in T cell zone
Integration10x MultiomeRNA-ATAC WNN reveals regulatory programs invisible to either modality alone
MetagenomicsHMP2/IBDFaecalibacterium prausnitzii depleted in IBD; Ruminococcus gnavus expanded
Reproducibility: Every analysis exports commands.sh + environment.yml + checksums.sha256. Run sha256sum -c checksums.sha256 to verify results are identical.
← Installation Next: scRNA-seq Tutorial →