Explore 7.4 PB of genomics data across 6.5M files
| Sample Genomics Data Data from various genomics file formats (BAM, VCF, BED, etc), and sequencing technologies. (5.15 GB) | Sample Genomics Data | 5.15 GB | Data from various genomics file formats (BAM, VCF, BED, etc), and sequencing technologies. | |
| Human Pangenome Project Sequencing data and analysis of 10 trios. First complete human genome assembly. (4.46 PB) | Human Pangenome Project | 4.46 PB | Sequencing data and analysis of 10 trios. First complete human genome assembly. | |
| Genome in a Bottle Reference data from several sequencing technologies. Used as ground truth for benchmarking. (130 TB) | Genome in a Bottle | 130 TB | Reference data from several sequencing technologies. Used as ground truth for benchmarking. | |
| 1000 Genomes Project Sequencing data and analysis of >2,500 individuals from around the world. (766 TB) | 1000 Genomes Project | 766 TB | Sequencing data and analysis of >2,500 individuals from around the world. | |
| Bio Data Zoo Example genomics data for tool developers (619 kB) | Bio Data Zoo | 619 kB | Example genomics data for tool developers | |
| Platinum Pedigree Whole genome sequencing using five technologies on a 4-generation family (11.8 TB) | Platinum Pedigree | 11.8 TB | Whole genome sequencing using five technologies on a 4-generation family | |
| DeepVariant Datasets Sample data used for testing and benchmarking the DeepVariant variant caller. (6.26 TB) | DeepVariant Datasets | 6.26 TB | Sample data used for testing and benchmarking the DeepVariant variant caller. | |
| KinDEL dataset DNA-Encoded Library Dataset For Kinase Inhibitors, for benchmarking machine learning models (24.6 GB) | KinDEL dataset | 24.6 GB | DNA-Encoded Library Dataset For Kinase Inhibitors, for benchmarking machine learning models | |
| Broad Public Datasets Sample datasets from the Broad Institute for testing bioinformatics workflows. (4.09 TB) | Broad Public Datasets | 4.09 TB | Sample datasets from the Broad Institute for testing bioinformatics workflows. | |
| Genome Ark Data from the Vertebrate Genomes Project (VGP), featuring reference genomes for vertebrate species. (1.61 PB) | Genome Ark | 1.61 PB | Data from the Vertebrate Genomes Project (VGP), featuring reference genomes for vertebrate species. | |
| Ensembl FTP Site Explore data on the Ensembl FTP site interactively | Ensembl FTP Site | Explore data on the Ensembl FTP site interactively | ||
| Human Microbiome Project Microbiome data of 300 healthy adults, and several individuals with disease conditions. (5.86 TB) | Human Microbiome Project | 5.86 TB | Microbiome data of 300 healthy adults, and several individuals with disease conditions. | |
| Australasian Genomes Sequencing datasets and reference genomes of several threatened Australasian species. (7.97 TB) | Australasian Genomes | 7.97 TB | Sequencing datasets and reference genomes of several threatened Australasian species. | |
| 3000 Rice Genomes Sequencing data and analysis of >3,000 rice varieties from 89 countries. (255 TB) | 3000 Rice Genomes | 255 TB | Sequencing data and analysis of >3,000 rice varieties from 89 countries. | |
| GATK Test Data Test datasets for the GATK variant caller, with data from WGS, WES, and RNA-seq. (1.05 TB) | GATK Test Data | 1.05 TB | Test datasets for the GATK variant caller, with data from WGS, WES, and RNA-seq. | |
| Element Bio Data Data from the Element Bio manuscript about the Avidity instrument. (535 GB) | Element Bio Data | 535 GB | Data from the Element Bio manuscript about the Avidity instrument. | |
| ONT Data Oxford Nanopore benchmarking datasets from various sequencing chemistries and samples. (160 TB) | ONT Data | 160 TB | Oxford Nanopore benchmarking datasets from various sequencing chemistries and samples. | |
| RNA-Seq Nanopore Data RNA-Seq data from Nanopore sequencing, with matched short-read RNA-Seq from the Singapore Nanopore Expression Project (SG-NEx) (18 TB) | RNA-Seq Nanopore Data | 18 TB | RNA-Seq data from Nanopore sequencing, with matched short-read RNA-Seq from the Singapore Nanopore Expression Project (SG-NEx) | |
| Pediatric Brain Tumor Atlas Analysis of pediatric brain tumors: gene expression, gene fusions, somatic mutations, CNVs, and SVs. (3.13 TB) | Pediatric Brain Tumor Atlas | 3.13 TB | Analysis of pediatric brain tumors: gene expression, gene fusions, somatic mutations, CNVs, and SVs. | |
| Genome in a Bottle (FTP) Reference data from several sequencing technologies. Used as ground truth for benchmarking. | Genome in a Bottle (FTP) | Reference data from several sequencing technologies. Used as ground truth for benchmarking. |