MLPerf Inference Benchmark Data Download

Quick Start: Copy one of the commands below and run it in your terminal to download the indicated dataset.

Available Downloads

DeepSeek-R1 Benchmark

(click to expand)

DeepSeek-R1-0528 model

DeepSeek-R1-0528 model for the DeepSeek-R1 benchmark (~689GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
        

DeepSeek-R1 datasets

Full preprocessed dataset and calibration dataset for the DeepSeek-R1 benchmark (~163MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d ./ https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
        

DLRM v2 Benchmark

(click to expand)

Preprocessed DLRM v2 benchmark dataset

Multihot Criteo Click Logs dataset preprocessed for the DLRM v2 benchmark (~163GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/dlrm-v2-preprocessed-dataset.uri
        

DLRM v2 model weights

PyTorch model weights for the DLRM v2 benchmark (~105GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/dlrm-v2-model-weights.uri
        

DLRM v3 Benchmark

(click to expand)

DLRM v3 streaming synthetic dataset

Streaming synthetic dataset for DLRM v3 (~31GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/dlrm-v3-dataset.uri
        

DLRM v3 streaming synthetic dataset checkpoint

Checkpoint for DLRM v3 (~1.1TB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/dlrm-v3-checkpoint.uri
        

GPT-J Benchmark

(click to expand)

GPT-J model checkpoint

Model checkpoint for the GPT-J benchmark (~25GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d model https://inference.mlcommons-storage.org/metadata/gpt-j-model-checkpoint.uri
        

GPT-OSS Benchmark

(click to expand)

gpt-oss-120b model

gpt-oss-120b model for the GPT-OSS benchmark (~196GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/gpt-oss-model.uri
        

Dataset for GPT-OSS benchmark

Dataset for GPT-OSS benchmark (~644MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/gpt-oss-data.uri
        

Llama 2 70b Benchmark

(click to expand)

Preprocessed Open Orca dataset

Open Orca dataset preprocessed for the Llama 2 70b benchmark (~418MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/llama-2-70b-open-orca-dataset.uri
        

Llama 3.1 405b Benchmark

(click to expand)

Llama 3.1 405b calibration dataset

Calibration dataset for the Llama 3.1 405b benchmark (~57MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/llama3-1-405b-calibration-dataset-512.uri
        

Llama 3.1 405b dataset

Dataset for the Llama 3.1 405b benchmark (~892MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/llama3-1-405b-dataset-8313.uri
        

Llama 3.1 8b Benchmark

(click to expand)

Full CNN evaluation dataset (Inference Datacenter)

CNN dataset for the Llama 3.1 8b Inference Datacenter benchmark (~267MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/llama3-1-8b-cnn-eval.uri
        

5000 samples CNN evaluation dataset (Inference Edge)

Sample CNN dataset for the Llama 3.1 8b Inference Edge benchmark (~101MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/llama3-1-8b-sample-cnn-eval-5000.uri
        

CNN-DailyMail calibration dataset

CNN-DailyMail calibration dataset for the Llama 3.1 8b benchmark (~21MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/llama3-1-8b-cnn-dailymail-calibration.uri
        

Mixtral 8x7b Benchmark

(click to expand)

Mixtral 8x7b model checkpoint

Mixtral 8x7b model checkpoint for the Mixtral 8x7b benchmark (~187GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/mixtral-8x7b-model-checkpoint.uri
        

Mixtral 8x7b validation dataset

Validation dataset for the Mixtral 8x7b benchmark (~75MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/mixtral-8x7b-validation-dataset.uri
        

Mixtral 8x7b calibration dataset

Calibration dataset for the Mixtral 8x7b benchmark (~5.0MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/mixtral-8x7b-calibration-dataset.uri
        

RGAT Benchmark

(click to expand)

RGAT model

RGAT model for the RGAT benchmark (~53MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/rgat-model.uri
        

Stable Diffusion Benchmark

(click to expand)

Stable Diffusion FP32 model

Stable Diffusion XL 1.0 FP32 model checkpoint for the Stable Diffusion benchmark (~14GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/stable-diffusion-xl-1-0-fp32-checkpoint.uri
        

Stable Diffusion FP16 model

Stable Diffusion XL 1.0 FP16 model checkpoint for the Stable Diffusion benchmark (~7.0GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/stable-diffusion-xl-1-0-fp16-checkpoint.uri
        

Text to Video

(click to expand)

Text to Video dataset

Text to Video dataset (~9.7MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/T2V-data.uri
        

Text to Video model

Wan2.2-T2V-A14B-Diffusers model for T2V benchmark (~127GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/T2V-model.uri
        

Text to Video vbench extra models

Extra models used by vbench saved (~5.6GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/T2V-vbench.uri
        

VLM Benchmark

(click to expand)

Global Shopify Catalogue dataset

Global Shopify Catalogue for VLM (~9.5GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/VLM-data.uri
        

VLM model

Qwen3-VL-235B-A22B-Instruct model for VLM benchmark (~472GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/VLM-model.uri
        

Whisper Benchmark

(click to expand)

Whisper model

Whisper large-v3 model for the Whisper benchmark (~25GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d whisper/model https://inference.mlcommons-storage.org/metadata/whisper-model.uri
        

Whisper dataset

LibriSpeech dataset for the Whisper benchmark (~4.6GB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d whisper/dataset https://inference.mlcommons-storage.org/metadata/whisper-dataset.uri
        

YOLO Benchmark

(click to expand)

YOLO COCO2017 dataset

COCO2017 filtered dataset for YOLO (~262MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/YOLO-COCO2017-dataset.uri
        

YOLO v11 model

YOLO v11 model (~52MB)

            bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) https://inference.mlcommons-storage.org/metadata/YOLO-model.uri