Back to hubresearch-lab / multimodal-evals-v1
dataset
research-lab/multimodal-evals-v1A richer repo page backed by a normalized hub store so the frontend can swap from seeded data to indexer output without changing page components.
datasetevalsagentictoolsmultimodal
R
Research LabVerified publisher
1 active flagModel cardFiles and versionsUseDiscussionsPreviewFunding
multimodal-evals-v1
Agentic benchmark shards with trajectory transcripts, tool traces, and evaluator labels.
Highlights - hot storage with 3 replicas - manifest CID bafydatasetmanifest - funded runway 42 days - moderation flags 1
Intended use Local MVP demo repo rendered from the normalized indexer store shape.
Funding
Runway and storage
Funded throughMay 1, 2026
Monthly burn$38.60 / mo
Storage classhot
Replicas3
Manifest1.3.0
Runway confidence47%
Use in code
Transformers, bash, and local integration examplesExample snippets
Fetch via APIbash
curl http://localhost:3000/api/repos?namespace=research-lab
Read from adapter layertypescript
import { getRepo } from "@/app/data";
const repo = getRepo("research-lab", "multimodal-evals-v1");
console.log(repo?.manifest.artifacts);Files and versions
3 files in current manifestRepository files
Preview
Public sample onlyDataset splits
| Split | Samples | Format | Notes |
|---|---|---|---|
| train | 842,000 | parquet | tool traces + labels |
| validation | 55,000 | parquet | balanced benchmark slice |
| preview | 250 | jsonl | safe public sample |
Community
Discussions
Manifest sync and mirror health
watchingTracking whether the latest manifest CID, billing runway, and moderation state stay aligned in local dev.
Should this repo stay visible under current filters?
openFrontend filters are driven by normalized flag records rather than hard protocol takedowns.