OW

Models Datasets Spaces Profiles MyLicenses

R

Research Lab

Verified publisher@research-lab

Open eval harnesses, datasets, and benchmark curation for agentic systems.

Published repos

1 repos from this publisher

research-lab/multimodal-evals-v1

Verified publisher | Updated 4 days ago

Unresolved copyright dispute

Agentic benchmark shards with trajectory transcripts, tool traces, and evaluator labels.

datasetcc-by-4.0hot storagepublic

datasetevalsagentictools

RunwayMay 1, 2026

Cost$38.60 / mo