hazumi
Back

Evaluating Coding Agents with Terminal-Bench 2.0

vinhnxsnorkel.ai
2 points0 comments

The article discusses Snorkel AI's development of the Terminal Bench, a benchmark for evaluating the capabilities of coding agents, and the company's role in building the next generation of benchmarks for advanced language models.