News
Newest
Ask
Show
Jobs
Open on GitHub
SlopCodeBench: Benchmarking How Coding Agents Degrade over Long-Horizon Tasks
(arxiv.org)
1 points | by
FiberBundle
2 hours ago
1 comments
cestivan
1 hour ago
[dead]
1 comments