Open-source evaluation tooling for LLMs
Remote
Open-source evaluation tooling for LLMs
Helix Labs maintains a suite of open evaluation and red-teaming tools used by AI teams to benchmark model quality, safety, and regressions before they ship. Community-driven, with enterprise support for teams running evals at scale.