METR seeks Research Lead – Eval Execution to work in Berkeley, CA. Draft, evaluate & manage methods for evaluating ML systems; Develop algorithms for enhancing AI systems; Identify weaknesses in evals & propose improvements; Run ML experiments on model performance; Manage team; Publish results.
Requires Master's in CS & 1 year of experience in job offered or similar managerial-level AI eval research position. Requires working knowledge of LLM agent evals & methods, scaffolding tools for LLM-based agents & knowledge of scoring functions to analyze agent performance, JSON, Docker, NodeJS, PostgreSQL, Git, Github, Python, Yaml, basic frontend development, LLM API development, deep learning, neural network architecture, alignment techniques, item response theory, experimental design, protocol analysis, factor analysis, & thematic analysis. Occasional local telecommuting permitted.
$209,181/yr + benefits. Send resume to Kris Chari, Operations, 440 N Barranca Ave. # 3345, Covina, CA 91723 or [email protected].
JOBS.NOW Note: To tap into these hidden job opportunities, it's crucial to adhere strictly to the application process outlined in each job ad. At JOBS.NOW, we ensure that every listing includes detailed employer instructions. Follow them precisely to be considered for these unique positions!
The "Log Application" button simply allows you to log the application for your records - JOBS.NOW does not submit any applications to employers directly. Remember to still apply through the method indicated in the job ad (mail, email, or via link).
Please note that JOBS.NOW is an independent website and does not post this listings on behalf of any employers nor do we receive any compensation for these listings. All listings are sourced via media or internet channels required by the PERM process.