Engineering manager

Dublin

Anthropic

Engineering manager

€80,000 - €120,000 a year

Posted: 28 August

Offer description

Who you are

* Have experience managing and scaling reliability or infrastructure engineering teams
* Possess deep technical knowledge of distributed systems observability and monitoring at scale
* Understand the unique challenges of operating AI infrastructure and can guide technical decisions
* Have successfully implemented SLO/SLA frameworks and can drive adoption across organizations
* Bring experience with both traditional infrastructure metrics and AI-specific performance indicators
* Can effectively lead technical discussions while translating between ML engineers and infrastructure teams
* Have excellent leadership and communication skills, with ability to influence at all levels
* Demonstrate strong hiring and talent development capabilities
* Have managed teams operating large-scale model training or serving infrastructure (>1000 GPUs)
* Bring hands-on experience with ML hardware accelerators (GPUs, TPUs, Trainium, etc.)
* Understand ML-specific networking optimizations and their operational implications
* Have led teams through major reliability transformations or infrastructure migrations
* Possess experience building reliability engineering practices from the ground up
* Have contributed to or led open-source infrastructure or ML tooling initiatives
* Demonstrate thought leadership in the reliability engineering community

What the job involves

* Anthropic is seeking an experienced engineering leader to manage our Reliability Engineering team
* This team includes Software Engineers and Systems Engineers focused on defining and achieving reliability metrics for all of Anthropic's internal and external products and services
* As a manager, you'll lead the team that's significantly improving reliability for Anthropic's services while pioneering the use of modern AI capabilities to reengineer how we approach reliability engineering
* This leadership role is critical to Anthropic's mission to bring groundbreaking AI technologies to benefit humanity in a safe and reliable way
* Lead and grow a team of reliability engineers responsible for large language model serving and training systems
* Drive the development of service level objectives (SLOs) that balance availability/latency with development velocity across the organization
* Oversee the design and implementation of comprehensive monitoring systems for availability, latency and other critical metrics
* Guide your team in architecting high-availability language model serving infrastructure capable of supporting millions of external customers and high-traffic internal workloads
* Lead the strategy for automated failover and recovery systems across multiple regions and cloud providers
* Establish and manage incident response processes for critical AI services, ensuring your team drives rapid recovery and systematic improvements
* Direct cost optimization initiatives for large-scale AI infrastructure, with focus on accelerator (GPU/TPU/Trainium) utilization and efficiency
* Partner with cross-functional teams to align reliability engineering efforts with broader company objectives
* Build a strong engineering culture focused on reliability, operational excellence, and innovation

Benefits

* Comprehensive health, dental, and vision insurance for you and your dependents
* Inclusive fertility benefits via Carrot Fertility
* Generous subsidy for OneMedical
* 21 weeks of paid parental leave
* Unlimited PTO
* Optional equity donation matching at a 3:1 ratio, up to 50% of your equity grant
* 401(k) plan with 4% matching
* $500/month flexible wellness stipend
* Commuter coverage
* Annual education stipend
* A home office improvement stipend when you first join
* Relocation support for those moving to the Bay Area

Apply

Create an E-mail Alert

Save

Similar job

Systems integrations engineering manager

Dublin

Canonical

Engineering manager

€80,000 - €100,000 a year

Similar job

Senior mobile engineering manager

Dublin

Reperio Human Capital

Engineering manager

Similar job

Engineering manager, ubuntu server distribution

Dublin

Canonical

Engineering manager

€80,000 - €100,000 a year