Meta is seeking a Production Engineer with in-depth understanding of networking, systems, automation, and tooling to join the PE Network team. This team is responsible for deploying and managing one of the world’s largest and most complex networks. Meta’s network is a foundational component in achieving the company's AI goals and this role would play a key role in supporting it. Given the scale and demands of our infrastructure, automation plays a critical role. In this position, you will design, develop, and implement automation and tooling to streamline network operations while ensuring the scalability and reliability of Meta’s global network. You’ll collaborate with top engineers in the industry to build and maintain the systems that power one of the largest networks in the world, supporting billions of users across our applications.
Responsibilities
Conceptualize, build, and maintain automation and tools to support the next generation of network products, network deployment, release engineering and operations
Develop operational process improvements and implement them in scalable, automated workflows to enhance operational efficiency
Design and develop solutions that scale across a variety of network platforms
Lead enhancements of automation for continuous integration, validations, testing infrastructure, release, and configuration management across our global data center network fleet
Conduct thorough investigations into complex technical issues across networks, ranging from automated tooling to hardware failures and network issues
Participate in a weekly on-call rotation with the team and be an escalation contact for your service
Proactively find operational gaps that impact the efficiency of your team, come up with the execution plan, and drive the project directly and through influence of other team members
Contribute to team growth and development through peer mentorship
Minimum Qualifications
Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
Experience developing software to automate operations
5+ years of experience developing and understanding network device configurations for at least one network vendor (e.g. Arista, Juniper, Cisco, Brocade, Ciena, Infinera, Nokia, etc.)
5+ years of coding experience in at least one programming language (e.g. Python, Go, C++,)
Demonstrated knowledge of TCP, IPv4/6, Routing Protocols (one or more of BGP, MPLS, ISIS, or similar), or related network services (e.g. DHCP and DNS)
Preferred Qualifications
Master's degree or graduate work experience in Computer Science, Computer Engineering, or a related technical field
6+ years of experience building software solutions for managing network infrastructure, with a focus on scalability and reliability
In-depth knowledge of software and network debugging, profiling, and instrumentation techniques to ensure optimal system performance
Proven experience designing, developing, and operating distributed systems at scale, with an in-depth understanding of the challenges and opportunities in this space
Experience designing and maintaining automated testing infrastructure to ensure the quality and reliability of our systems
Knowledge of IB/RDMA/RoCE Networks, including RDMA congestion control mechanisms, AI training workloads and demands they exert on networks
About Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology.
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base rate and do not include bonus, equity or incentives. Meta offers benefits. Learn more about benefits at Meta.
#J-18808-Ljbffr