 
        
        1 day ago Be among the first 25 applicants
xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands‑on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.
About the Role
xAI is building at a furious pace with the latest hardware to help people understand the universe, and we are in need of Network Development Engineers (NDEs) with at least 3+ years of experience in deploying or operating large‑scale production data‑center or backbone networks.
You will own the availability and/or the deployment of production networks for 𝕏 and xAI, including data‑center, backbone networks, and our primary front‑ and back‑end networks that train Grok and our customers use for inference. Deployment Engineers will own all aspects of planning and building green and brownfield network deployments. Operations Engineers will own timely mitigation of network impairments for all layers of our network, and the return to service of network hardware and capacity. You will be expected to participate in a team on‑call rota and to contribute to scaling and maintenance efforts.
Responsibilities
 * Deploying or operating scalable network architectures for AI/HPC workloads, inter‑DC and backbone network fabrics.
 * Power user and ability to iterate software and tooling for network operations, network deployment and monitoring.
 * Collaborating with cross‑functional teams on data‑center and backbone build‑outs and optimizations.
 * Analyzing performance and availability metrics to identify and resolve bottlenecks, availability impairments or inefficient build processes.
 * Ensuring high availability, fast deployability, and high security of production networks.
Required Qualifications
 * A minimum of 3 years in deploying or operating hyperscale networks.
 * Hands‑on experience with networking protocols and tools (e.g., BGP, OSPF, ZTP, etc.).
 * Experience with Python scripting and in automating tasks, acquiring metrics, and analyzing large data sets.
 * Strong problem‑solving skills and ability to thrive in a fast‑paced, ambiguous setting.
 * Bachelor’s degree in Computer Science, Electrical Engineering, or a related field (or equivalent experience).
Preferred Qualifications
 * Experience designing hyperscale network infrastructure or large‑scale GPU clusters and automating their entire deployment process.
 * Proven track record in leading on‑call rotations, incident response, and team development in high‑stakes environments.
 * A working understanding of RoCEv2.
Interview Process
After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to an initial interview (45 minutes – 1 hour). If you clear the initial phone interview, you may enter the main process, which consists of four interviews:
 * Coding interview
 * Network engineering technologies interview
 * Manager interview
 * Meet and greet with the team with a presentation of a large‑scale solution or problem you owned, from start to finish.
Our goal is to finish the main process within one week. We do not rely on recruiters for assessments. Every application is reviewed by a member of our technical team. All interviews will be conducted via Google Meet or in person.
xAI is an equal opportunity employer.
California Consumer Privacy Act (CCPA) Notice
Seniority Level
Mid‑Senior level
Employment Type
Full‑time
Job Function
Information Technology
Industries
Technology, Information, and Internet
#J-18808-Ljbffr