Join a leading global innovator in advanced compute and connectivity technologies at the forefront of powering next-generation AI infrastructure. This role sits at the intersection of software, hardware, and large-scale data center operations, shaping how intelligent accelerator platforms are provisioned, orchestrated, monitored, and managed across complex rack environments.
You will take ownership of designing and delivering rack-level management software that enables reliable, scalable deployment of AI accelerator systems in modern data centers. Working across heterogeneous stacks, you'll collaborate with multidisciplinary teams to build the tooling, services, and automation that ensure performance, resilience, and operational visibility from day one through the entire system lifecycle.
What this role offers
This position offers the opportunity to influence how cutting-edge AI hardware is integrated into real-world infrastructure at scale. You'll work on problems spanning distributed systems, orchestration, observability, and infrastructure automation—contributing to platforms that directly enable the future of AI workloads in production environments.
Key responsibilities
* Architect and build scalable software services that manage provisioning, orchestration, monitoring, and lifecycle operations for rack-based AI systems.
* Develop high-performance components with strong focus on concurrency, reliability, and fault tolerance in distributed environments.
* Create APIs, automation frameworks, and tooling that integrate with cloud-native and orchestration platforms to streamline infrastructure operations.
* Implement telemetry, logging, and observability capabilities to maintain system health and enable proactive issue detection.
* Apply secure design principles for multi-tenant environments, ensuring network isolation, resource control, and quality of service.
* Contribute to technical design, documentation, and continuous architectural improvements in collaboration with cross-functional engineering teams.
Qualifications and experience
* Several years' experience developing software for distributed systems, infrastructure platforms, or data center environments.
* Strong programming capability in Python and Go; familiarity with C++ is advantageous.
* Solid grounding in networking concepts and protocols, along with experience designing concurrent, high-throughput systems.
* Hands-on exposure to container orchestration and infrastructure automation tools such as Kubernetes, Terraform, or Ansible.
* Experience with monitoring and observability stacks (e.g., Prometheus, Grafana) and an understanding of how to design for operational visibility.
* Knowledge of rack or hardware management interfaces (such as Redfish or IPMI) and awareness of AI workload orchestration frameworks is beneficial.
* Comfortable working in fast-moving, collaborative engineering environments with a strong problem-solving mindset.
If this opportunity aligns with your experience, motivation, or career ambitions,
apply now
or contact
-
for a confidential discussion.
By applying to this role you understand that we may collect your personal data and store and process it on our systems. For more information please see our Privacy Notice (https://eu-)
In accordance with local employment laws, applicants must have current, valid authorisation to work in European Union at the time of application. We are unable to sponsor employment visas for this role. Applications from individuals without existing work authorisation for European Union cannot be considered.