My client is looking to hire a senior very technical Linux specialist to design, build, and operate next‑generation infrastructure platforms supporting mission‑critical systems and advanced compute workloads. This is a hands‑on engineering role focused on architecting and optimising scalable, resilient, high‑performance infrastructure across CPU and GPU environments. You will work at the intersection of Linux systems engineering, cloud infrastructure, Kubernetes platforms, and accelerator‑based compute.
I am seeking a highly experienced Linux Kernel Engineer with deep expertise in GPU‑enabled systems, cloud infrastructure, and distributed platforms. This role combines low‑level Linux kernel engineering with modern cloud‑native infrastructure, requiring hands‑on experience across kernel modules, GPU drivers, Kubernetes, OpenStack, and large‑scale distributed systems. You will help design and operate high‑performance compute environments optimized for GPU workloads, AI/ML training, and compute‑intensive applications.
A FULL JOB SPEC AND COMPANY PROFILE IS AVAILABLE UPON REQUEST
Key Responsibilities:
Develop, modify, and debug Linux kernel components, including networking, storage, and performance‑critical subsystems
Work with GPU kernel modules and CUDA driver stacks
Debug kernel panics, driver issues, and performance bottlenecks
Tune kernel parameters for high‑throughput, low‑latency GPU workloads
Optimize NUMA, I/O scheduling, memory management, and interrupt handling
Deploy and operate GPU‑enabled Linux servers at scale
Manage NVIDIA drivers, CUDA stack, and related kernel modules
Optimize GPU scheduling and resource isolation for multi‑tenant workloads
Support performance profiling for AI/ML and HPC workloads
Contribute to and operate OpenStack‑based infrastructure
Manage and optimise virtualization layers for GPU passthrough and SR‑IOV
Key Experience:
5+ years of Linux systems engineering experience
Strong knowledge of Linux kernel internals (networking, storage, memory management)
Experience deploying and operating GPU‑enabled Linux servers
Deep understanding of CUDA drivers and GPU kernel modules
Hands‑on OpenStack development and operations experience
Strong understanding of distributed systems theory and real‑world implementation
Experience with container runtimes (containerd, CRI‑O)
Proficiency in Python or similar systems programming language
Experience with Terraform and Ansible
Experience building and maintaining observability stacks
#J-18808-Ljbffr