Principal ML Engineer The Opportunity As a Principal ML Engineer, you will lead the technical architecture and engineeringstrategy for integrating sophisticated AI into high-stakes Healthcare Information Systems(HIS).
We are looking for a seasoned builder who prioritizes reliability, systemperformance, and automated scalability over hype.
While many focus on the "science" of modeling, your mission is the engineering of theecosystem.
You will architect the robust MLOps pipelines and cloud infrastructurerequired to move models from experimental notebooks into mission-critical clinicalenvironments.
You are the bridge between raw data and resilient, production-grade AIservices.
Key Responsibilities MLOps & System Architecture
• Production Lifecycle: Lead the design and implementation of end-to-end MLlifecycles, focusing on automated CI/CD pipelines, model versioning (MLflow/DVC),and reproducible experimentation.
• Inference at Scale: Architect high-performance serving layers for both LLMs andclassical models, ensuring low-latency and high-availability in a secure healthcarecloud environment.
• Agentic Orchestration: Build the underlying infrastructure for agent-basedreasoning systems, ensuring these "Agentic" workflows are traceable, auditable,and integrated into existing HIS.
Data Engineering & Infrastructure
• Data Reliability: Design robust data pipelines (ETL/ELT) to process healthcarespecificformats (FHIR, HL7, DICOM) into high-quality features for real-time andbatch inference.
• Hybrid Infrastructure: Manage and optimize cloud-native infrastructure(AWS/Azure/GCP) using Infrastructure as Code (Terraform/Pulumi) to support heavycompute workloads.
• System Integrity: Implement comprehensive monitoring and observabilityframeworks to detect data drift, model decay, and system bottlenecks before theyimpact clinical outcomes.
Technical Leadership & Governance
• Engineering Authority: Serve as the lead architect for the ML platform, ensuring allsystems are HIPAA/HITRUST compliant and follow "security-by-design" principles.
• Operational Excellence: Establish rigorous standards for code quality,containerization (Docker/Kubernetes), and system documentation across theengineering organization.
• Strategic Mentorship: Elevate the team by fostering a culture of "ML as Engineering," guiding junior engineers in building maintainable, modular, andscalable software.
Candidate Profile Education & Experience:
• Academic Background: Master's or Ph D in Computer Science, Software Engineering, or a related technical field.
• Proven Track Record: 10+ years of experience in software engineering, with at least6 years dedicated to deploying and maintaining large-scale ML systems inproduction (not just research or POCs).
Core Technical Stack:
• MLOps & Cloud: Expert-level experience with Cloud Providers (AWS/GCP/Azure)and orchestration tools (Kubernetes, Kubeflow, or Airflow).
• Engineering & Programming: Expert-level Python and Java/Go (or similar).
Deepproficiency in backend frameworks and system design patterns.
• Data Engineering: Strong experience with Spark, Snowflake/Databricks, andbuilding scalable feature stores.
• Applied AI: Hands-on experience deploying Generative AI (LLMs) and Agenticframeworks (Lang Chain/Lang Graph) within a containerized microservicesarchitecture.
The "Principal" Edge (Preferred):
• Hardware Optimization: Experience with GPU optimization, quantization, orspecialized serving frameworks (v LLM, TGI).
• Security & Compliance: Deep understanding of cybersecurity best practices withinregulated industries (Healthcare, Finance, or Defense).
• Distributed Systems: Proven ability to design systems that handle massiveconcurrency and distributed data processing.