Overview We are seeking a Principal Kubernetes Architect to serve as a senior technical authority for the design, evolution, and operation of InterSystems Kubernetes based cloud platforms.
This role is responsible for defining reference architectures, establishing engineering standards, and solving the most complex platform challenges across hybrid and multi cloud environments.
As a Principal Architect, you will operate at both a strategic and hands on level.
You will design large scale Kubernetes platforms, influence cloud and platform strategy, and mentor engineers across DevOps, Site Reliability Engineering, and platform teams.
You will play a critical role in enabling secure, resilient, and highly automated Kubernetes environments that power mission critical managed services and SaaS offerings.
This role is intended for a deeply experienced Kubernetes and cloud professional who has designed, built, and operated production platforms at scale and who acts as a technical leader across organizations.
Key Responsibilities Platform Architecture and Technical Leadership Act as a senior technical authority for Kubernetes architecture across managed services and SaaS platforms Define and maintain reference architectures and design patterns for Kubernetes platforms across on premises, private cloud, and public cloud environments Lead architectural reviews and provide guidance on cluster design, networking, security, scaling, and multi region strategies Influence long term platform strategy, tooling selection, and architectural direction in partnership with platform and cloud leadership Kubernetes Platform Engineering Design, build, and evolve Kubernetes clusters using platforms such as EKS, AKS, GKE, Rancher, or equivalent enterprise distributions Establish best practices for cluster lifecycle management, upgrades, multi tenancy, and workload isolation Architect and implement advanced Kubernetes networking models including CNI plugins, ingress controllers, and network policies Design secure RBAC models, secrets management approaches, and workload isolation aligned with enterprise security requirements Ensure platform changes are engineered, tested, versioned, and rolled out with the same rigor as application software Infrastructure as Code Automation Define and enforce Infrastructure as Code standards using Terraform, Helm, and GitOps based workflows Design reusable and composable infrastructure modules to enable consistent and repeatable platform deployments Drive automation first approaches to cluster provisioning, configuration, and lifecycle management # Ensure platform changes are versioned, tested, and delivered through controlled CI and CD pipelines Reliability, Observability Operations Architect observability solutions for Kubernetes platforms using Prometheus, Grafana, Loki, Fluentd or Fluent Bit and related tooling Define strategies for monitoring, alerting, capacity planning, and performance optimization Lead troubleshooting of complex platform incidents involving cluster degradation, networking issues, or systemic failures Partner with Site Reliability Engineering teams to establish service level objectives, error budgets, and reliability engineering practices Cloud, Hybrid Data Protection Design Kubernetes solutions that operate consistently across public cloud and on premises environments Architect backup, restore, and disaster recovery strategies using tools such as Velero, Kasten, or Stash Address cloud specific constraints related to identity, networking, storage, and cost optimization Mentorship Cross Team Influence Mentor senior engineers and architects across platform, DevOps, and Site Reliability Engineering teams Act as a trusted advisor to application teams on cloud native and containerization strategies Contribute to technical standards, documentation, and internal enablement Represent platform architecture in cross functional design reviews and technical forums Qualifications Experience Required Ten to fifteen or more years of experience in infrastructure, cloud, platform engineering or Site Reliability Engineering roles with deep Kubernetes specialization Proven experience architecting and operating large scale Kubernetes platforms in production environments supporting mission critical workloads Strong hands on experience with AWS, Azure, and or Google Cloud Platform including networking, identity, and storage services Expert level knowledge of Kubernetes internals, scheduling, networking, security, and storage Deep experience with Infrastructure as Code and automation using Terraform, Helm, GitOps, and CI and CD systems Strong Linux and container runtime expertise Demonstrated ability to solve complex technical problems and influence architectural decisions across teams Preferred Certifications Certified Kubernetes Administrator Certified Kubernetes Security Specialist Certified Kubernetes Application Developer AWS Certified DevOps Engineer Professional Google Professional Cloud DevOps Engineer HashiCorp Certified Terraform Associate Linux Foundation Certified System Administrator Nice to Have Certifications Experience with GitOps tooling such as Argo CD or Flux Experience with service mesh technologies such as Istio, Linkerd, or Cilium Experience with Spectro Cloud or enterprise Kubernetes management platforms Open source contributions to Kubernetes or cloud native projects Experience supporting regulated or compliance driven environments Familiarity with emerging container platforms such as Incus