About This Opportunity
Are you detail-oriented with experience in software, appliances, or system engineering? Are you passionate about designing and operating cloud-native production services at scale? Do you enjoy working with open-source projects and distributed system design, focusing on operational stability and performance?
The EWS Site Reliability Engineering (SRE) team offers an excellent opportunity in a fast-paced, innovative, and highly collaborative technical environment. We invite you to explore our Ericsson Web Services (EWS) offering to learn more:
We are looking for an open-minded, self-driven team member to join a group of top-tier SREs in Shared Development Clouds. With a solution-oriented attitude and an engineering-focused culture, we are responsible for architecting, designing, implementing, deploying, and operating full-stack cloud-native infrastructure and platforms—from hardware to microservices—to enable Ericsson's cloud-native development environment and 5G business.
What You Will Do
* Integrate open-source projects within the CNCF landscape:
* Design, implement, test, and operate high-quality, failure-resistant private clouds using Kubernetes and the cloud-native ecosystem, ensuring reliability and performance at scale.
* Automate and build advanced CI/CD platforms and software supply chains.
* Develop monitoring, logging, alerting, and proactive issue responses.
* Build data center networks, Kubernetes CNIs solutions, and distributed storage systems such as Ceph and Kubernetes CSIs.
* Engage in scaling, performance tuning, systematic problem-solving, cloud-native security, and Kubernetes hardening.
What You Will Bring
* A degree in Electrical Engineering, Computer Science, Software Engineering, Telecommunications, or a related technical field.
* A commitment to continuous learning and a passion for open-source technology.
* A knack for creating tools to automate routine tasks; organized and meticulous.
* Ability to thrive in a team setting at a startup pace, with an open mind and self-drive.
* Flexibility to manage on-call duties.
* Expertise in Linux (RHEL, SLES, Ubuntu) and Linux kernel knowledge is a plus.
* Proficiency in at least one programming language (e.g., Go, Python, C/C++).
* Experience in research, software development, platform development, hardware or appliances, IT, service operations, and cloud operations.
* Skills in Linux system administration and network administration.
* Knowledge of data center infrastructure, replication, scaling, and performance tuning.
* Familiarity with metrics, monitoring, and integrating open-source tools.
* Experience with CI/CD tooling and release engineering.
* Experience with public clouds (Azure, AWS, GCP) and comfort with Go, Python, bash scripts, etc.
* Experience with tools such as Git, GitLab, Docker, Rancher, Jenkins, ELK, Redis, Spinnaker, GitOps, Kubeflow, and Ceph.
* Knowledge of Infrastructure as Code tools like Ansible and Terraform.
* Deep knowledge of cloud-native and Kubernetes ecosystems.
* Experience with eBPF and familiarity with Kubeflow/TensorFlow.
* Understanding of cloud technologies: compute, storage, network, database, and security.