About your skills
* Scope: You own the end to end development of systems by prioritizing work, understanding user requirements and understanding the tradeoffs between design decisions. You understand the needs of the Organization as a whole. You collaborate with stakeholders across the organization to ensure project success.
* Decision Making: You use critical thinking to follow a defined decision making process and consider multiple perspectives. Upon making a decision, you are clear in your communication and ensure everyone is aligned in execution.
* Coaching: You identify opportunities to improve processes and standards for the team and Engineering org as a whole. You work with stakeholders to execute on these opportunities.
About this role
* At Udemy the SRE team manages infrastructure from our CDN right back as far as Datastores. In between, we own load balancers, kubernetes clusters and CI/CD.
* We build tools to accommodate the needs of our internal customers.
* We respond to incidents and drive standards of reliability across the organisation.
* A principal engineer is expected to be a leader across the organisation. They contribute directly to long term planning in close collaboration with Architects and Engineering leadership.
* The oversight of a principal engineer extends beyond the boundaries of the SRE team and they are involved in review of Architectural Decisions for teams right across the Engineering organisation.
What you'll be doing
* You'll be a lead on identifying opportunities, designing and executing on projects.
* You'll act as a mentor to other engineers on the SRE team.
* You'll champion SRE best practices.
* You'll participate in an on-call rotation.
* You'll represent SRE across Engineering.
What you'll have
* Extensive experience managing Kubernetes clusters and cloud environments.
* Extensive experience using infrastructure as code tools to deploy infrastructure.
* Experience designing bespoke solutions for greenfield projects.
* Experience writing tools and applications using programming languages such as Python, Golang and Kotlin.
* Experience with on-call and incident management. We promote a blameless culture here at Udemy.
* Experience working with a wide variety of engineering teams to guide them on best practices.
* Good communication skills and an ability to both share and receive feedback in a responsible manner.
#LI-ST5