Sr. Site Reliability Operations Engineer
Join to apply for the Sr. Site Reliability Operations Engineer role at Salesforce
Responsibilities
* Lead incident response for high severity incidents affecting internal business operations. Serve as Incident Commander to coordinate technical teams, establish impact, and drive rapid service restoration.
* Monitor and troubleshoot enterprise systems including infrastructure, applications, and network components. Use your technical skills to diagnose complex problems across multiple platforms and vendors before they impact users.
* Prepare executive summaries and communicate incident status to leadership up to CDO level. Translate technical details into business language for stakeholders during and after incidents.
* Drive improvements to incident management processes by updating playbooks, creating SOPs, and leading automation initiatives. Mentor junior team members on handling escalated technical issues.
* Coordinate emergency changes and infrastructure updates to resolve incidents. Work with cross-functional teams to maintain business continuity during critical situations.
* Analyze incident data and KPI metrics to identify trends. Develop actionable recommendations to reduce impact duration and improve team performance.
* Participate in on-call rotation as part of regional coverage. Lead incident review meetings and ensure accurate documentation for post‑incident analysis.
Required Experience
* 8+ years in IT operations, incident management, or site reliability work. Proven experience in a 24x7 high availability environment with enterprise systems.
* Demonstrated ability to lead high severity incident response under pressure. Establish impact, evaluate solutions with subject matter experts, and make decisions that balance technical and business needs.
* Excellent verbal and written communication skills for technical and executive audiences. Create clear incident updates, status reports, and executive summaries for leadership.
* Strong technical troubleshooting ability across Windows and Linux servers, networking, cloud platforms, and virtualization technologies. Diagnose problems quickly using logs, monitoring tools, and common diagnostic approaches.
* Experience leading or mentoring technical teams in incident response or operations roles.
* Experience with cloud platforms like AWS and monitoring of IT infrastructure. Comfortable with core cloud concepts and various monitoring tools.
* Suggest and design SLIs/SLOs to ensure reliability and performance of critical systems in alignment with SRE best practices.
* ITILv4 certification and deep understanding of incident, problem, and change management processes.
* Industry certifications from public cloud platforms like AWS/Azure/Google, CCNA, RHCE or Microsoft associate.
* BS in Computer Science or equivalent practical experience. What matters most is proven ability to solve complex problems and lead technical response efforts.
Nice to Have
* Salesforce platform experience and certifications.
* Additional advanced certifications like AWS SA, CCNP, RHCA.
* Scripting ability in Python, Bash, PowerShell, or similar languages. Experience leading automation initiatives to reduce manual work.
* Advanced experience with monitoring and visualization tools like Splunk, Grafana, or Tableau. Proven ability to analyze data and present insights.
* Experience with automation tools like Puppet or Chef for infrastructure management.
Unleash Your Potential
When you join Salesforce, you’ll be limitless in all areas of your life. Our benefits and resources support you to find balance and be your best, and our AI agents accelerate your impact so you can do your best. Together, we’ll bring the power of Agentforce to organizations of all sizes and deliver amazing experiences that customers love. Apply today to not only shape the future — but to redefine what’s possible — for yourself, for AI, and the world.
Accommodations
If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form.
Posting Statement
Salesforce is an equal opportunity employer and maintains a policy of non‑discrimination with all employees and applicants for employment. What does that mean exactly? It means that at Salesforce, we believe in equality for all. And we believe we can lead the path to equality in part by creating a workplace that’s inclusive, and free from discrimination. Know your rights: workplace discrimination is illegal. Any employee or potential employee will be assessed on the basis of merit, competence and qualifications – without regard to race, religion, color, national origin, sex, sexual orientation, gender expression or identity, transgender status, age, disability, veteran or marital status, political viewpoint, or other classifications protected by law. This policy applies to current and prospective employees, no matter where they are in their Salesforce employment journey. It also applies to recruiting, hiring, job assignment, compensation, promotion, benefits, training, assessment of job performance, discipline, termination, and everything in between. Recruiting, hiring, and promotion decisions at Salesforce are fair and based on merit. The same goes for compensation, benefits, promotions, transfers, reduction in workforce, recall, training, and education.
#J-18808-Ljbffr