Sr System Reliability Engineer (Application Support + Automation)
Join to apply for the Sr System Reliability Engineer (Application Support + Automation) role at Fulcrum Digital Inc
Sr System Reliability Engineer (Application Support + Automation)
1 week ago Be among the first 25 applicants
Join to apply for the Sr System Reliability Engineer (Application Support + Automation) role at Fulcrum Digital Inc
Who are we
Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing.
The Role
* Plan, manage, and oversee all aspects of a Production Environment
* Define strategies for Application Performance Monitoring, Optimization in Prod environment
* Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
* Support deployment of code into multiple lower environments. Supporting current processes with an emphasis on automating everything as soon as possible.
* Design, develop and standardize Monitoring and Alerting mechanism for the supported applications.
* Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.
* Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
* Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
* Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
* Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in DevOps automation and best practices.
* Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
* Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.
* Work with a global team spread across tech hubs in multiple geographies and time zones.
* Ability to share knowledge and explain processes and procedures to others.
* Share knowledge and mentor junior resources
* Able to perform on-call duties on a rotational basis.
* Occasional off hours work required.
Who are we
Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing.
The Role
* Plan, manage, and oversee all aspects of a Production Environment
* Define strategies for Application Performance Monitoring, Optimization in Prod environment
* Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
* Support deployment of code into multiple lower environments. Supporting current processes with an emphasis on automating everything as soon as possible.
* Design, develop and standardize Monitoring and Alerting mechanism for the supported applications.
* Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.
* Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
* Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
* Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
* Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in DevOps automation and best practices.
* Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
* Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.
* Work with a global team spread across tech hubs in multiple geographies and time zones.
* Ability to share knowledge and explain processes and procedures to others.
* Share knowledge and mentor junior resources
* Able to perform on-call duties on a rotational basis.
* Occasional off hours work required.
Requirements
Skills -
Must Have
* Linux
* Mainframe
* Shell Scripting
* ITIL / ITSM, Application Troubleshooting
* SQL
* Any Monitoring tool (Preferred Splunk/Dynatrace)
* Jenkins - CI/CD
* Groovy Scripting/Yaml - basic
* Git basic/bit bucket - basic
* Ansible/Chef - good to have
Good To Have
* Even Framework architecture
Seniority level
* Seniority level
Not Applicable
Employment type
* Employment type
Full-time
Job function
* Job function
Information Technology
* Industries
IT Services and IT Consulting
Referrals increase your chances of interviewing at Fulcrum Digital Inc by 2x
Get notified about new Senior Reliability Engineer jobs in Dublin, County Dublin, Ireland.
Dublin, County Dublin, Ireland 5 days ago
Dublin, County Dublin, Ireland 1 week ago
Dublin, County Dublin, Ireland 1 month ago
Dublin, County Dublin, Ireland 1 week ago
Senior Site Reliability Engineer (d/f/m)
Dublin, County Dublin, Ireland 1 week ago
Dublin, County Dublin, Ireland 1 month ago
Dublin, County Dublin, Ireland 1 month ago
Dublin, County Dublin, Ireland 1 week ago
Staff Reliability Engineer - Performance (Escalations Engineering)
Dublin, County Dublin, Ireland 2 weeks ago
Senior Site Reliability / Gitops Engineer
Dublin, County Dublin, Ireland 1 month ago
Manageability Research Engineer - Cloud Hardware Reliability - Permanent
Dublin, County Dublin, Ireland 1 month ago
Principal Network Reliability Engineer - Cloud Data Center - Permanent
Dublin, County Dublin, Ireland 1 month ago
GPU Platform Research Engineer - Cloud Hardware Reliability - Permanent
Dublin, County Dublin, Ireland 1 month ago
Staff Software Engineer, AI Reliability Engineering
Dublin, County Dublin, Ireland 2 weeks ago
Senior Software Development Engineer, Network Observability - Active Monitoring
Dublin, County Dublin, Ireland 2 days ago
Sr System Reliability Engineer (Application Support + Automation)
Dublin, County Dublin, Ireland 3 days ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr