Job Description
We are looking for a Database Reliability Engineer to join our team. This is not a traditional DBA role — you are a DevOps engineer who happens to have deep SQL and database knowledge. You think in code, manage infrastructure via Terraform, and treat database provisioning, patching, and performance as an engineering problem — not a manual task.
You will own the database infrastructure layer across our AWS environments: provisioning, scaling, observability, and reliability — all through Infrastructure-as-Code. Our current estate includes SQL Server running on Windows EC2 instances. You will partner with application teams to ensure databases are correctly provisioned, optimised, and monitored without becoming a gatekeeper.
You Will
Provision and manage AWS RDS instances entirely through Terraform — parameter groups, subnet groups, IAM authentication, secrets rotation, multi-AZ, and read replicas
Own database reliability — design runbooks, define SLOs, set up alerting on slow queries, connection pool saturation, replication lag, and disk growth
Automate database operations — schema migrations, backup validation, failover drills, and patching via CI/CD pipelines
Improve performance — work with development teams on EXPLAIN ANALYZE, query tuning, indexing strategies, and connection pooling (PgBouncer/RDS Proxy)
Secure the data layer — enforce encryption at rest/in transit, IAM database authentication, credential rotation via AWS Secrets Manager, and least-privilege access patterns
Contribute to the platform — build reusable Terraform modules for database provisioning that application teams self-serve from the platform
Participate in on-call rotation — respond to database incidents, drive RCAs, and implement permanent fixes to prevent recurrence
Qualifications
AWS — RDS Aurora, Secrets Manager, IAM, CloudWatch, VPC networking for databases, KMS encryption
Terraform — writing and maintaining modules, state management, remote backends, version pinning
Scripting — Python or Bash for automation, operational tooling, and migration scripts
CI/CD — GitLab CI or equivalent; using pipelines for infrastructure changes and database migrations
Observability — experience with Datadog, CloudWatch, or Prometheus/Grafana for database metrics and alerting
Linux — comfortable with system-level troubleshooting in production environments
Good to Have
PgBouncer or RDS Proxy — connection pooling configuration and tuning
Kubernetes — understanding how workloads connect to databases, sidecar patterns, secrets injection
Database migrations — zero-downtime schema migrations with tools like Flyway, Liquibase, or custom scripted approaches
Financial services or regulated environments — familiarity with audit logging, data residency requirements, PCI DSS / SOC 2 controls
AWS Aurora Global Database — cross-region replication, disaster recovery patterns
What We Don't Want
Traditional Oracle/MSSQL DBA focused on GUI-based administration
Engineers who manage databases manually rather than through code
Candidates whose primary experience is "advising" rather than building and operating
You Will Thrive Here If
You've written Terraform to provision an RDS cluster, not just clicked through the console
You've debugged a production slow query using EXPLAIN ANALYZE and fixed it
You think "how do I automate this" before "how do I do this manually
You're comfortable being on-call and leading database incident response
You want to build platform capabilities other engineers self-serve, rather than being a bottleneck
#J-18808-Ljbffr