Job Role

Senior DevOps Engineer (Platform / SRE / AWS)

Infrastructure & Automation
- Design and manage AWS infrastructure using Terraform (ECS, RDS, Redis, Kafka, networking, IAM).
- Own service deployment patterns on ECS / Fargate.
- Build safe, repeatable environments (dev, staging, prod).
- Manage VPC architecture, service discovery, secrets, and access controls.
Reliability & Operations
- Define and implement SLIs, SLOs, and error budgets.
- Build alerting and incident response playbooks.
- Improve system resilience against:
- service crashes
- network failures
- dependency latency
- traffic spikes
- Lead incident response and postmortems.
- Reduce MTTR through automation and tooling.
Observability
- Implement structured logging, metrics, and distributed tracing.
- Instrument services and infrastructure for performance and reliability visibility.
- Own dashboards and alerts for critical systems.
CI/CD & Release Engineering
- Build and maintain CI/CD pipelines.
- Improve deployment safety (rollbacks, canaries, blue-green where needed).
- Standardize build and release workflows.
- Enable high deployment velocity with operational safety.
Platform Tooling & Automation
- Build internal tools for infra lifecycle management, cost monitoring, and scaling.
- Automate provisioning, scaling, and recovery workflows.
- Write Python / scripting utilities where infra meets runtime systems.