We are seeking a hands-on Senior DevOps Engineer to strengthen our Kubernetes platform operations and CI/CD ecosystem.
The engineer will actively contribute to scaling cloud-native infrastructure, improving deployment pipelines, enforcing Infrastructure as Code (IaC) standards, and enhancing operational resilience. This is a production-facing role requiring strong troubleshooting capability, ownership mindset, and practical experience operating mission-critical workloads in AWS.
Responsibilities
- Administer and support Kubernetes clusters running production workloads
- Enhance cluster scalability, reliability, and performance through resource management, autoscaling, and workload isolation
- Advance observability by integrating metrics, logging, and tracing for greater operational insight
- Assist with onboarding and platform enablement for multiple teams
- Develop, modify, and extend GitHub Actions pipelines to improve modularity, maintainability, and governance
- Build reusable workflows and uphold repository standards
- Reduce deployment risks by automating validation and testing processes
- Optimize pipeline efficiency and manage operational costs
- Create and maintain Terraform-based infrastructure with emphasis on state management, modularity, and version control
- Enforce IaC governance and conduct review procedures
- Provide support for environment provisioning and lifecycle management
- Administer and enhance AWS services including networking, IAM, compute, and storage
- Strengthen secrets management and secure configuration practices
- Participate in cloud cost optimization efforts
- Maintain production system stability and resilience
- Improve access controls and secure handling of sensitive information
- Apply DevOps and SRE methodologies to production systems
- Troubleshoot incidents and perform root cause analysis
- Lead projects to boost system reliability and operational maturity
Requirements
- At least 3 years of experience in DevOps roles with a focus on cloud infrastructure
- Extensive hands-on experience with Kubernetes in production environments
- Demonstrated ability to manage AWS cloud infrastructure for critical workloads
- Advanced proficiency with GitHub Actions or similar CI/CD tools for pipeline automation
- Strong skills with Terraform for Infrastructure as Code, including state management and modular design
- Experience deploying and operating cloud-native systems in live production settings
- Excellent troubleshooting and debugging skills for complex technical issues
- Comprehensive understanding of DevOps and SRE principles, including reliability engineering practices
- Strong English communication skills at B2+ level or higher, both written and spoken
Nice to have
- Experience working in regulated or healthcare sectors, with knowledge of compliance and security requirements
- Familiarity with observability tools such as Prometheus and Grafana for monitoring and alerting
- Understanding of cost optimization strategies in AWS for efficient resource utilization
- Knowledge of platform engineering concepts and internal developer platforms to enhance team productivity