We are searching for a detail-oriented and experienced Senior SRE Engineer to join our team, helping to build and maintain highly scalable, reliable, and automated systems using Kubernetes and AWS.
You have a curious mindset, a logical approach, persistence, and a passion for growth — characteristics of a true tech enthusiast. You thrive by mastering your craft and creating sophisticated solutions for intricate challenges. If this resonates with you, consider joining EPAM as a Senior SRE Engineer.
Responsibilities
- Operate and enhance highly reliable systems with zero to minimal downtime
- Build and maintain sustainable, low-maintenance, and scalable infrastructure
- Troubleshoot technical issues tied to deployment, integration, or infrastructure management
- Design and implement automated processes that reduce manual intervention
- Collaborate with engineers and stakeholders to deliver reliable tools and suggest necessary improvements
- Continuously analyze and optimize system performance and scalability
- Align with security best practices, implementing processes like least privilege access
- Maintain and refine CI/CD pipelines and deployments
- Participate in scheduled on-call rotations to ensure round-the-clock system availability
- Research emerging trends and advances in site reliability and DevOps practices
Requirements
- 3+ years of experience in DevOps, with a focus on CI/CD
- Proficiency with Kubernetes, AWS, and Terraform or comparable Infrastructure as Code tools
- Expertise in Docker and containerization technologies
- Strong Linux system administration and troubleshooting skills
- Familiarity with CI/CD pipelines and associated tools
- Understanding of TLS, mTLS, and security concepts like least privilege principles
- Solid problem-solving abilities for resolving complex infrastructure issues
- Fluency in English (B2 level or higher)