Reliability Engineering Lead
Hyderabad, India Regular Posted on Dec. 12, 2025 Closing on Dec. 27, 2025Position Title: Reliability Engineering Lead
Location: Hyderabad
Role Description (Process-First Responsibilities)
1. Service Level Management & Reliability Framework
Process Owner: SLO-driven reliability decision making across digital services
Establish SLO Foundation: Define, implement, and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for critical services, ensuring alignment with business impact and patient safety requirements
Error Budget Management: Implement error budget policies that balance feature velocity with reliability, using budget consumption as the primary decision-making tool for release management and incident prioritization
Reliability Governance: Create and maintain reliability standards that comply with GxP, SOX, and other pharmaceutical regulatory frameworks while enabling innovation velocity
Business Impact Correlation: Translate technical reliability metrics into business language, demonstrating clear connections between SLO compliance and revenue, patient safety, or operational efficiency
2. Incident Management & Learning Culture
Process Owner: Blameless incident response and organizational learning
Incident Command: Lead critical incident response using structured protocols, focusing on rapid detection, mitigation, and recovery while maintaining detailed audit trails for regulatory compliance
Blameless Postmortem Leadership: Facilitate blameless postmortems that focus on system improvements rather than individual accountability, creating a culture of psychological safety for honest analysis
Learning Repository Management: Maintain and curate incident learning repositories with transparent sharing across digital units, enabling pattern recognition and systemic improvement
Predictive Issue Prevention: Implement proactive monitoring and alerting systems that identify potential failures before they impact users, shifting from reactive to preventive operations
3. Toil Elimination & Engineering Balance
Process Owner: Systematic automation of operational overhead
Toil Measurement & Reduction: Maintain operational work (toil) below 50% of time through systematic identification, measurement, and elimination of manual, repetitive tasks
Automation Strategy: Design and implement automation solutions using cost-benefit analysis, prioritizing work that scales linearly with service growth and requires minimal human judgment
Engineering Project Delivery: Dedicate minimum 50% of time to engineering projects that improve reliability, performance, or developer experience, delivering measurable improvements quarterly
Knowledge Transfer: Create self-service documentation, runbooks, and automation tools that reduce dependency on human intervention and enable team scaling
4. Platform Engineering Integration & AI Enablement
Process Owner: Reliability integration in AI-first platform services
AI Workload Reliability: Design and implement reliability practices for AI/ML workloads, including agent-to-agent communication systems, model serving infrastructure, and data pipeline reliability
Platform Collaboration: Partner with platform teams to embed reliability principles into Internal Developer Platforms (IDPs), enabling self-service infrastructure with built-in reliability guardrails
Agentic System Support: Provide reliability engineering expertise for Sanofi's agentic AI ecosystem, ensuring conversational AI systems meet enterprise reliability and compliance standards
Developer Experience Enhancement: Contribute to CI/CD pipeline reliability, infrastructure-as-code best practices, and observability integration that accelerates developer productivity
5. Observability & Performance Engineering
Process Owner: Comprehensive system visibility and performance optimization
Full-Stack Observability: Implement and maintain observability platforms covering metrics, logs, traces, and business KPIs, providing end-to-end visibility into service health and user experience
Performance Optimization: Conduct systematic performance engineering including capacity planning, bottleneck identification, and scalability improvements aligned with business growth projections
Intelligent Monitoring: Deploy AI-powered monitoring and alerting systems that reduce noise, provide intelligent root cause analysis, and enable predictive maintenance
Cross-System Correlation: Establish monitoring federation across diverse technology stacks (cloud, on-premises, legacy) while maintaining regulatory audit trails
6. Security & Compliance Integration
Process Owner: Reliability practices within regulatory frameworks
Secure Reliability Engineering: Implement reliability practices that enhance rather than compromise security posture, integrating DevSecOps principles with pharmaceutical compliance requirements
Compliance Automation: Automate compliance checks, audit trail generation, and regulatory reporting while maintaining system reliability and performance
Risk Assessment Integration: Conduct reliability impact assessments for changes affecting GxP systems, balancing innovation speed with regulatory validation requirements
Disaster Recovery: Design and test disaster recovery procedures that meet both technical recovery objectives and regulatory continuity requirements
7. Team Leadership
Process Owner: Represent the reliability engineering discipline
Team Grooming: Groom a team of SREs that can work independently across the key SRE principles
Communication: Provide crisp and strategic updates to the leadership team
Lead by Example: Demonstrate expertise by taking on complex scenarios and providing innovative solutions that can be leveraged by the team, documented for knowledge sharing, and scaled across the organization to drive systematic reliability improvements
Pursue Progress. Discover Extraordinary.
Join Sanofi and step into a new era of science - where your growth can be just as transformative as the work we do. We invest in you to reach further, think faster, and do what’s never-been-done-before. You’ll help push boundaries, challenge convention, and build smarter solutions that reach the communities we serve. Ready to chase the miracles of science and improve people’s lives? Let’s Pursue Progress and Discover Extraordinary – together.
At Sanofi, we provide equal opportunities to all regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, protected veteran status or other characteristics protected by law.
You have not viewed any jobs yet.
You have not saved any jobs yet.
Developing
our talent
We believe better is out there – and that extends to how we support people. Our People Strategy prioritizes inclusivity, transparency, and efficiency in talent development. It’s allowed us to chart personalized plans to keep employees and managers on the same page when it comes to training and succession planning. From upskilling to mentorship, we prepare our teams with the resources they need to pursue progress.
Find out more about this location
Experience possibility
-
Our culture & values
We're the first in Pharma to have a DE&I board. We also have Employee Resource Groups that create spaces for every Sanofian to be heard. Your voice matters – use it to shape our future.
-
Why Sanofi
Get access to the tools, training, and support to reach your goals. By fulfilling your potential, you’ll help us achieve our aim of halving the time from discovery to therapy.
-
Build a career with purpose
Bring your passion to your role and impact millions of people around the world. You're in the driver's seat – just set your goals, and we'll provide the training and support that will get you there.
Join our
talent community
What could we achieve together? Every Sanofian works on projects that truly make a difference to people’s lives.
Sign up today and discover our latest opportunities as soon as they’re available.