Senior Customer Reliability Engineer

at Isovalent
Location Dar es Salaam, Tanzania, United Republic of
Date Posted January 1, 2024
Category Engineering
IT / Information Technology
Job Type Full-time
Currency TZS

Description

The Community you will join:

The Customer Reliability Engineering team sits under the Customer Success organization at Isovalent. The Customer Success team is composed of the Customer Reliability practice and the Solutions Architect practice. The goal of this community is to make Isovalent customers successful in their adoption journey of Cilium. This community drives deep customer engagement from initial architecture inception, through enablement and finally, into lowering operational anxiety as our customers head into production.

About this role:

As a Customer Reliability Engineer (CRE), you are the tip of the spear in interacting with our customers. Our CRE team adapts the best practices of Site Reliability Engineering (SRE) and applies them to our customers. As part of the role, you will gain a deep understanding of our customers, their architecture into their various configurations. The main mission of this role is to ensure that our customers can continue running Cilium Enterprise, reliably, at scale. You will work with various stakeholders, internally and externally to provide world-class support and issue resolution to various incidents and enhance our organization’s view into the health of our various customers. This role takes a proactive approach vs a reactive approach to customer reliability and you will use existing data to help us and our customers be aware of upcoming reliability risks.

A typical day:

  • Reduce our customers’ production operational anxiety to near zero
  • Collaborate with our solutions architects and engineering team to provide resolution to customer incidents
  • Develop knowledge base articles that would help our customers accelerate time to resolution of previously identified issues
  • Collaborate with our documentation team to promote any existing knowledge base articles to our official documentation site
  • Conduct production readiness reviews with customer success team members and customers as they prepare to go into production
  • Leverage data to assess any reliability impact on our customer base and provide critical communication to customers around Cilium Enterprise to maintain a high level of production reliability
  • Create new customer reproduction environments and when necessary enhance existing or create new automation modules
  • Lead retrospective activities for high-severity customer incidents
  • Participate in an on-call rotation including weekends as necessary
Required profile for job ad : Senior Customer Reliability Engineer

Requirements

Your Expertise:

  • In-depth understanding of advanced Kubernetes architecture and its components
  • Proven history in troubleshooting and resolving issues with Kubernetes and cloud-native technologies at a large production scale
  • Comprehensive knowledge of Customer Reliability Engineering (CRE) practices, including Production Readiness Reviews (PRRs), Customer Test Environments (CuTEs), tooling, monitoring, knowledge base creation, and retrospectives
  • Extensive experience with at least one major cloud provider (AWS, Azure, or GCP)
  • Strong familiarity with best practices in operating system security and their application in cloud-native technologies
  • Commitment to knowledge sharing through the creation of valuable content for customers and internal stakeholders
  • Emphasis on utilizing automation technologies for efficiency in handling repetitive tasks
Drop files here browse files ...