Site Reliability Engineer - IT Operations
Yoba is looking for a Site Reliability Engineer who will become our champion for availability, scalability, latency, and efficiency. You will be part of a team that will build and operate an innovative cloud-based financial services platform for the SMEs.
Responsibilities
-
Participate in deployment of distributed architecture, ensuring security, high-availability and scalability
-
Ensure that cloud operations can be executed with no customer downtime
-
Collaborate with the product teams to design and develop systems that are resilient and highly performant at scale
-
Monitor infrastructure, measuring availability and system health
-
Perform blameless root cause analyses on outages and ensure action items are done
-
Collaborate with customer support in recovering from outages
-
Troubleshoot complex incidents in highly distributed systems
-
Shorten time to detecting by improving the accuracy of alarms
-
Be a key stakeholder in the design of services so that they are resilient from day 1
Requirements
-
MS of BS in Computer Science or another related engineering degree
-
Minimum 2 years on an SRE role or similar
-
Experience in designing resilient and fault-tolerant systems
-
Experience with at least one of the following: JavaScript, Powerscript, Python
-
Experience in debugging complex, distributed systems
-
Love for automation
-
Fluency in English, written and spoken
We Offer
-
Opportunity to take part in the large-scale greenfield development project
-
Competitive salary and a generous benefits package
Please send your CV and cover letter to careers@yoba.com