Job Description
Peaple Talent has partnered with a global digital consultancy who are currently recruiting for a Site Reliability Engineer based out of their offices in central London. You will be working within a great team of engineers and will be responsible for safeguarding production environments, performing all functions of a SRE team from architecture design (or redesign), automation, providing observability tooling, defining and monitoring SLOs, production support and incident management You'll be responsible for: •Balancing feature development velocity and reliability with well-defined SLOs. •Running the production environment by monitoring availability and taking a holistic view of system health. •Driving the incident management process and supporting a blameless post-mortems culture. •Partnering with development teams to improve services via rigorous testing and release procedures. •Participating in system design consulting, platform management, and capacity planning. •Creating sustainable systems and services through automation and uplifts You’ll have: •A degree in Computer Science or related technical field involving coding and / or systems engineering. •Proficiency in one or more of the following: Go, Python, C, C++, Java, Perl, Ruby or shell scripting. •Experience with algorithms, data structures and software design and/or Experience with UNIX operating systems internals and / or networking. •Excellent communication skills. You’ll be able to act as a bridge between internal teams and external stakeholders. •Excellent problem-solving skills It would be great if you have: •Experience with distributed systems design, maintenance, and troubleshooting. •Hands-on experience with debugging and optimizing code, as well as automation. •Strong interpersonal skills, drive, and ownership. •Coding skills beyond simple scripts.