Full-Time

On-Site

Lead Site Reliability Engineer

Royal Bank of Canada - Jersey City,Toronto, Canada,Royal Bank of Canada - MONTRÉAL

Description

This role is for a technical engineer specializing in Site Reliability Engineering (SRE) practices, cloud technologies, and SaaS application deployment. The individual will be responsible for ensuring high system reliability, scalability, and performance through automation and innovation. Key aspects include applying SRE principles, managing incidents, and ensuring observability to meet business and user needs.

What We're Looking For

Collaborate with Quality Engineering, DevOps, Development, IT, and Cloud teams to align SRE practices.,Design, implement, and maintain reliable, scalable systems for high availability and performance.,Monitor system health, identify bottlenecks, and proactively resolve issues.,Develop and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs).,Architect, deploy, and manage cloud-based infrastructure (AWS, Azure, GCP).,Optimize cloud resources for cost efficiency and performance.,Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or Pulumi.,Set up and configure new SaaS applications and integrate with existing systems.,Automate deployment pipelines using CI/CD tools (e.g., Jenkins, GitHub Actions, GitLab CI/CD).,Write clean, efficient, and maintainable code in languages such as Python, Go, Java, or Ruby.,Develop automation scripts for repetitive tasks, monitoring, and incident response.,Lead incident response efforts, including root cause analysis and post-mortem reviews.,Implement robust monitoring and alerting systems (e.g., Prometheus, Grafana, Datadog, New Relic).,Design and implement observability solutions for deep system insights.,Implement security best practices for cloud and SaaS environments and ensure compliance (e.g., GDPR, SOC 2, ISO 27001).,Conduct regular security audits and vulnerability assessments.,Document processes, workflows, and best practices to foster knowledge sharing.,Mentor junior team members and contribute to a culture of continuous learning.

Ideal Candidate

Proficiency in programming languages such as Python, Go, Java, or Ruby.,Strong understanding of cloud platforms (AWS, Azure, GCP) and their services.,Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).,Hands-on experience with CI/CD pipelines and tools (e.g., Jenkins, GitHub Actions, GitLab CI/CD).,Knowledge of Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation, Pulumi).,Proven experience in applying SRE principles to improve system reliability and scalability.,Experience in incident management, root cause analysis, and post-mortem processes.,Proven experience in deploying and managing SaaS applications.,Familiarity with SaaS integration and API management.,Experience with monitoring tools (e.g., Dynatrace, Prometheus, Grafana, Datadog, New Relic).,Strong scripting skills in Bash, Python, or similar languages.,Experience in automating repetitive tasks and workflows.,Bachelor's degree in Computer Science, Engineering, or a relevant field (nice-to-have).

Minimum Education

Bachelor's Degree (preferred)

Hard Skills

Python

Java

Ruby

AWS

Azure

GCP

Docker

Kubernetes

Jenkins

GitHub Actions

GitLab CI/CD

Terraform

CloudFormation

Pulumi

Dynatrace

Prometheus

Grafana

Datadog

New Relic

Bash

Salesforce

Flosum

Agile Methodology

Application Infrastructure

Atlassian JIRA

Automation

Cloud Platform

Cloud Technology

DevOps

IT Automation

IT Monitoring

Operations Support

PagerDuty

Production Support

Site Reliability Engineering

Software Development Life Cycle (SDLC)

Software Engineering

Software Product Technical Knowledge

System Applications

Systems Software

Soft Skills

Problem-solving

Troubleshooting

Communication

Collaboration

Group Problem Solving

Teamwork

Strategic thinking

Interpersonal skills

Work Hours

37.5 hours/week

Benefits

Comprehensive Total Rewards Program (bonuses, flexible benefits, competitive compensation, commissions, stock)

Leaders who support development through coaching and managing opportunities

Ability to make a difference and lasting impact

Work in a dynamic, collaborative, progressive, and high-performing team

World-class training program in financial services

Flexible work/life balance options

Opportunities to do challenging work

Also Available At

Toronto, Canada

Toronto, Ontario

Royal Bank of Canada - MONTRÉAL

MONTRÉAL, QC

About the Company

Royal Bank of Canada

Royal Bank of Canada is a global financial institution with a purpose-driven, principles-led approach to delivering leading performance. As Canada's largest bank, it provides personal and commercial banking, wealth management, and capital markets services to over 17 million clients worldwide.

Purpose-driven

Inclusive

Innovative

Collaborative

Professional

View all jobs at Royal Bank of Canada

Back to Job Board