Skip to main content
Full-Time
On-Site

Edge AI Intern, Summer 2026

View on Map

Description

This internship at RBC SA&I involves moving high-performance AI models from the cloud to edge devices. The intern will focus on optimizing models for low-latency inference, ensuring data security and privacy on resource-constrained devices, particularly for Large Language Models (LLMs) and computer vision models. The role involves researching and implementing model compression, deploying inference engines on local hardware, and optimizing pipelines for real-time financial applications.

What We're Looking For

Research and implement techniques to compress state-of-the-art models (LLMs, CNNs) for edge deployment, utilizing methods such as quantization (INT8, INT4), pruning, and knowledge distillation.,Prototype and deploy inference engines on local hardware (e.g., mobile CPUs/NPUs, edge servers, or embedded systems) using frameworks like ONNX Runtime, TensorFlow Lite, or ExecuTorch.,Analyze and profile model performance to identify bottlenecks; optimize inference pipelines for real-time financial applications (e.g., fraud detection, biometric authentication).,Experiment with split computing strategies to intelligently divide workload between the edge device and the cloud.,Develop rigorous testing suites to measure power consumption, memory footprint, and inference speed across different hardware targets.

Ideal Candidate

Currently enrolled in a Master's program or advanced Undergraduate in Computer Science, Electrical Engineering, or a related field.,Strong proficiency in Python and C++ (specifically for high-performance inference).,Deep understanding of deep learning frameworks (PyTorch or TensorFlow) and their internal mechanics.,Experience with model compression techniques (Quantization, LoRA, etc.).,Familiarity with edge inference runtimes (e.g., ONNX, TensorRT, CoreML, or TFLite).,Research experience or publications in efficient deep learning or systems for ML (Nice-to-have).,Experience with LLM inference optimization (e.g., vLLM, llama.cpp) (Nice-to-have).,Knowledge of hardware-software co-design (understanding how memory hierarchy affects AI performance) (Nice-to-have).,Previous exposure to the financial industry or privacy-preserving technologies (e.g., Federated Learning) (Nice-to-have).

Minimum Education

Master's program or advanced Undergraduate

Hard Skills

Python
C++
PyTorch
TensorFlow
ONNX Runtime
TensorFlow Lite
ExecuTorch
ONNX
TensorRT
CoreML
TFLite
Model Compression
Quantization
Pruning
Knowledge Distillation
LoRA
LLM inference optimization
vLLM
llama.cpp
Hardware-software co-design
CSS
HTML
Java
JavaScript

Soft Skills

Curious Mindset
Innovation
Interpersonal Relationships
Personal Initiative
Taking Initiative
Task-Oriented
Teamwork

Work Hours

40 hours/week

Benefits

Impact (work on 0-to-1 projects)
Mentorship (direct access to RBC SA&I researchers and engineers)
Community (participation in RBC Student Program
hackathons
executive networking
technical lunch and learns)

About the Company

R

Royal Bank of Canada

Royal Bank of Canada is a global financial institution with a purpose-driven, principles-led approach to delivering leading performance. As Canada's largest bank, it provides personal and commercial banking, wealth management, and capital markets services to over 17 million clients worldwide.

Purpose-driven
Inclusive
Innovative
Collaborative
Professional
View all jobs at Royal Bank of Canada

    We respect your privacy

    BerryMap uses cookies to provide essential features, analyze usage, and improve your experience. You can customize your preferences below.