Full-Time

On-Site

Edge AI Intern, Summer 2026

RRoyal Bank of Canada 201 S Orange Ave, Orlando, FLUS$30 per hour

Description

This internship at RBC SA&I involves moving high-performance AI models from the cloud to edge devices. The intern will focus on optimizing models for low-latency inference, ensuring data security and privacy on resource-constrained devices, particularly for Large Language Models (LLMs) and computer vision models. The role involves researching and implementing model compression, deploying inference engines on local hardware, and optimizing pipelines for real-time financial applications.

What We're Looking For

Research and implement techniques to compress state-of-the-art models (LLMs, CNNs) for edge deployment, utilizing methods such as quantization (INT8, INT4), pruning, and knowledge distillation.,Prototype and deploy inference engines on local hardware (e.g., mobile CPUs/NPUs, edge servers, or embedded systems) using frameworks like ONNX Runtime, TensorFlow Lite, or ExecuTorch.,Analyze and profile model performance to identify bottlenecks; optimize inference pipelines for real-time financial applications (e.g., fraud detection, biometric authentication).,Experiment with split computing strategies to intelligently divide workload between the edge device and the cloud.,Develop rigorous testing suites to measure power consumption, memory footprint, and inference speed across different hardware targets.

Ideal Candidate

Currently enrolled in a Master's program or advanced Undergraduate in Computer Science, Electrical Engineering, or a related field.,Strong proficiency in Python and C++ (specifically for high-performance inference).,Deep understanding of deep learning frameworks (PyTorch or TensorFlow) and their internal mechanics.,Experience with model compression techniques (Quantization, LoRA, etc.).,Familiarity with edge inference runtimes (e.g., ONNX, TensorRT, CoreML, or TFLite).,Research experience or publications in efficient deep learning or systems for ML (Nice-to-have).,Experience with LLM inference optimization (e.g., vLLM, llama.cpp) (Nice-to-have).,Knowledge of hardware-software co-design (understanding how memory hierarchy affects AI performance) (Nice-to-have).,Previous exposure to the financial industry or privacy-preserving technologies (e.g., Federated Learning) (Nice-to-have).

Minimum Education

Master's program or advanced Undergraduate

Hard Skills

Python

C++

PyTorch

TensorFlow

ONNX Runtime

TensorFlow Lite

ExecuTorch

ONNX

TensorRT

CoreML

TFLite

Model Compression

Quantization

Pruning

Knowledge Distillation

LoRA

LLM inference optimization

vLLM

llama.cpp

Hardware-software co-design

CSS

HTML

Java

JavaScript

Soft Skills

Curious Mindset

Innovation

Interpersonal Relationships

Personal Initiative

Taking Initiative

Task-Oriented

Teamwork

Work Hours

40 hours/week

Benefits

Impact (work on 0-to-1 projects)

Mentorship (direct access to RBC SA&I researchers and engineers)

Community (participation in RBC Student Program

hackathons

executive networking

technical lunch and learns)

About the Company

Royal Bank of Canada

Royal Bank of Canada is a global financial institution with a purpose-driven, principles-led approach to delivering leading performance. As Canada's largest bank, it provides personal and commercial banking, wealth management, and capital markets services to over 17 million clients worldwide.

Purpose-driven

Inclusive

Innovative

Collaborative

Professional

View all jobs at Royal Bank of Canada

Back to Job Board