This internship at RBC SA&I involves moving high-performance AI models from the cloud to edge devices. The intern will focus on optimizing models for low-latency inference, ensuring data security and privacy on resource-constrained devices, particularly for Large Language Models (LLMs) and computer vision models. The role involves researching and implementing model compression, deploying inference engines on local hardware, and optimizing pipelines for real-time financial applications.
Research and implement techniques to compress state-of-the-art models (LLMs, CNNs) for edge deployment, utilizing methods such as quantization (INT8, INT4), pruning, and knowledge distillation.,Prototype and deploy inference engines on local hardware (e.g., mobile CPUs/NPUs, edge servers, or embedded systems) using frameworks like ONNX Runtime, TensorFlow Lite, or ExecuTorch.,Analyze and profile model performance to identify bottlenecks; optimize inference pipelines for real-time financial applications (e.g., fraud detection, biometric authentication).,Experiment with split computing strategies to intelligently divide workload between the edge device and the cloud.,Develop rigorous testing suites to measure power consumption, memory footprint, and inference speed across different hardware targets.
Currently enrolled in a Master's program or advanced Undergraduate in Computer Science, Electrical Engineering, or a related field.,Strong proficiency in Python and C++ (specifically for high-performance inference).,Deep understanding of deep learning frameworks (PyTorch or TensorFlow) and their internal mechanics.,Experience with model compression techniques (Quantization, LoRA, etc.).,Familiarity with edge inference runtimes (e.g., ONNX, TensorRT, CoreML, or TFLite).,Research experience or publications in efficient deep learning or systems for ML (Nice-to-have).,Experience with LLM inference optimization (e.g., vLLM, llama.cpp) (Nice-to-have).,Knowledge of hardware-software co-design (understanding how memory hierarchy affects AI performance) (Nice-to-have).,Previous exposure to the financial industry or privacy-preserving technologies (e.g., Federated Learning) (Nice-to-have).
Master's program or advanced Undergraduate
40 hours/week
Royal Bank of Canada is a global financial institution with a purpose-driven, principles-led approach to delivering leading performance. As Canada's largest bank, it provides personal and commercial banking, wealth management, and capital markets services to over 17 million clients worldwide.
BerryMap uses cookies to provide essential features, analyze usage, and improve your experience. You can customize your preferences below.