I am a recent graduate from the Engineering Science, Robotics program at the University of Toronto. The objective of my research is to push the boundaries of AI-driven perception and planning systems to position robots for success in long-horizon, complex task settings. In pursuit of this, I'll be working on multi-agent reinforcement learning in mixed reality environments at Microsoft this summer, prior to starting my Ph.D in Computer Science at Stanford in the fall.
Ph.D in Computer Science
Department of Computer Science, Stanford University
(Next) Sep 2021 | Stanford, CA
Stanford Graduate Fellowship - School of Engineering
B.A.Sc in Engineering Science, Robotics
Faculty of Applied Science and Engineering, University of Toronto
Sep 2016 - May 2021 | Toronto, ON
President's Scholarship Program NSERC Undergraduate Research Award Dean's Honour List - 2018-2021
I am interested bridging concepts from Robotics, Deep Learning, and Computer Vision to build improved task planning, motion planning, decision-making and control systems. More recently, I've explored the use of unsupervised representation learning and reinforcement learning to create observational / world models that faciliate optimal planning and control.
My current focus is on the application of graph representation learning for long-horizon robot task planning in large-scale scene graphs. I've also lead and contributed to perception projects related to: 3D Scene Understanding, 2D/3D Semantic Scene Completion, 2D/3D Object Detection, LiDAR segmentation, and more!
Lightweight Semantic-aided Localization with Spinning LiDAR Sensor
Yuan Ren*, Bingbing Liu, Ran Cheng, Christopher Agia
[Patented]. IEEE Transactions on Intelligent Vehicles (T-IV), 2021
PDF / IEEExplore
How can semantic information be leveraged to improve localization accuracy in changing environments? We present a robust LiDAR-based localization algorithm that exploits both semantic and geometric properties of the scene with an adaptive fusion strategy.
Deep Reinforcement Learning is effective for learning robot navigation policies in rough terrain and cluttered simulated environments. In this work, we introduce a series of techniques that are applied in the policy learning phase to enhance transferability to real-world domains.
3D Scene Graphs (3DSGs) are informative abstractions of our world that unify symbolic, semantic, and metric scene representations. We present a benchmark for robot task planning over large 3DSGs and evaluate classical and learning-based planners; showing that real-time planning requires 3DSGs and planners to be jointly adapted to better exploit 3DSG hierarchies.
Learning visual state representations can significantly reduce the strain on policy learning from high-dimensional images. In this paper, we propose a framework to inform and guide policy learning with augmented attention representations, demonstrating outstanding convergence speeds and stability for self-driving control.
S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds Christopher Agia*, Ran Cheng*, Yuan Ren, Bingbing Liu
[Patented]. Conference on Robot Learning (CoRL), 2020 | Cambridge, US
PDF / Talk / Video / arXiv
Small-scale semantic reconstruction methods have had little success in large outdoor scenes as a result of exponential increases in sparsity, and a computationally expensive design. We propose a sparse convolutional network architecture based on the Minkowski Engine, achieving state-of-the-art results for semantic scene completion in 2D/3D space from LiDAR point clouds.
Direct methods are able to track motion with considerable long-term accuracy. However, scale inconsistent estimates arise from random or unit depth initialization. We integrate dense depth prediction with the Direct Sparse Odometry system to accelerate convergence in the windowed bundle-adjustment and promote estimates with consistent scale.
Several methods from Conference / Journal Papers contain patented components as well (i.e. indicated by [Patented]).
Road Surface Semantic Segmentation from LiDAR Point Clouds Christopher Agia*, Ran Cheng, Yuan Ren, Bingbing Liu
Long-range sparsity in point clouds constitutes a challenge for accurate LiDAR-based road estimation. This invention leverages bird's eye view features learned directly from aggregated point clouds and refines them with a convolutional CRF to semantically segment roads and predict surface elevation with high precision.
Software Engineering Intern Microsoft, Mixed Reality and Robotics
May 2021 - Present | Redmond, Washington
Working at the intersection of mixed reality, artificial intelligence, and robotics.
Research in artificial intelligence and robotics. Topics include task-driven perception via learning map representations for downstream control tasks with graph neural networks, and visual state abstraction for Deep Reinforcement Learning based self-driving control.
Software Engineering Intern Google, Cloud
May 2020 - Aug 2020 | San Francisco, CA
Designed a Proxy-Wasm ABI Test Harness and Simulator that supports both low-level and high-level mocking of interactions between a Proxy-Wasm extension and a simulated host environment, allowing developers to test plugins in a safe and controlled environment.
Research and development for autonomous systems (self-driving technology). Research focus and related topics: 2D/3D semantic scene completion, LiDAR-based segmentation, road estimation, visual odometry, depth estimation, and learning-based localization.
Search and rescue robotics - research on the topics of Deep Reinforcement Learning and Transfer Learning for autonomous robot navigation in rough and hazardous terrain. ROS (Robot Operating System) software development for various mobile robots.
Software Engineering Intern
General Electric, Grid Solutions
May 2017 - Aug 2017 | Markham, ON
Created customer-end software tools used to accelerate the transition/setup process of new protection and control systems upon upgrade. Designed the current Install-Base and Firmware Revision History databases used by GE internal service teams.
Learn by doing - I've had the opportunity to work on many interesting projects that range across industries such as Robotics, Health Care, Finance, Transportation, and Logistics.
Links to the source code are embedded in the project titles.
Bayesian Temporal Convolutional Networks
University of Toronto, CSC413 Neural Networks and Deep Learning
In this project, we explore the application of variational inference via Bayes by Backprop to the increasingly popular temporal convolutional networks (TCNs) architecture for time series predictive forecasting. Comparisons are made to the effective state-of-the-art in a series of ablation studies. Project report
An empirical study of various 3D Convolutional Neural Network architectures for predicting the full voxel geometry of objects given their partial signed distance field encodings (from the ShapeNetCore database). Project report
Designed, built, and programmed a robot that systematically sorts and packs up to 50 pills/minute to assist those suffering from dimentia. An efficient user interface was created to allow a user to input packing instructions. Team placed 3rd/50.Detailed project documentation / Youtube video
Based on the robotics Sense-Plan-Act Paradigm, we created an AI program to handle high-level (path planning, goal setting) and low-level (path following, object avoidance, action execution) tasks for an automated waste collection system to be used in fast food restaurants. 4th place Canada.Presentation
Developed a machine learning software solution to predict the triage score of emergency patients, allocate available resources to patients, and track key hospital performance metrics to reduce emergency wait times. 1st place Ontario.Presentation / Team photo
Created a logistics planning algorithm that assigned mobile robots to efficiently retrieve warehouse packages. Our solution combined traditional algorithms such as A* Path Planning with heuristic-based clustering. 1st place UofT.Presentation / Team photo
Smart Intersection - Yonge and Dundas
University of Toronto, MIE438 Robot Design
We propose a traffic intersection model which uses computer vision to estimate lane congestion and manage traffic flow accordingly. A mockup of our proposal was fabricated to display the behaviour and features of our system. Detailed report / YouTube video
Developed an AI program capable of playing Gomoku against both human and virtual opponents. The software's decision making process is determined by experimentally tuned heuristics which were designed to emulate that of a human opponent.
Programmed an intelligent system that approximates the semantic similarity between any two pair of words by parsing data from large novels and computing cosine similarities and Euclidean spaces between vector descriptors of each word.