My research seeks to extend the capabilities of AI-driven perception, planning and control systems to
position robots for success in long-horizon, complex task settings. Thus, my recent work has intersected
3D scene representations & planning, and robot vision & control.
Ph.D in Computer Science
Department of Computer Science, Stanford University
Sep 2021 | Stanford, CA
Stanford Graduate Fellowship - School of Engineering
B.A.Sc in Engineering Science, Robotics
Faculty of Applied Science and Engineering, University of Toronto
Sep 2016 - May 2021 | Toronto, ON
President's Scholarship Program NSERC Undergraduate Research Award Dean's Honour List - 2018-2021
I'm interested bridging concepts from Robotics, Deep Learning, and Computer Vision
to build improved task & motion planning, decision-making and control systems. I've recently explored modern learning-based planners
and their amenability to long-horizon robotic tasks in large-scale 3D scene graphs -
details in BASc thesis.
Before that, I researched couplings in representation learning and reinforcement learning to create
observational models that facilitate improved control.
I've also lead and contributed to projects related to: semantic localization, 3D scene understanding, 3D semantic scene completion,
2D/3D object detection, LiDAR segmentation, and more!
Lightweight Semantic-aided Localization with Spinning LiDAR Sensor
Yuan Ren*, Bingbing Liu, Ran Cheng, Christopher Agia
[Patented]. IEEE Transactions on Intelligent Vehicles (T-IV), 2021
PDF / IEEExplore
How can semantic information be leveraged to improve localization accuracy in changing environments? We present a robust LiDAR-based localization
algorithm that exploits both semantic and geometric properties of the scene with an adaptive fusion strategy.
Deep Reinforcement Learning is effective for learning robot navigation policies in rough terrain and cluttered simulated environments.
In this work, we introduce a series of techniques that are applied in the policy learning phase to enhance transferability to real-world domains.
3D Scene Graphs (3DSGs) are informative abstractions of our world that unify symbolic, semantic, and metric scene representations.
We present a benchmark for robot task planning over large 3DSGs and evaluate classical and learning-based planners;
showing that real-time planning requires 3DSGs and planners to be jointly adapted to better exploit 3DSG hierarchies.
Learning visual state representations can significantly reduce the strain on policy learning from high-dimensional images.
In this paper, we propose a framework to inform and guide policy learning with augmented attention representations,
demonstrating outstanding convergence speeds and stability for self-driving control.
S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds Christopher Agia*, Ran Cheng*, Yuan Ren, Bingbing Liu
[Patented]. Conference on Robot Learning (CoRL), 2020 | Cambridge, US
PDF / Talk / Video / arXiv
Small-scale semantic reconstruction methods have had little success in large outdoor scenes as a result of exponential increases in sparsity,
and a computationally expensive design. We propose a sparse convolutional network architecture based on the Minkowski Engine,
achieving state-of-the-art results for semantic scene completion in 2D/3D space from LiDAR point clouds.
Direct methods are able to track motion with considerable long-term accuracy. However, scale inconsistent estimates arise from random or unit depth initialization.
We integrate dense depth prediction with the Direct Sparse Odometry system to accelerate convergence in the windowed bundle-adjustment and promote estimates with consistent scale.
Several methods from Conference / Journal Papers contain patented components as well (i.e. indicated by [Patented]).
Road Surface Semantic Segmentation from LiDAR Point Clouds Christopher Agia*, Ran Cheng, Yuan Ren, Bingbing Liu
Long-range sparsity in point clouds constitutes a challenge for accurate LiDAR-based road estimation.
This invention leverages bird's eye view features learned directly from aggregated point clouds and refines
them with a convolutional CRF to semantically segment roads and predict surface elevation with high precision.
Software Engineering Intern Microsoft, Mixed Reality and Robotics
May 2021 - Aug 2021 | Redmond, Washington
Research & development at the intersection of mixed reality, artificial intelligence, and robotics.
Created a process unlocking the training and HL2
deployment of multi-agent reinforcement learning scenarios in shared digital spatial-semantic representations with
Research in artificial intelligence and robotics. Topics include task-driven perception via learning map representations for downstream
control tasks with graph neural networks, and visual state abstraction for Deep Reinforcement Learning based self-driving control.
Software Engineering Intern Google, Cloud
May 2020 - Aug 2020 | San Francisco, CA
Designed a Proxy-Wasm ABI Test Harness and Simulator that supports both
low-level and high-level mocking of interactions between a Proxy-Wasm extension and a simulated host environment,
allowing developers to test plugins in a safe and controlled environment.
Research and development for autonomous systems (self-driving technology). Research focus and related topics: 2D/3D semantic scene completion,
LiDAR-based segmentation, road estimation, visual odometry, depth estimation, and learning-based localization.
Search and rescue robotics - research on the topics of Deep Reinforcement Learning and Transfer Learning for autonomous robot navigation in rough and
hazardous terrain. ROS (Robot Operating System) software development for various mobile robots.
Software Engineering Intern
General Electric, Grid Solutions
May 2017 - Aug 2017 | Markham, ON
Created customer-end software tools used to accelerate the transition/setup process of new protection and control systems upon upgrade.
Designed the current Install-Base and Firmware Revision History databases used by GE internal service teams.
Learn by doing - I've had the opportunity to work on many interesting projects that range across industries such as Robotics, Health Care, Finance, Transportation, and Logistics.
Links to the source code are embedded in the project titles.
Bayesian Temporal Convolutional Networks
University of Toronto, CSC413 Neural Networks and Deep Learning
In this project, we explore the application of variational inference via Bayes by Backprop to the increasingly
popular temporal convolutional networks (TCNs) architecture for time series predictive forecasting.
Comparisons are made to the effective state-of-the-art in a series of ablation studies.
An empirical study of various 3D Convolutional Neural Network architectures for predicting the full voxel geometry of objects given their partial signed distance
field encodings (from the ShapeNetCore database).
Designed, built, and programmed a robot that systematically sorts and packs up to 50 pills/minute to assist those suffering from dimentia.
An efficient user interface was created to allow a user to input packing instructions. Team placed 3rd/50.Detailed project documentation /
Based on the robotics Sense-Plan-Act Paradigm, we created an AI program
to handle high-level (path planning, goal setting) and low-level (path following, object avoidance, action execution) tasks for an
automated waste collection system to be used in fast food restaurants. 4th place Canada.Presentation
Developed a machine learning software solution to predict the triage score of emergency patients, allocate available resources to
patients, and track key hospital performance metrics to reduce emergency wait times. 1st place Ontario.Presentation / Team photo
Created a logistics planning algorithm that assigned mobile robots to efficiently retrieve warehouse packages. Our solution combined
traditional algorithms such as A* Path Planning with heuristic-based clustering. 1st place UofT.Presentation / Team photo
Smart Intersection - Yonge and Dundas
University of Toronto, MIE438 Robot Design
We propose a traffic intersection model which uses computer vision to estimate lane congestion and manage traffic flow accordingly.
A mockup of our proposal was fabricated to display the behaviour and features of our system.
Detailed report /
Developed an AI program capable of playing Gomoku against both human and virtual opponents. The software's decision making process
is determined by experimentally tuned heuristics which were designed to emulate that of a human opponent.
Programmed an intelligent system that approximates the semantic similarity between any two pair of words by parsing data from
large novels and computing cosine similarities and Euclidean spaces between vector descriptors of each word.