Active Learning, Data Selection, Data Auto-Labeling, and Simulation in Autonomous Driving — Part 3
Let’s see how Waymo does active learning.
Waymo
Waymo uses active learning too, obviously. In this talk, Drago Anguelov explains about the ML factory used at Waymo:
The lifecycle is almost similar to what we saw for NVIDIA. Most of the data come from some common scenarios and does not have enough information for the model to learn. So it is essential to know how to select the data. They have data mining and active learning pipelines to find rare cases and situations where the models are uncertain or inconsistent over time and label those cases. Then this labeled data will go for model training. They also have auto-labels in their system. When you collect data, you also see the future for many objects. This knowledge about the past and the future will help annotate data better, go back to the model that does not know the future, and replicate it with the model.
Waymo also released the Open Motion Dataset and had a competition at CVPR 2021. The dataset is labeled using a deep learning model in offline mode published in CVPR 2021: Offboard 3D Object Detection from Point Cloud Sequences. Running the model in offline mode is not limited by latency constraints on the vehicle and also benefits from seeing the future, as it has access to the full scene and can go backward and forward in time. This labeling approach in offline mode can be used to label a lot of data and then train deep learning models on that data.
The offboard 3D object detection paper presents new techniques for automatically labeling the point clouds created by lidar sensors. Taking advantage of the fact that different image frames capture complementary views of the same object, this team has developed a labeling system that includes multi-frame object detection over time. Here is their 3D auto-labeling pipeline (the pipeline is explained in the image caption):
The 3D Auto Labeling pipeline. Given a point cloud sequence as input, the pipeline first leverages a 3D object detector to localize objects in each frame. Then object boxes at different frames are linked through a multi-object tracker. Object track data (its point clouds at every frame as well as its 3D bounding boxes) are extracted for each object and then go through the object-centric auto labeling (with a divide-and-conquer for static and dynamic tracks) to generate the final “auto labels”, i.e. refined 3D bounding boxes.
And here is its performance compared to state-of-the-art:
Waymo, like NVIDIA, has a simulator called SimulationCity. The goal is to gain a better understanding of how the Waymo Driver responds to the full range of behaviors that it will encounter in the real world.
Assume we simulate a scenario of tailgating at an intersection. To evaluate the Waymo Driver’s behavior, we want to understand as many possible outcomes as possible and their likelihood of occurring. If we chose a random tailgating scenario, the tailgater would almost certainly brake in time. However, it is critical to assess how the Waymo Driver behaves when the tailgater fails to brake in time, for example, when the tailgater is distracted or inattentive. As more variations of the same scenario are simulated, we observe a convergence of the distribution of outcomes between what we observe in simulation and the real world. Additionally, SimulationCity enables us to investigate rare events in order to create risky scenarios that the Driver has never encountered before, but that have been proven to be both realistic and extremely useful.
Having a large and high-quality dataset, as well as the simulation required to generate the required data, is critical for deep learning models and autonomous driving to operate safely and handle rare cases such as the following:
This scenario depicts a car passing a red traffic light and entering an intersection when the green light is for the ego car, but the vehicle is capable of handling the situation and yielding to the crazy car before proceeding. This demonstrates that Waymo is performing an excellent job of data collection, data selection, and corner case analysis, as well as simulation, to train their models.
We will go for Tesla in the next post!
Thank you for taking the time to read my post. If you found it helpful or enjoyable, please consider giving it a like and sharing it with your friends. Your support means the world to me and helps me to continue creating valuable content for you.