Tesla’s new patent shows path to Elon Musk’s pure vision FSD approach

Tesla received a new patent last week for “estimating object properties using visual image data.” Elon Musk estimated that Tesla would release a version of FSD Beta in April. At the time, he also mentioned that Tesla was going for pure vision and suggested that it would not even use radar sensors in the future.

“According to their patent, this invention aims to address the increasing cost and complexity of vision sensors for mass-market autonomous vehicles. This method enables a vehicle to detect and interpret the distance to its surroundings using the vehicle’s image data and machine learning,” explained law firm Founders Legal to Teslarati.

Tesla’s patent describes an invention using two neural networks to gauge the distances of objects using only image data. The first neural network can determine the distance of objects from images captured by the cameras around a vehicle. The other neural network creates training material in the form of annotated images for the first neural network.

FSD Beta has now been expanded to ~2000 owners & we’ve also revoked beta where drivers did not pay sufficient attention to the road. No accidents to date.

Next significant release will be in April. Going with pure vision — not even using radar. This is the way to real-world AI.
— Elon Musk (@elonmusk) March 12, 2021

In the patent, Tesla states that there is a need to find the right amount of sensors to put on an autonomous vehicle without limiting the amount of data it can capture and process. Tesla states that vision sensors, like radar, lidar, and ultrasonic sensors, can become too costly to put in a mass market vehicle and increase the “input bandwidth requirements” for an autonomous driving system.

The patent describes a configuration with a good balance of sensors and cameras to determine the distances of objects around a vehicle. This should allow Tesla to employ a system that could perform at a level comparable to industry leaders while keeping costs as low as possible.

“As the number and types of sensors increases, so does the complexity and cost of the system. For example, emitting distance sensors such as lidar are often costly to include in a mass market vehicle. Moreover, each additional sensor increases the input bandwidth requirements for the autonomous driving system. Therefore, there exists a need to find the optimal configuration of sensors on a vehicle. The configuration should limit the total number of sensors without limiting the amount and type of data captured to accurately describe the surrounding environment and safely control the vehicle,” Tesla wrote.

The patent also provides Tesla with a way to automatically label vision data. Considering that labeling is one of the most time-consuming part of Tesla’s FSD development, such a system would likely accelerate the development and release of updates and improvements to the company’s Full Self-Driving and Autopilot suites.

“In various embodiments, the collection and association of auxiliary data with vision data is done automatically and requires little, if any, human intervention. For example, objects identified using vision techniques do not need to be manually labeled, significantly improving the efficiency of machine learning training. Instead, the training data can be automatically generated and used to train a machine learning model to predict object properties with a high degree of accuracy,” Tesla wrote.

The configuration described in Tesla’s patent should significantly improve its Full Self-Driving (FSD) technology. It may reduced Tesla’s reliance on sensors and increase the amount of data that can be extracted from images to improve FSD Beta. Tesla’s image-based approach to FSD differs considerably from its competitors like Waymo but has yielded some rather impressive results based on some FSD Beta users’ experiences thus far.

Tesla’s “Estimating object properties using visual image data” patent could be accessed below.