One of the newest patents Tesla recently applied for shows what it takes to lead an industry. In the new filing, titled “Generating Ground Truth For Machine Learning From Time Series Elements,” Tesla invented a way to enable the computer (AI) to predict, with high accuracy, the actions a vehicle will take. This is something that I do as a pedestrian when crossing the street. For Tesla to teach computers how to instinctively know when a car will cross its path is really mind-blowing.
The patent uses quite of lot of technical terms, including “ground truth” and “time series.”
Ground truth refers to checking the results of machine learning for accuracy against the real world. The wording is borrowed from meteorology, where ground truth means information that is gathered on site. In machine learning, time series is a sequence of observations that are taken sequentially in time. Machine learning’s forecasting takes models and fits them on historical data; then uses them to predict future observations.
According to the patent filing, sensor data, which includes a group of time series elements (sequential pictures and videos), is received. Next, a training data set is determined, a machine learning model is developed, and the results are checked against the real world — or what should happen in the real world.
In other words, Tesla is able to teach its AI to check its results for accuracy and compare them against the real world.
Some Key Details From Tesla’s Newest Invention
There are several ways the new invention can be implemented. It can be set up as a process, an apparatus, a system, or a computer program product embodied on a computer-readable storage medium. As a processor, it can be configured to execute instructions that are stored on or provided by a memory coupled with the processor.
Another key point here is that Tesla disclosed a machine learning training technique for generating highly accurate results. By using the data captured from sensors on a vehicle and its environment, it created a training data set. An example of this is provided.
“For example, sensors affixed to a vehicle capture data such as image data of the road and the surrounding environment a vehicle is driving on. The sensor data may capture vehicle lane lines, vehicle lanes, other vehicle traffic, obstacles, traffic control signs, etc. Odometry and other similar sensors capture vehicle operating parameters such as vehicle speed, steering, orientation, change in direction, change in location, change in elevation, change in speed, etc.”
Once the data sets are captured, they are transmitted to a training server that creates a training data set and are used to train a machine learning model for the purpose of generating highly accurate machine learning results. In other words, data is collected, trained, and put to work. Tesla gives another example of this.
“For example, a ground truth is determined based on a group of time series elements and is associated with a single element from the group. As one example, a series of images for a time period, such as 30 seconds, is used to determine the actual path of a vehicle lane line over the time period the vehicle travels.”
“The vehicle lane line is determined by using the most accurate images of the vehicle lane over the time period. Different portions (or locations) of the lane line may be identified from different image data of the time series. As the vehicle travels in a lane alongside a lane line, more accurate data is captured for different portions of the lane line. In some examples, occluded portions of the lane line are revealed as the vehicle travels, for example, along a hidden curve or over a crest of a hill.”
The patent also gives other details as to how Tesla can select an image and apply the ground truth to different features, including lane lines and path predictions for vehicles. The latter includes other vehicles that the car sees, determining the depth distances of objects such as a stop sign or even a person crossing the street up ahead.
“For example, a series of images of a vehicle in an adjacent lane is used to predict that vehicle’s path.”
By using the time series of the images and the actual path that the other vehicle is taking, it can form a single image of the group and the actual path taken can be used to predict the path of the other vehicle. An easier way to visualize this is imagining you are walking down the street to the store and need to cross a parking lot.
As you begin to do so, you see a car off to your left. For now, you and that car are parallel, but you can kind of sense or see the path of that vehicle and determine whether or not it is going to cross yours.
For more details, you can read the full filing here.