All along, Tesla seemed positioned to gain an edge in artificial intelligence. Sure, Elon Musk’s Neuralink — along with SpaceX and The Boring Company — are separately held companies from Tesla, but certainly seepage among the companies occurs. So, at the Tesla AI event last month, when the company announced it would be designing its own silicon chips, more than ever it seemed Tesla had an advantage.
The AI event culminated with a dancing human posing as a humanoid robot, previewing the Tesla Bot the company intends to build. But the more immediate and important reveal was the custom AI chip “D1,” which would be used for training the machine-learning algorithm behind Tesla’s Autopilot self-driving system. Tesla has a keen focus on this technology, with a single giant neural network known as a “transformer” receiving input from 8 cameras at once.
“We are effectively building a synthetic animal from the ground up,” Tesla’s AI chief, Andrej Karpathy, said during the August, 2021 event. “The car can be thought of as an animal. It moves around autonomously, senses the environment, and acts autonomously.”
CleanTechnica‘s Johnna Crider, who attended the AI event, shared that, “At the very beginning of the event, Tesla CEO Musk said that Tesla is much more than an electric car company, and that it has ‘deep AI activity in hardware on the inference level and on the training level.’” She concluded that, “by unveiling the Dojo supercomputer plans and getting into the details of how it is solving computer vision problems, Tesla showed the world another side to its identity.”
Tesla’s Foray into Silicon Chips
Tesla is the latest nontraditional chipmaker, as described in a recent Wired analysis. Intel Corporation is the world’s largest semiconductor chip maker, based on its 2020 sales. It is the inventor of the x86 series of microprocessors found in most personal computers today. Yet, as AI gains prominence and silicon chips become essential ingredients in technology-integrated manufacturing, many others, including Google, Amazon, and Microsoft, are now designing their own chips.
For Tesla, the key to silicon chip success will be deriving optimal performance out of the computer system used to train the company’s neural network. “If it takes a couple of days for a model to train versus a couple of hours,” CEO Elon Musk said at the AI event, “it’s a big deal.”
Initially, Tesla relied on Nvidia hardware for its silicon chips. That changed in 2019, when Tesla turned in-house to design chips that interpret sensor input in its cars. However, manufacturing the chips needed to train AI algorithms — moving the creative process from vision to execution — is quite a sophisticated, costly, and demanding endeavor.
The D1 chip, part of Tesla’s Dojo supercomputer system, uses a 7-nanometer manufacturing process, with 362 teraflops of processing power, said Ganesh Venkataramanan, senior director of Autopilot hardware. Tesla places 25 of these chips onto a single “training tile,” and 120 of these tiles come together across several server cabinets, amounting to over an exaflop of power. “We are assembling our first cabinets pretty soon,” Venkataramanan disclosed.
CleanTechnica‘s Chanan Bos deconstructed the D1 chip intricately in a series of articles (in case you missed them) and related that, under its specifications, the D1 chip boasts that it has 50 billion transistors. When it comes to processors, that absolutely beats the current record held by AMD’s Epyc Rome chip of 39.54 billion transistors.
Tesla says on its website that the company believes “that an approach based on advanced AI for vision and planning, supported by efficient use of inference hardware, is the only way to achieve a general solution for full self-driving and beyond.” To do so, the company will:
- Build silicon chips that power the full self-driving software from the ground up, taking every small architectural and micro-architectural improvement into account while pushing hard to squeeze maximum silicon performance-per-watt;
- Perform floor-planning, timing, and power analyses on the design;
- Write robust, randomized tests and scoreboards to verify functionality and performance;
- Implement compilers and drivers to program and communicate with the chip, with a strong focus on performance optimization and power savings; and,
- Validate the silicon chip and bring it to mass production.
“We should have Dojo operational next year,” CEO Elon Musk affirmed.
The Tesla Neural Network & Data Training
Tesla’s approach to full self-driving is grounded in its neural network. Most companies that are developing self-driving technology look to lidar, which is an acronym for “Light Detection and Ranging.” It’s a remote sensing method that uses light in the form of a pulsed laser to measure ranges — i.e., variable distances — to the Earth. These light pulses are combined with other data recorded by the airborne system to generate precise, 3-dimensional information about the shape of the Earth and its surface characteristics.
Tesla, however, rejected lidar, partially due to its expensive cost and the amount of technology required per vehicle. Instead, it interprets scenes by using the neural network algorithm to dissect input from its cameras and radar. Chris Gerdes, director of the Center for Automotive Research at Stanford, says this approach is “computationally formidable. The algorithm has to reconstruct a map of its surroundings from the camera feeds rather than relying on sensors that can capture that picture directly.”
Tesla explains on its website the protocols it has embraced to develop its neural networks:
- Apply cutting-edge research to train deep neural networks on problems ranging from perception to control;
- Per-camera networks analyze raw images to perform semantic segmentation, object detection, and monocular depth estimation;
- Birds-eye-view networks take video from all cameras to output the road layout, static infrastructure, and 3D objects directly in the top-down view;
- Networks learn from the most complicated and diverse scenarios in the world, iteratively sourced from a fleet of nearly 1M vehicles in real time; and,
- A full build of Autopilot neural networks involves 48 networks that take 70,000 GPU hours to train, and, together, output 1,000 distinct tensors (predictions) at each timestep.
Training Teslas via Videofeeds
Tesla gathers more training data than other car companies. Each of the more than 1 million Teslas on the road sends back to the company the videofeeds from its 8 cameras. Hardware 3 onboard computer processes more than 40s the data compared to Tesla’s previous generation system. The company employs 1,000 people who label those images — noting cars, trucks, traffic signs, lane markings, and other features — to help train the large transformer.
At the August event, Tesla also said it can automatically select which images to prioritize in labeling to make the process more efficient. This is one of the many pieces that sets Tesla apart from its competitors.