162. Reparameterization of Networks
Musk's main business territory is heavy industries such as automobiles and rockets, but in fact, it has little to do with artificial intelligence at this point in time.
However, he is indeed an extremely forward-thinking and radical person, and building ordinary cars is not his style.
Tesla not only has to be electrified, but also has to be autonomous!
Not only that, in the face of Meng Fanqi's successive breakthroughs in visual algorithms, he had a bold idea.
That is, they want to build a pure computer vision system for their Tesla electric car industry, without the help of other technical means.
This is the main reason why he came to look for Meng Fanqi again this time and pursue technological breakthroughs.
Personally, he is actually very satisfied with the result last time, but he has set his goals too high, and it is not enough to achieve it.
Autonomous driving had some good results before deep learning was developed, but most of those were based on radar and sensors.
Lidar or other sensors are used to detect objects and their distance from the vehicle.
However, Musk feels that this is not quite the same as the way humans operate vehicles, which is too uncool.
Think about how humans drive?
When a person drives a car, it is basically purely visual, and he can drive just by looking at it. The main purpose of those mirrors on the vehicle is to make it easier for people to see around and behind them.
There may be occasional auditory aids, such as a whistle, but not critically. It's mainly the visual system that plays a role.
Musk calls it first-principles thinking, and he wants to make intelligent systems that drive vehicles exactly according to human logic, rather than with the help of sensors, after all, humans don't have such superpowers.
However, the vision system is completely based on a large number of cameras and relies heavily on high-precision detection algorithms, which can cause a lot of problems.
What if something is detected that hasn't been seen in the dataset? Can that still be detected?
The sensor method based on LiDAR, no matter what it encounters, can always detect particles and objects, and its principle is not like that of humans, at least it is not so easy to collide with directly.
It's not easy to say that an intelligent system that relies purely on vision, and the images must be processed and then analyzed using a network.
Once the analysis is wrong and there is a misjudgment, it is undoubtedly a collision, and a vehicle accident will definitely occur, and a headlong wound will die on it.
Musk's aggressive technology strategy and preferences have led to the problem that AI algorithms have too much to do.
If you want to completely abandon the sensor, you have to install on-board cameras in all directions to ensure that you can see clearly.
In addition, there is an important thing, and that is the estimation of distance.
It is too easy for humans to judge the distance based on a picture, but it is not an easy thing for artificial intelligence vision algorithms.
With the current technical conditions, it is necessary to perform very complex annotations to analyze the distance between each part and the pixels in the sample image.
Because the picture is 2D flat after all, and autonomous driving is a task that requires a good grasp of spatial distance.
It is necessary to reconstruct a three-dimensional space through a large number of flat pictures from different angles, even from a bird's-eye view.
But now this is just a castle in the air, and Musk's intention to contact Meng Fanqi again is very simple, that is, he hopes that this neural network as the backbone can be a little faster, or the amount of computation is a little smaller.
Otherwise, with the current situation, it is difficult for Tesla to afford this amount of computing.
In fact, Musk did not have particularly high hopes for this matter. In his opinion, the plan given by Meng Fanqi last time was outrageously good.
At this point in time when everyone has just begun to reproduce DreamNet, and has not yet understood the principle of residuals and some variants, Meng Fanqi has done quite a lot of experiments on other computing devices on various platforms.
Thus, by optimizing the operator structure and adjusting the specific calculation process, the number of core backbone network parameters is reduced by nearly ten times.
It's amazing that the computing is so much faster and the performance hasn't changed much.
Musk has this question, and he also mentioned it casually in private.
But his name was too big, and what he did in the past was too crazy, which caused Meng Fanqi to take it seriously when he listened to his rather low, magnetic voice.
I really thought this was a very serious request.
"The popularity of autonomous driving is indeed getting faster, and I am doing some optimization work specifically in this area, which is not a loss."
Meng Fanqi took advantage of the rebirth to start buying the stocks of some car companies, and at the same time began to implement a clever way to accelerate and save memory.
This new optimization method is called reparameterization of the network structure.
In the past six months, the rapid improvement in the performance of the visual method comes from the residual method proposed by Meng Fanqi, that is, y = F(x) becomes y = F(x) + x.
The writing here is relatively simple, and a series of complex operations are abstractly summarized as F(), which is still relatively complex in the actual operation process, and often needs to be counted for a while.
But when it comes to computation, there is a problem, when y = F(x) operation starts, it is no longer necessary to continue to store the variable x, because it is already participating in the F(x) operation.
In the process of arithmetic, it becomes the other intermediate variable, and then finally the y we want.
However, in the residual method, y = F(x) + x, the original input of x, cannot be discarded.
There must be space for this x to be occupied all the time, because it's still waiting to be added at the end.
In more complex, high-resolution tasks, the size of this variable is considerable.
Is there a way to circumvent this situation? After avoidance, can the performance improvement brought by the residual method not be affected?
The answer, of course, is yes, and it can be done.
The core idea of this kind of structural reparameterization that Meng Fanqi intends to achieve is the separation of model training and actual use inference.
First, a series of structures (generally used for training) are constructed, and their parameters are equivalent to another set of parameters (generally used for inference), so that this series of structures is equivalent to another series of structures.
In real-world scenarios, training resources are generally abundant and can be obtained on large servers.
When reasoning, computing resources are often limited, so people are more concerned about the overhead and performance of inference.
If you want to train with a large structure and good properties, such as particularly good performance and high accuracy.
But when reasoning, the structure is made smaller and faster, and at the same time mathematically equivalent to a large structure.
Meng Fanqi's new approach provides this possibility, and he believes that the reduction of computing power in the parametric + mobile network will become a major catalyst in the field of autonomous driving. Latest URL:
Website of this site: