Chapter 655 [Neural Network Deep Learning]

An hour later, Fang Hong came to the headquarters of Quantitative Capital again.

Chen Yu's assistant came to receive him, led him towards the reception room, and said, "Mr. Fang, Mr. Chen is having a meeting in the technical department, you wait a minute, I will inform him." ā€

Fang Hong said: "No, take me directly to his conference room, and I'll go to audit." ā€

Hearing this, Chen Yu's assistant took out his mobile phone and sent him a message, and soon Chen Yu replied to the message, and the assistant turned to look at Fang Hong and smiled: "Mr. Fang, please here." ā€

After a while, Fang Hong came to the conference room where Chen Yu was, there were more than 30 people present, and when they saw a strange young man walking in, everyone looked at it with curiosity.

They found that Fang Hong was about the same age as their boss Chen Yu, but the difference was that they felt a superior aura from Fang Hong that they did not have at this age, which made everyone realize that this strange young man was not an ordinary person.

At this moment, Chen Yu saw Fang Hong looking at him and nodding in greeting, and the latter smiled slightly and silently found a place in the conference room to sit down and listen.

Chen Yu retracted his gaze, turned to look around at the group of participants, and continued: "...... For the basic implementation idea of artificial intelligence, the process of machine learning, simply put, is how the computer learns itself. ā€

"Because all computer operations are based on mathematical operations, any machine learning idea is ultimately about turning a practical problem into a mathematical problem. In order for a computer to be able to predict or recognize something, it needs to construct a mathematical function, which is called a prediction function. ā€

It may be difficult for ordinary people to imagine that quantitative capital, as a multi-financial company, is even a non-bank financial investment company in the eyes of most shareholders, and the head of the company is also engaged in investment transactions, but talks about these contents in the company.

However, Fang Hong is very calm, this is actually normal, Wall Street is a group of top mathematicians and physicists.

At this moment, Chen Yu turned to look at the conference screen and said: "For example, predicting a function of eating a full meal can be described as [full = N bowl of rice], is this prediction calculated accurate?" What is the relationship between how many bowls of rice a person eats and what is the relationship between being full? Do you eat one bowl or three bowls to get full? ā€

"This needs to be actually tried, if the prediction is that two bowls of rice are full, but the actual three bowls of rice are full, the error of one bowl is the loss, and the function describing this loss is [3-N=1], which is the loss function."

"Machine learning is the process of constantly trying to minimize this error, and the way to find the minimum loss is usually gradient descent, once we find the minimum error, we will find that when [N=3], the error is the smallest, that is, the machine learning has found the real law, and the problem has been successfully solved."

Chen Yu looked at the crowd again: "Therefore, machine learning is to find the rules of data, most of the time, its essence is to project data into the coordinate system, and then use the computer to draw a line through mathematical methods to distinguish or simulate the process of these data." ā€

"Different machine learning methods are using different mathematical models to project data and draw lines, from the last century to the present, different schools have found different methods, good at solving different problems, and there are several kinds of huge impacts: linear regression and logistic regression, K-nearest neighbors, decision trees, support vector machines, Bayesian classification, perceptrons, etc."

Fang Hong sat on the side and listened silently, he is also half an industry insider in the field of computer science, and he has the advantage of remembering the prophet in his previous life, so there is no pressure to listen at this moment.

Chen Yu: They obviously followed the school of neural networks, but they also took a step forward and entered the intensive deep learning, and the predecessor of neural networks was perceptrons.

All three nouns are essentially playing the same thing.

But at this moment, Chen Yu said slowly: "The most basic idea of deep learning is to simulate the activity mode of brain neurons to construct prediction functions and loss functions. ā€

A diagram of the structure of neurons in the brain is shown on the screen.

"This is a neuron, everyone knows its structure, this is the dendrite, this is the axon, the signals sent by other neurons enter the neuron through the dendrite, and then shoot out through the axon, this is the operating mechanism of a neuron."

"Now we mutate the tree of neurons into inputs, and the axes into outputs, and the neuron becomes a graph like this. It's even easier to turn it into a mathematical formula, [X1+X2+X3=Y], which is the formula. ā€

"That's right, it's as simple as that. The most complex things are often created by the simplest things, simple 0s and 1s shape the huge computer world, four nucleotides vacant complex life phenomena, and a simple neuronal reflex shapes our brain. ā€

Chen Yu paused for a moment and looked around at everyone again: "The key to the problem is not how simple the basic structure is, but how we use this basic structure to build a huge world, and the reason why neurons are amazing is because it has an activation mechanism, the so-called threshold. ā€

"Each dendrites of a neuron are constantly receiving input signals, but not every input signal can make the axons output signals, and each dendrites have a different weight at the input."

"For example, if you pursue a girl, you take all kinds of actions tirelessly, give her a bouquet of flowers today, and treat her to a big dinner tomorrow, but you find that none of these actions can impress her. Until one day I went shopping with her for a day, and she was suddenly moved and agreed to be your girlfriend, what does this mean? ā€

"It shows that not all input weights are the same, and the weight of shopping may be the largest in girls' places, followed by the accumulation of effects is not a linear and gradual process, but quantitative changes cause qualitative changes."

"All the inputs are completely ineffective before a certain point, but once they reach a certain value, they are suddenly excited, so to imitate this activation characteristic of neurons, then modify the formula just now."

"Each input needs a certain weight, and a coefficient [W] that adjusts the weight is added in front of it, and a constant is added in the back to make it easier to adjust the threshold, so this function becomes like this."

Fang Hong also looked at the big screen of the meeting, which was a new mathematical formula.

怐W1X1+W2X2+W3X3+b=Y怑

Chen Yu looked at the formula on the screen and said: "In order to realize the activation process, further processing is done on the output value, and an activation function is added, for example, when X>1, output 1; When X<1 is output, 0 is output, and that's what it looks like. ā€

"However, this function does not seem to be rounded enough, and it is not derivable everywhere, so it is difficult to deal with, so it can be replaced with a Sigmoid function, such a simple function can handle the classification problem."

"A single perceptron is actually a line that separates two different things, and a single perceptron can solve a linear problem, but it can't do anything about the linear inseparability, which means that even the simplest XOR problem cannot be handled."

XOR questions are understood to everyone present, including Fang Hong, that this is one of the basic calculations of computers.

At this time, Chen Yu asked himself: "If the XOR problem can't be handled, isn't that the rhythm of the death penalty?" ā€

Chen Yu immediately replied: "It's very simple, directly use the kernel function to upgrade the dimension. The reason why the perceptron can become the current deep learning is because it has changed from one layer to multiple layers, the depth of deep learning means that there are many layers of the perceptron, and we usually call the neural network with more than three hidden layers as the deep neural network. ā€

Chen Yu looked back at the screen to retrieve the next slide chart and said: "The computer has four basic operation logics, AND, OR, NOT, XOR, this does not need to be talked about. If we put XOR in a coordinate system, this is it. ā€

"The origin position X is 0 and Y is 0, so 0 is taken; When X=1, Y=0, the two are different to take 1, Kone, here is also 1, and this position X and Y are equal to 1, so take 0, if we need to separate 0 and 1 on this diagram, a straight line can't do it. ā€

"What to do? Mathematically speaking, XOR operation is actually a composite operation, which can actually be obtained by other operations, and the proof process is too complicated to expand here. ā€

"If we can use the perceptron to complete the bracketed operation first, and then input the result into another perceptron to perform the outer layer of the operation, we can complete the puzzle operation, and then the XOR problem is solved so magically, and the problem of linear inseparability is solved while solving the problem."

What does this mean? Theoretically, no matter how complex the data is, we can fit a suitable curve to separate them by adding layers, and adding layers is the nesting of functions, theoretically speaking, no matter how complex the problem, we can combine it through simple linear functions, therefore, theoretically, multi-layer perceptrons can become a general method to solve various machine learning problems across domains. ā€

……

(End of chapter)