Chapter 607: Difficulty of Commercialization
"I don't say that cultivating science and technology can't make money, I have to ask Professor Zhang, why don't we commercialize technology, do we have to rely on the state to support our company? Deng Yunji immediately went to the R&D department, and Xu Gong hurriedly followed.
Professor Zhang is working on algorithms in his office, whether it is machine learning, pattern recognition, data mining, statistical learning, computer vision, speech recognition, or natural language processing.
There are many kinds of algorithms commonly used in big data, including classification decision tree algorithm, clustering algorithm, association rule algorithm, maximum expectation algorithm, iteration algorithm, classification algorithm, vector machine algorithm, etc.
Decision tree is a decision analysis method that intuitively uses probability analysis to obtain the probability that the expected value of the net present value is greater than or equal to zero by constructing a decision tree on the basis of the known probability of occurrence of various situations, evaluating the project risk and judging its feasibility.
Because this decision branch is drawn like the branches of a tree, it is called a decision tree.
For example, let's say a dataset contains a lot of patient information, and we know various information about each patient, such as age, pulse, blood pressure, VO2 maximum, family history, etc.
These are called data attributes.
Now given these attributes, we want to predict whether a patient will develop cancer. Patients may fall into one of two categories: they will have cancer or they will not have cancer. The C4.5 algorithm tells us the classification of each patient.
By using a set of patient data attributes and corresponding patient feedback types, C4.5 constructs a decision tree that predicts the type of new patient based on their attributes.
So what is a decision tree? Decision tree learning is the creation of something similar to a flowchart to classify new data. Using the same patient example, a specific flowchart path could be: the patient has a history of cancer, the patient has highly similar gene expression to the cancer patient, the patient has a tumor, and the patient's tumor is more than 5 cm in size.
The basic principle is that each part of the flowchart is a question about the values of attributes, and based on these values, the patients are classified.
Is the algorithm supervised or unsupervised? It's a supervised learning algorithm because the training data is classified. Using well-categorized patient data, the C4.5 algorithm does not need to learn whether a patient will develop cancer on its own.
In most machine learning courses, regression algorithms are the first algorithm introduced.
There are two reasons for this: one is that the regression algorithm is relatively simple, and it can be introduced that it allows people to smoothly migrate from statistics to machine learning. Second, the regression algorithm is the cornerstone of several powerful algorithms, and if you don't understand the regression algorithm, you can't learn those powerful algorithms.
There are two important subclasses of regression algorithms: linear regression and logistic regression.
In the era of big data, data mining is the most critical job.
Big data mining is the process of discovering valuable and potentially useful information and knowledge hidden in large-scale databases that are massive, incomplete, noisy, fuzzy, and random, and it is also a decision support process.
It is mainly based on artificial intelligence, machine learning, pattern learning, statistics, etc. Through highly automated analysis of big data, inductive reasoning is made, and potential patterns are mined from it, which can help enterprises, merchants, and users adjust market policies, reduce risks, face the market rationally, and make correct decisions.
At present, in many fields, especially in the business field such as banking, telecommunications, e-commerce, etc., data mining can solve many problems, including marketing strategy formulation, background analysis, enterprise management crisis, etc.
What Professor Zhang is doing is using these algorithms to optimize the basic technology of the "gimbal". The gimbal is a big data system that they have continuously optimized and redesigned on the basis of the snake system.
Knock knock.
"Come in. Hearing the knock on the door, Professor Zhang didn't raise his head.
"Professor Zhang, I have something to do with you. Deng Yunji and the others walked into the office and sat on their chairs.
"What's the matter?" Professor Zhang glanced at him.
"The gimbal has been nationally recognized and will be applied to various industries immediately, so why don't we commercialize it. As far as I know, Jiangyan has long put this technology into commercial development, and the market is very broad. If we do the same, we don't need to apply for R&D funding from above, and you don't have to worry about funding anymore. Deng Yunji said.
"PTZ is for government departments, and if we want to commercialize, we need an independent data center. It's not cheap to build a data center, can you apply for funding?" said Professor Zhang.
They study big data and cloud computing, and currently use Weibo Cloud.
"As long as Cultivation Technology has this ability, I believe that the above will invest. Deng Yunji said confidently.
Although building a data center is expensive, it is a drop in the bucket compared to the country's investment in the big data strategic plan.
Deng Yunji has a background, and he believes that the people above also want to make some money and will agree to his application.
"The data center problem is solved, and we also need commercial applications. Professor Zhang said.
"How to apply, you just say. Deng Yunji said.
"I'm a researcher of basic technology, and I don't know much about commercial applications. Professor Zhang said.
"You are too modest, isn't your technical level comparable to Jiang Yan's company?" Deng Yunji said.
"It's not a matter of technical level, it's a matter of software design, which requires a good understanding of the business mindset of the Internet. I suggest you recruit some creative young people who can go to Jiangyan's software park, where there are a lot of people who are good at this. Professor Zhang said.
"Forget it about going to Jiangyan Company, we will recruit ourselves. Deng Yunji said.
"It's not that simple, you don't have any plans now, and you don't know what to do when you recruit people. And there are very few talents in this area, so it is best to cooperate with Jiangyan Company, they have cultivated a lot of such talents. Professor Zhang said.
"yes, I'll think about it. Deng Yunji said.
Deng Yunji just gave Hang Yu a look not long ago, how could he ask him for help, so he decided to do it himself.
He posted high-paying jobs while applying for funding from above.
The application for funds was very smooth, and in less than a week, the above agreed that they would build their own data center, just before the establishment of the Guizhou Pilot Zone, to summarize a little construction experience, so as to avoid problems when they got it.
However, the recruitment process has been very slow, and as Professor Zhang said, there are very few applicants, and even fewer can pass the interview. Half a month has passed, and only five people have been recruited by Cultivation Technology, and these five people only have technology, no creativity.