Chapter 302 Natural Language Processing
The proposal is very thick, two or three hundred pages, and the content is very detailed.
When Xiao Yuan was looking through it, he didn't read it word by word, in fact, there was no need for that, he just needed to grasp the context and key through the plan book, and understand the ideas of Tang Xinyu and Gu Wolf, fortunately, although the plan book was thick, it was very well organized, so Xiao Yuan didn't have much trouble reading it.
When Xiao Yuan was halfway through reading the plan book, Yang Jingchen called him outside to eat, so he stopped reading and got up to go out to eat.
After eating, he followed his mother to the study, wanting to ask her something.
"What, what do you want Mom to do for you?" In the study, Yang Jingchen asked, obviously, she thought that Xiao Yuan was still going to talk about the open source community.
"No, I want to ask, how much do you know about Chinese natural language processing, and is there anyone in the lab who is studying this?" Xiao Yuan asked.
"Why did you remember to ask this?" Yang Jingchen asked.
"Genne is going to make a full-text search engine, the purpose is to allow users to enter their daily language in the search box, the search engine should be able to automatically analyze and understand their language, and then search for the information they want from the huge number of web pages on the Internet, which requires natural language processing." Xiao Yuan said.
"Full-text search engine?" Yang Jingchen's eyes lit up, and he said sincerely, "This idea is good, if it can be done, it will be more promising and easier to make a big industry than the Xuanni Firewall that Xuanni is currently mainly engaged in." ”
"Yes, we're seeing that, too, but there's something that requires both theoretical support and a lot of groundwork, and natural language processing is also in it, and it's important, aren't you an AI expert, so I thought to ask you." Xiao Yuan said.
"Strictly speaking, natural language processing should be an interdisciplinary research direction with artificial intelligence as the core, which involves not only computers and artificial intelligence, but also linguistics, psychology and other social disciplines, although this discipline originated in the West very early, in the forties, but the natural language system is too complex, so, until now, there is no breakthrough in the world."
Yang Jingchen is obviously very familiar with this field, and introduced the current international research situation in this area to Xiao Yuan, and then said the domestic research situation: "Our domestic research in this area, compared with foreign countries, is still in a lower level of basic information accumulation stage, on the one hand, because the domestic start is later than the West, and second, because Chinese is very different from those languages of the Western Latin family, the Latin language family is a pinyin language family, and the sentence structure itself is structured, Therefore, it is relatively easy for computers to process, but Chinese is an ancient hieroglyphic script, flexible and arbitrary, full of various virtual word auxiliaries, and it is a big problem to transform such a language into a form that computers can analyze and process. ”
"Well, that's true, so what is the current state of research in our country?" After listening to his mother's words, Xiao Yuan was a little disappointed, in his previous life, he had only paid a little attention to the theory in this area, and he didn't know too much about the specific details, so he wanted to have a deeper understanding.
"Now there are several universities in China that conduct research in this area, among which the research of Jinghua University is more cutting-edge, and most of the research of other universities in China in this area is also cooperating with the research of Jinghua University to do some basic thesaurus sorting work, and there are also people in the laboratory doing research in this area, mainly undertaking the work of sorting out and building a vocabulary from h to p, and the current degree of completion has reached 80%." Yang Jingchen said.
After listening to Yang Jingchen's introduction, Xiao Yuan said: "When will the basic thesaurus be built?" ”
"It's hard to estimate." Yang Jingchen said, after speaking, he saw Xiao Yuan's brows furrowed, and said, "If you really want to use natural language processing algorithms in search engines, Mom can give you some suggestions." ”
Xiao Yuan looked at his mother with an inquiring gaze, waiting for her to continue.
"If people want to use the search engine you said to retrieve information on the Internet, I think the most they use it will definitely not be a complete sentence, but some keywords, or some short sentences, just analyze the keywords and short sentences, the difficulty is not so great, and you don't have to make the product so perfect all at once, people have never been exposed to such a product before, so it only takes some simple intelligence, it is enough to attract enough to users." Yang Jingchen said.
Xiao Yuan nodded and said: "I naturally know this, but I am a little disappointed in the research situation in our country, the first version must not put too many things in it at once, only need to make a basic thing, and then according to the user's feedback and the maturity of the new technology, and then expand, so that it is more and more perfect, this seems to be the famous XP programming in software engineering." ”
"You know XP programming?"
Yang Jingchen is because Xiao Yuan mentioned XP programming (XP here refers to the abbreviation of ExtremeProgramming, the meaning of extreme programming, and XP in Windows XP refers to the abbreviation of Experience, which means experience, and there is no WindowsXP in 99 years. I was surprised, probably I didn't expect Xiao Yuan to read books on software engineering, not to mention that the XP development method is a very new method that has only been proposed in recent years.
"Understand something." Xiao Yuan didn't expect that his casual words would surprise his mother, so he brought the topic back to the topic and said, "Mom, I have an idea." ”
"What thoughts?" Yang Jingchen asked.
"I want Xuannese Search to have in-depth cooperation with your artificial intelligence laboratory, on the one hand, you can get a certain amount of financial support from Xuannie, and you can also use the massive network information resources collected by Xuannie Search, on the other hand, Xuannie can also apply your research results to your own products for the first time, improve the company's technical content, I think this is a win-win thing." Xiao Yuan said.
"The cooperation between school scientific research and enterprises to transform it into productivity as soon as possible is what our country has been advocating, and the artificial intelligence laboratory has been seeking cooperation with enterprises for joint research and development over the years, so if Xuan Ni wants to cooperate, the laboratory is of course welcome." Yang Jingchen said.
"Okay, when the time is ripe, I'll let Tang Xinyu talk to you, and I won't be involved in the specific process." Xiao Yuan said.
…………
After chatting with his mother for a while, Xiao Yuan returned to his room again, finished reading the plan book, thought for a while, took out a pen and paper, and began to conceive the technical architecture of the Xuan Nie search engine.
ps: These chapters need to talk about some technical knowledge, in order to ensure that there are no mistakes, green tea needs to think a lot of things, check a lot of information, write very slowly, very brain-consuming.
Today there are still three watches, and the second one is delivered.