CN109726392B

CN109726392B - Intelligent language cognition information processing system and method based on big data

Info

Publication number: CN109726392B
Application number: CN201811521939.7A
Authority: CN
Inventors: 尹观海; 方燕红; 王文烨; 李小东; 陈佳; 张明宝; 廖玲萍
Original assignee: Jinggangshan University
Current assignee: Jinggangshan University
Priority date: 2018-12-13
Filing date: 2018-12-13
Publication date: 2023-10-10
Anticipated expiration: 2038-12-13
Also published as: CN109726392A

Abstract

The invention belongs to the field of big data, and discloses an intelligent language cognitive information processing system and method based on big data; inputting the language through the voice and text input form; extracting words and phrases by using the best consistent approximation method for the words and phrases, and the Hange and the sentence pattern, and converting the words and phrases after extracting the words and phrases; checking the conversion content and sentence existing in the system by adopting a Showy-Fresnel algorithm, and verifying the conversion content; after verification, inputting the verification result into a microprocessor; re-extracting and converting after verification failure, and entering into a microprocessor after passing the verification failure; and finally, the information is stored by adopting wavelet domain denoising of PURE-LET and is output through a loudspeaker. The invention can greatly reduce the error rate of the intelligent language cognitive system, can perform multi-language conversion, and can improve the conversion efficiency through the memory function.

Description

Intelligent language cognition information processing system and method based on big data

Technical Field

The invention belongs to the field of big data, and particularly relates to an intelligent language cognitive information processing system and method based on big data.

Background

Language is broadly speaking a set of communication instructions that are expressed using common processing rules, the instructions being communicated visually, audibly, or tactilely. Strictly speaking, language refers to instruction-natural language used for human communication. All people are language abilities obtained through learning, and the purpose of the language is to communicate ideas, ideas and the like. Linguistics have evolved from human research into linguistic classification and rules. Language is a way of communication between people, and people can not leave the language in contact with each other. Although people's ideas can be conveyed by pictures, actions, expressions, etc., language is among the most important and most convenient medium. When humans find that certain animals can communicate in some way, the concept of animal language is created. To the birth of a computer, a human needs to give instructions to the computer. The one-way communication becomes a computer language. However, the computer can not be well recognized when directly understanding the language spoken by the human, and the computer has high error rate in intelligent language recognition at present, and a plurality of words can not be recognized, so that the computer can only be used for simple and single recognition.

In summary, the problems of the prior art are:

at present, the computer has high error rate in intelligent language cognition, and a plurality of words cannot be identified, so that only simple and single identification can be performed.

In the prior art, the words cannot be accurately extracted; in the prior art, the conversion content cannot effectively remove errors or redundant information, so that the verification time is prolonged, the verification efficiency is reduced, and the efficient verification of the conversion content cannot be realized; in the prior art, the information is easy to be interfered by external factors, the information quality is reduced, errors are caused, and the accurate output of the loudspeaker is not facilitated.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides an intelligent language cognitive information processing system and method based on big data.

The invention is realized in such a way that the intelligent language cognition information processing method based on big data comprises the following steps:

firstly, inputting a language through a voice and text input mode;

secondly, extracting the words and phrases by using the best consistent approximation method for the words and phrases, the guard and the sentence periods, and converting the extracted words and phrases;

thirdly, checking the converted content and sentence meaning existing in the system by adopting a Showy-Fresnel algorithm, and verifying the converted content;

fourth, inputting the verification result to the microprocessor; re-extracting and converting after verification failure, and entering into a microprocessor after passing the verification failure; and finally, the information is stored by adopting wavelet domain denoising of PURE-LET and is output through a loudspeaker.

Further, in the second step, the words, idioms, the shoddy and the sentence periods are used for extracting the words by adopting an optimal consistent approximation method for the language, and the specific algorithm is as follows: f (x) ∈C [ a, b ]],p _n (x) Is a set of all polynomials with degree not exceeding n; if:

then p x is the best consistent approximation polynomial of f (x) over a, b, also called the minima maxima polynomial;

solving an optimal polynomial by adopting a lining Mi Ci algorithm; solving according to chebyshev's theorem:

wherein: ak (k=0, 1, … n) is the polynomial coefficient to be solved; ρ is the best approximation; x is x _i Obtained by using an iterative correction method.

Further, in the third step, the converted content is checked by adopting a Showy-Fresnel algorithm, so that the efficient verification of the converted content is realized; the algorithm comprises the following steps:

using a set of data samples S ₀ ＝{x ₀ ，x ₁ ，…，x _n N sample data contains m error data sample points, f ₀ (x) Is reflecting the set of data samplesThe function of this basic feature is as follows:

wherein: n is the number of individuals for a set of data;

D _i ＝|x _i -f(x _i )|；

for measuring sample point data x _i Degree of deviation from functional relationship D _i The larger the sample point is, the greater the likelihood of the sample point being error data; d for n data _i A maximum value;

chinese zodiac-View Fresnel algorithm rejection D _i The sample point j with the largest value is used for establishing a new sample set S ₁ ＝{S ₀ –x _j And (3) repeating operation on the rest data, wherein when the data meets the operation termination condition, the m removed sample points are error data.

Another object of the present invention is to provide a big data based intelligent language cognitive information processing system implementing the big data based intelligent language cognitive information processing method, the big data based intelligent language cognitive information processing system comprising: the system comprises a language receiving module, a character input module, a word extraction module, a conversion module, a verification module, a microprocessor, a storage module, a loudspeaker module and big data;

the big data provides knowledge support for the word extraction module and the verification module; the voice receiving module and the text input module are input and then extracted through the word extraction module, the word extraction module is then used for converting, and the conversion module is used for inputting conversion content to the verification module;

the verification module inputs the verification result to the microprocessor after the verification is passed, and returns to the word extraction module for reconversion after the verification is failed;

the microprocessor stores the conversion information into the storage module; the microprocessor outputs the information through the speaker module.

The invention further aims to provide a spare element cognition platform applying the intelligent language cognition information processing method based on big data.

The invention has the advantages and positive effects that: the verification module is arranged, and verifies the information output by the conversion module and the information in the big data, if the verification conversion is wrong, the extraction conversion is carried out again, so that the system can have correct cognition, and errors are avoided; the invention is provided with the storage module, and the storage module can record the converted language, so that the conversion system can generate memory, and the next conversion is more rapid. The invention is provided with big data, so that the vocabulary source of the system is wider, multiple languages can be identified, colloquial idioms and the like can be inquired, and the error rate is low. The error rate of the intelligent language cognition system can be greatly reduced, multiple languages can be converted, and the conversion efficiency can be improved through the memory function.

The invention utilizes words, idioms, a hank, a sentence pattern and the like to extract the words by adopting an optimal consistent approximation method for the language, thereby improving the accuracy of the word extraction; the invention adopts the Showy Fresnel algorithm to check the conversion content, effectively removes error or redundant information, improves the checking efficiency and realizes the efficient verification of the conversion content; the invention stores the wavelet domain denoising of the information by adopting the PURE-LET, effectively avoids the interference of external factors, ensures the information quality and is favorable for the accurate output of a loudspeaker.

Drawings

Fig. 1 is a flowchart of an intelligent language cognition information processing method based on big data provided by an embodiment of the invention.

FIG. 2 is a schematic diagram of an intelligent language cognitive information processing system based on big data according to an embodiment of the present invention;

in the figure: 1. a language receiving module; 2. a text input module; 3. a word extraction module; 4. a conversion module; 5. a verification module; 6. a microprocessor; 7. a storage module; 8. a speaker module; 9. big data.

Detailed Description

For further understanding of the invention, the following examples are set forth to illustrate the invention, its features and their efficacy, as best illustrated in the accompanying drawings, 1.

The structure of the present invention will be described in detail with reference to the accompanying drawings.

As shown in fig. 1, the intelligent language cognitive information processing method based on big data provided by the embodiment of the invention specifically includes the following steps:

s101: inputting the language through the voice and text input form;

s102: extracting words and phrases, a Hangul, a sentence pattern and the like by adopting an optimal consistent approximation method for the language, and converting after extracting the words and the phrases;

s103: checking the conversion content and sentence existing in the system by adopting a Showy-Fresnel algorithm, and verifying the conversion content;

s104: after verification, inputting the verification result into a microprocessor; re-extracting and converting after verification failure, and entering into a microprocessor after passing the verification failure; and finally, the information is stored by adopting wavelet domain denoising of PURE-LET and is output through a loudspeaker.

In step S102, the method for extracting the words and phrases by using the words and phrases, the Hangul, the sentence pattern and the like according to the embodiment of the invention adopts the best consistent approximation method for the language, thereby improving the accuracy of extracting the words and phrases; the specific algorithm is as follows:

let f (x) E C [ a, b ]],p _n (x) Is a set of all polynomials with degree not exceeding n; if it is

solving an optimal polynomial by adopting a lining Mi Ci algorithm; solving according to chebyshev's theorem

In step S103, the conversion content provided by the embodiment of the present invention is checked by using a schottky algorithm, so that errors or redundant information is effectively removed, the checking efficiency is improved, and efficient verification of the conversion content is realized; the algorithm comprises the following steps:

using a set of data samples S ₀ ＝{x ₀ ，x ₁ ，…，x _n N samples of data containing m error data samples

Point f ₀ (x) Is a function reflecting the basic characteristics of the set of data samples as follows:

wherein: n is the number of individuals for a set of data;

D _i ＝|x _i -f(x _i )|

chinese zodiac-View Fresnel algorithm rejection D _i The sample point j with the largest value is used for establishing a new sample set S ₁ ＝{S ₀ -x _j And (3) repeating operation on the rest data, wherein when the data meets the operation termination condition, the m removed sample points are error data.

In step S103, the information is stored by adopting the wavelet domain denoising of PURE-LET, so that the interference of external factors is effectively avoided, the information quality is ensured, and the accurate output of a loudspeaker is facilitated; the specific algorithm is as follows:

information at each scale estimates wavelet coefficientsAll written as a linear combination of a set of basic threshold functions:

and the coefficient vector a= [ a ] is determined by minimization of the push ₁ ，…，a _M ] ^T ；

Let θ (d, s) =θ ^j (d ⁱ ，s ^j ) For noiseless wavelet coefficient delta=delta ^j Is a function of the estimate of (1); function theta ⁺ (d, s) and θ ^- (d, s) as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,is->Standard basis of (2) except e _k (k) =0 for all other elements; random variable PURE _j For unbiased estimation of MSE at subband j, i.e., E { PURE _j }＝E{MSE _j }；

Calculating a linear combination parameter of wavelet estimation in formula (2) through minimization of PURE; substituting the formula (2) into the formula (3) and omitting the independent variables (d, s) includes

As shown in fig. 2, the intelligent language cognitive information processing system based on big data provided in the embodiment of the present invention specifically includes:

the system comprises a language receiving module 1, a character input module 2, a word extraction module 3, a conversion module 4, a verification module 5, a microprocessor 6, a storage module 7, a loudspeaker module 8 and big data 9.

Big data 9 provides knowledge support for word extraction module 3 and verification module 4; the voice receiving module 1 and the text input module 2 are input and then extracted through the word extracting module 3, the word extracting module 3 is then converted, and the conversion module 4 inputs the conversion content to the verification module 5.

The verification module 5 provided by the embodiment of the invention inputs the verification to the microprocessor 6 after the verification is passed, and returns to the word extraction module 3 for reconversion after the verification is failed.

The microprocessor 6 provided in the embodiment of the invention stores the conversion information in the storage module 7.

The microprocessor 6 provided by the embodiment of the invention outputs information through the speaker module 8.

The working principle of the invention is as follows: through the input of the voice receiving module 1 and the text input module 2, the word extracting module 3 extracts words, idioms, the adams, sentence patterns and the like in big data 9, the words are converted through the conversion module 4 after being extracted, the converted contents are input into the verification module 5 for verification, the verification module 5 receives sentence patterns existing in the big data 9 for verification, the verification module 5 inputs the sentence patterns into the microprocessor 6 after verification, the word extracting module 3 returns to convert and extract the words after verification failure for reconversion, and the microprocessor 6 stores information into the storage module 7 and outputs the information through the loudspeaker module 8.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the invention in any way, but any simple modification, equivalent variation and modification of the above embodiments according to the technical principles of the present invention are within the scope of the technical solutions of the present invention.

Claims

1. The intelligent language cognition information processing method based on big data is characterized by comprising the following steps of:

firstly, inputting a language through a voice and text input mode;

fourth, inputting the verification result to the microprocessor; extracting and converting again after verification failure, and inputting the qualified result into a microprocessor; finally, the information is stored by adopting wavelet domain denoising of PURE-LET and is output through a loudspeaker;

in the second step, words, idioms, and a sentence pattern are used for extracting words by adopting an optimal consistent approximation method for languages, and the specific algorithm is as follows: f (x) ∈C [ a, b ]],p _n (x) Is a set of all polynomials with degree not exceeding n; if:

then call p ^* (x) Is f (x) is represented by [ a, b ]]The best consistent approximation polynomial, also called minimisation maximum polynomial;

wherein: a, a _k (k=0, 1, … n) is the polynomial coefficient to be solved; ρ is the best approximation; x is x _i Obtaining by using an iterative correction method;

in the third step, the converted content is checked by adopting a Showy-Fresnel algorithm, so that the efficient verification of the converted content is realized; the algorithm comprises the following steps:

using a set of data samples S ₀ ＝{x ₀ ，x ₁ ，…，x _n N sample data contains m error data sample points, f ₀ (x) Is a function reflecting the basic characteristics of the set of data samples as follows:

wherein: n is the number of individuals for a set of data;

D _i ＝|x _i -f(x _i )|；

2. A big data-based intelligent language cognitive information processing system that implements the big data-based intelligent language cognitive information processing method of claim 1, characterized in that the big data-based intelligent language cognitive information processing system comprises: the system comprises a language receiving module, a character input module, a word extraction module, a conversion module, a verification module, a microprocessor, a storage module, a loudspeaker module and big data;

3. A language cognition platform applying the intelligent language cognition information processing method based on big data according to claim 1.