CN109726392A

CN109726392A - A kind of intelligent language Cognitive Information Processing Based and method based on big data

Info

Publication number: CN109726392A
Application number: CN201811521939.7A
Authority: CN
Inventors: 尹观海; 方燕红; 王文烨; 李小东; 陈佳; 张明宝; 廖玲萍
Original assignee: Jinggangshan University
Current assignee: Jinggangshan University
Priority date: 2018-12-13
Filing date: 2018-12-13
Publication date: 2019-05-07
Anticipated expiration: 2038-12-13
Also published as: CN109726392B

Abstract

The invention belongs to big data fields, disclose a kind of intelligent language Cognitive Information Processing Based and method based on big data；Language is inputted with text input form by voice；Word extraction is carried out using best uniformity approximation method to language using word, Chinese idiom, proverb, sentence pattern, word is converted after extracting；To conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content verify；By being input to microprocessor after verifying；Extraction and conversion are re-started after authentication failed, are entered and left after qualified and are arrived microprocessor；Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.The present invention can make the error rate of intelligent language cognitive system substantially reduce, and can carry out multilingual conversion, and transformation efficiency can be improved by memory function.

Description

A kind of intelligent language Cognitive Information Processing Based and method based on big data

Technical field

The invention belongs to big data field more particularly to a kind of intelligent language Cognitive Information Processing Baseds based on big data And method.

Background technique

Language is just to instruct meeting using a set of communication instruction for having and being jointly processed by rule to be expressed in the broadest sense It is transmitted with vision, sound or tactile manner.Strictly speaking, language refers to instruction used in human communication-nature language Speech.Owner is by learning the language competence to obtain, and the purpose of language is exchange idea, opinion, thought etc..Language Learn be exactly from human research's language classification with it is regular and developed.Language is a kind of interpersonal exchange way, people It is mutual associate be unable to do without language.Although can transmit the thought of people by picture, movement, expression etc., language is The medium of most important one and most convenient.When the mankind have found that certain animals can link up in some way, with regard to being born The concept of animal language.The birth of computer is arrived, human needs give computer instruction.It is this " unidirectional to link up " just at computer language Speech.But computer can not recognize well when directly understanding the language that the mankind say, computer is recognized in intelligent language at present Know that aspect error rate is high, and there are many words that can not identify, simple single identification can only be carried out.

In conclusion problem of the existing technology is:

Computer error rate in terms of intelligent language cognition is high at present, and has many words that can not identify, can only carry out Simple single identification.

It can not accurately be extracted to word in the prior art；Conversion content cannot be removed effectively mistake in the prior art Accidentally or unnecessary information, extension proof time reduce verification efficiency, cannot achieve the efficient verification to conversion content；The prior art Interference of the middle information vulnerable to extraneous factor reduces information quality, causes error, and it is accurately defeated to be unfavorable for loudspeaker progress Out.

Summary of the invention

In view of the problems of the existing technology, at the intelligent language cognitive information based on big data that the present invention provides a kind of Manage system and method.

The invention is realized in this way a kind of intelligent language method for processing cognitive information based on big data, described to be based on The intelligent language method for processing cognitive information of big data includes:

The first step is inputted language with text input form by voice；

Second step carries out word extraction using best uniformity approximation method to language using word, Chinese idiom, proverb, sentence pattern, Word is converted after extracting；

Third step, to conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content It is verified；

4th step, by being input to microprocessor after verifying；Extraction and conversion are re-started after authentication failed, are gone out after qualified Enter to microprocessor；Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.

Further, best uniformity approximation side is used for language using word, Chinese idiom, proverb, sentence pattern in the second step Method carries out word extraction, specific algorithm are as follows: f (x) ∈ C [a, b], p_n(x) collection that all multinomials for being number no more than n are constituted It closes；If:

Then claiming p* (x) is optimal and uniform approximating polynomial of the f (x) on [a, b], also referred to as the very big multinomial of minimization；

Optimum polynomial is sought using Li meter Zi algorithm；It is solved according to chebyshev's theorem:

Wherein: ak (k=0,1 ... it n) is multinomial coefficient to be asked；ρ is most preferably to approach value；x_iIt is obtained with correction method repeatedly.

Further, conversion content is checked in the third step using Xiao Weinie algorithm, is realized to conversion content Efficient verification；With algorithm are as follows:

Utilize data sample set S₀={ x₀, x₁..., x_n, m error information sample point, f are contained in n sample data₀ It (x) is the function for reflecting this group of data sample essential characteristic, as follows:

In formula: n is the number of individuals of one group of data；

D_i=| x_i-f(x_i)|；

For measuring sample points according to x_iThe degree of deflection function relationship, D_iBigger, sample point becomes the possibility of error information Property is bigger；D is asked to n data_iMaximum value；

Xiao Weinie algorithm rejects D_iIt is worth maximum sample point j, establishes new sample set S₁={ S₀–x_j, to remaining number According to repetitive operation is carried out, when data meet operation termination condition, m sample point of rejecting is exactly error information.

Another object of the present invention is to provide the intelligent language cognitive information processing described in a kind of realize based on big data The intelligent language Cognitive Information Processing Based based on big data of method, at the intelligent language cognitive information based on big data Reason system include: language receiving module, text input module, word extraction module, conversion module, authentication module, microprocessor, Storage module, loudspeaker module, big data；

Big data provides knowledge and supports to word extraction module, authentication module；Speech reception module and text input module It is extracted, will be converted after word extraction module by word extraction module after being inputted, conversion module will be in conversion Appearance is input to authentication module；

Authentication module is input to microprocessor after being verified, return to word extraction module after authentication failed and turned again Change；

Microprocessor will convert information preservation to storage module；Microprocessor carries out information by loudspeaker module defeated Out.

Another object of the present invention is to provide the intelligent language cognitive information processing described in a kind of application based on big data The remaining metacognition platform of method.

Advantages of the present invention and good effect are as follows: be provided with authentication module, the information that authentication module exports conversion module It is verified with information in big data, if examining conversion wrong, re-starts extraction conversion, system is had correctly Cognition, avoids mistake；The invention is provided with storage module, and storage module can record the language after conversion, in turn So that conversion system generates memory, so that more rapid when conversion next time.The invention is provided with big data, can make system Vocabulary source it is more wide, can identify multilingual, common saying Chinese idiom etc. can be inquired, error rate is low.It can make intelligence The error rate of energy language acknowledging system substantially reduces, and can carry out multilingual conversion, can be improved by memory function Transformation efficiency.

The present invention carries out word using best uniformity approximation method for language using word, Chinese idiom, proverb, sentence pattern etc. and mentions It takes, improves the accuracy that word extracts；The present invention checks conversion content using Xiao Weinie algorithm, effectively remove mistake or Unnecessary information improves verification efficiency, realizes the efficient verification to conversion content；The present invention is to information using the small of PURE-LET Curvelet domain denoising is saved, and the interference of extraneous factor is effectively avoided, and guarantees information quality, and it is accurate to be conducive to loudspeaker progress Output.

Detailed description of the invention

Fig. 1 is the intelligent language method for processing cognitive information flow chart provided in an embodiment of the present invention based on big data.

Fig. 2 is the intelligent language Cognitive Information Processing Based structural representation provided in an embodiment of the present invention based on big data Figure；

In figure: 1, language receiving module；2, text input module；3, word extraction module；4, conversion module；5, mould is verified Block；6, microprocessor；7, storage module；8, loudspeaker module；9, big data.

Specific embodiment

In order to further understand the content, features and effects of the present invention, the following examples are hereby given, and cooperate attached drawing 1 detailed description are as follows.

Structure of the invention is explained in detail with reference to the accompanying drawing.

As shown in Figure 1, the intelligent language method for processing cognitive information provided in an embodiment of the present invention based on big data, specifically The following steps are included:

S101: language is inputted with text input form by voice；

S102: word is carried out using best uniformity approximation method for language using word, Chinese idiom, proverb, sentence pattern etc. and is mentioned It takes, word is converted after extracting；

S103: to conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content into Row verifying；

S104: by being input to microprocessor after verifying；Extraction and conversion are re-started after authentication failed, are entered and left after qualified To microprocessor；Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.

It is provided in an embodiment of the present invention that language is used most using word, Chinese idiom, proverb, sentence pattern etc. in step S102 Good Uniform approximat method carries out word extraction, improves the accuracy that word extracts；Specific algorithm are as follows:

If f (x) ∈ C [a, b], p_n(x) set that all multinomials for being number no more than n are constituted；If

Optimum polynomial is sought using Li meter Zi algorithm；It is solved according to chebyshev's theorem

It is provided in an embodiment of the present invention that conversion content is checked using Xiao Weinie algorithm in step S103, effectively go Except mistake or unnecessary information, verification efficiency is improved, realizes the efficient verification to conversion content；With algorithm are as follows:

Utilize data sample set S₀={ x₀, x₁..., x_n, m error information sample is contained in n sample data

Point, f₀It (x) is the function for reflecting this group of data sample essential characteristic, as follows:

In formula: n is the number of individuals of one group of data；

D_i=| x_i-f(x_i)|

Xiao Weinie algorithm rejects D_iIt is worth maximum sample point j, establishes new sample set S₁={ S₀-x_j, to remaining number According to repetitive operation is carried out, when data meet operation termination condition, m sample point of rejecting is exactly error information.

It is provided in an embodiment of the present invention that information is saved using the wavelet field denoising of PURE-LET in step S103, The interference of extraneous factor is effectively avoided, guarantees information quality, is conducive to loudspeaker and is accurately exported；Specific algorithm Are as follows:

Information under each scale estimates wavelet coefficientWrite as one group of basic threshold function table Linear combination:

And coefficient vector a=[a is determined by the minimum of PURE₁..., a_M]^T；

Enable θ (d, s)=θ^j(dⁱ, s^j) it is noiseless wavelet coefficient δ=δ^jOne estimation；Function #⁺(d, s) and θ^-(d, s) It is as follows:

Wherein,ForStandard base, remove e_k(k)=outer remaining element is 0；Then stochastic variable PURE_jFor The unbiased esti-mator of MSE under subband j, i.e. E { PURE_j}=E { MSE_j}；

By the minimum of PURE, carry out the linear combination parameter of wavelet estimators in calculating formula (2)；Formula (2) are substituted into formula (3), it and omits independent variable (d, s), has

As shown in Fig. 2, the intelligent language Cognitive Information Processing Based provided in an embodiment of the present invention based on big data, specifically Include:

Language receiving module 1, text input module 2, word extraction module 3, conversion module 4, authentication module 5, micro process Device 6, storage module 7, loudspeaker module 8, big data 9.

Big data 9 provides knowledge to word extraction module 3, authentication module 4 and supports；Speech reception module 1 and text are defeated Enter and extracted after module 2 is inputted by word extraction module 3, will be converted after word extraction module 3, conversion module Conversion content is input to authentication module 5 by 4.

Authentication module 5 provided in an embodiment of the present invention is input to microprocessor 6 after being verified, return to word after authentication failed Language extracts 3 pieces of mould and is converted again.

Microprocessor 6 provided in an embodiment of the present invention will convert information preservation to storage module 7.

Microprocessor 6 provided in an embodiment of the present invention exports information by loudspeaker module 8.

The working principle of the invention: being inputted by speech reception module 1 and text input module 2, and word extracts mould Block 3 is extracted by word, Chinese idiom, proverb, the sentence pattern etc. in big data 9, and word carries out after extracting by 4 turns of conversion module Change, conversion content is input to authentication module 5 and is verified, authentication module 5 receives existing sentence meaning in big data 9 and carries out core Right, authentication module 5 is input to microprocessor 6 after being verified, and returns to word extraction module 3 after authentication failed and converts extracting mode Again it is converted, microprocessor 6 is exported by information preservation to storage module 7 and by loudspeaker module 8.

The above is only the preferred embodiments of the present invention, and is not intended to limit the present invention in any form, Any simple modification made to the above embodiment according to the technical essence of the invention, equivalent variations and modification, belong to In the range of technical solution of the present invention.

Claims

1. a kind of intelligent language method for processing cognitive information based on big data, which is characterized in that the intelligence based on big data Can language acknowledging information processing method include:

The first step is inputted language with text input form by voice；

Second step carries out word extraction, word using best uniformity approximation method to language using word, Chinese idiom, proverb, sentence pattern It is converted after extraction；

Third step, to conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content carry out Verifying；

4th step, by being input to microprocessor after verifying；Extraction and conversion are re-started after authentication failed, are entered and left and are arrived after qualified Microprocessor；Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.

2. the intelligent language method for processing cognitive information based on big data as described in claim 1, which is characterized in that described Word extraction is carried out using best uniformity approximation method for language using word, Chinese idiom, proverb, sentence pattern in two steps, it is specific to calculate Method are as follows: f (x) ∈ C [a, b], p_n(x) set that all multinomials for being number no more than n are constituted；If:

3. the intelligent language method for processing cognitive information based on big data as described in claim 1, which is characterized in that described Conversion content is checked using Xiao Weinie algorithm in three steps, realizes the efficient verification to conversion content；With algorithm are as follows:

Utilize data sample set S₀={ x₀, x₁..., x_n, m error information sample point, f are contained in n sample data₀(x) It is the function for reflecting this group of data sample essential characteristic, as follows:

In formula: n is the number of individuals of one group of data；

D_i=| x_i-f(x_i)|；

For measuring sample points according to x_iThe degree of deflection function relationship, D_iBigger, sample point is got over as a possibility that error information Greatly；D is asked to n data_iMaximum value；

Xiao Weinie algorithm rejects D_iIt is worth maximum sample point j, establishes new sample set S₁={ S₀–x_j, to remaining data into Row repetitive operation, when data meet operation termination condition, m sample point of rejecting is exactly error information.

4. a kind of intelligent language method for processing cognitive information realized described in claim 1 based on big data based on big data Intelligent language Cognitive Information Processing Based, which is characterized in that the intelligent language Cognitive Information Processing Based based on big data It include: language receiving module, text input module, word extraction module, conversion module, authentication module, microprocessor, storage mould Block, loudspeaker module, big data；

Big data provides knowledge and supports to word extraction module, authentication module；Speech reception module and text input module carry out It is extracted, will be converted after word extraction module, it is defeated that conversion module will convert content by word extraction module after input Enter to authentication module；

Authentication module is input to microprocessor after being verified, return to word extraction module after authentication failed and converted again；

Microprocessor will convert information preservation to storage module；Microprocessor exports information by loudspeaker module.

5. a kind of intelligent language method for processing cognitive information using described in claims 1 to 3 any one based on big data Remaining metacognition platform.