CN109726392A - A kind of intelligent language Cognitive Information Processing Based and method based on big data - Google Patents

A kind of intelligent language Cognitive Information Processing Based and method based on big data Download PDF

Info

Publication number
CN109726392A
CN109726392A CN201811521939.7A CN201811521939A CN109726392A CN 109726392 A CN109726392 A CN 109726392A CN 201811521939 A CN201811521939 A CN 201811521939A CN 109726392 A CN109726392 A CN 109726392A
Authority
CN
China
Prior art keywords
module
language
big data
word
conversion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811521939.7A
Other languages
Chinese (zh)
Other versions
CN109726392B (en
Inventor
尹观海
方燕红
王文烨
李小东
陈佳
张明宝
廖玲萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinggangshan University
Original Assignee
Jinggangshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinggangshan University filed Critical Jinggangshan University
Priority to CN201811521939.7A priority Critical patent/CN109726392B/en
Publication of CN109726392A publication Critical patent/CN109726392A/en
Application granted granted Critical
Publication of CN109726392B publication Critical patent/CN109726392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The invention belongs to big data fields, disclose a kind of intelligent language Cognitive Information Processing Based and method based on big data;Language is inputted with text input form by voice;Word extraction is carried out using best uniformity approximation method to language using word, Chinese idiom, proverb, sentence pattern, word is converted after extracting;To conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content verify;By being input to microprocessor after verifying;Extraction and conversion are re-started after authentication failed, are entered and left after qualified and are arrived microprocessor;Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.The present invention can make the error rate of intelligent language cognitive system substantially reduce, and can carry out multilingual conversion, and transformation efficiency can be improved by memory function.

Description

A kind of intelligent language Cognitive Information Processing Based and method based on big data
Technical field
The invention belongs to big data field more particularly to a kind of intelligent language Cognitive Information Processing Baseds based on big data And method.
Background technique
Language is just to instruct meeting using a set of communication instruction for having and being jointly processed by rule to be expressed in the broadest sense It is transmitted with vision, sound or tactile manner.Strictly speaking, language refers to instruction used in human communication-nature language Speech.Owner is by learning the language competence to obtain, and the purpose of language is exchange idea, opinion, thought etc..Language Learn be exactly from human research's language classification with it is regular and developed.Language is a kind of interpersonal exchange way, people It is mutual associate be unable to do without language.Although can transmit the thought of people by picture, movement, expression etc., language is The medium of most important one and most convenient.When the mankind have found that certain animals can link up in some way, with regard to being born The concept of animal language.The birth of computer is arrived, human needs give computer instruction.It is this " unidirectional to link up " just at computer language Speech.But computer can not recognize well when directly understanding the language that the mankind say, computer is recognized in intelligent language at present Know that aspect error rate is high, and there are many words that can not identify, simple single identification can only be carried out.
In conclusion problem of the existing technology is:
Computer error rate in terms of intelligent language cognition is high at present, and has many words that can not identify, can only carry out Simple single identification.
It can not accurately be extracted to word in the prior art;Conversion content cannot be removed effectively mistake in the prior art Accidentally or unnecessary information, extension proof time reduce verification efficiency, cannot achieve the efficient verification to conversion content;The prior art Interference of the middle information vulnerable to extraneous factor reduces information quality, causes error, and it is accurately defeated to be unfavorable for loudspeaker progress Out.
Summary of the invention
In view of the problems of the existing technology, at the intelligent language cognitive information based on big data that the present invention provides a kind of Manage system and method.
The invention is realized in this way a kind of intelligent language method for processing cognitive information based on big data, described to be based on The intelligent language method for processing cognitive information of big data includes:
The first step is inputted language with text input form by voice;
Second step carries out word extraction using best uniformity approximation method to language using word, Chinese idiom, proverb, sentence pattern, Word is converted after extracting;
Third step, to conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content It is verified;
4th step, by being input to microprocessor after verifying;Extraction and conversion are re-started after authentication failed, are gone out after qualified Enter to microprocessor;Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.
Further, best uniformity approximation side is used for language using word, Chinese idiom, proverb, sentence pattern in the second step Method carries out word extraction, specific algorithm are as follows: f (x) ∈ C [a, b], pn(x) collection that all multinomials for being number no more than n are constituted It closes;If:
Then claiming p* (x) is optimal and uniform approximating polynomial of the f (x) on [a, b], also referred to as the very big multinomial of minimization;
Optimum polynomial is sought using Li meter Zi algorithm;It is solved according to chebyshev's theorem:
Wherein: ak (k=0,1 ... it n) is multinomial coefficient to be asked;ρ is most preferably to approach value;xiIt is obtained with correction method repeatedly.
Further, conversion content is checked in the third step using Xiao Weinie algorithm, is realized to conversion content Efficient verification;With algorithm are as follows:
Utilize data sample set S0={ x0, x1..., xn, m error information sample point, f are contained in n sample data0 It (x) is the function for reflecting this group of data sample essential characteristic, as follows:
In formula: n is the number of individuals of one group of data;
Di=| xi-f(xi)|;
For measuring sample points according to xiThe degree of deflection function relationship, DiBigger, sample point becomes the possibility of error information Property is bigger;D is asked to n dataiMaximum value;
Xiao Weinie algorithm rejects DiIt is worth maximum sample point j, establishes new sample set S1={ S0–xj, to remaining number According to repetitive operation is carried out, when data meet operation termination condition, m sample point of rejecting is exactly error information.
Another object of the present invention is to provide the intelligent language cognitive information processing described in a kind of realize based on big data The intelligent language Cognitive Information Processing Based based on big data of method, at the intelligent language cognitive information based on big data Reason system include: language receiving module, text input module, word extraction module, conversion module, authentication module, microprocessor, Storage module, loudspeaker module, big data;
Big data provides knowledge and supports to word extraction module, authentication module;Speech reception module and text input module It is extracted, will be converted after word extraction module by word extraction module after being inputted, conversion module will be in conversion Appearance is input to authentication module;
Authentication module is input to microprocessor after being verified, return to word extraction module after authentication failed and turned again Change;
Microprocessor will convert information preservation to storage module;Microprocessor carries out information by loudspeaker module defeated Out.
Another object of the present invention is to provide the intelligent language cognitive information processing described in a kind of application based on big data The remaining metacognition platform of method.
Advantages of the present invention and good effect are as follows: be provided with authentication module, the information that authentication module exports conversion module It is verified with information in big data, if examining conversion wrong, re-starts extraction conversion, system is had correctly Cognition, avoids mistake;The invention is provided with storage module, and storage module can record the language after conversion, in turn So that conversion system generates memory, so that more rapid when conversion next time.The invention is provided with big data, can make system Vocabulary source it is more wide, can identify multilingual, common saying Chinese idiom etc. can be inquired, error rate is low.It can make intelligence The error rate of energy language acknowledging system substantially reduces, and can carry out multilingual conversion, can be improved by memory function Transformation efficiency.
The present invention carries out word using best uniformity approximation method for language using word, Chinese idiom, proverb, sentence pattern etc. and mentions It takes, improves the accuracy that word extracts;The present invention checks conversion content using Xiao Weinie algorithm, effectively remove mistake or Unnecessary information improves verification efficiency, realizes the efficient verification to conversion content;The present invention is to information using the small of PURE-LET Curvelet domain denoising is saved, and the interference of extraneous factor is effectively avoided, and guarantees information quality, and it is accurate to be conducive to loudspeaker progress Output.
Detailed description of the invention
Fig. 1 is the intelligent language method for processing cognitive information flow chart provided in an embodiment of the present invention based on big data.
Fig. 2 is the intelligent language Cognitive Information Processing Based structural representation provided in an embodiment of the present invention based on big data Figure;
In figure: 1, language receiving module;2, text input module;3, word extraction module;4, conversion module;5, mould is verified Block;6, microprocessor;7, storage module;8, loudspeaker module;9, big data.
Specific embodiment
In order to further understand the content, features and effects of the present invention, the following examples are hereby given, and cooperate attached drawing 1 detailed description are as follows.
Structure of the invention is explained in detail with reference to the accompanying drawing.
As shown in Figure 1, the intelligent language method for processing cognitive information provided in an embodiment of the present invention based on big data, specifically The following steps are included:
S101: language is inputted with text input form by voice;
S102: word is carried out using best uniformity approximation method for language using word, Chinese idiom, proverb, sentence pattern etc. and is mentioned It takes, word is converted after extracting;
S103: to conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content into Row verifying;
S104: by being input to microprocessor after verifying;Extraction and conversion are re-started after authentication failed, are entered and left after qualified To microprocessor;Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.
It is provided in an embodiment of the present invention that language is used most using word, Chinese idiom, proverb, sentence pattern etc. in step S102 Good Uniform approximat method carries out word extraction, improves the accuracy that word extracts;Specific algorithm are as follows:
If f (x) ∈ C [a, b], pn(x) set that all multinomials for being number no more than n are constituted;If
Then claiming p* (x) is optimal and uniform approximating polynomial of the f (x) on [a, b], also referred to as the very big multinomial of minimization;
Optimum polynomial is sought using Li meter Zi algorithm;It is solved according to chebyshev's theorem
Wherein: ak (k=0,1 ... it n) is multinomial coefficient to be asked;ρ is most preferably to approach value;xiIt is obtained with correction method repeatedly.
It is provided in an embodiment of the present invention that conversion content is checked using Xiao Weinie algorithm in step S103, effectively go Except mistake or unnecessary information, verification efficiency is improved, realizes the efficient verification to conversion content;With algorithm are as follows:
Utilize data sample set S0={ x0, x1..., xn, m error information sample is contained in n sample data
Point, f0It (x) is the function for reflecting this group of data sample essential characteristic, as follows:
In formula: n is the number of individuals of one group of data;
Di=| xi-f(xi)|
For measuring sample points according to xiThe degree of deflection function relationship, DiBigger, sample point becomes the possibility of error information Property is bigger;D is asked to n dataiMaximum value;
Xiao Weinie algorithm rejects DiIt is worth maximum sample point j, establishes new sample set S1={ S0-xj, to remaining number According to repetitive operation is carried out, when data meet operation termination condition, m sample point of rejecting is exactly error information.
It is provided in an embodiment of the present invention that information is saved using the wavelet field denoising of PURE-LET in step S103, The interference of extraneous factor is effectively avoided, guarantees information quality, is conducive to loudspeaker and is accurately exported;Specific algorithm Are as follows:
Information under each scale estimates wavelet coefficientWrite as one group of basic threshold function table Linear combination:
And coefficient vector a=[a is determined by the minimum of PURE1..., aM]T
Enable θ (d, s)=θj(di, sj) it is noiseless wavelet coefficient δ=δjOne estimation;Function #+(d, s) and θ-(d, s) It is as follows:
Wherein,ForStandard base, remove ek(k)=outer remaining element is 0;Then stochastic variable PUREjFor The unbiased esti-mator of MSE under subband j, i.e. E { PUREj}=E { MSEj};
By the minimum of PURE, carry out the linear combination parameter of wavelet estimators in calculating formula (2);Formula (2) are substituted into formula (3), it and omits independent variable (d, s), has
As shown in Fig. 2, the intelligent language Cognitive Information Processing Based provided in an embodiment of the present invention based on big data, specifically Include:
Language receiving module 1, text input module 2, word extraction module 3, conversion module 4, authentication module 5, micro process Device 6, storage module 7, loudspeaker module 8, big data 9.
Big data 9 provides knowledge to word extraction module 3, authentication module 4 and supports;Speech reception module 1 and text are defeated Enter and extracted after module 2 is inputted by word extraction module 3, will be converted after word extraction module 3, conversion module Conversion content is input to authentication module 5 by 4.
Authentication module 5 provided in an embodiment of the present invention is input to microprocessor 6 after being verified, return to word after authentication failed Language extracts 3 pieces of mould and is converted again.
Microprocessor 6 provided in an embodiment of the present invention will convert information preservation to storage module 7.
Microprocessor 6 provided in an embodiment of the present invention exports information by loudspeaker module 8.
The working principle of the invention: being inputted by speech reception module 1 and text input module 2, and word extracts mould Block 3 is extracted by word, Chinese idiom, proverb, the sentence pattern etc. in big data 9, and word carries out after extracting by 4 turns of conversion module Change, conversion content is input to authentication module 5 and is verified, authentication module 5 receives existing sentence meaning in big data 9 and carries out core Right, authentication module 5 is input to microprocessor 6 after being verified, and returns to word extraction module 3 after authentication failed and converts extracting mode Again it is converted, microprocessor 6 is exported by information preservation to storage module 7 and by loudspeaker module 8.
The above is only the preferred embodiments of the present invention, and is not intended to limit the present invention in any form, Any simple modification made to the above embodiment according to the technical essence of the invention, equivalent variations and modification, belong to In the range of technical solution of the present invention.

Claims (5)

1. a kind of intelligent language method for processing cognitive information based on big data, which is characterized in that the intelligence based on big data Can language acknowledging information processing method include:
The first step is inputted language with text input form by voice;
Second step carries out word extraction, word using best uniformity approximation method to language using word, Chinese idiom, proverb, sentence pattern It is converted after extraction;
Third step, to conversion content and Installed System Memory sentence anticipate, checked using Xiao Weinie algorithm, to conversion content carry out Verifying;
4th step, by being input to microprocessor after verifying;Extraction and conversion are re-started after authentication failed, are entered and left and are arrived after qualified Microprocessor;Finally information save and export by loudspeaker using the wavelet field denoising of PURE-LET.
2. the intelligent language method for processing cognitive information based on big data as described in claim 1, which is characterized in that described Word extraction is carried out using best uniformity approximation method for language using word, Chinese idiom, proverb, sentence pattern in two steps, it is specific to calculate Method are as follows: f (x) ∈ C [a, b], pn(x) set that all multinomials for being number no more than n are constituted;If:
Then claiming p* (x) is optimal and uniform approximating polynomial of the f (x) on [a, b], also referred to as the very big multinomial of minimization;
Optimum polynomial is sought using Li meter Zi algorithm;It is solved according to chebyshev's theorem:
Wherein: ak (k=0,1 ... it n) is multinomial coefficient to be asked;ρ is most preferably to approach value;xiIt is obtained with correction method repeatedly.
3. the intelligent language method for processing cognitive information based on big data as described in claim 1, which is characterized in that described Conversion content is checked using Xiao Weinie algorithm in three steps, realizes the efficient verification to conversion content;With algorithm are as follows:
Utilize data sample set S0={ x0, x1..., xn, m error information sample point, f are contained in n sample data0(x) It is the function for reflecting this group of data sample essential characteristic, as follows:
In formula: n is the number of individuals of one group of data;
Di=| xi-f(xi)|;
For measuring sample points according to xiThe degree of deflection function relationship, DiBigger, sample point is got over as a possibility that error information Greatly;D is asked to n dataiMaximum value;
Xiao Weinie algorithm rejects DiIt is worth maximum sample point j, establishes new sample set S1={ S0–xj, to remaining data into Row repetitive operation, when data meet operation termination condition, m sample point of rejecting is exactly error information.
4. a kind of intelligent language method for processing cognitive information realized described in claim 1 based on big data based on big data Intelligent language Cognitive Information Processing Based, which is characterized in that the intelligent language Cognitive Information Processing Based based on big data It include: language receiving module, text input module, word extraction module, conversion module, authentication module, microprocessor, storage mould Block, loudspeaker module, big data;
Big data provides knowledge and supports to word extraction module, authentication module;Speech reception module and text input module carry out It is extracted, will be converted after word extraction module, it is defeated that conversion module will convert content by word extraction module after input Enter to authentication module;
Authentication module is input to microprocessor after being verified, return to word extraction module after authentication failed and converted again;
Microprocessor will convert information preservation to storage module;Microprocessor exports information by loudspeaker module.
5. a kind of intelligent language method for processing cognitive information using described in claims 1 to 3 any one based on big data Remaining metacognition platform.
CN201811521939.7A 2018-12-13 2018-12-13 Intelligent language cognition information processing system and method based on big data Active CN109726392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811521939.7A CN109726392B (en) 2018-12-13 2018-12-13 Intelligent language cognition information processing system and method based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811521939.7A CN109726392B (en) 2018-12-13 2018-12-13 Intelligent language cognition information processing system and method based on big data

Publications (2)

Publication Number Publication Date
CN109726392A true CN109726392A (en) 2019-05-07
CN109726392B CN109726392B (en) 2023-10-10

Family

ID=66294925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811521939.7A Active CN109726392B (en) 2018-12-13 2018-12-13 Intelligent language cognition information processing system and method based on big data

Country Status (1)

Country Link
CN (1) CN109726392B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221704A (en) * 2007-01-12 2008-07-16 戴献东 Electric language learning policy
CN101604204A (en) * 2009-07-09 2009-12-16 北京科技大学 Distributed cognitive technology for intelligent emotional robot
US20130297290A1 (en) * 2012-05-03 2013-11-07 International Business Machines Corporation Automatic accuracy estimation for audio transcriptions
CN104778254A (en) * 2015-04-20 2015-07-15 北京蓝色光标品牌管理顾问股份有限公司 Distributing type system for non-parameter topic automatic identifying and identifying method
CN105494230A (en) * 2015-09-30 2016-04-20 常州大学怀德学院 Intelligent orientating oxygenation method and apparatus for aquatic culture
CN107123068A (en) * 2017-04-26 2017-09-01 北京航空航天大学 A kind of programming-oriented language course individualized learning effect analysis system and method
CN107273361A (en) * 2017-06-21 2017-10-20 河南工业大学 The word computational methods and its device closed based on the general type-2 fuzzy sets of broad sense
CN107741295A (en) * 2017-09-15 2018-02-27 江苏大学 A kind of MENS capacitive baroceptors test calibration device and method
CN207541938U (en) * 2017-11-08 2018-06-26 延边大学 A kind of natural language intelligent interaction machine
CN108537332A (en) * 2018-04-12 2018-09-14 合肥工业大学 A kind of Sigmoid function hardware-efficient rate implementation methods based on Remez algorithms
CN111597790A (en) * 2020-05-25 2020-08-28 郑州轻工业大学 Natural language processing system based on artificial intelligence

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101221704A (en) * 2007-01-12 2008-07-16 戴献东 Electric language learning policy
CN101604204A (en) * 2009-07-09 2009-12-16 北京科技大学 Distributed cognitive technology for intelligent emotional robot
US20130297290A1 (en) * 2012-05-03 2013-11-07 International Business Machines Corporation Automatic accuracy estimation for audio transcriptions
CN104778254A (en) * 2015-04-20 2015-07-15 北京蓝色光标品牌管理顾问股份有限公司 Distributing type system for non-parameter topic automatic identifying and identifying method
CN105494230A (en) * 2015-09-30 2016-04-20 常州大学怀德学院 Intelligent orientating oxygenation method and apparatus for aquatic culture
CN107123068A (en) * 2017-04-26 2017-09-01 北京航空航天大学 A kind of programming-oriented language course individualized learning effect analysis system and method
CN107273361A (en) * 2017-06-21 2017-10-20 河南工业大学 The word computational methods and its device closed based on the general type-2 fuzzy sets of broad sense
CN107741295A (en) * 2017-09-15 2018-02-27 江苏大学 A kind of MENS capacitive baroceptors test calibration device and method
CN207541938U (en) * 2017-11-08 2018-06-26 延边大学 A kind of natural language intelligent interaction machine
CN108537332A (en) * 2018-04-12 2018-09-14 合肥工业大学 A kind of Sigmoid function hardware-efficient rate implementation methods based on Remez algorithms
CN111597790A (en) * 2020-05-25 2020-08-28 郑州轻工业大学 Natural language processing system based on artificial intelligence

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AUSTIN F. FRANK等: "Speaking Rationally:Uniform Information Density as an Optimal Strategy for Language Production" *
吴晶等: "计算机辅助模式下外语自主学习者的认知" *

Also Published As

Publication number Publication date
CN109726392B (en) 2023-10-10

Similar Documents

Publication Publication Date Title
JP7062851B2 (en) Voiceprint creation / registration method and equipment
CN108766414B (en) Method, apparatus, device and computer-readable storage medium for speech translation
CN109272988B (en) Voice recognition method based on multi-path convolution neural network
CN111625641A (en) Dialog intention recognition method and system based on multi-dimensional semantic interaction representation model
CN108549658B (en) Deep learning video question-answering method and system based on attention mechanism on syntax analysis tree
CN109740447A (en) Communication means, equipment and readable storage medium storing program for executing based on artificial intelligence
CN112269868B (en) Use method of machine reading understanding model based on multi-task joint training
CN109816106A (en) One kind carrying out call center's customer service knowledge class response quality evaluation system based on speech recognition and natural language processing technique
CN111144124B (en) Training method of machine learning model, intention recognition method, and related device and equipment
CN113158671B (en) Open domain information extraction method combined with named entity identification
CN109147763A (en) A kind of audio-video keyword recognition method and device based on neural network and inverse entropy weighting
CN111968666A (en) Hearing aid voice enhancement method based on depth domain self-adaptive network
Han et al. Semantic-aware speech to text transmission with redundancy removal
CN111538823A (en) Information processing method, model training method, device, equipment and medium
Zhang et al. Re-Weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting.
CN114242113A (en) Voice detection method, training method and device and electronic equipment
US20240119716A1 (en) Method for multimodal emotion classification based on modal space assimilation and contrastive learning
CN109726392A (en) A kind of intelligent language Cognitive Information Processing Based and method based on big data
CN106897770B (en) Method and device for establishing license plate recognition model
CN111951785B (en) Voice recognition method and device and terminal equipment
CN116701996A (en) Multi-modal emotion analysis method, system, equipment and medium based on multiple loss functions
CN116524931A (en) System, method, electronic equipment and medium for converting voice of 5G rich media message into text
Li et al. Audio–visual keyword transformer for unconstrained sentence‐level keyword spotting
CN110717022A (en) Robot dialogue generation method and device, readable storage medium and robot
CN116186259A (en) Session cue scoring method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant