CN110377889A - A kind of method for editing text and system based on feedforward sequence Memory Neural Networks - Google Patents

A kind of method for editing text and system based on feedforward sequence Memory Neural Networks Download PDF

Info

Publication number
CN110377889A
CN110377889A CN201910487145.1A CN201910487145A CN110377889A CN 110377889 A CN110377889 A CN 110377889A CN 201910487145 A CN201910487145 A CN 201910487145A CN 110377889 A CN110377889 A CN 110377889A
Authority
CN
China
Prior art keywords
neural networks
sequence memory
memory neural
feedforward sequence
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910487145.1A
Other languages
Chinese (zh)
Other versions
CN110377889B (en
Inventor
吴立刚
刘迪
邱镇
黄晓光
浦正国
梁翀
韩涛
张天奇
余江斌
宋杰
何东
郭庆
吴小华
胡心颖
周伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Anhui Jiyuan Software Co Ltd
National Network Information and Communication Industry Group Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Anhui Jiyuan Software Co Ltd
National Network Information and Communication Industry Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Zhejiang Electric Power Co Ltd, Anhui Jiyuan Software Co Ltd, National Network Information and Communication Industry Group Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201910487145.1A priority Critical patent/CN110377889B/en
Publication of CN110377889A publication Critical patent/CN110377889A/en
Application granted granted Critical
Publication of CN110377889B publication Critical patent/CN110377889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a kind of method for editing text based on feedforward sequence Memory Neural Networks, belong to speech signal processing technology, comprising: obtain urtext to be edited;Receive editor's voice data;Editor's voice data is used and carries out speech recognition based on improved feedforward sequence Memory Neural Networks, obtains edit commands;Semantic understanding is carried out to the edit commands, executes the edit commands.The exemplary technical solution of the present invention carries out speech recognition using based on improved feedforward sequence Memory Neural Networks, and text editing is more acurrate efficiently.

Description

A kind of method for editing text and system based on feedforward sequence Memory Neural Networks
Technical field
The invention belongs to speech signal processing technologies, specifically a kind of to be based on feedforward sequence Memory Neural Networks Method for editing text and system.
Background technique
With popularizing for mobile phone, people can receive largely on the portable devices such as mobile phone or tablet computer daily Text information.For example, message, web page contents and text news etc. that short message, instant messaging class software or other software push.When When people want to edit word content interested in text information, it is necessary first to position a cursor over interested text At word content, then subsequent operation carried out to the text chosen, such as increases text newly in cursor position, the text chosen is carried out Replacement operation etc., editing process are complicated, inconvenient.Having technology at present is the voice data for receiving user's typing, further according to voice Data execute corresponding edit operation to edit object.In this way, user is when carrying out text editing, it not only can be directly fast Edit object in the selected text of speed, chooses operation without complex text, user can also be directly realized by by voice input To the editor of edit object, text editing process is simplified.But operation is directly executed after current reception voice data, it is not right Any processing of voice, under some far fields and the stronger situation of noise jamming, the performance of speech recognition system is not ideal enough, Cause text editing inaccurate.
Summary of the invention
In order to solve above-mentioned deficiency in the prior art, the purpose of the present invention is to provide one kind to be remembered based on feedforward sequence The method for editing text of neural network carries out speech recognition using based on improved feedforward sequence Memory Neural Networks, and text is compiled It collects more acurrate efficient.
In order to solve the above-mentioned technical problem, the present invention adopts the following technical scheme:
On the one hand, the present invention provides a kind of method for editing text based on feedforward sequence Memory Neural Networks, specific to walk Suddenly are as follows:
S1: urtext to be edited is obtained;
S2: editor's voice data is received;
S3: using editor's voice data and carry out speech recognition based on improved feedforward sequence Memory Neural Networks, Obtain edit commands;
S4: semantic understanding is carried out to the edit commands, executes the edit commands.
It is further preferred that the improved feedforward sequence Memory Neural Networks are in the hidden of full Connection Neural Network that feedover The linear projection layer of low dimensional is inserted between layer, by memory module equipment on the linear projection layer, in the adjacent memory Module addition jumps connection, so that high-rise memory module can directly be tired out and be added to the output of low layer memory module.
It is further preferred that the memory module is tapped delay structure by current time and the hidden layer at moment exports before A fixed expression is obtained by one group of coefficient coding.
It is further preferred that the operation of the memory module uses the coding based on scalar or vector.
It is further preferred that the coding of the memory module introduces the stride factor.
On the other hand, the present invention also provides a kind of text editing system based on feedforward sequence Memory Neural Networks, packets It includes:
Acquisition unit, the configuration urtext to be edited with acquisition;
Receiving unit;It is configured to receive editor's voice data;
Recognition unit is configured to use based on improved feedforward sequence Memory Neural Networks editor's voice data Speech recognition is carried out, edit commands is obtained;
Output unit is configured to carry out semantic understanding to the edit commands, executes the edit commands, output edit Text.
On the other hand, the present invention also provides a kind of equipment, the equipment includes:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of places It manages device and executes the exemplary any method for editing text based on feedforward sequence Memory Neural Networks of the present invention.
On the other hand, the present invention also provides a kind of computer readable storage medium for being stored with computer program, the journeys The present invention exemplary any text editing based on feedforward sequence Memory Neural Networks is realized when sequence is executed by processor.
Compared with prior art, the invention has the benefit that
A kind of exemplary method for editing text based on feedforward sequence Memory Neural Networks of the present invention, obtains original to be edited Beginning text and the voice data for receiving user's typing, executing corresponding edit operation to edit object further according to voice data is Can, in this way, user when carrying out text editing, not only directly can quickly select the edit object in text, without complexity Text chooses operation, and user can also be directly realized by the editor to edit object by voice input, simplify text editing Journey.Speech recognition, text are carried out based on improved feedforward sequence Memory Neural Networks in addition, using to editor's voice data It edits more acurrate efficient.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is the flow diagram of one embodiment of the invention;
Fig. 2 is the structural block diagram of improved feedforward sequence Memory Neural Networks.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, An embodiment provides a kind of texts based on feedforward sequence Memory Neural Networks Edit methods, specific steps are as follows:
S1: urtext to be edited is obtained;
S2: editor's voice data is received;
S3: using editor's voice data and carry out speech recognition based on improved feedforward sequence Memory Neural Networks, Obtain edit commands;
S4: semantic understanding is carried out to the edit commands, executes the edit commands.
The improved feedforward sequence Memory Neural Networks be inserted between the hidden layer for the full Connection Neural Network of feedovering it is low The linear projection layer of dimension is jumped by memory module equipment on the linear projection layer in the adjacent memory module addition Connection, so that high-rise memory module can directly be tired out and be added to the output of low layer memory module.
The memory module, which is tapped delay structure, passes through one group of coefficient for current time and before the hidden layer output at moment Coding obtains a fixed expression.
The operation of the memory module uses the coding based on scalar or vector.
The coding of the memory module introduces the stride factor, and specific calculation formula is as follows:
WhereinRepresent the output of previous cFSMN-layer layers of memory module, s1 and s2 respectively represent review and to
The stride that future sees.Indicate that each moment takes an input when encoding to history if s1=2.Exist in this way It is identical
Order in the case of, so that it may see farther history, so as to it is significantly more efficient to it is long when correlation carry out Modeling.
The improved feedforward sequence Memory Neural Networks (cFSMN) of the present embodiment and existing Sigmoid-DNN, LSTM, Performance and model parameter amount and each iteration of the speech recognition system of BLSTM, sFSMN and vFSMN on SWB database Training time comparison, is shown in Table 1:
Table 1: the training time of performance and model parameter amount and each iteration of the speech recognition system on SWB database
The experimental results showed that those can effectively put the model of row modeling into long phase, such as LSTM and FSMN can be with Obtain the significant performance boost of DNN.The secondary iteration of LSTM-needs 9.5 hours, and BLSTM then needs 23.2 hours.This is Because NVIDIA Tesla K20GPU memory only has 3GB, so that the BLSTM based on BPTT training can only be parallel using 16 words, And LSTM then can be parallel using 64 words.The vFSMN put forward can obtain a small amount of performance boost compared to BLSTM. The model structure of vFSMN is simpler, and also more rapidly, the vFSMN of an iteration training substantially needs 6.9 small training speed When, accelerate compared to the BLSTM training that can obtain 3 times.But the model parameter of vFSMN is but more than BLSTM.Further, The total parameter of model can be reduced to 74MB by the cFSMN of proposition, compared to BLSTM, parameter amount can be reduced 60%.More Importantly, iteration only needs 3.0 hours every time, accelerate compared to the BLSTM training that can obtain substantially 7 times.And it is based on The model of cFSMN can obtain 12.5% Word Error Rate, mention compared to the BLSTM absolute performance that can obtain 0.9% point It rises.
Improved feedforward sequence Memory Neural Networks are expressed as 216-N × [2048-P (N1,N2)]-M×2048-P-8911, Wherein N and M respectively represents the number of cFSMN-layer and the full articulamentum of standard.P is the interstitial content of low-rank linear projection layer. N1,N2Respectively represent the filter order reviewed He before seen.It is different configuration of to use improved feedforward sequence Memory Neural Networks (cFSMN) acoustic model is shown in Table 2 in the performance test of FSH task:
Table 2: different configuration of to train the cFSMN acoustic model of deep layer in the performance of FSH task using quick connect
Experimental result: exp1 and exp2's results showed that using as formula (1) memory module encode formula, by setting Set big stride, it can be seen that farther contextual information, it is hereby achieved that better performance.From exp2 to exp6, gradually Increase the number of cFSMN-layer, model performance is gradually promoted.Connection is jumped eventually by addition, can successfully train one Deep layer cFSMN comprising 12 cFSMN-layer and 2 full articulamentum is labeled as Deep-cFSMN, in Hub5e00 test set The Word Error Rate of upper acquisition 9.3%.
On the other hand, the present invention also provides a kind of text editing system based on feedforward sequence Memory Neural Networks, packets It includes:
Acquisition unit, the configuration urtext to be edited with acquisition;
Receiving unit;It is configured to receive editor's voice data;
Recognition unit is configured to use based on improved feedforward sequence Memory Neural Networks editor's voice data Speech recognition is carried out, edit commands is obtained;
Output unit is configured to carry out semantic understanding to the edit commands, executes the edit commands, output edit Text.
On the other hand, the present invention also provides a kind of equipment, the equipment includes:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of places It manages device and executes the exemplary any method for editing text based on feedforward sequence Memory Neural Networks of the present invention.
On the other hand, the present invention also provides a kind of computer readable storage medium for being stored with computer program, the journeys The present invention exemplary any text editing based on feedforward sequence Memory Neural Networks is realized when sequence is executed by processor.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.
Except for the technical features described in the specification, remaining technical characteristic is the known technology of those skilled in the art, is prominent Innovative characteristics of the invention out, details are not described herein for remaining technical characteristic.

Claims (8)

1. a kind of method for editing text based on feedforward sequence Memory Neural Networks, it is characterised in that: specific steps are as follows:
S1: urtext to be edited is obtained;
S2: editor's voice data is received;
S3: editor's voice data is used and carries out speech recognition based on improved feedforward sequence Memory Neural Networks, is obtained Edit commands;
S4: semantic understanding is carried out to the edit commands, executes the edit commands.
2. the method for editing text according to claim 1 based on feedforward sequence Memory Neural Networks, it is characterised in that: institute Stating improved feedforward sequence Memory Neural Networks is to be inserted into the linear of low dimensional between the hidden layer for the full Connection Neural Network that feedovers Projection layer jumps connection in the adjacent memory module addition, to make by memory module equipment on the linear projection layer High-rise memory module can directly be tired out and be added to the output for obtaining low layer memory module.
3. the method for editing text according to claim 2 based on feedforward sequence Memory Neural Networks, it is characterised in that: institute It states memory module to be tapped delay structure exports current time and the before hidden layer at moment and obtain one by one group of coefficient coding The expression of a fixation.
4. the method for editing text according to claim 2 based on feedforward sequence Memory Neural Networks, it is characterised in that: institute The operation for stating memory module uses the coding based on scalar or vector.
5. according to the method for editing text as claimed in claim 2 based on feedforward sequence Memory Neural Networks, it is characterised in that: described The coding of memory module introduces the stride factor.
6. a kind of text editing system based on feedforward sequence Memory Neural Networks, comprising:
Acquisition unit, the configuration urtext to be edited with acquisition;
Receiving unit;It is configured to receive editor's voice data;
Recognition unit is configured to use based on the progress of improved feedforward sequence Memory Neural Networks editor's voice data Speech recognition obtains edit commands;
Output unit is configured to carry out semantic understanding to the edit commands, executes the edit commands, output edit text This.
7. a kind of equipment, the equipment include:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors A kind of method for editing text based on feedforward sequence Memory Neural Networks that perform claim requires the office 1-5 to state.
8. a kind of computer readable storage medium for being stored with computer program realizes that right is wanted when the program is executed by processor Seek a kind of any method for editing text based on feedforward sequence Memory Neural Networks of 1-5.
CN201910487145.1A 2019-06-05 2019-06-05 Text editing method and system based on feedforward sequence memory neural network Active CN110377889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910487145.1A CN110377889B (en) 2019-06-05 2019-06-05 Text editing method and system based on feedforward sequence memory neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910487145.1A CN110377889B (en) 2019-06-05 2019-06-05 Text editing method and system based on feedforward sequence memory neural network

Publications (2)

Publication Number Publication Date
CN110377889A true CN110377889A (en) 2019-10-25
CN110377889B CN110377889B (en) 2023-06-20

Family

ID=68249843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910487145.1A Active CN110377889B (en) 2019-06-05 2019-06-05 Text editing method and system based on feedforward sequence memory neural network

Country Status (1)

Country Link
CN (1) CN110377889B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016101688A1 (en) * 2014-12-25 2016-06-30 清华大学 Continuous voice recognition method based on deep long-and-short-term memory recurrent neural network
CN106919977A (en) * 2015-12-25 2017-07-04 科大讯飞股份有限公司 A kind of feedforward sequence Memory Neural Networks and its construction method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016101688A1 (en) * 2014-12-25 2016-06-30 清华大学 Continuous voice recognition method based on deep long-and-short-term memory recurrent neural network
CN106919977A (en) * 2015-12-25 2017-07-04 科大讯飞股份有限公司 A kind of feedforward sequence Memory Neural Networks and its construction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王海坤等: "基于时域建模的自动语音识别", 《计算机工程与应用》 *

Also Published As

Publication number Publication date
CN110377889B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN109918680B (en) Entity identification method and device and computer equipment
Lu et al. Less is more: Pretrain a strong Siamese encoder for dense text retrieval using a weak decoder
WO2018157700A1 (en) Method and device for generating dialogue, and storage medium
CN109086303A (en) The Intelligent dialogue method, apparatus understood, terminal are read based on machine
WO2019076286A1 (en) User intent recognition method and device for a statement
CN110377908B (en) Semantic understanding method, semantic understanding device, semantic understanding equipment and readable storage medium
CN113239169B (en) Answer generation method, device, equipment and storage medium based on artificial intelligence
Chi et al. Speaker role contextual modeling for language understanding and dialogue policy learning
CN104199825A (en) Information inquiry method and system
CN113935337A (en) Dialogue management method, system, terminal and storage medium
JP7436077B2 (en) Skill voice wake-up method and device
Tran et al. WaveTransformer: A novel architecture for audio captioning based on learning temporal and time-frequency information
CN108959421A (en) Candidate replys evaluating apparatus and inquiry reverting equipment and its method, storage medium
Lu et al. Less is more: Pre-train a strong text encoder for dense retrieval using a weak decoder
CN109933773A (en) A kind of multiple semantic sentence analysis system and method
CN110795547B (en) Text recognition method and related product
CN116127328B (en) Training method, training device, training medium and training equipment for dialogue state recognition model
CN116644168A (en) Interactive data construction method, device, equipment and storage medium
CN110377889A (en) A kind of method for editing text and system based on feedforward sequence Memory Neural Networks
CN112397053B (en) Voice recognition method and device, electronic equipment and readable storage medium
CN114297352A (en) Conversation state tracking method and device, man-machine conversation system and working machine
CN111508481A (en) Training method and device of voice awakening model, electronic equipment and storage medium
CN115169367B (en) Dialogue generating method and device, and storage medium
CN117474084B (en) Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task
CN115064173B (en) Voice recognition method and device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant