CN110377889A - A kind of method for editing text and system based on feedforward sequence Memory Neural Networks - Google Patents
A kind of method for editing text and system based on feedforward sequence Memory Neural Networks Download PDFInfo
- Publication number
- CN110377889A CN110377889A CN201910487145.1A CN201910487145A CN110377889A CN 110377889 A CN110377889 A CN 110377889A CN 201910487145 A CN201910487145 A CN 201910487145A CN 110377889 A CN110377889 A CN 110377889A
- Authority
- CN
- China
- Prior art keywords
- neural networks
- sequence memory
- memory neural
- feedforward sequence
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a kind of method for editing text based on feedforward sequence Memory Neural Networks, belong to speech signal processing technology, comprising: obtain urtext to be edited;Receive editor's voice data;Editor's voice data is used and carries out speech recognition based on improved feedforward sequence Memory Neural Networks, obtains edit commands;Semantic understanding is carried out to the edit commands, executes the edit commands.The exemplary technical solution of the present invention carries out speech recognition using based on improved feedforward sequence Memory Neural Networks, and text editing is more acurrate efficiently.
Description
Technical field
The invention belongs to speech signal processing technologies, specifically a kind of to be based on feedforward sequence Memory Neural Networks
Method for editing text and system.
Background technique
With popularizing for mobile phone, people can receive largely on the portable devices such as mobile phone or tablet computer daily
Text information.For example, message, web page contents and text news etc. that short message, instant messaging class software or other software push.When
When people want to edit word content interested in text information, it is necessary first to position a cursor over interested text
At word content, then subsequent operation carried out to the text chosen, such as increases text newly in cursor position, the text chosen is carried out
Replacement operation etc., editing process are complicated, inconvenient.Having technology at present is the voice data for receiving user's typing, further according to voice
Data execute corresponding edit operation to edit object.In this way, user is when carrying out text editing, it not only can be directly fast
Edit object in the selected text of speed, chooses operation without complex text, user can also be directly realized by by voice input
To the editor of edit object, text editing process is simplified.But operation is directly executed after current reception voice data, it is not right
Any processing of voice, under some far fields and the stronger situation of noise jamming, the performance of speech recognition system is not ideal enough,
Cause text editing inaccurate.
Summary of the invention
In order to solve above-mentioned deficiency in the prior art, the purpose of the present invention is to provide one kind to be remembered based on feedforward sequence
The method for editing text of neural network carries out speech recognition using based on improved feedforward sequence Memory Neural Networks, and text is compiled
It collects more acurrate efficient.
In order to solve the above-mentioned technical problem, the present invention adopts the following technical scheme:
On the one hand, the present invention provides a kind of method for editing text based on feedforward sequence Memory Neural Networks, specific to walk
Suddenly are as follows:
S1: urtext to be edited is obtained;
S2: editor's voice data is received;
S3: using editor's voice data and carry out speech recognition based on improved feedforward sequence Memory Neural Networks,
Obtain edit commands;
S4: semantic understanding is carried out to the edit commands, executes the edit commands.
It is further preferred that the improved feedforward sequence Memory Neural Networks are in the hidden of full Connection Neural Network that feedover
The linear projection layer of low dimensional is inserted between layer, by memory module equipment on the linear projection layer, in the adjacent memory
Module addition jumps connection, so that high-rise memory module can directly be tired out and be added to the output of low layer memory module.
It is further preferred that the memory module is tapped delay structure by current time and the hidden layer at moment exports before
A fixed expression is obtained by one group of coefficient coding.
It is further preferred that the operation of the memory module uses the coding based on scalar or vector.
It is further preferred that the coding of the memory module introduces the stride factor.
On the other hand, the present invention also provides a kind of text editing system based on feedforward sequence Memory Neural Networks, packets
It includes:
Acquisition unit, the configuration urtext to be edited with acquisition;
Receiving unit;It is configured to receive editor's voice data;
Recognition unit is configured to use based on improved feedforward sequence Memory Neural Networks editor's voice data
Speech recognition is carried out, edit commands is obtained;
Output unit is configured to carry out semantic understanding to the edit commands, executes the edit commands, output edit
Text.
On the other hand, the present invention also provides a kind of equipment, the equipment includes:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of places
It manages device and executes the exemplary any method for editing text based on feedforward sequence Memory Neural Networks of the present invention.
On the other hand, the present invention also provides a kind of computer readable storage medium for being stored with computer program, the journeys
The present invention exemplary any text editing based on feedforward sequence Memory Neural Networks is realized when sequence is executed by processor.
Compared with prior art, the invention has the benefit that
A kind of exemplary method for editing text based on feedforward sequence Memory Neural Networks of the present invention, obtains original to be edited
Beginning text and the voice data for receiving user's typing, executing corresponding edit operation to edit object further according to voice data is
Can, in this way, user when carrying out text editing, not only directly can quickly select the edit object in text, without complexity
Text chooses operation, and user can also be directly realized by the editor to edit object by voice input, simplify text editing
Journey.Speech recognition, text are carried out based on improved feedforward sequence Memory Neural Networks in addition, using to editor's voice data
It edits more acurrate efficient.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is the flow diagram of one embodiment of the invention;
Fig. 2 is the structural block diagram of improved feedforward sequence Memory Neural Networks.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, part relevant to invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, An embodiment provides a kind of texts based on feedforward sequence Memory Neural Networks
Edit methods, specific steps are as follows:
S1: urtext to be edited is obtained;
S2: editor's voice data is received;
S3: using editor's voice data and carry out speech recognition based on improved feedforward sequence Memory Neural Networks,
Obtain edit commands;
S4: semantic understanding is carried out to the edit commands, executes the edit commands.
The improved feedforward sequence Memory Neural Networks be inserted between the hidden layer for the full Connection Neural Network of feedovering it is low
The linear projection layer of dimension is jumped by memory module equipment on the linear projection layer in the adjacent memory module addition
Connection, so that high-rise memory module can directly be tired out and be added to the output of low layer memory module.
The memory module, which is tapped delay structure, passes through one group of coefficient for current time and before the hidden layer output at moment
Coding obtains a fixed expression.
The operation of the memory module uses the coding based on scalar or vector.
The coding of the memory module introduces the stride factor, and specific calculation formula is as follows:
WhereinRepresent the output of previous cFSMN-layer layers of memory module, s1 and s2 respectively represent review and to
The stride that future sees.Indicate that each moment takes an input when encoding to history if s1=2.Exist in this way
It is identical
Order in the case of, so that it may see farther history, so as to it is significantly more efficient to it is long when correlation carry out
Modeling.
The improved feedforward sequence Memory Neural Networks (cFSMN) of the present embodiment and existing Sigmoid-DNN, LSTM,
Performance and model parameter amount and each iteration of the speech recognition system of BLSTM, sFSMN and vFSMN on SWB database
Training time comparison, is shown in Table 1:
Table 1: the training time of performance and model parameter amount and each iteration of the speech recognition system on SWB database
The experimental results showed that those can effectively put the model of row modeling into long phase, such as LSTM and FSMN can be with
Obtain the significant performance boost of DNN.The secondary iteration of LSTM-needs 9.5 hours, and BLSTM then needs 23.2 hours.This is
Because NVIDIA Tesla K20GPU memory only has 3GB, so that the BLSTM based on BPTT training can only be parallel using 16 words,
And LSTM then can be parallel using 64 words.The vFSMN put forward can obtain a small amount of performance boost compared to BLSTM.
The model structure of vFSMN is simpler, and also more rapidly, the vFSMN of an iteration training substantially needs 6.9 small training speed
When, accelerate compared to the BLSTM training that can obtain 3 times.But the model parameter of vFSMN is but more than BLSTM.Further,
The total parameter of model can be reduced to 74MB by the cFSMN of proposition, compared to BLSTM, parameter amount can be reduced 60%.More
Importantly, iteration only needs 3.0 hours every time, accelerate compared to the BLSTM training that can obtain substantially 7 times.And it is based on
The model of cFSMN can obtain 12.5% Word Error Rate, mention compared to the BLSTM absolute performance that can obtain 0.9% point
It rises.
Improved feedforward sequence Memory Neural Networks are expressed as 216-N × [2048-P (N1,N2)]-M×2048-P-8911,
Wherein N and M respectively represents the number of cFSMN-layer and the full articulamentum of standard.P is the interstitial content of low-rank linear projection layer.
N1,N2Respectively represent the filter order reviewed He before seen.It is different configuration of to use improved feedforward sequence Memory Neural Networks
(cFSMN) acoustic model is shown in Table 2 in the performance test of FSH task:
Table 2: different configuration of to train the cFSMN acoustic model of deep layer in the performance of FSH task using quick connect
Experimental result: exp1 and exp2's results showed that using as formula (1) memory module encode formula, by setting
Set big stride, it can be seen that farther contextual information, it is hereby achieved that better performance.From exp2 to exp6, gradually
Increase the number of cFSMN-layer, model performance is gradually promoted.Connection is jumped eventually by addition, can successfully train one
Deep layer cFSMN comprising 12 cFSMN-layer and 2 full articulamentum is labeled as Deep-cFSMN, in Hub5e00 test set
The Word Error Rate of upper acquisition 9.3%.
On the other hand, the present invention also provides a kind of text editing system based on feedforward sequence Memory Neural Networks, packets
It includes:
Acquisition unit, the configuration urtext to be edited with acquisition;
Receiving unit;It is configured to receive editor's voice data;
Recognition unit is configured to use based on improved feedforward sequence Memory Neural Networks editor's voice data
Speech recognition is carried out, edit commands is obtained;
Output unit is configured to carry out semantic understanding to the edit commands, executes the edit commands, output edit
Text.
On the other hand, the present invention also provides a kind of equipment, the equipment includes:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of places
It manages device and executes the exemplary any method for editing text based on feedforward sequence Memory Neural Networks of the present invention.
On the other hand, the present invention also provides a kind of computer readable storage medium for being stored with computer program, the journeys
The present invention exemplary any text editing based on feedforward sequence Memory Neural Networks is realized when sequence is executed by processor.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art
Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature
Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Except for the technical features described in the specification, remaining technical characteristic is the known technology of those skilled in the art, is prominent
Innovative characteristics of the invention out, details are not described herein for remaining technical characteristic.
Claims (8)
1. a kind of method for editing text based on feedforward sequence Memory Neural Networks, it is characterised in that: specific steps are as follows:
S1: urtext to be edited is obtained;
S2: editor's voice data is received;
S3: editor's voice data is used and carries out speech recognition based on improved feedforward sequence Memory Neural Networks, is obtained
Edit commands;
S4: semantic understanding is carried out to the edit commands, executes the edit commands.
2. the method for editing text according to claim 1 based on feedforward sequence Memory Neural Networks, it is characterised in that: institute
Stating improved feedforward sequence Memory Neural Networks is to be inserted into the linear of low dimensional between the hidden layer for the full Connection Neural Network that feedovers
Projection layer jumps connection in the adjacent memory module addition, to make by memory module equipment on the linear projection layer
High-rise memory module can directly be tired out and be added to the output for obtaining low layer memory module.
3. the method for editing text according to claim 2 based on feedforward sequence Memory Neural Networks, it is characterised in that: institute
It states memory module to be tapped delay structure exports current time and the before hidden layer at moment and obtain one by one group of coefficient coding
The expression of a fixation.
4. the method for editing text according to claim 2 based on feedforward sequence Memory Neural Networks, it is characterised in that: institute
The operation for stating memory module uses the coding based on scalar or vector.
5. according to the method for editing text as claimed in claim 2 based on feedforward sequence Memory Neural Networks, it is characterised in that: described
The coding of memory module introduces the stride factor.
6. a kind of text editing system based on feedforward sequence Memory Neural Networks, comprising:
Acquisition unit, the configuration urtext to be edited with acquisition;
Receiving unit;It is configured to receive editor's voice data;
Recognition unit is configured to use based on the progress of improved feedforward sequence Memory Neural Networks editor's voice data
Speech recognition obtains edit commands;
Output unit is configured to carry out semantic understanding to the edit commands, executes the edit commands, output edit text
This.
7. a kind of equipment, the equipment include:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors
A kind of method for editing text based on feedforward sequence Memory Neural Networks that perform claim requires the office 1-5 to state.
8. a kind of computer readable storage medium for being stored with computer program realizes that right is wanted when the program is executed by processor
Seek a kind of any method for editing text based on feedforward sequence Memory Neural Networks of 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910487145.1A CN110377889B (en) | 2019-06-05 | 2019-06-05 | Text editing method and system based on feedforward sequence memory neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910487145.1A CN110377889B (en) | 2019-06-05 | 2019-06-05 | Text editing method and system based on feedforward sequence memory neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110377889A true CN110377889A (en) | 2019-10-25 |
CN110377889B CN110377889B (en) | 2023-06-20 |
Family
ID=68249843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910487145.1A Active CN110377889B (en) | 2019-06-05 | 2019-06-05 | Text editing method and system based on feedforward sequence memory neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110377889B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016101688A1 (en) * | 2014-12-25 | 2016-06-30 | 清华大学 | Continuous voice recognition method based on deep long-and-short-term memory recurrent neural network |
CN106919977A (en) * | 2015-12-25 | 2017-07-04 | 科大讯飞股份有限公司 | A kind of feedforward sequence Memory Neural Networks and its construction method and system |
-
2019
- 2019-06-05 CN CN201910487145.1A patent/CN110377889B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016101688A1 (en) * | 2014-12-25 | 2016-06-30 | 清华大学 | Continuous voice recognition method based on deep long-and-short-term memory recurrent neural network |
CN106919977A (en) * | 2015-12-25 | 2017-07-04 | 科大讯飞股份有限公司 | A kind of feedforward sequence Memory Neural Networks and its construction method and system |
Non-Patent Citations (1)
Title |
---|
王海坤等: "基于时域建模的自动语音识别", 《计算机工程与应用》 * |
Also Published As
Publication number | Publication date |
---|---|
CN110377889B (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109918680B (en) | Entity identification method and device and computer equipment | |
Lu et al. | Less is more: Pretrain a strong Siamese encoder for dense text retrieval using a weak decoder | |
WO2018157700A1 (en) | Method and device for generating dialogue, and storage medium | |
CN109086303A (en) | The Intelligent dialogue method, apparatus understood, terminal are read based on machine | |
WO2019076286A1 (en) | User intent recognition method and device for a statement | |
CN110377908B (en) | Semantic understanding method, semantic understanding device, semantic understanding equipment and readable storage medium | |
CN113239169B (en) | Answer generation method, device, equipment and storage medium based on artificial intelligence | |
Chi et al. | Speaker role contextual modeling for language understanding and dialogue policy learning | |
CN104199825A (en) | Information inquiry method and system | |
CN113935337A (en) | Dialogue management method, system, terminal and storage medium | |
JP7436077B2 (en) | Skill voice wake-up method and device | |
Tran et al. | WaveTransformer: A novel architecture for audio captioning based on learning temporal and time-frequency information | |
CN108959421A (en) | Candidate replys evaluating apparatus and inquiry reverting equipment and its method, storage medium | |
Lu et al. | Less is more: Pre-train a strong text encoder for dense retrieval using a weak decoder | |
CN109933773A (en) | A kind of multiple semantic sentence analysis system and method | |
CN110795547B (en) | Text recognition method and related product | |
CN116127328B (en) | Training method, training device, training medium and training equipment for dialogue state recognition model | |
CN116644168A (en) | Interactive data construction method, device, equipment and storage medium | |
CN110377889A (en) | A kind of method for editing text and system based on feedforward sequence Memory Neural Networks | |
CN112397053B (en) | Voice recognition method and device, electronic equipment and readable storage medium | |
CN114297352A (en) | Conversation state tracking method and device, man-machine conversation system and working machine | |
CN111508481A (en) | Training method and device of voice awakening model, electronic equipment and storage medium | |
CN115169367B (en) | Dialogue generating method and device, and storage medium | |
CN117474084B (en) | Bidirectional iteration method, equipment and medium for pre-training model and downstream sequence task | |
CN115064173B (en) | Voice recognition method and device, electronic equipment and computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |