CN110047477A - A kind of optimization method, equipment and the system of weighted finite state interpreter - Google Patents
A kind of optimization method, equipment and the system of weighted finite state interpreter Download PDFInfo
- Publication number
- CN110047477A CN110047477A CN201910271141.XA CN201910271141A CN110047477A CN 110047477 A CN110047477 A CN 110047477A CN 201910271141 A CN201910271141 A CN 201910271141A CN 110047477 A CN110047477 A CN 110047477A
- Authority
- CN
- China
- Prior art keywords
- token
- finite state
- weighted finite
- data structure
- interpreter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000005457 optimization Methods 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012546 transfer Methods 0.000 claims abstract description 47
- 238000003860 storage Methods 0.000 claims abstract description 46
- 230000007246 mechanism Effects 0.000 claims abstract description 8
- 238000005520 cutting process Methods 0.000 claims abstract description 7
- 238000004590 computer program Methods 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 abstract description 25
- 238000006243 chemical reaction Methods 0.000 abstract description 16
- 238000010845 search algorithm Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 18
- 230000008520 organization Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 230000006872 improvement Effects 0.000 description 7
- 230000006835 compression Effects 0.000 description 6
- 238000007906 compression Methods 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 230000003247 decreasing effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 210000001072 colon Anatomy 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000007257 malfunction Effects 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 241000238558 Eucarida Species 0.000 description 1
- 241000252794 Sphinx Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000003739 neck Anatomy 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides optimization method, system, computer equipment and the computer readable storage mediums of a kind of weighted finite state interpreter, are related to technical field of voice recognition.The system includes data structure optimization module, for optimizing to Token data structure;Switching mechanism models block, for carrying out transfer plus chain to acoustic output and language model with the Token data structure after optimizing, obtains weighted finite state interpreter;Interpreter cuts module, for cutting to the weighted finite state interpreter;Optimum route search module is searched in a manner of traverse node for the Token data structure in the corresponding search space of the weighted finite state interpreter after optimization, obtains optimal path.The present invention reduces the conversion between different structure, reduces the expense of memory and calculating, reduces the access times of internal storage data, optimizes optimum route search algorithm, realizes the purpose of boosting algorithm efficiency.
Description
Technical field
Process field of the present invention about voice signal is concretely a kind of especially with regard to the identification technology of voice
Optimization method, system, computer equipment and the computer readable storage medium of weighted finite state interpreter.
Background technique
Background that this section is intended to provide an explanation of the embodiments of the present invention set forth in the claims or context.Herein
Description recognizes it is the prior art not because not being included in this section.
Speech recognition be the voice of the mankind is converted to it is computer-readable enter text.Speech recognition technology is in every field
In effect it is increasingly prominent very strong.In a government office, commercial department, apply in family life, the work of people can be given
Make and life brings great convenience.With the development of internet, digitizer and multimedia technology, speech recognition is more next
More paid attention to, voice Related product also emerges one after another, such as the Siri of apple, the Echo of Amazon, Home of Google etc..This
A little voice Related products were quickly introduced to the market in recent years, had received public favorable comment.Future speech identification will apply to more necks
Domain, medical, intelligent vehicle-carried, smart home, education etc..
Speech recognition system, which generally comprises, obtains feature vector, acoustic model, language model and decoder.Such as Fig. 1 institute
Show, O and W respectively the observation feature vector of training sentence and corresponding word sequence, and P (O | W) it is acoustic model probability, P (W) mark
The matching degree for knowing speech acoustics feature and word sequence W, when P (W) P (O | W) reaches maximum value, word sequence W* is as speech recognition
Output.It can be seen that decoder is one of core of speech recognition system, the quality of decoder will have a direct impact on final knowledge
Other result it is excellent.In recent years, many decoding strategies and various decoding functions are applied in decoder.For example, HTK
The Hvite decoding tool of (Hidden Markov Model Toolkit), Sphinx decoder, TODE decoder etc..These solutions
The common drawback of code device is that the form application in the linguistries such as represented acoustics, voice, dictionary source is very raw in a decoder
Firmly, so that modification operation later is very cumbersome.In order to find more flexible decoder architecture, " weighted finite state conversion
The concept of machine " (WFST) is suggested, and theory is the grammar construct and characteristic with WFST model come simulation language.
The core concept of WFST is to mark acoustic output, language model with a weighted finite state interpreter respectively
Know, a complete weighted finite state interpreter model is then integrated by combinational algorithm, so as to obtain needle
To the search space of sample characteristics, optimal path is finally searched out in search space.So WFST mainly includes three parts:
Finite state machine building, finite state machine cut, search for optimal path.
Finite state machine is a kind of simple and effective mathematical model.Since finite state machine all has over time and space
There is high efficiency, so that it is widely used in speech recognition and natural language processing, core ideas is to identify traditional voice
Mathematical model in system is converted into finite state machine model respectively, then the model after conversion is effectively integrated and optimized, and obtains
To search space.So the advantages of operation, is to have unified the representation of model, so that integrating different resources becomes more
It is easy, and greatlies simplify the complexity of speech recognition system.Weighted finite state interpreter (WFST) is limited
A kind of special shape of state machine.
In finite state machine, with point identification state, the direction line segment with haircut indicates transfer, the character representation in transfer
Input character.Original state is indicated with overstriking circle, and two-wire circle indicates final state.When a state is both original state,
When being also final state, indicated with double thick line circles, as shown in Figure 2.
State contained by finite state machine be it is limited, only one of them is original state, the above final state of zero.
Finite state machine is entered by original state, is shifted by inputting character, reaches next state, is turned when completing the last one
The state reached after shifting is final state, then exports receiving, otherwise output refusal.Transfer is indicated with directive camber line.Have
A series of state and transfer constitute path in limit state machine.
In order to better describe the characteristic of finite state machine, setup parameter weighted value assigns different transfers different power
Weight, just generates weighted finite state machine.Weighted finite state machine is as shown in figure 3, the colon left side is input character, colon the right
It is output character, is the weight of transfer on the right of oblique line.If a paths are P=p1 ... pi ... pn, wherein pi indicates the path
I-th of transfer, wherein i=1,2 ..., n.W [pi] indicates the weighted value of path pi, and λ is the set of original state weight.η is eventually
The only set of state weight.The weight of the original state of path P is λ [p1], and the weight of final state is η [pn], then path P institute
There is the weighted value in transfer are as follows:
The optimal path of weighted finite state machine are as follows:
After generating weighted finite state machine, in order to reduce the search space size of speech recognition and improve identification effect
Rate, will use in next step merge algorithm to the path of Similar Track and weight very little cut and merge generation simplify after
Finite state machine.However, the search space process that finite state machine generates is more complicated since language model is very huge, and
And search space is also larger, leads to current WFST algorithm very holding time and space resources, to the optimization work of WFST algorithm
Make just very necessary.
The structure of WFST mileage evidence has StdLattice, Token, Lattice, CompactLattice.Wherein,
StdLattice is mainly used in storage of the language model to data, and Token is after storing acoustic model and language model combination
The weighted finite state machine of generation, Lattice and CompactLattice are used in GetBestPath (search optimal path) mould
In block.Token is two-dimensional network topology structure, and Lattice is the table structure being converted to by Token structure,
CompactLattice is the table structure obtained by Lattice conversion.Fig. 4 is Token structure, and Fig. 5 is
CompactLattice structure.
As shown in the above, WFST mono- has shared 4 data structures and has stored to data, these four data structures point
Do not play the role of different.In order to guarantee that the discrimination of speech recognition, WFST need to save a large amount of intermediate state and path, institute
With the calculating and memory source for resulting in conversion, traversal and the storage consumption of these four data structures very more.Secondly,
GetBestPath (search optimal path) module is very time-consuming module in WFST.It needs to calculate in GetBeastPath
The cost of every possible path out, WFST use the very high Depth Priority Algorithm of complexity to realize the function,
Cause the time-consuming of algorithm very high.Finally, having some redundant operations in WFST, can merging and optimizing.
Therefore, a kind of new weighted finite state interpreter how is provided, to solve existing weighted finite state interpreter
The existing above problem is urgent technical problem to be solved in the field.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of optimization methods of weighted finite state interpreter, system, calculating
Machine equipment and computer readable storage medium merge storage organization by compression, reduce the conversion between different structure, substantially
The expense for reducing memory and calculating is greatly decreased the access times of internal storage data, realizes and promoted by optimizing the realization of WFST
The purpose of efficiency of algorithm, by the searching algorithm of Optimizing Search optimal path, the exponential number for reducing traverse node realizes effect
Rate is substantially improved.
It is an object of the invention to provide a kind of optimization methods of weighted finite state interpreter, comprising:
Token data structure is optimized;
Transfer plus chain are carried out with the Token data structure after optimizing to acoustic output and language model, obtaining weighting has
Limit State Transformer;
The weighted finite state interpreter is cut;
In the corresponding search space of the weighted finite state interpreter by optimization after Token data structure with time
The mode of joint-running point is searched for, and optimal path is obtained.
It is an object of the invention to provide a kind of optimization systems of weighted finite state interpreter, comprising:
Data structure optimization module, for being optimized to Token data structure;
Switching mechanism models block, for being carried out to acoustic output and language model with the Token data structure after optimizing
Transfer plus chain, obtain weighted finite state interpreter;
Interpreter cuts module, for cutting to the weighted finite state interpreter;
Optimum route search module, for passing through optimization in the corresponding search space of the weighted finite state interpreter
Token data structure afterwards is searched in a manner of traverse node, obtains optimal path.
It is an object of the invention to provide a kind of computer equipments, comprising: be adapted for carrying out each instruction processor and
Equipment is stored, the storage equipment is stored with a plurality of instruction, and described instruction, which is suitable for being loaded by processor and being executed a kind of weight, to be had
Limit the optimization method of State Transformer.
It is an object of the invention to provide a kind of computer readable storage mediums, are stored with computer program, the meter
Calculation machine program is used to execute a kind of optimization method of weighted finite state interpreter.
The beneficial effects of the present invention are provide optimization method, the system, calculating of a kind of weighted finite state interpreter
Machine equipment and computer readable storage medium merge storage organization by compression, reduce the conversion between different structure, substantially
The expense for reducing memory and calculating is greatly decreased the access times of internal storage data, realizes and promoted by optimizing the realization of WFST
The purpose of efficiency of algorithm, by the searching algorithm of Optimizing Search optimal path, the exponential number for reducing traverse node realizes effect
Rate is substantially improved, and solves weighted finite state machine (WFST) in the prior art and occupies asking for very high time and space resources
Topic.
For above and other objects, features and advantages of the invention can be clearer and more comprehensible, preferred embodiment is cited below particularly,
And cooperate institute's accompanying drawings, it is described in detail below.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is the architecture diagram of speech recognition;
Fig. 2 is the structure chart of finite state machine;
Fig. 3 is the structure chart of weighted finite state machine;
Fig. 4 is the structure chart of the storage organization Token of WFST in the prior art;
Fig. 5 is the structure chart of the storage organization CompactLattice of WFST in the prior art;
Fig. 6 is a kind of flow chart of the optimization method of weighted finite state interpreter provided in an embodiment of the present invention;
Fig. 7 is the specific flow chart of the step S101 in Fig. 6;
Fig. 8 is the specific flow chart of the step S102 in Fig. 6;
Fig. 9 is the specific flow chart of the step S303 in Fig. 8;
Figure 10 is the specific flow chart of the step S304 in Fig. 8;
Figure 11 is the specific flow chart of the step S104 in Fig. 6;
Figure 12 is a kind of structural schematic diagram of the optimization system of weighted finite state interpreter provided in an embodiment of the present invention;
Figure 13 is that data structure is excellent in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
Change the structural schematic diagram of module;
Figure 14 is interpreter building in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of module;
Figure 15 is that first transfer adds in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of chain module;
Figure 16 is that second transfer adds in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of chain module;
Figure 17 is that optimal path is searched in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of rope module.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
One skilled in the art will appreciate that embodiments of the present invention can be implemented as a kind of system, device, method or calculating
Machine program product.Therefore, disclose can be with specific implementation is as follows by the present invention, it may be assumed that complete hardware, complete software (packet
Include firmware, resident software, microcode etc.) or hardware and software combine form.
Below with reference to several representative embodiments of the invention, the principle and spirit of the present invention are explained in detail.
Figure 12 is a kind of structural schematic diagram of the optimization system of weighted finite state interpreter provided in an embodiment of the present invention,
Referring to Figure 12, the optimization system of the weighted finite state interpreter includes:
Data structure optimization module 100, for being optimized to Token data structure.
Switching mechanism models block 200, for the Token data structure after optimizing to acoustic output and language model into
Row transfer plus chain, obtain weighted finite state interpreter.
Interpreter cuts module 300, for cutting to the weighted finite state interpreter;
Optimum route search module 400, for passing through in the corresponding search space of the weighted finite state interpreter
Token data structure after optimization is searched in a manner of traverse node, obtains optimal path.
That is, in the present invention, firstly, the access times of internal storage data are greatly decreased by the realization of optimization WFST, it is real
The purpose of existing boosting algorithm efficiency.Secondly, compression merges storage organization, the conversion between different structure is reduced, is substantially reduced
The expense of memory and calculating.Finally, the searching algorithm of optimization GetBestPath, the exponential number for reducing traverse node, are realized
Efficiency is substantially improved.
In one embodiment of the invention, acoustic output can be obtained by acoustic output determining module, specifically, sound
It learns output determining module and obtains acoustic output for voice to be input to an acoustic model.
Below with reference to specific attached drawing, technical solution of the present invention and innovative point is discussed in detail.
In the prior art, as many as there are four types of WFST data store organisations, conversion is carried out between data structure and being brought very greatly
Resource consumption.The present invention integrates and optimizes for excessive data structure, to Token structure of modification in the prior art
Upgrading, to realize that Lattice and CompactLattice have the function of supporting GetBestPath (search optimal path), from
And data structure can be reduced into StdLattice, Token two, reduce the number of data conversion, so that computationally intensive
Width is reduced, and the peak value of memory consumption reduces 50% or so,
Specifically, Figure 13 is to convert in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of Mechanism Modeling block, please refers to Figure 13, and switching mechanism modeling block 100 includes:
Data structure adding module 101, for adding in Token data structure for storing previous frame node address
Token*last structure;
Data variable adding module 102, for adding the cost value for storing each path in Token data structure
Tot_cost variable.
That is, first forward direction calculates the cost of each path and is saved in the prior art in search optimal path module
In the node of last frame, the node of minimum cost is obtained, then retracts forward and obtains optimal path.However it is in the prior art
The each path of Token structure is single-track link table, cannot achieve Backward Function.Therefore, the invention in Token structure
In addition storage previous frame node address Token*last structure, so it is subsequent Token generate when can recorde
One frame address of node solves the problems, such as that Token can not retract.In addition, original token structure does not have storage each path
Cost value function, the invention in Token structure addition storage each path cost value tot_cost
Variable.
In the prior art, Token is most important storage organization in WFST, it is two-dimensional topological structure, be can be convenient
Expression interframe and frame in directional information, so the data after language model and acoustics models coupling are using Token structure
Storage, however since data volume is very huge, so the creation of Token is also more time-consuming.
The creation of Token is there are two part in the prior art, and one is addition non-empty transfer ProcessEmitting, separately
One is that addition idle running moves NoneProcessEmitting.The two steps are to separate to call in WFST, so creating
Each node at least traverses twice when Token structure, and the present invention moves non-empty transfer ProcessEmitting and idle running
NoneProcessEmitting is merged, to realize that node traverses number halves, the access time of memory is greatly reduced
Number, since access internal storage data occupies the time-consuming of larger proportion in creation Token, to realize the creation part Token
Efficiency optimization.
Specifically, Figure 14 is to convert in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of Mechanism Modeling block, please refers to Figure 14, and switching mechanism modeling block 200 includes:
First frame obtains module 201, for obtaining the first frame of the acoustic output as present frame, the acoustic output
It is made of multiple frames;
Primary transfer obtains module 202, for once obtaining out the corresponding sky of the present frame in the language model
Transfer and non-empty transfer;
First transfer plus chain module 203, for according to the corresponding idle running shift-in row transfer of the present frame plus chain;
Second transfer plus chain module 204, for being transferred into row transfer plus chain according to the corresponding non-empty of the present frame;
Frame spider module 205 is stored in for traversing multiple frames of the acoustic output, and by previous frame node address
Token*last structure.
Figure 15 is that first transfer adds in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of chain module, referring to Figure 15, first transfer plus chain module 203 include:
First judgment module 2031, it is described for judging whether the corresponding idle running shifting of the present frame meets first threshold
First threshold includes fixed threshold and dynamic threshold;
First adds chain module 2032, for when the first judgment module is judged as YES, the present frame to add chain;
First gives up module 2033, for giving up the present frame when the first judgment module is judged as NO.
Figure 16 is that second transfer adds in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of chain module, referring to Figure 16, second transfer plus chain module 204 include:
Second judgment module 2041, for judging whether the corresponding non-empty transfer of the present frame meets second threshold, institute
Stating second threshold includes fixed threshold and dynamic threshold;
Second adds chain module 2042, for when second judgment module is judged as YES, the present frame to add chain;
Second gives up module 2043, for giving up the present frame when second judgment module is judged as NO.
That is, in the present invention, by the way that non-empty is shifted ProcessEmitting and idle running shifting
NoneProcessEmitting, which merges, realizes that the amount of access of datarams halves.Specifically, when adding Token, in language mould
Corresponding idle running shifting and non-empty transfer are once obtained in type, then successively judge whether the idle running shifting and non-empty transfer meet threshold
Value adds chain if meeting, gives up if being unsatisfactory for.The generation of in the prior art plus chain threshold value is separate computations,
Idle running is moved in the present invention and non-empty shifts after merging, first threshold, second threshold are unified calculations.
In the prior art, WFST most time-consuming module is GetBestPath (search optimal path) part.
The target of GetBestPath is the cost for calculating each path, then selects the smallest path cost.In order to calculate each path
Cost need using path search algorithm.WFST has used Depth Priority Algorithm.But Depth Priority Algorithm is multiple
Miscellaneous degree is very high, very time-consuming.
Figure 17 is that optimal path is searched in a kind of optimization system of weighted finite state interpreter provided in an embodiment of the present invention
The structural schematic diagram of rope module, referring to Figure 17, the optimum route search module 400 includes:
Cost value determining module 401, for arranging the Token chained list in frame, so that chained list discharges in order in frame, successively
It traverses in frame and the Token chained list of interframe, adds up the weight of Token chain road according to points relationship, obtain each path
Cost value stores into tot_cost variable, while recording the pointer of father node into Token*last;
Cost value comparison module 402, the Token chain of the last frame for taking the weighted finite state interpreter after cutting
Table adds finalcost value, the cost size of more each node to the cost of each node;
Optimal path determining module 403, for taking the smallest cost node and the link that retracts is until first node, to obtain
Optimal path.
That is, the Token data structure after present invention optimization, the interframe direction of Token is unidirectionally created, so interframe
There is no inverted order directions, and because of the arrangement to chained list direction has been done in frame before looking for optimal path, so that node refers to
To can only refer to a direction, also there is no inverted orders to be directed toward in same frame.So tired by the sequence of two dimensions in interframe and frame
Cost is added not malfunction.The sequence of Token structural integrity saved in interframe and frame, so may be implemented using Token excellent
Change node searching.
As shown in figure 4, Token is two-dimensional topological structure, first frame is node 1, and the second frame is node 2,3, third frame
It is node 4,5,6, traversal order is exactly the sequence according to node 1,2,3,4,5,6.It completely remains single frames and interframe
Information, since Token structure is (because Token is the sequential build according to frame) being directed toward there is no the inverted order of interframe, institute
Depth Priority Algorithm is used with unnecessary when doing route searching, and the much lower node time of complexity can be used
It goes through, the sequence of traversal carries out in the way of interframe rear in first frame.The complexity of Depth Priority Algorithm is O (n2), traversal
Complexity be O (n), it can be seen that improved efficiencyTimes, since number of nodes n is very big, so improved efficiency width
It spends larger.
As above it is a kind of optimization system of weighted finite state interpreter provided by the invention, storage is merged by compression
Structure reduces the conversion between different structure, substantially reduces the expense of memory and calculating, by optimizing the realization of WFST, greatly
Width reduces the access times of internal storage data, realizes the purpose of boosting algorithm efficiency, is calculated by the search of Optimizing Search optimal path
Method, the exponential number for reducing traverse node, realizes being substantially improved for efficiency.
In addition, although being referred to several unit modules of system in the above detailed description, it is this to divide only simultaneously
Non-imposed.In fact, embodiment according to the present invention, the feature and function of two or more above-described units can
To embody in a unit.Equally, the feature and function of an above-described unit can also be served as reasons with further division
Multiple units embody.Terms used above " module " and " unit ", can be realize predetermined function software and/or
Hardware.Although module described in following embodiment is preferably realized with software, the group of hardware or software and hardware
The realization of conjunction is also that may and be contemplated.
After describing the optimization system of weighted finite state interpreter of exemplary embodiment of the invention, connect down
Come, is introduced with reference to method of the attached drawing to exemplary embodiment of the invention.The implementation of this method may refer to above-mentioned entirety
Implementation, overlaps will not be repeated.
Fig. 6 is a kind of flow diagram of the optimization method of weighted finite state interpreter provided in an embodiment of the present invention,
Fig. 6 is referred to, the optimization method of the weighted finite state interpreter includes:
S101: Token data structure is optimized.
S102: transfer plus chain are carried out with the Token data structure after optimizing to acoustic output and language model, added
Weigh Finite State Transformer.
S103: the weighted finite state interpreter is cut;
S104: pass through the Token data knot after optimization in the corresponding search space of the weighted finite state interpreter
Structure is searched in a manner of traverse node, obtains optimal path.
That is, in the present invention, firstly, the access times of internal storage data are greatly decreased by the realization of optimization WFST, it is real
The purpose of existing boosting algorithm efficiency.Secondly, compression merges storage organization, the conversion between different structure is reduced, is substantially reduced
The expense of memory and calculating.Finally, the searching algorithm of optimization GetBestPath, the exponential number for reducing traverse node, are realized
Efficiency is substantially improved.
In one embodiment of the invention, acoustic output can be obtained by acoustic output determining module, specifically:
Voice is input to an acoustic model, obtains acoustic output.
Below with reference to specific attached drawing, technical solution of the present invention and innovative point is discussed in detail.
In the prior art, as many as there are four types of WFST data store organisations, conversion is carried out between data structure and being brought very greatly
Resource consumption.The present invention integrates and optimizes for excessive data structure, to Token structure of modification in the prior art
Upgrading, to realize that Lattice and CompactLattice have the function of supporting GetBestPath (search optimal path), from
And data structure can be reduced into StdLattice, Token two, reduce the number of data conversion, so that computationally intensive
Width is reduced, and the peak value of memory consumption reduces 50% or so,
Specifically, Fig. 7 is right in a kind of optimization method of weighted finite state interpreter provided in an embodiment of the present invention
The flow diagram that Token data structure optimizes, referring to Fig. 7, being optimized to Token data structure and including:
S201: the Token*last structure for storing previous frame node address is added in Token data structure;
S202: the tot_cost variable for storing the cost value of each path is added in Token data structure.
That is, first forward direction calculates the cost of each path and is saved in the prior art in search optimal path module
In the node of last frame, the node of minimum cost is obtained, then retracts forward and obtains optimal path.However it is in the prior art
The each path of Token structure is single-track link table, cannot achieve Backward Function.Therefore, the invention in Token structure
In addition storage previous frame node address Token*last structure, so it is subsequent Token generate when can recorde
One frame address of node solves the problems, such as that Token can not retract.In addition, original token structure does not have storage each path
Cost value function, the invention in Token structure addition storage each path cost value tot_cost
Variable.
In the prior art, Token is most important storage organization in WFST, it is two-dimensional topological structure, be can be convenient
Expression interframe and frame in directional information, so the data after language model and acoustics models coupling are using Token structure
Storage, however since data volume is very huge, so the creation of Token is also more time-consuming.
The creation of Token is there are two part in the prior art, and one is addition non-empty transfer ProcessEmitting, separately
One is that addition idle running moves NoneProcessEmitting.The two steps are to separate to call in WFST, so creating
Each node at least traverses twice when Token structure, and the present invention moves non-empty transfer ProcessEmitting and idle running
NoneProcessEmitting is merged, to realize that node traverses number halves, the access time of memory is greatly reduced
Number, since access internal storage data occupies the time-consuming of larger proportion in creation Token, to realize the creation part Token
Efficiency optimization.
Specifically, Fig. 8 is step in a kind of optimization method of weighted finite state interpreter provided in an embodiment of the present invention
The flow diagram of S102, referring to Fig. 8, step S102 includes:
S301: the first frame of the acoustic output is obtained as present frame, the acoustic output is made of multiple frames;
S302: the corresponding idle running of the present frame is once obtained out in the language model and is moved and non-empty transfer;
S303: according to the corresponding idle running shift-in row transfer of the present frame plus chain;
S304: row transfer plus chain are transferred into according to the corresponding non-empty of the present frame;
S305: multiple frames of the acoustic output are traversed, and previous frame node address is stored in Token*last structure.
Fig. 9 is the flow diagram of step S303, refers to Fig. 9, step S303 includes:
S401: judging whether the corresponding idle running shifting of the present frame meets first threshold, and the first threshold includes fixing
Threshold value and dynamic threshold;
S402: when step S401 is judged as YES, the present frame adds chain;
S403: when step S401 is judged as NO, give up the present frame.
Figure 10 is the flow diagram of step S304, referring to Figure 10, step S304 includes:
S501: judging whether the corresponding non-empty transfer of the present frame meets second threshold, and the second threshold includes solid
Determine threshold value and dynamic threshold;
S502: when step S501 is judged as YES, the present frame adds chain;
S503: when step S501 is judged as NO, give up the present frame.
That is, in the present invention, by the way that non-empty is shifted ProcessEmitting and idle running shifting
NoneProcessEmitting, which merges, realizes that the amount of access of datarams halves.Specifically, when adding Token, in language mould
Corresponding idle running shifting and non-empty transfer are once obtained in type, then successively judge whether the idle running shifting and non-empty transfer meet threshold
Value adds chain if meeting, gives up if being unsatisfactory for.The generation of in the prior art plus chain threshold value is separate computations,
Idle running is moved in the present invention and non-empty shifts after merging, first threshold, second threshold are unified calculations.
In the prior art, WFST most time-consuming module is GetBestPath (search optimal path) part.
The target of GetBestPath is the cost for calculating each path, then selects the smallest path cost.In order to calculate each path
Cost need using path search algorithm.WFST has used Depth Priority Algorithm.But Depth Priority Algorithm is multiple
Miscellaneous degree is very high, very time-consuming.
Figure 11 is the specific flow chart of the step S104 in Fig. 6, referring to Figure 11, step S104 includes:
S601: arranging the Token chained list in frame, so that chained list discharges in order in frame, successively traverses in frame and interframe
Token chained list adds up the weight of Token chain road according to points relationship, obtains the cost value of each path, and tot_ is arrived in storage
In cost variable, while the pointer of father node is recorded into Token*last;
S602: the Token chained list of the last frame of the weighted finite state interpreter after cutting is taken, to each node
Cost adds finalcost value, the cost size of more each node;
S603: taking the smallest cost node and the link that retracts is up to first node, to obtain optimal path.
That is, the Token data structure after present invention optimization, the interframe direction of Token is unidirectionally created, so interframe
There is no inverted order directions, and because of the arrangement to chained list direction has been done in frame before looking for optimal path, so that node refers to
To can only refer to a direction, also there is no inverted orders to be directed toward in same frame.So tired by the sequence of two dimensions in interframe and frame
Cost is added not malfunction.The sequence of Token structural integrity saved in interframe and frame, so may be implemented using Token excellent
Change node searching.
As shown in figure 4, Token is two-dimensional topological structure, first frame is node 1, and the second frame is node 2,3, third frame
It is node 4,5,6, traversal order is exactly the sequence according to node 1,2,3,4,5,6.It completely remains single frames and interframe
Information, since Token structure is (because Token is the sequential build according to frame) being directed toward there is no the inverted order of interframe, institute
Depth Priority Algorithm is used with unnecessary when doing route searching, and the much lower node time of complexity can be used
It goes through, the sequence of traversal carries out in the way of interframe rear in first frame.The complexity of Depth Priority Algorithm is O (n2), traversal
Complexity be O (n), it can be seen that improved efficiencyTimes, since number of nodes n is very big, so improved efficiency width
It spends larger.
As above it is a kind of optimization method of weighted finite state interpreter provided by the invention, storage is merged by compression
Structure reduces the conversion between different structure, substantially reduces the expense of memory and calculating, by optimizing the realization of WFST, greatly
Width reduces the access times of internal storage data, realizes the purpose of boosting algorithm efficiency, is calculated by the search of Optimizing Search optimal path
Method, the exponential number for reducing traverse node, realizes being substantially improved for efficiency.
By the comparative experiments on pc it may be concluded that GetBestPath optimizes preceding time-consuming 11.376s, after optimization
1.479s.Total time-consuming 13.136s, 2.95s after optimization before WFST optimizes.WFST time-consuming after optimization reduces 77.54%, memory
The peak value of consumption reduces 47.3%.The output result for optimizing front and back is completely the same, proves by example, the present invention is to decoded
Discrimination is without influence.
The present invention also provides a kind of computer equipments, comprising: it is adapted for carrying out the processor and storage equipment of each instruction,
The storage equipment is stored with a plurality of instruction, and described instruction is suitable for being loaded by processor and being executed a kind of weighted finite state conversion
The optimization method of machine.
The present invention also provides a kind of computer readable storage mediums, are stored with computer program, the computer program
For executing a kind of optimization method of weighted finite state interpreter.
It is improvement on hardware (for example, to diode, crystal that the improvement of one technology, which can be distinguished clearly,
Pipe, switch etc. circuit structures improvement) or software on improvement (improvement for method flow).However, with technology
The improvement of development, current many method flows can be considered as directly improving for hardware circuit.Designer is almost
All corresponding hardware circuit is obtained by the way that improved method flow to be programmed into hardware circuit.Therefore, it cannot be said that one
The improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device
(ProgrammableLogic Device, PLD) (such as field programmable gate array (Field Programmable Gate
Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer
Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker
Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " is patrolled
Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development,
And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language
(Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL
(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description
Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL
(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby
Hardware Description Language) etc., VHDL (Very-High-Speed is most generally used at present
Integrated Circuit Hardware Description Language) and Verilog2.Those skilled in the art
It will be apparent to the skilled artisan that only needing method flow slightly programming in logic and being programmed into integrated circuit with above-mentioned several hardware description languages
In, so that it may it is readily available the hardware circuit for realizing the logical method process.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing
The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can
Read medium, logic gate, switch, specific integrated circuit (Application Specific Integrated Circuit,
ASIC), the form of programmable logic controller (PLC) and insertion microcontroller, the example of controller includes but is not limited to following microcontroller
Device: ARC625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320 are deposited
Memory controller is also implemented as a part of the control logic of memory.
It is also known in the art that other than realizing controller in a manner of pure computer readable program code, it is complete
Entirely can by by method and step carry out programming in logic come so that controller with logic gate, switch, specific integrated circuit, programmable
Logic controller realizes identical function with the form for being embedded in microcontroller etc..Therefore this controller is considered one kind
Hardware component, and the structure that the device for realizing various functions for including in it can also be considered as in hardware component.Or
Even, can will be considered as realizing the device of various functions either the software module of implementation method can be Hardware Subdivision again
Structure in part.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by the product with certain function.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this
The function of each unit can be realized in the same or multiple software and or hardware when application.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can
It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application
On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product
It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer system
(can be personal computer, server or network system etc.) executes the certain of each embodiment of the application or embodiment
Method described in part.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method
Part explanation.
The application can be used in numerous general or special purpose computing system environments or configuration.Such as: personal computer, clothes
Business device computer, hand system or portable system, plate system, multicomputer system, microprocessor-based system, set
Top box, programmable consumer electronics system, network PC, minicomputer, mainframe computer including any of the above system or system
Distributed computing environment etc..
The application can describe in the general context of computer-executable instructions executed by a computer, such as program
Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, group
Part, data structure etc..The application can also be practiced in a distributed computing environment, in these distributed computing environments, by
Task is executed by the connected teleprocessing system of communication network.In a distributed computing environment, program module can be with
In the local and remote computer storage media including storage system.
Although depicting the application by embodiment, it will be appreciated by the skilled addressee that the application there are many deformation and
Variation is without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application's
Spirit.
Claims (10)
1. a kind of optimization method of weighted finite state interpreter, which is characterized in that the described method includes:
Token data structure is optimized;
Transfer plus chain are carried out with the Token data structure after optimizing to acoustic output and language model, obtain weighted finite shape
State interpreter;
The weighted finite state interpreter is cut;
By the Token data structure after optimization to traverse section in the corresponding search space of the weighted finite state interpreter
The mode of point is searched for, and optimal path is obtained.
2. the method according to claim 1, wherein being optimized to Token data structure and including:
The Token*last structure for storing previous frame node address is added in Token data structure;
The tot_cost variable for storing the cost value of each path is added in Token data structure.
3. the method according to claim 1, wherein the method also includes:
Voice is input to an acoustic model, obtains acoustic output.
4. according to the method described in claim 2, it is characterized in that, to acoustic output and language model to optimize after
Token data structure carries out transfer plus chain includes:
The first frame of the acoustic output is obtained as present frame, the acoustic output is made of multiple frames;
The corresponding idle running of the present frame is once obtained out in the language model to move and non-empty transfer;
According to the corresponding idle running shift-in row transfer of the present frame plus chain;
Row transfer plus chain are transferred into according to the corresponding non-empty of the present frame;
Multiple frames of the acoustic output are traversed, and previous frame node address is stored in Token*last structure.
5. according to the method described in claim 4, it is characterized in that, being added according to the corresponding idle running shift-in row transfer of the present frame
Chain includes:
Judge whether the corresponding idle running shifting of the present frame meets first threshold, the first threshold includes fixed threshold and dynamic
Threshold value;
When the judgment is yes, the present frame adds chain;
When the judgment is no, give up the present frame.
6. according to the method described in claim 4, it is characterized in that, being transferred into capable transfer according to the corresponding non-empty of the present frame
The chain is added to include:
Judge whether the corresponding non-empty transfer of the present frame meets second threshold, the second threshold includes fixed threshold and moves
State threshold value;
When the judgment is yes, the present frame adds chain;
When the judgment is no, give up the present frame.
7. according to the method described in claim 4, it is characterized in that, empty in the corresponding search of the weighted finite state interpreter
Between in optimization after Token data structure searched in a manner of traverse node, obtaining optimal path includes:
The Token chained list in frame is arranged, so that chained list discharges in order in frame, successively traverses the Token chained list in frame with interframe,
It adds up the weight of Token chain road according to points relationship, obtains the cost value of each path, store into tot_cost variable,
The pointer of father node is recorded into Token*last simultaneously;
The Token chained list for taking the last frame of the weighted finite state interpreter after cutting, adds to the cost of each node
Finalcost value, the cost size of more each node;
It takes the smallest cost node and the link that retracts is up to first node, to obtain optimal path.
8. a kind of optimization system of weighted finite state interpreter, which is characterized in that the system comprises:
Data structure optimization module, for being optimized to Token data structure;
Switching mechanism models block, for being shifted to acoustic output and language model with the Token data structure after optimizing
Add chain, obtains weighted finite state interpreter;
Interpreter cuts module, for cutting to the weighted finite state interpreter;
Optimum route search module, after in the corresponding search space of the weighted finite state interpreter through optimization
Token data structure is searched in a manner of traverse node, obtains optimal path.
9. a kind of computer equipment characterized by comprising it is adapted for carrying out the processor and storage equipment of each instruction, it is described
Storage equipment is stored with a plurality of instruction, and described instruction is suitable for being loaded by processor and being executed such as claim 1 to 7 any one institute
A kind of optimization method for the weighted finite state interpreter stated.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer program, the computer program is used for
Execute a kind of optimization method of weighted finite state interpreter as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910271141.XA CN110047477B (en) | 2019-04-04 | 2019-04-04 | Optimization method, equipment and system of weighted finite state converter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910271141.XA CN110047477B (en) | 2019-04-04 | 2019-04-04 | Optimization method, equipment and system of weighted finite state converter |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110047477A true CN110047477A (en) | 2019-07-23 |
CN110047477B CN110047477B (en) | 2021-04-09 |
Family
ID=67276250
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910271141.XA Active CN110047477B (en) | 2019-04-04 | 2019-04-04 | Optimization method, equipment and system of weighted finite state converter |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110047477B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111933119A (en) * | 2020-08-18 | 2020-11-13 | 北京字节跳动网络技术有限公司 | Method, apparatus, electronic device, and medium for generating voice recognition network |
CN111968648A (en) * | 2020-08-27 | 2020-11-20 | 北京字节跳动网络技术有限公司 | Voice recognition method and device, readable medium and electronic equipment |
CN112259082A (en) * | 2020-11-03 | 2021-01-22 | 苏州思必驰信息科技有限公司 | Real-time voice recognition method and system |
CN112989136A (en) * | 2021-04-19 | 2021-06-18 | 河南科技大学 | Simplification method and system of finite state automatic machine |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102968989A (en) * | 2012-12-10 | 2013-03-13 | 中国科学院自动化研究所 | Improvement method of Ngram model for voice recognition |
US10008200B2 (en) * | 2013-12-24 | 2018-06-26 | Kabushiki Kaisha Toshiba | Decoder for searching a path according to a signal sequence, decoding method, and computer program product |
CN108694939A (en) * | 2018-05-23 | 2018-10-23 | 广州视源电子科技股份有限公司 | voice search optimization method, device and system |
CN108962271A (en) * | 2018-06-29 | 2018-12-07 | 广州视源电子科技股份有限公司 | Multi-weighted finite state transducer merging method, device, equipment and storage medium |
WO2018232591A1 (en) * | 2017-06-20 | 2018-12-27 | Microsoft Technology Licensing, Llc. | Sequence recognition processing |
-
2019
- 2019-04-04 CN CN201910271141.XA patent/CN110047477B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102968989A (en) * | 2012-12-10 | 2013-03-13 | 中国科学院自动化研究所 | Improvement method of Ngram model for voice recognition |
US10008200B2 (en) * | 2013-12-24 | 2018-06-26 | Kabushiki Kaisha Toshiba | Decoder for searching a path according to a signal sequence, decoding method, and computer program product |
WO2018232591A1 (en) * | 2017-06-20 | 2018-12-27 | Microsoft Technology Licensing, Llc. | Sequence recognition processing |
CN108694939A (en) * | 2018-05-23 | 2018-10-23 | 广州视源电子科技股份有限公司 | voice search optimization method, device and system |
CN108962271A (en) * | 2018-06-29 | 2018-12-07 | 广州视源电子科技股份有限公司 | Multi-weighted finite state transducer merging method, device, equipment and storage medium |
Non-Patent Citations (3)
Title |
---|
ZHEHUAI CHEN1, JUSTIN LUITJENS, HAINAN XU: "A GPU-based WFST Decoder with Exact Lattice Generation", 《INTERSPEECH》 * |
丁佳伟,刘加,张卫强: "WFST解码器词图生成算法中的非活跃节点检测与内存优化", 《中国科学院大学学报》 * |
姚煜,RYAD CHELLALI: "基于双向长短时记忆-联结时序分类和加权有限状态转换器的端到端中文语音识别系统", 《计算机应用》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111933119A (en) * | 2020-08-18 | 2020-11-13 | 北京字节跳动网络技术有限公司 | Method, apparatus, electronic device, and medium for generating voice recognition network |
CN111968648A (en) * | 2020-08-27 | 2020-11-20 | 北京字节跳动网络技术有限公司 | Voice recognition method and device, readable medium and electronic equipment |
CN112259082A (en) * | 2020-11-03 | 2021-01-22 | 苏州思必驰信息科技有限公司 | Real-time voice recognition method and system |
CN112989136A (en) * | 2021-04-19 | 2021-06-18 | 河南科技大学 | Simplification method and system of finite state automatic machine |
Also Published As
Publication number | Publication date |
---|---|
CN110047477B (en) | 2021-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110047477A (en) | A kind of optimization method, equipment and the system of weighted finite state interpreter | |
CN105893349B (en) | Classification tag match mapping method and device | |
US20210089874A1 (en) | Ultra-low power keyword spotting neural network circuit | |
CN107832844A (en) | A kind of information processing method and Related product | |
CN109033303B (en) | Large-scale knowledge graph fusion method based on reduction anchor points | |
CN107832432A (en) | A kind of search result ordering method, device, server and storage medium | |
CN108735201A (en) | continuous speech recognition method, device, equipment and storage medium | |
EP3970012A1 (en) | Scheduling operations on a computation graph | |
US20170323638A1 (en) | System and method of automatic speech recognition using parallel processing for weighted finite state transducer-based speech decoding | |
CN111341299B (en) | Voice processing method and device | |
US20220004547A1 (en) | Method, apparatus, system, device, and storage medium for answering knowledge questions | |
JP2010044637A (en) | Data processing apparatus, method, and program | |
CN109119067A (en) | Phoneme synthesizing method and device | |
WO2017177484A1 (en) | Voice recognition-based decoding method and device | |
JP2022031863A (en) | Word slot recognition method, device and electronic apparatus | |
CN115455171A (en) | Method, device, equipment and medium for mutual retrieval and model training of text videos | |
CN113360683B (en) | Method for training cross-modal retrieval model and cross-modal retrieval method and device | |
CN116401502A (en) | Method and device for optimizing Winograd convolution based on NUMA system characteristics | |
CN114490922B (en) | Natural language understanding model training method and device | |
CN113689868B (en) | Training method and device of voice conversion model, electronic equipment and medium | |
CN113486659B (en) | Text matching method, device, computer equipment and storage medium | |
CN114330717A (en) | Data processing method and device | |
WO2023093909A1 (en) | Workflow node recommendation method and apparatus | |
US20210034987A1 (en) | Auxiliary handling of metadata and annotations for a question answering system | |
CN115757735A (en) | Intelligent retrieval method and system for power grid digital construction result resources |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Sun Hao Inventor after: OuYang Peng Inventor after: Li Xiudong Inventor after: Wang Bo Inventor before: Sun Hao Inventor before: OuYang Peng Inventor before: Yin Shouyi Inventor before: Li Xiudong Inventor before: Wang Bo |