CN112002310B - Domain language model construction method, device, computer equipment and storage medium - Google Patents

Domain language model construction method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN112002310B
CN112002310B CN202010669031.1A CN202010669031A CN112002310B CN 112002310 B CN112002310 B CN 112002310B CN 202010669031 A CN202010669031 A CN 202010669031A CN 112002310 B CN112002310 B CN 112002310B
Authority
CN
China
Prior art keywords
wfsa
network
domain
language model
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010669031.1A
Other languages
Chinese (zh)
Other versions
CN112002310A (en
Inventor
张旭华
齐欣
孙泽明
朱林林
王宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Cloud Computing Co Ltd
Original Assignee
Suning Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Cloud Computing Co Ltd filed Critical Suning Cloud Computing Co Ltd
Priority to CN202010669031.1A priority Critical patent/CN112002310B/en
Publication of CN112002310A publication Critical patent/CN112002310A/en
Priority to PCT/CN2021/099661 priority patent/WO2022012238A1/en
Application granted granted Critical
Publication of CN112002310B publication Critical patent/CN112002310B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Abstract

The invention discloses a method, a device, computer equipment and a storage medium for constructing a domain language model, which belong to the technical field of voice recognition, wherein the method comprises the following steps: converting the universal language model into an equivalent first WFSA network; screening optimal paths meeting preset conditions from the first WFSA network according to the preset number of domain corpora to construct a second WFSA network; normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model. According to the method, the domain language model meeting specific scenes and having universal generalization capability can be quickly constructed under the condition of insufficient domain training corpus.

Description

Domain language model construction method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of speech recognition technology, and in particular, to a method and apparatus for constructing a domain language model, a computer device, and a storage medium.
Background
Speech recognition schemes are mostly language model based recognition schemes. When training a language model, the most commonly used model is an N-Gram model, which is a statistical language model, and generally, the larger the corpus is, the better the model effect is. Along with the continuous deep of scenes, various language models meeting the requirements of specific scenes and having generalization capability are often required to be made, and higher requirements are provided for corpus selection.
At present, two common methods for constructing a language model meeting a specific scene generally exist, one method is to directly train by collecting relevant domain corpuses, the other method is to fuse the trained language model with a general language model according to a certain weight to increase generalization capability, and the two methods need a large amount of domain training corpuses, but finding the domain corpuses of a fitting scene is not easy.
Disclosure of Invention
In order to solve the problems in the prior art, the embodiment of the invention provides a method, a device, computer equipment and a storage medium for constructing a domain language model, which can quickly construct the domain language model meeting specific scenes and having universal generalization capability under the condition of insufficient domain training corpus.
In a first aspect, a method for constructing a domain language model is provided, where the method includes:
converting the universal language model into an equivalent first WFSA network;
screening optimal paths meeting preset conditions from the first WFSA network according to a preset number of domain corpora to construct a second WFSA network;
normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
Further, the selecting, according to the preset number of domain corpora, an optimal path satisfying a preset condition from the first WFSA network to construct a second WFSA network includes:
searching a preset number of candidate optimal paths in the first WFSA network aiming at each domain corpus; and
screening out optimal paths corresponding to the domain corpus from the preset number of candidate optimal paths, wherein the probability on the transmitting arcs of each state node of the optimal paths exceeds a preset threshold;
and constructing the second WFSA network according to the optimal path corresponding to each domain corpus.
Further, for each of the domain corpora, searching a preset number of candidate optimal paths in the first WFSA network, including:
inputting the domain corpus into the first WFSA network for searching aiming at each domain corpus to obtain a plurality of candidate paths corresponding to the domain corpus and path probabilities of the candidate paths;
and sequencing the plurality of candidate paths corresponding to the domain corpus according to the sequence of the path probability from high to low, and taking the candidate paths sequenced in the preset number as the candidate optimal paths of the domain corpus.
Further, the normalizing the second WFSA network includes:
and normalizing the probability of all the transmitting arcs of each state node in the second WFSA network according to the transmitting arc number of each state node in the second WFSA network and the probability of each transmitting arc.
Further, the general language model and the domain language model are both N-Gram language models.
In a second aspect, there is provided a domain language model construction apparatus, the apparatus comprising:
the first conversion module is used for converting the universal language model into an equivalent first WFSA network;
the construction module is used for screening out an optimal path meeting preset conditions from the first WFSA network according to the preset number of domain corpora so as to construct a second WFSA network;
the normalization module is used for normalizing the second WFSA network;
and the second conversion module is used for converting the normalized second WFSA network into a domain language model.
Further, the construction module includes:
the searching sub-module is used for searching a preset number of candidate optimal paths in the first WFSA network aiming at each domain corpus;
the screening sub-module is used for screening out optimal paths corresponding to the domain corpus from the preset number of candidate optimal paths, wherein the probability of each state node of the optimal paths on an emission arc exceeds a preset threshold;
and the construction submodule is used for constructing the second WFSA network according to the optimal path corresponding to each domain corpus.
Further, the searching submodule is specifically configured to:
inputting the domain corpus into the first WFSA network for searching aiming at each domain corpus to obtain a plurality of candidate paths corresponding to the domain corpus and path probabilities of the candidate paths;
and sequencing the plurality of candidate paths corresponding to the domain corpus according to the sequence of the path probability from high to low, and taking the candidate paths sequenced in the preset number as the candidate optimal paths of the domain corpus.
Further, the normalization module is specifically configured to:
and normalizing the probability of all the transmitting arcs of each state node in the second WFSA network according to the transmitting arc number of each state node in the second WFSA network and the probability of each transmitting arc.
Further, the general language model and the domain language model are both N-Gram language models.
In a third aspect, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of:
converting the universal language model into an equivalent first WFSA network;
screening optimal paths meeting preset conditions from the first WFSA network according to a preset number of domain corpora to construct a second WFSA network;
normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
In a fourth aspect, there is provided a computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor performs the steps of:
converting the universal language model into an equivalent first WFSA network;
screening optimal paths meeting preset conditions from the first WFSA network according to a preset number of domain corpora to construct a second WFSA network;
normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
The invention provides a method, a device, computer equipment and a storage medium for constructing a domain language model, which are used for converting a general language model into an equivalent first WFSA network; then, according to the preset number of domain corpora, the optimal path meeting the preset conditions is screened out from the first WFSA network so as to construct a second WFSA network; finally, normalizing the second WFSA network, converting the normalized second WFSA network into a domain language model, and quickly constructing the domain language model which meets the specific scene and has the universal generalization capability under the condition of insufficient domain training corpus because the path for constructing the second WFSA network is screened out from the first WFSA network converted from the universal language model and is screened for the preset number of domain corpuses.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a flowchart of a method for building a domain language model according to an embodiment of the present invention;
FIG. 2 is a specific flowchart of step S2 shown in FIG. 1;
FIG. 3 is a diagram showing a construction apparatus for a domain language model according to an embodiment of the present invention;
fig. 4 shows an internal structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It is noted that, unless the context clearly requires otherwise, the words "comprise," "comprising," and the like throughout the specification and the claims should be construed in an inclusive sense rather than an exclusive or exhaustive sense; that is, it is the meaning of "including but not limited to". Furthermore, in the description of the present invention, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present invention, unless otherwise indicated, the meaning of "a plurality" is two or more.
As described in the foregoing background art, there are two general methods for constructing a language model meeting a specific scene, one is to directly train by collecting relevant domain linguistic data, and the other is to fuse the trained language model with a general language model according to a certain weight to increase generalization capability, and the two methods both require a large amount of training linguistic data, but finding the linguistic data of a fitting scene is not easy. The domain language model in the embodiment of the present invention may be applied to a scenario of a specific domain, where the specific domain may be a financial domain, a medical domain, a commodity domain, a logistics domain, or other specific domains, which is not specifically limited in the present invention.
Fig. 1 shows a flowchart of a domain language model building method according to an embodiment of the present invention, where the embodiment of the present invention uses a domain language model building apparatus as an execution body, and the apparatus may be configured in any computer device, and the computer device may be an independent server or a server cluster.
Referring to fig. 1, the method for constructing a domain language model provided by the present invention includes steps S1 to S4:
s1: the generic language model is converted to an equivalent first WFSA network.
Wherein the generic language model may be based on a statistical language model, which is a probability distribution over a sequence of words, which for a given length m of the sequence may produce a probability P (w 1 ,w 2 ,...,w m ). The essence is to try to find a probability distribution for a sentence or sequence that can represent the probability of the occurrence of any one sentence or sequence, typically using conditional probabilities to characterize the probability of the current sequence as related to the n sequences that occur before. N-Gram is an algorithm based on a statistical language model based on the markov assumption, namely: it is assumed that in a piece of text, the occurrence of the nth word is related to only the first N-1 words, but not to any other words. Based on such an assumption, the probability of each word occurring in the text can be evaluated, and the probability of an entire sentence is the product of the probabilities of the respective words occurring. These probabilities can be obtained by counting the number of simultaneous occurrences of N words directly from the corpus, commonly used N-Gram models such as binary Bi-Gram and ternary Tri-Gram.
The universal language model can be generated by training a universal corpus in advance, the universal corpus can be obtained by capturing Chinese corpus from the Internet through a web crawler tool or directly downloading the public free Chinese corpus, and the storage format of the universal language model can be an ara format. It should be noted that, updating the generic language model after training is time-consuming, generally only once, and aims to cover more comprehensive language phenomena, and the reason that such a generic language model covering more comprehensive fields is used instead of using other field models is that the generic language model does not pay attention to any field and is a relatively smooth probability set calculated on a large amount of historical text corpus, so that the generic language model is easier to migrate to the target field and can reflect the connection probability of words close to reality.
The first WFSA network is a directed graph structure, a plurality of state nodes are arranged on the graph, connection arcs are arranged among the state nodes, the arcs represent transitions among the states, the arcs are directed, and each arc is provided with an input label and a probability corresponding to the state transition. The input label is a word object; the probability over an arc characterizes the probability that the arc appears in the path. The first WFSA network may include a plurality of paths, and the probability of each path may be calculated according to a probability product of all arcs in the path, where when the probabilities are represented by weights on arcs between state nodes, the weight value may be obtained by logarithmically calculating the probabilities.
Specifically, when converting the universal language model in the arpa format into a first WFSA (Weighted fixed-State automation) network, the execution subject may call an arpa2fst tool to convert the universal language model into an equivalent first WFSA network. Of course, in practical application, in addition to calling the arpa2fst tool to perform conversion, an equivalent first WFSA network may be obtained through conversion in other manners, which is not limited in this embodiment.
S2: and screening the optimal paths meeting the preset conditions from the first WFSA network according to the preset number of domain corpora to construct a second WFSA network.
The domain corpus can be common words and sentences, professional words and sentences and the like in a specific domain.
The preset number may be preset to be lower than the preset value, and it is understood that the number of samples of the preset number of domain corpora is smaller than the general corpus.
In this embodiment, multiple paths may be searched in the first WFSA network for each domain corpus, and one or more optimal paths meeting the preset conditions corresponding to each domain corpus may be obtained through screening, for example, the optimal paths may be paths with highest path probability, and word sequences corresponding to each domain corpus may be obtained according to the optimal paths corresponding to each domain corpus.
The preset conditions are preset conditions for determining the optimal path. In a specific application, the preset conditions may be set as: when the probability on the transmitting arc of each state node on one path exceeds a preset threshold, the path is the optimal path, and in addition, the preset condition can be set as follows: when the sum of probabilities on all the transmitting arcs of one path exceeds a preset threshold value, the path is the optimal path.
Specifically, as shown in fig. 2, the implementation procedure of step S2 may include the steps of:
s21: aiming at each domain corpus, searching a preset number of candidate optimal paths corresponding to the domain corpus in the first WFSA network.
The preset number may be set to an integer value according to actual needs, and the specific preset number is not limited in this embodiment.
Specifically, the process may include:
inputting the domain corpus into a first WFSA (wireless Fidelity sa) network for searching aiming at each domain corpus to obtain a plurality of candidate paths corresponding to the domain corpus and path probabilities of the candidate paths; and sequencing the multiple candidate paths corresponding to the domain corpus according to the sequence of the path probability from high to low, and taking the candidate paths sequenced in the preset number as the candidate optimal paths of the domain corpus.
By way of example, assuming that "today weather is good" is entered as a domain corpus into the first WFSA network, the following two candidate optimal paths may be searched:
PATH1 < s > weather is good today
PATH2 < s > weather is good today.
S22: and screening out optimal paths corresponding to the domain corpus from a preset number of candidate optimal paths, wherein the probability of each state node of the optimal paths on an emission arc exceeds a preset threshold value.
In this embodiment, after one or more candidate optimal paths corresponding to a given domain corpus are searched, when the probability on the transmitting arc of each state node on one candidate optimal path exceeds a preset threshold, the candidate optimal path is the optimal path corresponding to the domain corpus.
S23: and constructing a second WFSA network according to the optimal path corresponding to the corpus in each field.
Specifically, a second WFSA network only including an initial state node and an end state node may be pre-constructed, and after an optimal path corresponding to a domain corpus is obtained, the optimal path is updated to the second WFSA network until an optimal path corresponding to a last domain corpus is updated to the second WFSA network, so that the construction of the second WFSA network is completed.
S3: the second WFSA network is normalized.
Specifically, according to the number of transmitting arcs of each state node on the second WFSA network and the probability on each transmitting arc, the probability on all transmitting arcs of each state node in the second WFSA network is normalized, so that the sum of the probabilities on all transmitting arcs of each state node in the second WFSA network is 1.
S4: and converting the normalized second WFSA network into a domain language model.
When the general language model is an N-Gram model, the domain language model is the N-Gram model with the same order as the general language model.
Specifically, the execution body can call the fsts-to-transgressions tool to convert the text into the N-Gram model in the arpa format after the second WFSA network, so as to obtain the domain language model. In addition, besides calling the fsts-to-transgressions tool to transform, the domain language model may be obtained through other transformation methods, which is not limited in this embodiment.
The invention provides a field language model construction method, which is characterized in that a general language model is converted into an equivalent first WFSA network; then, according to the preset number of domain corpora, the optimal path meeting the preset conditions is screened out from the first WFSA network so as to construct a second WFSA network; finally, normalizing the second WFSA network, converting the normalized second WFSA network into a domain language model, and quickly constructing the language model which meets the specific scene and has the universal generalization capability under the condition of insufficient training corpus because the path for constructing the second WFSA network is screened out from the first WFSA network converted from the universal language model and is screened for the preset number of domain corpuses.
Fig. 3 shows a block diagram of a domain language model construction device according to an embodiment of the present invention, and referring to fig. 3, the device includes:
a first conversion module 31, configured to convert the generic language model into an equivalent first WFSA network;
a construction module 32, configured to screen out an optimal path satisfying a preset condition from the first WFSA network according to a preset number of domain corpora, so as to construct a second WFSA network;
a normalization module 33, configured to normalize the second WFSA network;
and the second conversion module 34 is configured to convert the normalized second WFSA network into a domain language model.
In one embodiment, the construction module 32 includes:
a searching sub-module 321, configured to search, for each domain corpus, a preset number of candidate optimal paths corresponding to the domain corpus in the first WFSA network;
the screening sub-module 322 is configured to screen an optimal path corresponding to the domain corpus from a preset number of candidate optimal paths, where a probability on an emission arc of each state node of the optimal path exceeds a preset threshold;
and a constructing sub-module 323, configured to construct a second WFSA network according to the optimal path corresponding to the corpus in each domain.
In one embodiment, the search sub-module 321 is specifically configured to:
inputting the domain corpus into a first WFSA (wireless Fidelity sa) network for searching aiming at each domain corpus to obtain a plurality of candidate paths corresponding to the domain corpus and path probabilities of the candidate paths;
and sequencing the multiple candidate paths corresponding to the domain corpus according to the sequence of the path probability from high to low, and taking the candidate paths sequenced in the preset number as the candidate optimal paths of the domain corpus.
In one embodiment, the normalization module 33 is specifically configured to:
and normalizing the probability of all the transmitting arcs of each state node in the second WFSA network according to the transmitting arc number of each state node in the second WFSA network and the probability of each transmitting arc, so that the sum of the probabilities of all the transmitting arcs of each state node in the second WFSA network is 1.
In one embodiment, the generic language model and the domain language model are both N-Gram language models.
The domain language model construction device provided by the embodiment of the invention belongs to the same inventive concept as the domain language model construction method provided by the embodiment of the invention, and the domain language model construction method provided by any embodiment of the invention can be executed, and has the corresponding functional modules and beneficial effects of executing the domain language model construction method. Technical details not described in detail in this embodiment may refer to the method for constructing a domain language model provided in the embodiment of the present invention, and are not described herein again.
Fig. 4 shows an internal structural diagram of a computer device according to an embodiment of the present invention. The computer device may be a server, the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a domain language model building method.
In one embodiment, there is provided a computer device comprising:
one or more processors;
a storage means for storing one or more programs;
the following steps are implemented when one or more programs are executed by one or more processors, causing the one or more processors to execute the computer program:
converting the universal language model into an equivalent first WFSA network;
screening optimal paths meeting preset conditions from the first WFSA network according to the preset number of domain corpora to construct a second WFSA network;
normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:
converting the universal language model into an equivalent first WFSA network;
screening optimal paths meeting preset conditions from the first WFSA network according to the preset number of domain corpora to construct a second WFSA network;
normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, physical banking or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (8)

1. A method for building a domain language model, the method comprising:
converting the universal language model into an equivalent first WFSA network;
screening optimal paths meeting preset conditions from the first WFSA network according to a preset number of domain corpora to construct a second WFSA network;
normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model;
the filtering, according to a preset number of domain corpora, an optimal path meeting a preset condition from the first WFSA network to construct a second WFSA network, including:
searching a preset number of candidate optimal paths in the first WFSA network aiming at each domain corpus; and
screening out optimal paths corresponding to the domain corpus from the preset number of candidate optimal paths, wherein the probability on the transmitting arcs of each state node of the optimal paths exceeds a preset threshold;
and constructing the second WFSA network according to the optimal path corresponding to each domain corpus.
2. The method of claim 1, wherein the searching a preset number of candidate optimal paths in the first WFSA network for each of the domain corpora comprises:
inputting the domain corpus into the first WFSA network for searching aiming at each domain corpus to obtain a plurality of candidate paths corresponding to the domain corpus and path probabilities of the candidate paths;
and sequencing the plurality of candidate paths corresponding to the domain corpus according to the sequence of the path probability from high to low, and taking the candidate paths sequenced in the preset number as the candidate optimal paths of the domain corpus.
3. The method of claim 1 or 2, wherein normalizing the second WFSA network comprises:
and normalizing the probability of all the transmitting arcs of each state node in the second WFSA network according to the transmitting arc number of each state node in the second WFSA network and the probability of each transmitting arc.
4. The method of claim 1, wherein the generic language model and the domain language model are both N-Gram language models.
5. A domain language model construction apparatus, the apparatus comprising:
the first conversion module is used for converting the universal language model into an equivalent first WFSA network;
the construction module is used for screening out an optimal path meeting preset conditions from the first WFSA network according to the preset number of domain corpora so as to construct a second WFSA network;
the normalization module is used for normalizing the second WFSA network;
the second conversion module is used for converting the normalized second WFSA network into a domain language model;
wherein the construction module comprises:
the searching sub-module is used for searching a preset number of candidate optimal paths in the first WFSA network aiming at each domain corpus;
the screening sub-module is used for screening out optimal paths corresponding to the domain corpus from the preset number of candidate optimal paths, wherein the probability of each state node of the optimal paths on an emission arc exceeds a preset threshold;
and the construction submodule is used for constructing the second WFSA network according to the optimal path corresponding to each domain corpus.
6. The apparatus of claim 5, wherein the search submodule is specifically configured to:
inputting the domain corpus into the first WFSA network for searching aiming at each domain corpus to obtain a plurality of candidate paths corresponding to the domain corpus and path probabilities of the candidate paths;
and sequencing the plurality of candidate paths corresponding to the domain corpus according to the sequence of the path probability from high to low, and taking the candidate paths sequenced in the preset number as the candidate optimal paths of the domain corpus.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the domain language model construction method of any one of claims 1 to 4 when the computer program is executed.
8. A computer-readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the domain language model construction method of any one of claims 1 to 4.
CN202010669031.1A 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium Active CN112002310B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010669031.1A CN112002310B (en) 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium
PCT/CN2021/099661 WO2022012238A1 (en) 2020-07-13 2021-06-11 Method and apparatus for constructing domain language model, and computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010669031.1A CN112002310B (en) 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112002310A CN112002310A (en) 2020-11-27
CN112002310B true CN112002310B (en) 2024-03-26

Family

ID=73466859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010669031.1A Active CN112002310B (en) 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112002310B (en)
WO (1) WO2022012238A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112002310B (en) * 2020-07-13 2024-03-26 苏宁云计算有限公司 Domain language model construction method, device, computer equipment and storage medium
CN112614023A (en) * 2020-12-25 2021-04-06 东北大学 Formalized security verification method for electronic contract
CN113782001B (en) * 2021-11-12 2022-03-08 深圳市北科瑞声科技股份有限公司 Specific field voice recognition method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968989A (en) * 2012-12-10 2013-03-13 中国科学院自动化研究所 Improvement method of Ngram model for voice recognition
JP2015152661A (en) * 2014-02-12 2015-08-24 日本電信電話株式会社 Weighted finite state automaton creation device, symbol string conversion device, voice recognition device, methods thereof and programs
JP2017097451A (en) * 2015-11-18 2017-06-01 富士通株式会社 Information processing method, information processing program, and information processing device
CN110472223A (en) * 2018-05-10 2019-11-19 北京搜狗科技发展有限公司 A kind of input configuration method, device and electronic equipment
CN111243599A (en) * 2020-01-13 2020-06-05 网易有道信息技术(北京)有限公司 Speech recognition model construction method, device, medium and electronic equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101394253B1 (en) * 2012-05-16 2014-05-13 광주과학기술원 Apparatus for correcting error of speech recognition
US20150254233A1 (en) * 2014-03-06 2015-09-10 Nice-Systems Ltd Text-based unsupervised learning of language models
US9972311B2 (en) * 2014-05-07 2018-05-15 Microsoft Technology Licensing, Llc Language model optimization for in-domain application
US9672810B2 (en) * 2014-09-26 2017-06-06 Intel Corporation Optimizations to decoding of WFST models for automatic speech recognition
CN112002310B (en) * 2020-07-13 2024-03-26 苏宁云计算有限公司 Domain language model construction method, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968989A (en) * 2012-12-10 2013-03-13 中国科学院自动化研究所 Improvement method of Ngram model for voice recognition
JP2015152661A (en) * 2014-02-12 2015-08-24 日本電信電話株式会社 Weighted finite state automaton creation device, symbol string conversion device, voice recognition device, methods thereof and programs
JP2017097451A (en) * 2015-11-18 2017-06-01 富士通株式会社 Information processing method, information processing program, and information processing device
CN110472223A (en) * 2018-05-10 2019-11-19 北京搜狗科技发展有限公司 A kind of input configuration method, device and electronic equipment
CN111243599A (en) * 2020-01-13 2020-06-05 网易有道信息技术(北京)有限公司 Speech recognition model construction method, device, medium and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
language model adaptation using WFST-based speaking-style translation;T. Hori. et al;IEEE;全文 *
一个面向广播语音识别的语言模型自适应框架;王晓瑞;丁鹏;梁家恩;徐波;;中文信息学报(第04期);全文 *
基于WFST的中文语音识别解码器的研究;范书平;中国优秀硕士论文全文数据库·信息科技辑(第03期);全文 *

Also Published As

Publication number Publication date
WO2022012238A1 (en) 2022-01-20
CN112002310A (en) 2020-11-27

Similar Documents

Publication Publication Date Title
CN112002310B (en) Domain language model construction method, device, computer equipment and storage medium
CN111444311A (en) Semantic understanding model training method and device, computer equipment and storage medium
CN110704588A (en) Multi-round dialogue semantic analysis method and system based on long-term and short-term memory network
CN109902301B (en) Deep neural network-based relationship reasoning method, device and equipment
CN110689881B (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN111583911B (en) Speech recognition method, device, terminal and medium based on label smoothing
CN110175273B (en) Text processing method and device, computer readable storage medium and computer equipment
CN112199473A (en) Multi-turn dialogue method and device in knowledge question-answering system
CN106843523B (en) Character input method and device based on artificial intelligence
CN109086348B (en) Hyperlink processing method and device and storage medium
CN111462751A (en) Method, apparatus, computer device and storage medium for decoding voice data
CN112733911A (en) Entity recognition model training method, device, equipment and storage medium
CN112687266A (en) Speech recognition method, speech recognition device, computer equipment and storage medium
JP2021524095A (en) Text-level text translation methods and equipment
CN113779214A (en) Automatic generation method and device of jump condition, computer equipment and storage medium
CN113343711A (en) Work order generation method, device, equipment and storage medium
CN115062619B (en) Chinese entity linking method, device, equipment and storage medium
CN115497484B (en) Voice decoding result processing method, device, equipment and storage medium
CN115391512A (en) Training method, device, equipment and storage medium of dialogue language model
CN113704466B (en) Text multi-label classification method and device based on iterative network and electronic equipment
CN111783435A (en) Shared vocabulary selection method and device and storage medium
CN112735392B (en) Voice processing method, device, equipment and storage medium
CN115862616A (en) Speech recognition method
CN113571052A (en) Noise extraction and instruction identification method and electronic equipment
CN112487811B (en) Cascading information extraction system and method based on reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant