CN112002310A - Domain language model construction method and device, computer equipment and storage medium - Google Patents

Domain language model construction method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112002310A
CN112002310A CN202010669031.1A CN202010669031A CN112002310A CN 112002310 A CN112002310 A CN 112002310A CN 202010669031 A CN202010669031 A CN 202010669031A CN 112002310 A CN112002310 A CN 112002310A
Authority
CN
China
Prior art keywords
wfsa
network
domain
language model
optimal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010669031.1A
Other languages
Chinese (zh)
Other versions
CN112002310B (en
Inventor
张旭华
齐欣
孙泽明
朱林林
王宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Cloud Computing Co Ltd
Original Assignee
Suning Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Cloud Computing Co Ltd filed Critical Suning Cloud Computing Co Ltd
Priority to CN202010669031.1A priority Critical patent/CN112002310B/en
Publication of CN112002310A publication Critical patent/CN112002310A/en
Priority to PCT/CN2021/099661 priority patent/WO2022012238A1/en
Application granted granted Critical
Publication of CN112002310B publication Critical patent/CN112002310B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Abstract

The invention discloses a domain language model construction method, a device, computer equipment and a storage medium, belonging to the technical field of voice recognition, wherein the method comprises the following steps: converting the generic language model to an equivalent first WFSA network; screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network; and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model. Under the condition of insufficient domain training corpus, the method can quickly construct the domain language model which meets the specific scene and has the general generalization capability.

Description

Domain language model construction method and device, computer equipment and storage medium
Technical Field
The invention relates to the technical field of voice recognition, in particular to a method and a device for constructing a domain language model, computer equipment and a storage medium.
Background
Most speech recognition schemes are based on language models. When training a language model, the most commonly used model is an N-Gram model, which is a statistical language model, and generally, the larger the corpus is, the better the model effect is. With the continuous deepening of scenes, various language models meeting the requirements of specific scenes and having generalization capability are often required to be made, which puts higher requirements on the selection of linguistic data.
At present, there are two common methods for constructing a language model satisfying a specific scenario, one is to directly train through collecting related domain corpora, and the other is to fuse the trained language model with a general language model according to a certain weight to increase generalization ability, but both methods require a large amount of domain corpora, but finding a domain corpus fitting a scenario is not easy.
Disclosure of Invention
In order to solve the problems in the prior art, embodiments of the present invention provide a method and an apparatus for constructing a domain language model, a computer device, and a storage medium, which are capable of quickly constructing a domain language model that satisfies a specific scenario and has a general generalization capability under the condition of insufficient domain training corpus.
In a first aspect, a method for constructing a domain language model is provided, where the method includes:
converting the generic language model to an equivalent first WFSA network;
screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network;
and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
Further, the screening out an optimal path meeting a preset condition from the first WFSA network according to a preset number of domain corpora to construct a second WFSA network includes:
searching a preset number of candidate optimal paths in the first WFSA network aiming at each domain corpus; and
screening out optimal paths corresponding to the domain corpora from the preset number of candidate optimal paths, wherein the probability of the transmitting arc of each state node of the optimal paths exceeds a preset threshold;
and constructing the second WFSA network according to the optimal path corresponding to each field corpus.
Further, the searching a preset number of candidate optimal paths in the first WFSA network for each of the domain corpora includes:
inputting the domain linguistic data into the first WFSA network for searching aiming at each domain linguistic data to obtain a plurality of candidate paths corresponding to the domain linguistic data and path probability of each candidate path;
and sequencing a plurality of candidate paths corresponding to the field corpus according to the order of the path probability from high to low, and taking the candidate paths sequenced in the front by a preset number of bits as the optimal candidate paths of the field corpus.
Further, the normalizing the second WFSA network includes:
normalizing the probabilities on all the transmitting arcs of each state node in the second WFSA network according to the number of the transmitting arcs of each state node in the second WFSA network and the probabilities on the transmitting arcs.
Further, the general language model and the domain language model are both N-Gram language models.
In a second aspect, an apparatus for constructing a domain language model is provided, the apparatus comprising:
the first conversion module is used for converting the universal language model into an equivalent first WFSA network;
the construction module is used for screening out an optimal path meeting preset conditions from the first WFSA network according to the field linguistic data with preset number so as to construct a second WFSA network;
a normalization module, configured to normalize the second WFSA network;
and the second conversion module is used for converting the normalized second WFSA network into a domain language model.
Further, the construction module includes:
the search submodule is used for searching a preset number of candidate optimal paths in the first WFSA network aiming at each field corpus;
the screening submodule is used for screening the optimal paths corresponding to the domain corpora from the preset number of candidate optimal paths, wherein the probability of the transmitting arc of each state node of the optimal paths exceeds a preset threshold value;
and the construction submodule is used for constructing the second WFSA network according to the optimal path corresponding to each field corpus.
Further, the search sub-module is specifically configured to:
inputting the domain linguistic data into the first WFSA network for searching aiming at each domain linguistic data to obtain a plurality of candidate paths corresponding to the domain linguistic data and path probability of each candidate path;
and sequencing a plurality of candidate paths corresponding to the field corpus according to the order of the path probability from high to low, and taking the candidate paths sequenced in the front by a preset number of bits as the optimal candidate paths of the field corpus.
Further, the normalization module is specifically configured to:
normalizing the probabilities on all the transmitting arcs of each state node in the second WFSA network according to the number of the transmitting arcs of each state node in the second WFSA network and the probabilities on the transmitting arcs.
Further, the general language model and the domain language model are both N-Gram language models.
In a third aspect, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the following steps are implemented:
converting the generic language model to an equivalent first WFSA network;
screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network;
and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
In a fourth aspect, a computer-readable storage medium is provided, on which a computer program is stored, which, when executed by a processor, performs the steps of:
converting the generic language model to an equivalent first WFSA network;
screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network;
and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
The invention provides a method and a device for constructing a domain language model, computer equipment and a storage medium, wherein a general language model is converted into an equivalent first WFSA network; then, according to a preset number of field linguistic data, screening out an optimal path meeting a preset condition from the first WFSA network to construct a second WFSA network; and finally, normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model, wherein the path for constructing the second WFSA network is screened from the first WFSA network converted from the general language model, and is screened according to the preset number of domain linguistic data, so that the domain language model obtained by converting the normalized second WFSA network can meet the requirement of a specific scene and has general generalization capability, and the purpose of quickly constructing the domain language model which meets the specific scene and has the general generalization capability under the condition that the domain linguistic data are insufficient is realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart illustrating a domain language model building method according to an embodiment of the present invention;
fig. 2 is a detailed flowchart of step S2 shown in fig. 1;
FIG. 3 is a block diagram illustrating a domain language model building apparatus according to an embodiment of the present invention;
fig. 4 shows an internal structure diagram of a computer device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that, unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to". Furthermore, in the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
As described in the foregoing background art, there are generally two methods for constructing a language model satisfying a specific scenario, one method is to directly collect related domain corpora for training, and the other method is to fuse the trained language model with a general language model according to a certain weight to increase generalization capability, and both the two methods require a large amount of training corpora, but it is not easy to find a corpus fitting a scenario. The domain language model in the embodiment of the present invention may be applied to a scenario in a specific domain, where the specific domain may be a financial domain, a medical domain, a commodity domain, a logistics domain, or other specific domains, and the present invention is not limited in this respect.
Fig. 1 shows a flowchart of a domain language model building method according to an embodiment of the present invention, which is illustrated by taking a domain language model building apparatus as an execution subject, where the apparatus may be configured in any computer device, and the computer device may be an independent server or a server cluster.
Referring to fig. 1, the method for constructing a domain language model provided by the present invention includes steps S1 to S4:
s1: the generic language model is converted into an equivalent first WFSA network.
Wherein the generic language model may be based on a statistical language model, which is a probability distribution over a sequence of words, which for a given length m of the sequence may yield a probability P (w) for the entire sequence1,w2,...,wm). The essence of this is that an attempt is made to find a probability distribution for a sentence or sequence that can represent the probability of the occurrence of any one sentence or sequence, and the probability that the current sequence is characterized by a conditional probability is typically related to the n sequences that occurred before. N-Gram is an algorithm based on a statistical language model based on markov assumptions, namely: assume that the nth word appears in a piece of text only in relation to the first N-1 words and not in relation to any other words. Based on such an assumption, the probability of each word in the text can be evaluated, and the probability of the whole sentence is the product of the probabilities of the respective words. These probabilities can be obtained by counting the number of simultaneous occurrences of N words directly from the corpus, commonly used N-Gram models such as binary Bi-Gram and ternary Tri-Gram.
The universal language model can be generated by training with a universal corpus in advance, the universal corpus can be obtained by capturing Chinese corpuses from the internet or directly downloading a public free Chinese corpus through a web crawler tool, and the storage format of the universal language model can be an arpa format. It should be noted that the reason why the universal language model is used instead of other domain models is that the universal language model does not emphasize any domain and is a relatively smooth probability set calculated on a large number of historical text corpora, so that the universal language model is easier to migrate into a target domain, and can reflect a word connection probability close to reality.
The first WFSA network is a directed graph structure, a plurality of state nodes are arranged on the graph, connecting arcs are arranged among the state nodes, the arcs represent transitions among the states, the arcs are directional, and each arc is provided with an input label and probability corresponding to the state transition. Wherein, the input label is a word object; the probability of an arc characterizing the probability of the arc appearing in the path. The first WFSA network may include a plurality of paths, and a probability of each path may be calculated according to a product of probabilities on all arcs in the path, wherein when the probability is expressed as a weight on an arc between state nodes, the weight value may be calculated by taking a logarithm of the probability.
Specifically, when converting the generic language model in the arpa format into a first WFSA (Weighted Finite-State Automata) network, the execution body may call the arpa2fst tool to convert the generic language model into an equivalent first WFSA network. Of course, in practical applications, the arpa2fst tool may be called to perform conversion, and an equivalent first WFSA network may also be obtained through conversion in other ways, which is not specifically limited in this embodiment.
S2: and screening the optimal path meeting the preset conditions from the first WFSA network according to the preset number of field linguistic data to construct a second WFSA network.
The domain linguistic data can be common words and sentences, professional words and sentences and the like in a specific domain.
The preset number may be preset to be lower than the preset value, and it is understood that the number of samples of the domain corpus of the preset number is smaller than that of the general corpus.
In this embodiment, a plurality of paths may be respectively searched in the first WFSA network for each domain corpus, and one or more optimal paths satisfying the preset condition corresponding to each domain corpus may be obtained through screening, for example, the optimal path may be a path with the highest path probability, and the word sequence corresponding to each domain corpus may be obtained according to the optimal path corresponding to each domain corpus.
Wherein the preset condition is a preset condition for determining an optimal path. In a specific application, the preset condition may be set as: when the probability of the transmitting arc of each state node on one path exceeds a preset threshold, the path is an optimal path, and in addition, the preset condition can be set as: when the sum of the probabilities of all the transmitting arcs passed by one path exceeds a preset threshold value, the path is an optimal path.
Specifically, as shown in fig. 2, the implementation process of step S2 may include the steps of:
s21: and searching a preset number of candidate optimal paths corresponding to the domain linguistic data in the first WFSA network aiming at each domain linguistic data.
The preset number may be set as an integer value according to actual needs, and the specific preset number is not limited in this embodiment.
Specifically, the process may include:
inputting the domain linguistic data into a first WFSA network for searching aiming at each domain linguistic data to obtain a plurality of candidate paths corresponding to the domain linguistic data and path probabilities of the candidate paths; and sequencing a plurality of candidate paths corresponding to the field linguistic data from high to low according to the path probability, and taking the candidate paths sequenced in the front preset number as the optimal candidate paths of the field linguistic data.
For example, assuming that "weather today is really good" is input into the first WFSA network as a domain corpus, the following two candidate optimal paths can be searched:
PATH1: < s > weather today is really good >
PATH2 < s > weather today is really good.
S22: and screening out the optimal paths corresponding to the field linguistic data from the preset number of candidate optimal paths, wherein the probability of the transmitting arc of each state node of the optimal paths exceeds a preset threshold value.
In this embodiment, after one or more candidate optimal paths corresponding to a given domain corpus are searched, when the probability of the emission arc of each state node on one candidate optimal path exceeds a preset threshold, the candidate optimal path is the optimal path corresponding to the domain corpus.
S23: and constructing a second WFSA network according to the optimal path corresponding to each field corpus.
Specifically, a second WFSA network that only includes the initial state node and the end state node may be pre-constructed, and after each optimal path corresponding to one field corpus is obtained, the optimal path is updated to the second WFSA network until the optimal path corresponding to the last field corpus is updated to the second WFSA network, that is, the second WFSA network is constructed.
S3: the second WFSA network is normalized.
Specifically, the probabilities on all the transmission arcs of each state node in the second WFSA network are normalized according to the number of transmission arcs of each state node in the second WFSA network and the probabilities on the respective transmission arcs, so that the sum of the probabilities on all the transmission arcs of each state node in the second WFSA network is 1.
S4: and converting the normalized second WFSA network into a domain language model.
And when the general language model is the N-Gram model, the domain language model is the N-Gram model with the same order as the general language model.
Specifically, the execution main body can convert the text of the second WFSA network into the N-Gram model in the arpa format by calling the fsts-to-bridges tool, and then the domain language model is obtained. In addition, the fsts-to-bridges tool may be called to perform transformation, and a domain language model may also be obtained through transformation in other manners, which is not limited in this embodiment.
The invention provides a domain language model construction method, which comprises the steps of converting a general language model into an equivalent first WFSA network; then, according to a preset number of field linguistic data, screening out an optimal path meeting a preset condition from the first WFSA network to construct a second WFSA network; and finally, normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model, wherein the path for constructing the second WFSA network is screened from the first WFSA network converted from the general language model, and is screened according to the preset number of domain linguistic data, so that the domain language model converted from the normalized second WFSA network can meet the requirements of a specific scene and has general generalization capability, and the purpose of quickly constructing the language model which meets the specific scene and has the general generalization capability under the condition of insufficient training linguistic data is achieved.
Fig. 3 is a block diagram illustrating a domain language model building apparatus according to an embodiment of the present invention, and referring to fig. 3, the apparatus includes:
a first conversion module 31 for converting the generic language model into an equivalent first WFSA network;
a constructing module 32, configured to screen an optimal path meeting a preset condition from the first WFSA network according to a preset number of domain corpora, so as to construct a second WFSA network;
a normalization module 33, configured to normalize the second WFSA network;
and a second conversion module 34, configured to convert the normalized second WFSA network into a domain language model.
In one embodiment, the construction module 32 includes:
the searching submodule 321 is configured to search, for each domain corpus, a preset number of candidate optimal paths corresponding to the domain corpus in the first WFSA network;
the screening submodule 322 is configured to screen out an optimal path corresponding to the domain corpus from a preset number of candidate optimal paths, where a probability on a transmission arc of each state node of the optimal path exceeds a preset threshold;
and the constructing submodule 323 is used for constructing a second WFSA network according to the optimal path corresponding to each field corpus.
In one embodiment, the search submodule 321 is specifically configured to:
inputting the domain linguistic data into a first WFSA network for searching aiming at each domain linguistic data to obtain a plurality of candidate paths corresponding to the domain linguistic data and path probabilities of the candidate paths;
and sequencing a plurality of candidate paths corresponding to the field linguistic data from high to low according to the path probability, and taking the candidate paths sequenced in the front preset number as the optimal candidate paths of the field linguistic data.
In one embodiment, the normalization module 33 is specifically configured to:
normalizing the probabilities on all the transmitting arcs of each state node in the second WFSA network according to the number of the transmitting arcs of each state node in the second WFSA network and the probabilities on the respective transmitting arcs, so that the sum of the probabilities on all the transmitting arcs of each state node in the second WFSA network is 1.
In one embodiment, the generic language model and the domain language model are both N-Gram language models.
The device for constructing the domain language model provided by the embodiment of the invention belongs to the same inventive concept as the method for constructing the domain language model provided by the embodiment of the invention, can execute the method for constructing the domain language model provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects for executing the method for constructing the domain language model. For technical details that are not described in detail in this embodiment, reference may be made to the domain language model construction method provided in this embodiment of the present invention, and details are not described here again.
Fig. 4 shows an internal structure diagram of a computer device according to an embodiment of the present invention. The computer device may be a server, and its internal structure diagram may be as shown in fig. 4. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a domain language model building method.
In one embodiment, there is provided a computer device comprising:
one or more processors;
storage means for storing one or more programs;
the one or more programs when executed by the one or more processors cause the one or more processors to perform the computer program to perform the steps of:
converting the generic language model to an equivalent first WFSA network;
screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network;
and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, performs the steps of:
converting the generic language model to an equivalent first WFSA network;
screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network;
and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, physical sub-tables, or other media used in the embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for constructing a domain language model, the method comprising:
converting the generic language model to an equivalent first WFSA network;
screening an optimal path meeting preset conditions from the first WFSA network according to a preset number of field linguistic data to construct a second WFSA network;
and normalizing the second WFSA network, and converting the normalized second WFSA network into a domain language model.
2. The method of claim 1, wherein the screening out an optimal path meeting a preset condition from the first WFSA network according to a preset number of domain corpora to construct a second WFSA network comprises:
searching a preset number of candidate optimal paths in the first WFSA network aiming at each domain corpus; and
screening out optimal paths corresponding to the domain corpora from the preset number of candidate optimal paths, wherein the probability of the transmitting arc of each state node of the optimal paths exceeds a preset threshold;
and constructing the second WFSA network according to the optimal path corresponding to each field corpus.
3. The method according to claim 2, wherein the searching out a preset number of candidate optimal paths in the first WFSA network for each of the domain corpora comprises:
inputting the domain linguistic data into the first WFSA network for searching aiming at each domain linguistic data to obtain a plurality of candidate paths corresponding to the domain linguistic data and path probability of each candidate path;
and sequencing a plurality of candidate paths corresponding to the field corpus according to the order of the path probability from high to low, and taking the candidate paths sequenced in the front by a preset number of bits as the optimal candidate paths of the field corpus.
4. The method of any of claims 1 to 3, wherein the normalizing the second WFSA network comprises:
normalizing the probabilities on all the transmitting arcs of each state node in the second WFSA network according to the number of the transmitting arcs of each state node in the second WFSA network and the probabilities on the transmitting arcs.
5. The method of claim 1, wherein the generic language model and the domain language model are both N-Gram language models.
6. A domain language model building apparatus, the apparatus comprising:
the first conversion module is used for converting the universal language model into an equivalent first WFSA network;
the construction module is used for screening out an optimal path meeting preset conditions from the first WFSA network according to the field linguistic data with preset number so as to construct a second WFSA network;
a normalization module, configured to normalize the second WFSA network;
and the second conversion module is used for converting the normalized second WFSA network into a domain language model.
7. The apparatus of claim 6, wherein the configuration module comprises:
the search submodule is used for searching a preset number of candidate optimal paths in the first WFSA network aiming at each field corpus;
the screening submodule is used for screening the optimal paths corresponding to the domain corpora from the preset number of candidate optimal paths, wherein the probability of the transmitting arc of each state node of the optimal paths exceeds a preset threshold value;
and the construction submodule is used for constructing the second WFSA network according to the optimal path corresponding to each field corpus.
8. The apparatus of claim 7, wherein the search submodule is specifically configured to:
inputting the domain linguistic data into the first WFSA network for searching aiming at each domain linguistic data to obtain a plurality of candidate paths corresponding to the domain linguistic data and path probability of each candidate path;
and sequencing a plurality of candidate paths corresponding to the field corpus according to the order of the path probability from high to low, and taking the candidate paths sequenced in the front by a preset number of bits as the optimal candidate paths of the field corpus.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for domain language model construction according to any one of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the domain language model construction method according to any one of claims 1 to 5.
CN202010669031.1A 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium Active CN112002310B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010669031.1A CN112002310B (en) 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium
PCT/CN2021/099661 WO2022012238A1 (en) 2020-07-13 2021-06-11 Method and apparatus for constructing domain language model, and computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010669031.1A CN112002310B (en) 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112002310A true CN112002310A (en) 2020-11-27
CN112002310B CN112002310B (en) 2024-03-26

Family

ID=73466859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010669031.1A Active CN112002310B (en) 2020-07-13 2020-07-13 Domain language model construction method, device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112002310B (en)
WO (1) WO2022012238A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614023A (en) * 2020-12-25 2021-04-06 东北大学 Formalized security verification method for electronic contract
CN113782001A (en) * 2021-11-12 2021-12-10 深圳市北科瑞声科技股份有限公司 Specific field voice recognition method and device, electronic equipment and storage medium
WO2022012238A1 (en) * 2020-07-13 2022-01-20 苏宁易购集团股份有限公司 Method and apparatus for constructing domain language model, and computer device, and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968989A (en) * 2012-12-10 2013-03-13 中国科学院自动化研究所 Improvement method of Ngram model for voice recognition
US20130311182A1 (en) * 2012-05-16 2013-11-21 Gwangju Institute Of Science And Technology Apparatus for correcting error in speech recognition
JP2015152661A (en) * 2014-02-12 2015-08-24 日本電信電話株式会社 Weighted finite state automaton creation device, symbol string conversion device, voice recognition device, methods thereof and programs
US20150254233A1 (en) * 2014-03-06 2015-09-10 Nice-Systems Ltd Text-based unsupervised learning of language models
US20150325235A1 (en) * 2014-05-07 2015-11-12 Microsoft Corporation Language Model Optimization For In-Domain Application
US20160093292A1 (en) * 2014-09-26 2016-03-31 Intel Corporation Optimizations to decoding of wfst models for automatic speech recognition
JP2017097451A (en) * 2015-11-18 2017-06-01 富士通株式会社 Information processing method, information processing program, and information processing device
CN110472223A (en) * 2018-05-10 2019-11-19 北京搜狗科技发展有限公司 A kind of input configuration method, device and electronic equipment
CN111243599A (en) * 2020-01-13 2020-06-05 网易有道信息技术(北京)有限公司 Speech recognition model construction method, device, medium and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112002310B (en) * 2020-07-13 2024-03-26 苏宁云计算有限公司 Domain language model construction method, device, computer equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130311182A1 (en) * 2012-05-16 2013-11-21 Gwangju Institute Of Science And Technology Apparatus for correcting error in speech recognition
CN102968989A (en) * 2012-12-10 2013-03-13 中国科学院自动化研究所 Improvement method of Ngram model for voice recognition
JP2015152661A (en) * 2014-02-12 2015-08-24 日本電信電話株式会社 Weighted finite state automaton creation device, symbol string conversion device, voice recognition device, methods thereof and programs
US20150254233A1 (en) * 2014-03-06 2015-09-10 Nice-Systems Ltd Text-based unsupervised learning of language models
US20150325235A1 (en) * 2014-05-07 2015-11-12 Microsoft Corporation Language Model Optimization For In-Domain Application
US20160093292A1 (en) * 2014-09-26 2016-03-31 Intel Corporation Optimizations to decoding of wfst models for automatic speech recognition
JP2017097451A (en) * 2015-11-18 2017-06-01 富士通株式会社 Information processing method, information processing program, and information processing device
CN110472223A (en) * 2018-05-10 2019-11-19 北京搜狗科技发展有限公司 A kind of input configuration method, device and electronic equipment
CN111243599A (en) * 2020-01-13 2020-06-05 网易有道信息技术(北京)有限公司 Speech recognition model construction method, device, medium and electronic equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
T. HORI. ET AL: "language model adaptation using WFST-based speaking-style translation", IEEE *
王晓瑞;丁鹏;梁家恩;徐波;: "一个面向广播语音识别的语言模型自适应框架", 中文信息学报, no. 04 *
范书平: "基于WFST的中文语音识别解码器的研究", 中国优秀硕士论文全文数据库·信息科技辑, no. 03 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022012238A1 (en) * 2020-07-13 2022-01-20 苏宁易购集团股份有限公司 Method and apparatus for constructing domain language model, and computer device, and storage medium
CN112614023A (en) * 2020-12-25 2021-04-06 东北大学 Formalized security verification method for electronic contract
CN113782001A (en) * 2021-11-12 2021-12-10 深圳市北科瑞声科技股份有限公司 Specific field voice recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2022012238A1 (en) 2022-01-20
CN112002310B (en) 2024-03-26

Similar Documents

Publication Publication Date Title
CN112002310B (en) Domain language model construction method, device, computer equipment and storage medium
CN112102815B (en) Speech recognition method, speech recognition device, computer equipment and storage medium
CN111444311A (en) Semantic understanding model training method and device, computer equipment and storage medium
CN110704588A (en) Multi-round dialogue semantic analysis method and system based on long-term and short-term memory network
CN111563208A (en) Intention identification method and device and computer readable storage medium
US9934452B2 (en) Pruning and label selection in hidden Markov model-based OCR
CN110175273B (en) Text processing method and device, computer readable storage medium and computer equipment
CN111428474A (en) Language model-based error correction method, device, equipment and storage medium
CN111583911B (en) Speech recognition method, device, terminal and medium based on label smoothing
CN113506574A (en) Method and device for recognizing user-defined command words and computer equipment
CN106843523B (en) Character input method and device based on artificial intelligence
CN114120978A (en) Emotion recognition model training and voice interaction method, device, equipment and medium
CN109086348B (en) Hyperlink processing method and device and storage medium
CN112766319A (en) Dialogue intention recognition model training method and device, computer equipment and medium
CN112417878A (en) Entity relationship extraction method, system, electronic equipment and storage medium
CN112733911A (en) Entity recognition model training method, device, equipment and storage medium
CN112836506A (en) Information source coding and decoding method and device based on context semantics
CN114297361A (en) Human-computer interaction method based on scene conversation understanding and related components
JP2021524095A (en) Text-level text translation methods and equipment
CN113343711A (en) Work order generation method, device, equipment and storage medium
CN113449081A (en) Text feature extraction method and device, computer equipment and storage medium
CN115062619B (en) Chinese entity linking method, device, equipment and storage medium
CN115497484B (en) Voice decoding result processing method, device, equipment and storage medium
CN115862616A (en) Speech recognition method
CN115270789A (en) Abnormal voice data detection method and device and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant