CN116151391A

CN116151391A - Language model construction method, electronic device and computer readable storage medium

Info

Publication number: CN116151391A
Application number: CN202310160741.5A
Authority: CN
Inventors: 李承翰; 蒋宁; 陈逢文; 吴海英; 夏粉; 刘敏; 全宗峰
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd
Priority date: 2023-02-23
Filing date: 2023-02-23
Publication date: 2023-05-23

Abstract

The present disclosure provides a language model construction method and apparatus, an electronic device, and a computer-readable storage medium, where the language model construction method includes: acquiring a text to be processed; determining the degree of correlation between the text to be processed and an initial language model of at least one pre-trained scene respectively; if the correlation degree between the text to be processed and the first initial language model meets a preset condition, fusing the language model corresponding to the text to be processed with the first initial language model, and replacing the first initial language model with the fused model; and if the correlation degree of the text to be processed and the initial language model of the at least one scene does not meet the preset condition, taking the language model corresponding to the text to be processed as the initial language model of the second scene. The embodiment of the disclosure can reduce training cost and can obtain the multi-scene language model through continuous expansion and optimization.

Description

Language model construction method, electronic device and computer readable storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence (Artificial Intelligence, AI) technology, and in particular, to a language model construction method, an electronic device, and a computer-readable storage medium.

Background

Currently, language models are commonly applied to a particular scene, such as a sports news scene, etc. Based on the method, a large number of corpus of related scenes are usually required to be collected before the language model is trained, so that the training cost is high, the language model obtained through training is only suitable for specific scenes, and generalization is poor.

Disclosure of Invention

The disclosure provides a language model construction method and device, electronic equipment and a computer readable storage medium.

In a first aspect, the present disclosure provides a language model construction method, including:

acquiring a text to be processed;

determining the correlation degree of the text to be processed and an initial language model of at least one pre-trained scene respectively, wherein the initial language model of the at least one scene is obtained according to basic text training;

if the correlation degree between the text to be processed and the first initial language model meets a preset condition, fusing the language model corresponding to the text to be processed with the first initial language model to obtain a fused model, wherein the first initial language model is one of the initial language models of the at least one scene; replacing the fusion model with the first initial language model to serve as a new initial language model of a first scene corresponding to the first initial language model;

If the correlation degree of the text to be processed and the initial language model of the at least one scene does not meet the preset condition, taking the language model corresponding to the text to be processed as an initial language model of a second scene, wherein the second scene is a scene associated with the file to be processed, and the second scene and the at least one scene are different from each other.

In a second aspect, the present disclosure provides a language model construction apparatus including:

the acquisition module is used for acquiring the text to be processed;

the determining module is used for determining the correlation degree of the text to be processed and an initial language model of at least one pre-trained scene respectively, wherein the initial language model of the at least one scene is obtained according to basic text training;

the fusion module is used for fusing the language model corresponding to the text to be processed with the first initial language model to obtain a fusion model if the degree of correlation between the text to be processed and the first initial language model meets a preset condition, wherein the first initial language model is one of the initial language models of the at least one scene; replacing the fusion model with the first initial language model to serve as a new initial language model of a first scene corresponding to the first initial language model;

And the updating module is used for taking the language model corresponding to the text to be processed as an initial language model of a second scene if the correlation degree of the text to be processed and the initial language model of the at least one scene does not meet the preset condition, wherein the second scene is a scene associated with the file to be processed, and the second scene and the at least one scene are different from each other.

In a third aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, the one or more computer programs being executable by the at least one processor to enable the at least one processor to perform the language model construction method described above.

In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the language model construction method described above.

In a fifth aspect, the present disclosure provides a computer program or a computer program product comprising a computer program stored in a computer readable storage medium, which computer program, when executed by a processor, implements the language model construction method described above.

In the embodiment provided by the disclosure, firstly, initial language models corresponding to at least one scene are obtained based on the training of the existing basic text, further, after the text to be processed is obtained, the correlation degree of the text to be processed and the initial language models of each scene is determined, if the correlation degree of the text to be processed and the first initial language model meets the preset condition, the language models corresponding to the text to be processed and the first initial language model are fused to obtain a fusion model, the fusion model is used for replacing the first initial language model to serve as a new initial language model of the first scene corresponding to the first initial language model, and the text to be processed is added into corpus of the corresponding scene, wherein the first initial language model is one of the initial language models of the at least one scene. If the correlation degree of the text to be processed and the initial language model of the at least one scene does not meet the preset condition, taking the language model corresponding to the text to be processed as an initial language model of a second scene, wherein the second scene is a scene associated with the file to be processed, and the second scene and the at least one scene are different from each other. As can be seen, in the embodiments of the present disclosure, the corpus used for training is not collected for a specific scene, but a basic language model corresponding to at least one scene is obtained by pre-training based on the existing corpus, and the basic language model is used as an initial language model, that is, the text corpus of which scenes the existing corpus includes, and then the initial language model of the corresponding scene can be obtained by pre-training. After the text to be processed is obtained, if the degree of correlation between the text to be processed and a certain initial language model in at least one initial language model is higher, the language model corresponding to the text to be processed and the initial language model with higher degree of correlation are fused to optimize the language model of the scene corresponding to the initial language model with higher degree of correlation; and if the text to be processed is not related to at least one scene, taking the scene represented by the text to be processed as a new scene independent of the at least one scene, and taking the language model of the text to be processed as an initial language model of the new scene. Since a large amount of corpus is not required to be collected in advance for each scene, the calculation amount is small when the initial language model of each scene is obtained through training. Further, after receiving a text to be processed matched with an existing scene, optimizing a corresponding language model of the corresponding scene in a model fusion mode, compared with the existing mode of retraining the language model based on updated text, the implementation mode of the method can reduce training cost, and the language model of multiple scenes can be obtained in a continuous expansion and optimization mode.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:

FIG. 1 is a flow chart of a language model construction method provided in an embodiment of the present disclosure;

FIG. 2 is a block diagram of a language model building scenario provided by an embodiment of the present disclosure;

FIG. 3 is a block diagram of a language model usage scenario provided by an embodiment of the present disclosure;

FIG. 4 is a block diagram of a language model construction device according to an embodiment of the present disclosure;

fig. 5 is a block diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Embodiments of the present disclosure relate to the field of natural language processing (Nature Language processing, NLP), and generally to language model training techniques. The language model may be used to process language and text, for example, at least one of the following processing operations may be performed: language identification, machine translation, question and answer, information retrieval, text extraction, etc.

Language models are usually trained or recognized based on the rules of language text, and terms related to corpus and frequencies of terms and the like in different implementation scenes are different. For example, in a financial scenario, the corpus contains frequently occurring words such as "bond", "fund", etc., while in a sports scenario, the corpus may not even contain terms such as "bond", "fund". Based on this, a conventional method of training a language model includes: and collecting a large amount of text corpus in the target scene, and training to obtain a language model applied to the target scene by using the collected large amount of text corpus. However, by adopting the implementation mode, a large amount of text corpus is collected, so that a large amount of cost is required, the calculation amount of training a language model based on the large amount of text corpus is large and complex, and the trained language model is only suitable for a target scene and has poor generalization.

In view of this, an embodiment of the present disclosure provides a language model construction method, which trains according to a basic text to obtain initial language models corresponding to at least one scene respectively, where the basic text is an existing text. After the text to be processed is obtained, determining the matching degree of the text to be processed and each scene in at least one scene, if the text to be processed is matched with a certain scene in at least one scene, fusing a language model corresponding to the text to be processed and an initial language model corresponding to the corresponding scene, and replacing the initial language model corresponding to the corresponding scene by using the fused language model so as to optimize the language model corresponding to the scene; and if the text to be processed is not matched with at least one scene, taking the scene associated with the text to be processed as a new scene independent of the at least one scene, and taking the language model of the text to be processed as an initial language model of the new scene. Therefore, a large amount of text corpus is not required to be collected, the training cost can be reduced, the calculation amount of the training process of the language model of each scene is small, the language model of a new scene can be added in a continuous expansion mode, and the language model of the existing scene is optimized.

The language model construction method according to the embodiment of the present disclosure may be performed by an electronic device, which may be a vehicle-mounted device, a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like, which may be implemented by a processor invoking computer readable program instructions stored in a memory. The electronic device may be used as a terminal device or a server.

Fig. 1 is a flowchart of a language model construction method according to an embodiment of the present disclosure. Referring to fig. 1, the method includes:

in step S11, a text to be processed is acquired.

Where the text to be processed refers to new text that is different from the existing underlying text.

It should be noted that, before step S11, the electronic device may obtain, according to the basic text training, an initial language model corresponding to at least one scene respectively. The basic text according to the embodiment of the present disclosure may be obtained through various public legal channels, for example, the basic text may include technical documents, teaching material materials, news articles and the like obtained from related websites, and the basic text may be a text corpus of at least one scene, for example, a text of a financial scene.

It should be noted that in the embodiment of the present disclosure, an electronic device may maintain a network to be trained in advance. Further, after the basic text is obtained, the electronic device may classify the basic text according to the scenes, and then, for each classified scene, perform operations such as preprocessing, word segmentation and the like on the text of the corresponding scene, and further, train the network to be trained based on the word segmentation result, and use the language model after training as the initial language model corresponding to the corresponding scene.

For example, the basic text includes a text of a financial scene, a text of a sports scene and a text of an entertainment scene, and the electronic device may divide the basic text into the text of the financial scene, the text of the sports scene and the text of the entertainment scene according to the scenes, and further perform operations such as preprocessing, word segmentation and training to obtain an initial language model of the financial scene corresponding to the text of the financial scene; preprocessing, word segmentation and other operations are carried out on texts corresponding to the sports scene, and an initial language model of the sports scene is obtained through training; and performing operations such as preprocessing, word segmentation and the like on the text corresponding to the entertainment scene, and training to obtain an initial language model of the entertainment scene.

In step S12, a degree of relevance of the text to be processed to the initial language model of the pre-trained at least one scene, respectively, is determined.

The correlation degree of the text to be processed and the initial language model of the scene i can be characterized to determine whether the text to be processed belongs to the text describing the scene i, wherein the scene i is any scene of the at least one scene.

In some implementations, after the text to be processed is obtained, the electronic device may segment the text to be processed to obtain a plurality of word segments, where each word segment corresponds to a phrase length, and the phrase length of the word segment j is used to indicate the number of words that make up the word segment j, where the word segment j is any one of the plurality of word segments. And further, determining a relevance parameter of each initial language model and the text to be processed according to the word segments, wherein each relevance parameter is used for representing the relevance degree of the corresponding initial language model and the text to be processed.

Illustratively, the third initial speech model is any one of the initial language models of the at least one scene; the specific implementation manner of determining the relevance parameter between the third initial language model and the text to be processed according to the word segments is as follows: and calling the third initial language model to calculate a relevance parameter of the third initial language model and the text to be processed according to the word segments, wherein the relevance parameter is used for representing the relevance degree of the text to be processed and a scene corresponding to the third initial language model.

For example, the electronic device may segment the text to be processed to obtain a plurality of word segments based on an n-gram mechanism, where the n-gram mechanism refers to sequentially grouping the 1 st to n th characters in order to obtain a word segment from the first character of the text to be processed with a sliding window with a step length of n, then grouping the 2 nd to n-1 st characters to obtain a word segment, and then sliding backward according to the rule until the last character. The step length n is the phrase length.

In some implementations, the electronic device may flexibly set the phrase length n according to an actual implementation scenario, and may set a plurality of n-grams, for example, in one implementation scenario, the value of the phrase length n may be set to any one of values 2 to 8.

It should be noted that, in combination with the foregoing description, various word combinations, word frequencies and distributions of the words in the text describing different scenes are different. Based on the above, the confidence level prediction is performed on the text corresponding to the second scene by using the language model corresponding to the first scene, so that the obtained confidence level parameter is generally poor, and the confidence level prediction is performed on the text corresponding to the first scene by using the language model corresponding to the first scene, so that the higher confidence level parameter can be obtained. The confidence parameters of the text may characterize the confidence of the text composed of the word segments, and may be generally presented with probabilities, e.g., the greater the probability, the higher the confidence of the text composed of the word segments, the less the probability, and the lower the confidence of the text composed of the word segments.

Based on this, in step S12, taking the example of calling the third initial language model to calculate the relevance parameter of the third initial language model and the text to be processed according to the plurality of word segments, the electronic device may call the third initial language model to calculate the reliability parameter of each word segment in the plurality of word segments, where the reliability parameter is used to characterize the reliability of the word segment formed by the corresponding word segment under the scenario corresponding to the third initial language model, and further calculate the relevance parameter of the third initial language model and the text to be processed according to the reliability parameter of each word segment.

For example, the text S to be processed is segmented into word segments W by adopting an n-gram mechanism ₁ Word segment W ₂ … … to word segment W _k I.e. s=w ₁ ，W ₂ ，……W _k The relevance parameter P (S) of the third initial language model and the text S to be processed, corresponding to the third initial language model, may be implemented, for example, as:

P(S)＝P(W ₁ ，W ₂ ，...，W _k )＝p(W ₁ )P(W ₂ |W ₁ )…P(W _k |W ₁ ，W ₂ ，…，W _k-1 )

wherein P (W) _i ) Refers to the word segment w calculated by using the third initial language model _i Confidence parameters for the composed words. Taking 3-gram as an example:

P(W _i |W _i-1 ，W _i-2 )≈count(W _i-2 W _i-1 W _i )/count(W _i-2 W _i-1 )

wherein count (W) _i-2 W _i-1 W _i ) Is the word W _i-2 W _i-1 W _i Word frequencies in all words of each word segment composition.

In some implementations, the relevance parameters between the initial language model and the text to be processed include: a confusion (ppl) parameter between the initial language model and the text to be processed. Taking the above text to be processed S as an example, the ppl parameters of the third initial language model and the text to be processed S may be implemented as follows:

it can be seen that, the larger the correlation parameter P (S) between the third initial language model and the text S to be processed is, the higher the probability of obtaining the text S to be processed by using the third initial language model to organize is, the smaller the ppl parameter between the third initial language model and the text S to be processed is, i.e. the more "not confused" the third initial language model is used to characterize the text S to be processed, and then the higher the correlation degree between the third initial language model and the text S to be processed is. On the contrary, the smaller the correlation parameter P (S) between the third initial language model and the text S to be processed, the lower the probability of obtaining the text S to be processed by using the third initial language model organization, the larger the ppl parameter between the third initial language model and the text S to be processed, that is, the more "confusion" of the third initial language model to be used for representing the text S to be processed, and the lower the correlation degree between the third initial language model and the text S to be processed.

In step S13, if the degree of correlation between the text to be processed and the first initial language model meets the preset condition, fusing the language model corresponding to the text to be processed with the first initial language model to obtain a fused model, and replacing the first initial language model with the fused model to serve as a new initial language model of the first scene corresponding to the first initial language model.

Wherein the first initial language model is one of the initial language models of the at least one scene. Taking the implementation of the correlation degree as a ppl parameter as an example, the correlation degree meeting the preset condition may be that the ppl parameter is smaller than a preset threshold, for example. The degree of correlation between the text to be processed and the first initial language model meets the preset condition, namely, the ppl parameter of the text to be processed and the first initial language model is smaller than the preset threshold.

In some implementations, after the text to be processed is obtained, the electronic device may further train to obtain a language model corresponding to the text to be processed according to the text to be processed. For example, after the text to be processed is segmented to obtain a plurality of word segments in step S12, the electronic device may train the network to be trained according to the plurality of word segments, and use the trained model as the language model corresponding to the text to be processed.

According to the foregoing description, if the degree of correlation between the text to be processed and the first initial language model meets the preset condition, it is indicated that the text to be processed is the text of the scene corresponding to the first initial language model, and then the electronic device may optimize the first initial language model according to the text to be processed and the language model corresponding to the text to be processed.

In some implementations, the electronic device may determine an interpolation coefficient according to a degree of correlation between the language model corresponding to the text to be processed and the first initial language model, and further interpolate the language model of the text to be processed and the first initial language model according to the interpolation coefficient to obtain the fusion model.

For example, when the correlation parameter between the first initial language model and the text to be processed is a ppl parameter, a difference between a preset threshold and the ppl parameter corresponding to the first initial language model may be calculated, and a ratio of the difference to the preset threshold is used as the interpolation coefficient, that is, the interpolation coefficient= (preset threshold-ppl)/preset threshold.

Further, the electronic device may take the first initial language model as a model to be interpolated, and interpolate the language model corresponding to the text to be processed into the model to be interpolated, so as to obtain an interpolated model. For example, the electronic device obtains, for each phrase length, a probability distribution of the language model of the text to be processed for each word segment of the corresponding phrase length as a first probability distribution, and further, for each phrase length, and obtaining probability distribution of the first initial language model on word segments with the phrase length in the corresponding scene text as second probability distribution, wherein the first initial language model corresponds to classification operation from the scene text to the basic text, and detailed description of step S11 is omitted herein. And then, according to the length of each phrase, the electronic equipment takes the interpolation coefficient as an algorithm weight, fuses the corresponding first probability distribution with the corresponding second probability distribution through a weighting algorithm, replaces the model parameters of the first initial language model by fused data, takes the language model after replacing the model parameters as the fusion model, and further can replace the first initial language model by using the fusion model.

In other implementations, the text to be processed may also be added to the corpus of the scene corresponding to the first initial language model.

Therefore, according to the implementation mode, language models of relevant scenes are continuously optimized by inputting language materials in small batches aiming at the language models of each scene and adopting an interpolation mode, so that the training cost can be reduced without repeated training of the models based on historical language materials, and the quantity of text language materials can be continuously expanded.

In step S14, if the degree of correlation between the text to be processed and the initial language model of at least one scene does not satisfy the preset condition, the language model corresponding to the text to be processed is used as the initial language model of the second scene.

The second scene refers to a scene associated with the file to be processed, and the second scene and the at least one scene are different from each other.

If the correlation degree between the text to be processed and the initial language model of at least one scene does not meet the preset condition, it may be understood that the ppl parameters of the text to be processed and the initial language model of at least one scene are both greater than a preset threshold, that is, the scene associated with the text to be processed and the at least one scene are not matched, then the electronic device may use the scene associated with the text to be processed as a new scene other than the at least one scene, use the language model of the text to be processed as the initial language model of the new scene, and use the text to be processed as the corpus of the new scene.

By adopting the implementation mode, the number of the language models of the adapted scenes can be continuously expanded by inputting the text corpus of the new scenes, so that the language models of the plurality of the scenes can be obtained through training in a low-cost state.

The language model construction method provided by the embodiments of the present disclosure is described below with reference to examples.

Referring to fig. 2, fig. 2 is a block diagram of a language model construction scenario provided by an embodiment of the present disclosure, in this example, a basic text is an existing corpus, and a text to be processed is a new text.

In the preprocessing stage, the electronic device classifies the existing corpus according to scenes, for example, sub-texts of A, B, C scenes are obtained, further, operations such as preprocessing and word segmentation are performed on the sub-texts of each scene in A, B, C, and then a language model corresponding to each scene is obtained according to word segmentation training of the sub-texts of each scene: and taking the A language model, the B language model and the C language model as frame models.

And in the language model construction stage, receiving the new text, and then, the electronic equipment performs operations such as preprocessing, word segmentation and the like on the new text. Then, on one hand, according to word segmentation training of the new text, a language model corresponding to the new text is obtained: a new language model. On the other hand, the new text word segment after word segmentation is respectively input into an A language model, a B language model and a C language model, so that the A language model, the B language model and the C language model respectively calculate the ppl value, the A language model calculates the ppl.A value, the B language model calculates the ppl.B value, and the C language model calculates the ppl.C value. And further, respectively judging the magnitude relation between the ppl.A value, the ppl.B value and the ppl.C value and a preset ppl threshold value, if the ppl.A value, the ppl.B value and the ppl.C value are all larger than the ppl threshold value, indicating that the new text is not matched with the A, B, C three scenes, adding the new text as corpus of the D scene into the scene which can be adapted by the disclosure, and adding the new language model as the D language model into the frame model.

In another implementation, by way of example, if the ppl.A value is less than the ppl threshold, it is indicated that the new text matches the A scene, i.e., the new text is text in the A scene. Then, calculating an interpolation coefficient= (ppl threshold-ppl.a)/ppl threshold, taking the interpolation coefficient as the weight of interpolation calculation, interpolating the a language model according to the new language model to obtain an interpolated a 'language model, and then using the a' language model to replace the a language model as the language model of the a scene.

It will be appreciated that the above is only a schematic description taking the language model a as an example, and the embodiments of the present disclosure are not limited, and if the ppl of the language model B or the language model C is smaller than the ppl threshold, the new language model is interpolated with the corresponding language model to optimize the language model of the corresponding scene.

Therefore, by adopting the language model construction method provided by the embodiment of the disclosure, the corpus used for training is not collected for a specific scene, but the basic language model corresponding to at least one scene is obtained by pre-training based on the existing corpus (basic text) to serve as an initial language model. After the text to be processed is obtained, if the correlation degree of the text to be processed and a first initial language model in at least one initial language model meets a preset condition, describing that the text to be processed is matched with a scene corresponding to the first initial language model, and optimizing the language model corresponding to the scene by fusing the language model corresponding to the text to be processed and the first initial language model; if the text to be processed is not matched with at least one scene, taking the scene corresponding to the text to be processed as a new scene independent of the at least one scene, and taking the language model of the text to be processed as an initial language model of the new scene. Thus, the training cost can be reduced, and the calculation amount in each model training process is smaller. Furthermore, by inputting the linguistic data in small batches, if the initial language model serving as the basis comprises the language model corresponding to the input linguistic data, continuously optimizing the language model of the related scene in an interpolation mode, and repeatedly training the model based on the historical linguistic data is not needed; if the initial language model serving as the basis does not comprise the language model corresponding to the input corpus, the number of the language models of the adapted scenes can be expanded, so that the language models adapted to a plurality of scenes can be obtained through training.

It should be noted that, in the embodiment of the present disclosure, the electronic device may deploy at least one language model related to the embodiment of the method in one multi-scenario language model training system, and may call the language model according to the scenario in the process of continuously optimizing each scenario language model in the multi-scenario language model training system.

By way of example, the language model may be a language recognition model, a machine translation model, a text recognition model, an information retrieval model, a text extraction model, and the like.

For example, how the obtained language model performs text processing, such as obtaining a text to be processed, and performing XX operation on the text to be processed input model to obtain what of the text

For example, the language recognition model may be used for a voice recognition task, which may be applied in including a conversation robot capable of conversationally interacting with a user to meet the needs of the user, and in this regard, the conversation robot may be classified into a chat type robot, a business type robot, and the like. In order to realize dialogue communication with a user, firstly, the dialogue robot needs to accurately understand the intention expressed by the voice of the user, generally, the electronic equipment can convert the voice of the user into text, then perform text recognition to obtain the intention or key word of the dialogue of the user, further generate answer voice or text according to the intention of the dialogue of the user, and output the answer voice or text to the user through the dialogue robot.

As shown in fig. 3, fig. 3 is a block diagram of a language model usage scenario provided by an embodiment of the present disclosure. After receiving the text corresponding to any scene in the multi-scene language model training system, the multi-scene language model training system can calculate the ppl value of the existing language model and the corresponding text, and then select the language model with the smallest ppl value as the called model.

It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.

In addition, the disclosure further provides a language model construction device, an electronic device, and a computer readable storage medium, where the foregoing may be used to implement any one of the language model construction methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.

Fig. 4 is a block diagram of a language model construction apparatus according to an embodiment of the present disclosure.

Referring to fig. 4, an embodiment of the present disclosure provides a language model construction apparatus including: an acquisition module 41, a determination module 42, a fusion module 43 and an update module 44. Wherein, each module can realize part or all of the functions in the implementation manner of the method when running.

Detailed implementation manner is shown in the above method implementation manners illustrated in fig. 1 and fig. 2, and will not be repeated here.

It will be appreciated that the above division of the respective modules is merely a division of logic functions, and in actual implementation, each of the above modules may be integrated into a hardware implementation, for example, the functions of the obtaining module 41 in the above implementation may be integrated into an I/O interface implementation, and the functions of the determining module 42, the fusing module 43 and the updating module 44 may be integrated into a processor implementation.

Referring to fig. 5, fig. 5 is a schematic view of an electronic device according to an embodiment of the present disclosure, the electronic device including: at least one processor 501; at least one memory 502, and one or more I/O interfaces 503, coupled between the processor 501 and the memory 502; wherein the memory 502 stores one or more computer programs executable by the at least one processor 501, the one or more computer programs being executable by the at least one processor 501 to enable the at least one processor 501 to perform the language model construction method described above.

The disclosed embodiments also provide a computer readable storage medium, which may be a volatile or non-volatile computer readable storage medium, having a computer program stored thereon, wherein the computer program when executed by the processor 501:

acquiring a text to be processed; determining the correlation degree of the text to be processed and an initial language model of at least one pre-trained scene respectively, wherein the initial language model of the at least one scene is obtained according to basic text training; if the correlation degree between the text to be processed and the first initial language model meets a preset condition, fusing the language model corresponding to the text to be processed with the first initial language model to obtain a fused model, wherein the first initial language model is one of the initial language models of the at least one scene; replacing the fusion model with the first initial language model to serve as a new initial language model of a first scene corresponding to the first initial language model; if the correlation degree of the text to be processed and the initial language model of the at least one scene does not meet the preset condition, taking the language model corresponding to the text to be processed as an initial language model of a second scene, wherein the second scene is a scene associated with the file to be processed, and the second scene and the at least one scene are different from each other.

In some embodiments, the processor 501 is further configured to segment the text to be processed to obtain a plurality of word segments, each word segment corresponding to a phrase length, the phrase length of each word segment being used to indicate the number of words that compose the corresponding word segment; determining a relevance parameter of each initial language model and the text to be processed according to the word segments, wherein each relevance parameter is used for representing the relevance degree of the corresponding initial language model and the text to be processed; the specific implementation manner of determining the relevance parameter between the third initial language model and the text to be processed according to the word segments is as follows: invoking the third initial language model to calculate a relevance parameter of the third initial language model and the text to be processed according to the word segments, wherein the third initial language model is any one of the initial language models of the at least one scene; and the relevance parameter is used for representing the relevance degree of the text to be processed and the scene corresponding to the third initial language model.

In some embodiments, the processor 501 is further configured to invoke the third initial language model to calculate a confidence parameter of each word segment in the plurality of word segments, where the confidence parameter is used to characterize a confidence level of a word segment formed by the respective word segment under a scenario corresponding to the third initial language model; and calculating the relevance parameter of the third initial language model and the text to be processed according to the reliability parameter of each word segment.

In some embodiments, the relevance parameters between the initial language model and the text to be processed include: the confusion degree parameter between the initial language model and the text to be processed, wherein the degree of correlation between the text to be processed and the first initial language model meets the preset condition, and the confusion degree parameter comprises the following steps: and the confusion degree parameter of the text to be processed and the first initial language model is smaller than a preset threshold value.

In some embodiments, the processor 501 is further configured to determine an interpolation coefficient according to a degree of correlation between the language model corresponding to the text to be processed and the first initial language model; and interpolating the language model of the text to be processed and the first initial language model according to the interpolation coefficient to obtain the fusion model.

In some embodiments, the processor 501 is further configured to calculate a difference between a preset threshold and a confusion parameter corresponding to the first initial language model; and taking the ratio of the difference value to the preset threshold value as the interpolation coefficient.

In some embodiments, the phrase length corresponding to each word segment is an integer greater than or equal to 2, and the processor 501 is further configured to obtain, for each phrase length, a probability distribution of the language model of the text to be processed on each word segment of the corresponding phrase length, as a first probability distribution; obtaining probability distribution of the first initial language model on word segments of phrase lengths in corresponding scene texts as second probability distribution, wherein the scene texts corresponding to the first initial language model come from the basic texts; corresponding to the length of each phrase, taking the interpolation coefficient as algorithm weight, and fusing corresponding first probability distribution with corresponding second probability distribution through a weighting algorithm; and replacing the model parameters of the first initial language model by using the fused data, and taking the language model with the replaced model parameters as the fused model.

Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the language model building method described above.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).

The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.

The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.

Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.

The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims

1. A method for constructing a language model, comprising:

acquiring a text to be processed;

2. The language model construction method according to claim 1, wherein the determining the degree of correlation of the text to be processed with the initial language model of the pre-trained at least one scene, respectively, comprises:

dividing the text to be processed into a plurality of word segments, wherein each word segment corresponds to a phrase length, and the phrase length of each word segment is used for indicating the number of words forming the corresponding word segment;

determining a relevance parameter of each initial language model and the text to be processed according to the word segments, wherein each relevance parameter is used for representing the relevance degree of the corresponding initial language model and the text to be processed; the specific implementation manner of determining the relevance parameter between the third initial language model and the text to be processed according to the word segments is as follows: invoking the third initial language model to calculate a relevance parameter of the third initial language model and the text to be processed according to the word segments, wherein the third initial language model is any one of the initial language models of the at least one scene; and the relevance parameter is used for representing the relevance degree of the text to be processed and the scene corresponding to the third initial language model.

3. The language model construction method according to claim 2, wherein the calling the third initial language model to calculate a relevance parameter of the third initial language model to the text to be processed according to the plurality of word segments comprises:

invoking the third initial language model to calculate the credibility parameter of each word segment in the plurality of word segments, wherein the credibility parameter is used for representing the credibility of the word segments formed by the corresponding word segments under the corresponding scene of the third initial language model;

and calculating the relevance parameter of the third initial language model and the text to be processed according to the reliability parameter of each word segment.

4. A language model construction method according to claim 3, wherein the relevance parameter between the initial language model and the text to be processed comprises: the confusion degree parameter between the initial language model and the text to be processed, wherein the degree of correlation between the text to be processed and the first initial language model meets the preset condition, and the confusion degree parameter comprises the following steps: and the confusion degree parameter of the text to be processed and the first initial language model is smaller than a preset threshold value.

5. The method for constructing a language model according to claim 1, wherein fusing the language model corresponding to the text to be processed with the first initial language model to obtain a fused model comprises:

Determining an interpolation coefficient according to the correlation degree of the language model corresponding to the text to be processed and the first initial language model;

and interpolating the language model of the text to be processed and the first initial language model according to the interpolation coefficient to obtain the fusion model.

6. The method for constructing a language model according to claim 4 or 5, wherein determining the interpolation coefficient according to the degree of correlation between the language model corresponding to the text to be processed and the first initial language model comprises:

calculating a difference value of a preset threshold value and a confusion degree parameter corresponding to the first initial language model;

and taking the ratio of the difference value to the preset threshold value as the interpolation coefficient.

7. The method for constructing a language model according to any one of claims 2 to 5, wherein the phrase length corresponding to each word segment is an integer greater than or equal to 2, and the interpolating the language model of the text to be processed and the first initial language model according to the interpolation coefficient to obtain a fusion model includes:

corresponding to the length of each phrase, acquiring probability distribution of the language model of the text to be processed on each word segment of the corresponding phrase length as first probability distribution; obtaining probability distribution of the first initial language model on word segments of phrase lengths in corresponding scene texts as second probability distribution, wherein the scene texts corresponding to the first initial language model come from the basic texts;

Corresponding to the length of each phrase, taking the interpolation coefficient as algorithm weight, and fusing corresponding first probability distribution with corresponding second probability distribution through a weighting algorithm;

and replacing the model parameters of the first initial language model by using the fused data, and taking the language model with the replaced model parameters as the fused model.

8. A language model construction apparatus, comprising:

the acquisition module is used for acquiring the text to be processed;

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the language model construction method of any one of claims 1-7.

10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the language model construction method according to any one of claims 1-7.