CN111008702A - Idiom recommendation model training method and device - Google Patents

Idiom recommendation model training method and device Download PDF

Info

Publication number
CN111008702A
CN111008702A CN201911245163.5A CN201911245163A CN111008702A CN 111008702 A CN111008702 A CN 111008702A CN 201911245163 A CN201911245163 A CN 201911245163A CN 111008702 A CN111008702 A CN 111008702A
Authority
CN
China
Prior art keywords
idiom
training
common
recommendation
missing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911245163.5A
Other languages
Chinese (zh)
Inventor
郭昱
汪美玲
李长亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Kingsoft Interactive Entertainment Technology Co ltd
Beijing Kingsoft Software Co Ltd
Kingsoft Corp Ltd
Beijing Kingsoft Digital Entertainment Co Ltd
Original Assignee
Chengdu Kingsoft Interactive Entertainment Technology Co ltd
Beijing Kingsoft Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Kingsoft Interactive Entertainment Technology Co ltd, Beijing Kingsoft Software Co Ltd filed Critical Chengdu Kingsoft Interactive Entertainment Technology Co ltd
Priority to CN201911245163.5A priority Critical patent/CN111008702A/en
Publication of CN111008702A publication Critical patent/CN111008702A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Abstract

The application provides a method and a device for training a idiom recommendation model, wherein the method comprises the following steps: acquiring a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank; and training a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, wherein the idiom recommendation model enables the training samples to be associated with the training labels. According to the method and the device, the intelligent model for scoring the common idioms according to the context semantic information is realized through large-scale corpus training, the user can be helped to recommend the common idioms when writing through the model, the user can quickly and well complete the high-quality text pen article, and the user experience is improved.

Description

Idiom recommendation model training method and device
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method for training a idiom recommendation model, a method and an apparatus for idiom recommendation, a computing device, and a computer-readable storage medium.
Background
When a user writes texts, some commonly used idioms are often needed to improve the literary level of written contents, but under the condition that the knowledge level of the user is limited or the commonly used idioms suitable for writing are difficult to determine according to the current context, a method capable of actively recommending a plurality of idioms according to the context does not exist in the prior art, at the moment, the user needs to switch to a search platform of a third party or search by using tools such as a dictionary, and subjectively discriminate and screen the common idioms returned by the tools such as the search of the third party or the dictionary, so that the continuity and the correctness of text writing ideas of the user are reduced, and the experience of the user is damaged.
Disclosure of Invention
In view of this, embodiments of the present specification provide a method for training a idiom recommendation model, a method for recommending an idiom, an apparatus, a computing device, and a computer-readable storage medium, so as to solve technical defects in the prior art.
According to a first aspect of embodiments of the present specification, there is provided a method for training an idiom recommendation model, including:
acquiring a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank;
and training a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, wherein the idiom recommendation model enables the training samples to be associated with the training labels.
According to a second aspect of embodiments of the present specification, there is provided an idiom recommendation method including:
acquiring a target statement, wherein the target statement contains a missing blank to be filled with a common idiom;
inputting the target sentence into the idiom recommendation model obtained by training in the method for idiom recommendation, and determining at least one recommended idiom corresponding to the missing blank according to the recommendation score of the common idiom.
According to a third aspect of the embodiments of the present specification, there is provided a training apparatus for an idiom recommendation model, including:
the training data acquisition module is configured to acquire a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank;
the model training module is configured to train a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, and the idiom recommendation model enables the training samples to be associated with the training labels.
According to a fourth aspect of embodiments herein, there is provided an idiom recommendation apparatus including:
the prediction acquisition module is configured to acquire a target statement, and the target statement contains missing blanks to be filled with common idioms;
and the idiom recommending module is configured to input the target sentence into the idiom recommending model obtained by training in the method for idiom recommending, and determine at least one recommended idiom corresponding to the missing blank according to the recommending score of the common idiom.
According to a fifth aspect of embodiments herein, there is provided a computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, the processor implementing the steps of the idiom recommendation method or the idiom recommendation method of training the idiom recommendation model when executing the instructions.
According to a sixth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer instructions which, when executed by a processor, implement the method for training idiom recommendation model or the steps of the idiom recommendation method.
According to the method and the device, the intelligent model for scoring the common idioms according to the context semantic information is realized through large-scale corpus training, the user can be helped to recommend the common idioms when writing through the model, the user can quickly and well complete the high-quality text pen article, and the user experience is improved.
Drawings
FIG. 1 is a block diagram of a computing device provided by an embodiment of the present application;
FIG. 2 is a flowchart of a training method of an idiom recommendation model provided in an embodiment of the present application;
FIG. 3 is another flowchart of a method for training a idiom recommendation model according to an embodiment of the present application;
FIG. 4 is a flowchart of a idiom recommendation method provided in an embodiment of the present application;
FIG. 5 is another flowchart of a idiom recommendation method provided in an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a training apparatus for idiom recommendation models according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an idiom recommendation apparatus according to an embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
In the present application, a method, an apparatus, a computing device and a computer-readable storage medium for training an idiom recommendation model are provided, which are described in detail in the following embodiments one by one.
FIG. 1 shows a block diagram of a computing device 100, according to an embodiment of the present description. The components of the computing device 100 include, but are not limited to, memory 110 and processor 120. The processor 120 is coupled to the memory 110 via a bus 130 and a database 150 is used to store data.
Computing device 100 also includes access device 140, access device 140 enabling computing device 100 to communicate via one or more networks 160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 140 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 100 and other components not shown in FIG. 1 may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 1 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.
Computing device 100 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 100 may also be a mobile or stationary server.
Wherein the processor 120 may perform the steps of the method shown in fig. 2. Fig. 2 is a schematic flow chart diagram illustrating a training method of idiom recommendation model according to an embodiment of the present application, including steps 202 to 204.
Step 202: the method comprises the steps of obtaining a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank.
In an embodiment of the present application, the training sample comprises a large amount of original corpora for training and a large amount of common idioms, which may be from a selection of dictionaries, periodicals, impurities, literary works, etc., such as "southern weekends", "civilians", "october", "readers", "showy sea", "toke myrobalan", "long-term" "," anamnesis "," fogdu solitary "," chasing up water for the time of year "," banning out of the day "", "first aid of clang", and so on, and a large amount of original corpora has missing blanks for filling out common idioms and matches a plurality of common idioms, such as "hundred feet", "curds", "body forces", "one day", "one candidate", "thirty days of standing up water for the time", "new year of water for the time", and so on, "all flowers together" or "canghai chestnut", etc., the training labels include a recommendation score for each common idiom for each missing blank.
Step 204: and training a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, wherein the idiom recommendation model enables the training samples to be associated with the training labels.
In an embodiment of the present application, obtaining training samples and corresponding training labels includes:
s2021: and acquiring a plurality of common idioms and an original corpus from a public corpus database as training samples, wherein the original corpus comprises a plurality of missing blanks to be filled with the common idioms.
S2022: and acquiring at least one recommended idiom corresponding to each missing blank from the public corpus database as a training label.
In the above embodiments, the idiom recommendation model of the present application may directly obtain the training samples and the training labels from a public corpus database, where the public corpus database may be a large-scale chinese idiom dataset stored on a software project hosting platform for completing a gap-filling test.
According to the method and the device, the intelligent model for scoring the common idioms according to the context semantic information is realized through large-scale corpus training, the user can be helped to recommend the common idioms when writing through the model, the user can quickly and well complete the high-quality text pen article, and the user experience is improved.
In the embodiment of the present application, as shown in fig. 3, training the idiom recommendation model by the training samples and the corresponding training labels includes steps 302 to 306.
Step 302: randomly dividing the common idioms into a target number of idiom groups, and including the same number of non-repeating common idioms in each of the idiom groups.
In an embodiment of the present application, the idiom recommendation model of the present application may randomly group 3848 compiled or recognized common idioms obtained from a published corpus database into a target number of idiom groups, and include the same number of unrepeated common idioms in each of the idiom groups, and preferably, may randomly divide 3848 common idioms into 385 idiom groups, and each idiom group includes 10 common idioms.
Step 304: and constructing a target number of test sets, wherein each test set comprises the original corpus and one idiom group.
In an embodiment of the present application, the idiom recommendation model of the present application may copy the original corpus to the same number as the idiom groups, so as to form a test set for each idiom group and the original corpus, and preferably, in a case where the 3848 common idioms are randomly divided into 385 idiom groups, 385 test sets may be constructed.
Step 306: according to the attribute information of the missing blank, scoring each common idiom in the idiom group in each test set, and determining a recommendation score corresponding to each common idiom.
In an embodiment of the present application, according to the attribute information of the missing blank, the idiom recommendation model of the present application may score each common idiom in the idiom groups in each test set according to the adaptation degree of the semantic of the common idiom and the context semantic of the missing blank, for example, score 385 idiom groups respectively, so as to obtain a recommendation score of 3848 common idioms.
In one or more embodiments of the present application, scoring each common idiom in the idiom group in each test set according to the attribute information of the missing blank includes:
s3061: and determining the position of the missing blank in the original corpus and the number of the placeholders, and determining the context semantic relationship of the missing blank according to the position of the missing blank in the original corpus.
S3062: and obtaining the probability that the common idiom is the recommended idiom through a loss function according to the context semantic relation of the missing blank and the number of the placeholders of the missing blank in the original corpus.
In the embodiment of the application, the idiom recommendation model of the application can obtain the probability that the common idiom is a recommendation idiom between 0 and 1 according to the context semantic relation of the missing blanks and the number of placeholders of the missing blanks in the original corpus through a preset scoring function, and the probability can be used as the recommendation score of the common idiom.
In one or more embodiments of the present application, determining a recommendation score corresponding to each of the common idioms comprises:
and under the condition that one common idiom corresponds to two recommendation scores, taking the recommendation score with the higher score of the recommendation scores as the final recommendation score of the common idiom.
In an embodiment of the present application, when the idiom recommendation model scores the common idioms, a situation may occur in which the same common idiom has multiple recommendation scores, and at this time, a recommendation score with a highest score among the multiple recommendation scores is taken as a final recommendation score, for example, when 3848 common idioms and 385 idioms are involved in a group, because each idiom group includes 10 unrepeated common idioms, two common idioms may repeatedly appear twice, resulting in two common idioms having two recommendation scores respectively, and at this time, a recommendation score with a higher score among the two recommendation scores is selected as a final recommendation score of the common idioms.
This application trains the model through the current whole common idiom of 3848 of set, and the form through grouping is scored whole common idioms of 3848 for the idiom recommendation model of this application can satisfy the idiom scoring under the multilingual environment, and then is applicable to under the extensive environment of writing, has satisfied great quantity of customer's demand, has guaranteed the prediction rate of accuracy.
Wherein the processor 120 may perform the steps of the method shown in fig. 4. Fig. 4 is a schematic flow chart diagram illustrating a idiom recommendation method according to an embodiment of the present application, including steps 402 to 404.
Step 402: obtaining a target statement, wherein the target statement contains a missing blank to be filled with a common idiom.
Step 404: and inputting the target sentence into a idiom recommendation model obtained by training by the method for idiom recommendation, and determining at least one recommended idiom corresponding to the missing blank according to the recommendation score of the common idiom.
In an embodiment of the application, a system or a terminal of the application can obtain a text paragraph input by a user, determine a target sentence containing a missing blank to be filled with a common idiom from the text paragraph, input the target sentence into an idiom recommendation model obtained by training the method of the application, obtain recommendation scores of test samples composed of 3848 common idioms respectively filled in the missing blank of the target sentence, and then determine at least one recommended idiom corresponding to the missing blank according to the recommendation scores of the common idioms.
In one or more embodiments of the present application, as shown in fig. 5, determining at least one recommended idiom corresponding to the missing blank according to the recommendation score of the common idiom includes steps 502 to 504.
Step 502: and ranking each common idiom according to a recommendation score from high to low.
Step 504: and acquiring at least one common idiom as a recommended idiom according to the sequence of the arrangement.
In the embodiment of the application, the system or the terminal of the application can arrange 3848 common idioms according to the recommendation scores thereof from high to low, if the common idioms are identical, the idioms are juxtaposed, and then the first n common idioms are acquired from high to low as the recommendation idioms according to the arrangement sequence, wherein n is preferably a positive integer which is greater than or equal to 1 and less than or equal to 10. For example, a text passage entered by a user is: the overall level of the world championship is far higher than that of the Asian cup, and the situation that the fish and the bear paw are obtained like the Asian cup needs to be closely matched in all aspects, and a blank # is lost. As the ministry of the chief and the commander, the soldier must be skillfully used except breaking the conservative thought and darting to break the lattice, the gap # is lost, the battle is flexibly arranged, the command is proper, the new person is pushed through the competition, the good performance is brought out, and the new fighting power is brought out. As can be seen, there are two missing blanks for which the idiom recommendation needs to be performed, and the idiom recommendation model of the present application predicts the two missing blanks respectively, scores a plurality of common candidate idioms including "kneading by empty", "top big horse", "motivation cooperation", "same boat economy", "harmony", "plash face", "close gong and close drum", "wadding day", "measure body tailoring", "gold bang question names", "bai war abstinence", "know them and", "directed", "wind current talent", and selects at least one recommendation, such as "motivation cooperation" and "directed wind current talent", according to the recommendation score of each candidate recommendation.
According to the method and the device, the trained idiom recommendation model is applied to recommend the commonly used idioms to the missing blank in the target sentence, and the artificial intelligence is utilized to help the user to select at least one relatively suitable recommended idiom from the conventional 3848 commonly used idioms, so that the user can have multiple choices, and each choice has relatively high accuracy.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a training apparatus for idiom recommendation models, and fig. 6 shows a schematic structural diagram of the training apparatus for idiom recommendation models according to an embodiment of the present specification. As shown in fig. 6, the apparatus includes:
a training data obtaining module 601, configured to obtain a training sample and a corresponding training label, where the training sample includes an original corpus including missing blanks and a plurality of common idioms, and the training label includes a recommendation score corresponding to each common idiom in each missing blank;
a model training module 602 configured to train a idiom recommendation model through the training samples and the corresponding training labels, so as to obtain the idiom recommendation model, where the idiom recommendation model associates the training samples with the training labels.
Optionally, the training data obtaining module 601 includes:
the system comprises a sample acquisition unit, a data processing unit and a data processing unit, wherein the sample acquisition unit is configured to acquire a plurality of common idioms and an original corpus from a public corpus database as training samples, and the original corpus comprises a plurality of missing blanks to be filled with the common idioms;
and the label acquisition unit is configured to acquire at least one recommended idiom corresponding to each missing blank from the public corpus database as a training label.
Optionally, the model training module 602 includes:
a idiom grouping unit configured to randomly divide the plurality of common idioms into a target number of idiom groups and include the same number of non-repeating common idioms in each of the idiom groups;
a set construction unit configured to construct a target number of test sets, each of the test sets including the original corpus and one of the idiomatic groups;
and the idiom scoring unit is configured to score each common idiom in the idiom group in each test set according to the attribute information of the missing blank, and determine a recommendation score corresponding to each common idiom.
Optionally, the idiom scoring unit includes:
an attribute information subunit, configured to determine the positions of the missing blanks in the original corpus and the number of placeholders, and determine the context semantic relationship of the missing blanks according to the positions of the missing blanks in the original corpus;
and the probability determining subunit is configured to obtain the probability that the common idiom is the recommended idiom through a loss function according to the context semantic relation of the missing blank and the number of the placeholders of the missing blank in the original corpus.
Optionally, the idiom scoring unit includes:
and the score selection subunit is configured to take the recommendation score with the higher score of the recommendation scores as the final recommendation score of the common idioms under the condition that one common idiom corresponds to two recommendation scores.
According to the method and the device, the intelligent model for scoring the common idioms according to the context semantic information is realized through large-scale corpus training, the user can be helped to recommend the common idioms when writing through the model, the user can quickly and well complete the high-quality text pen article, and the user experience is improved.
Corresponding to the above method embodiment, the present specification further provides an idiom recommendation apparatus embodiment, and fig. 7 shows a schematic structural diagram of an idiom recommendation apparatus according to an embodiment of the present specification. As shown in fig. 7, the apparatus includes:
a prediction obtaining module 701 configured to obtain a target statement, where the target statement includes a missing blank to be filled with a common idiom;
and a idiom recommending module 702, configured to input the target sentence into the idiom recommending model trained by the method as described above to recommend idioms, and determine at least one recommended idiom corresponding to the missing blank according to the recommendation score of the common idioms.
Optionally, the idiom recommendation module 702 includes:
the score sorting unit is configured to sort each common idiom from high to low according to the recommendation score;
and the sequence acquisition unit is configured to acquire at least one common idiom as a recommended idiom according to the sequence of the arrangement.
According to the method and the device, the trained idiom recommendation model is applied to recommend the commonly used idioms to the missing blank in the target sentence, and the artificial intelligence is utilized to help the user to select at least one relatively suitable recommended idiom from the conventional 3848 commonly used idioms, so that the user can have multiple choices, and each choice has relatively high accuracy.
An embodiment of the present application further provides a computing device, including a memory, a processor, and computer instructions stored on the memory and executable on the processor, where the processor executes the instructions to implement the following steps:
acquiring a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank;
and training a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, wherein the idiom recommendation model enables the training samples to be associated with the training labels.
An embodiment of the present application further provides a computing device, including a memory, a processor, and computer instructions stored on the memory and executable on the processor, where the processor executes the instructions to implement the following steps:
acquiring a target statement, wherein the target statement contains a missing blank to be filled with a common idiom;
inputting the target sentence into the idiom recommendation model obtained by training with the method for idiom recommendation, and determining at least one recommended idiom corresponding to the missing blank according to the recommendation scores of the common idioms.
An embodiment of the present application further provides a computer-readable storage medium, which stores computer instructions, and when the instructions are executed by a processor, the method for training the idiom recommendation model or the steps of the idiom recommendation method are implemented as described above.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the computer-readable storage medium and the above-mentioned idiom recommendation method or idiom recommendation method belong to the same concept, and details that are not described in detail in the technical solution of the computer-readable storage medium can be referred to in the description of the technical solution of the above-mentioned idiom recommendation model training method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims (11)

1. A training method of idiom recommendation model is characterized by comprising the following steps:
acquiring a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank;
and training a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, wherein the idiom recommendation model enables the training samples to be associated with the training labels.
2. The method of claim 1, wherein obtaining training samples and corresponding training labels comprises:
acquiring a plurality of common idioms and an original corpus from a public corpus database as training samples, wherein the original corpus comprises a plurality of missing blanks to be filled with the common idioms;
and acquiring at least one recommended idiom corresponding to each missing blank from the public corpus database as a training label.
3. The method of claim 1, wherein training the idiomatic recommendation model with the training samples and corresponding training labels comprises:
randomly dividing the common idioms into a target number of idiom groups, and including the same number of non-repeated common idioms in each of the idiom groups;
constructing a target number of test sets, wherein each test set comprises the original corpus and one idiom group;
according to the attribute information of the missing blank, scoring each common idiom in the idiom group in each test set, and determining a recommendation score corresponding to each common idiom.
4. The method of claim 3, wherein scoring each common idiom in the group of idioms in each of the test sets according to the attribute information of the missing white space comprises:
determining the position of the missing blank in the original corpus and the number of placeholders, and determining the context semantic relationship of the missing blank according to the position of the missing blank in the original corpus;
and obtaining the probability that the common idiom is the recommended idiom through a loss function according to the context semantic relation of the missing blank and the number of the placeholders of the missing blank in the original corpus.
5. The method of claim 3, wherein determining a recommendation score for each of the common idioms comprises:
and under the condition that one common idiom corresponds to two recommendation scores, taking the recommendation score with the higher score of the recommendation scores as the final recommendation score of the common idiom.
6. A idiom recommendation method, comprising:
acquiring a target statement, wherein the target statement contains a missing blank to be filled with a common idiom;
inputting the target sentence into a idiom recommendation model obtained by training according to the method of any one of claims 1 to 5 for idiom recommendation, and determining at least one recommended idiom corresponding to the missing blank according to the recommendation score of the common idiom.
7. The method of claim 6, wherein determining at least one recommended idiom corresponding to the missing blank according to the recommendation score of the common idiom comprises:
ranking each common idiom according to a recommendation score from high to low;
and acquiring at least one common idiom as a recommended idiom according to the sequence of the arrangement.
8. A training device for idiom recommendation models is characterized by comprising:
the training data acquisition module is configured to acquire a training sample and a corresponding training label, wherein the training sample comprises an original corpus containing missing blanks and a plurality of common idioms, and the training label comprises a recommendation score corresponding to each common idiom in each missing blank;
the model training module is configured to train a idiom recommendation model through the training samples and the corresponding training labels to obtain the idiom recommendation model, and the idiom recommendation model enables the training samples to be associated with the training labels.
9. An idiom recommendation apparatus, comprising:
the prediction acquisition module is configured to acquire a target statement, and the target statement contains missing blanks to be filled with common idioms;
a idiom recommending module configured to input the target sentence into an idiom recommending model trained by the method of any one of claims 1 to 5 to recommend an idiom, and determine at least one recommended idiom corresponding to the missing blank according to a recommending score of the common idiom.
10. A computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, wherein the processor implements the steps of the method of any of claims 1-5 or 6-7 when executing the instructions.
11. A computer-readable storage medium storing computer instructions, which when executed by a processor, perform the steps of the method of any one of claims 1-5 or 6-7.
CN201911245163.5A 2019-12-06 2019-12-06 Idiom recommendation model training method and device Pending CN111008702A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911245163.5A CN111008702A (en) 2019-12-06 2019-12-06 Idiom recommendation model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911245163.5A CN111008702A (en) 2019-12-06 2019-12-06 Idiom recommendation model training method and device

Publications (1)

Publication Number Publication Date
CN111008702A true CN111008702A (en) 2020-04-14

Family

ID=70114087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911245163.5A Pending CN111008702A (en) 2019-12-06 2019-12-06 Idiom recommendation model training method and device

Country Status (1)

Country Link
CN (1) CN111008702A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021159816A1 (en) * 2020-09-04 2021-08-19 平安科技(深圳)有限公司 Idiom blank-filling question answer selection method and apparatus, and computer device
CN113538079A (en) * 2020-04-17 2021-10-22 北京金山数字娱乐科技有限公司 Recommendation model training method and device, and recommendation method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013050268A1 (en) * 2011-10-06 2013-04-11 Thomson Licensing Method and apparatus for generating an explanation for a recommendation
CN103685711A (en) * 2012-09-21 2014-03-26 崔玉珩 Call control and processing method based on automatic connection of mobile phone
CN109800414A (en) * 2018-12-13 2019-05-24 科大讯飞股份有限公司 Faulty wording corrects recommended method and system
CN110442735A (en) * 2019-08-13 2019-11-12 北京金山数字娱乐科技有限公司 Idiom near-meaning word recommendation method and device
CN110532356A (en) * 2019-08-30 2019-12-03 联想(北京)有限公司 Information processing method, device and storage medium
CN110532562A (en) * 2019-08-30 2019-12-03 联想(北京)有限公司 Neural network training method, Chinese idiom misuse detection method, device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013050268A1 (en) * 2011-10-06 2013-04-11 Thomson Licensing Method and apparatus for generating an explanation for a recommendation
CN103685711A (en) * 2012-09-21 2014-03-26 崔玉珩 Call control and processing method based on automatic connection of mobile phone
CN109800414A (en) * 2018-12-13 2019-05-24 科大讯飞股份有限公司 Faulty wording corrects recommended method and system
CN110442735A (en) * 2019-08-13 2019-11-12 北京金山数字娱乐科技有限公司 Idiom near-meaning word recommendation method and device
CN110532356A (en) * 2019-08-30 2019-12-03 联想(北京)有限公司 Information processing method, device and storage medium
CN110532562A (en) * 2019-08-30 2019-12-03 联想(北京)有限公司 Neural network training method, Chinese idiom misuse detection method, device and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538079A (en) * 2020-04-17 2021-10-22 北京金山数字娱乐科技有限公司 Recommendation model training method and device, and recommendation method and device
WO2021159816A1 (en) * 2020-09-04 2021-08-19 平安科技(深圳)有限公司 Idiom blank-filling question answer selection method and apparatus, and computer device

Similar Documents

Publication Publication Date Title
CN109816111B (en) Reading understanding model training method and device
WO2016187472A1 (en) Multilingual image question answering
US20140351228A1 (en) Dialog system, redundant message removal method and redundant message removal program
CN109271493A (en) A kind of language text processing method, device and storage medium
CN110781663A (en) Training method and device of text analysis model and text analysis method and device
CN111310440A (en) Text error correction method, device and system
CN110442691A (en) Machine reads the method, apparatus and computer equipment for understanding Chinese
CN110990556B (en) Idiom recommendation method and device, training method and device of idiom recommendation model
CN108874789B (en) Statement generation method, device, storage medium and electronic device
CN111008702A (en) Idiom recommendation model training method and device
CN109614480A (en) A kind of generation method and device of the autoabstract based on production confrontation network
CN113342958A (en) Question-answer matching method, text matching model training method and related equipment
KR102146433B1 (en) Method for providing context based language learning service using associative memory
CN116109732A (en) Image labeling method, device, processing equipment and storage medium
CN110502613A (en) A kind of model training method, intelligent search method, device and storage medium
CN110187780A (en) Long text prediction technique, device, equipment and storage medium
CN111274813B (en) Language sequence labeling method, device storage medium and computer equipment
CN115292492A (en) Method, device and equipment for training intention classification model and storage medium
McKinnon Mapping the dimensions of a literary corpus
CN114138947A (en) Text processing method and device
CN115617959A (en) Question answering method and device
Roele WVOQ at SemEval-2021 Task 6: BART for span detection and classification
CN116108144B (en) Information extraction method and device
CN109543091A (en) Method for pushing, device and the terminal of application program
CN113111652B (en) Data processing method and device and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination