CN112015885A

CN112015885A - Deep learning-based Chinese model sentence generation method

Info

Publication number: CN112015885A
Application number: CN202010891348.XA
Authority: CN
Inventors: 樊星
Original assignee: Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Current assignee: Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority date: 2020-08-30
Filing date: 2020-08-30
Publication date: 2020-12-01

Abstract

The invention discloses a deep learning-based Chinese model sentence generation method, which is used for solving the problem that no Chinese model sentence generation method can quickly and accurately generate a Chinese model sentence expected by a user at present. The method comprises the following steps: classifying a plurality of norms according to the norms types to obtain norms sets of different types; constructing a model essay model of each model essay in each model essay set; for each exemplar set, fusing the exemplar models corresponding to all exemplars in the exemplar set into class fusion models corresponding to the exemplar set; receiving Chinese writing information input by a user; calculating matching values of the Chinese writing information and various fusion models according to a preset matching value algorithm; and outputting the model texts in the model set corresponding to the class fusion model corresponding to the calculated maximum matching value. The method displays the model essay in the model essay set corresponding to the class fusion model with the largest matching value to the user, and ensures that the displayed model essay is the model essay expected by the user.

Description

Deep learning-based Chinese model sentence generation method

Technical Field

The invention relates to the technical field of model sentence generation, in particular to a Chinese model sentence generation method based on deep learning.

Background

With the rapid development of the internet, more and more internet users are provided, and the relationship between users becomes tighter and tighter through an information sharing mode. The information sharing not only enables the resource allocation to be more reasonable, but also can save social cost and create more wealth. There are many resources shared by information, including paradigms, papers, software programs, movies, music, etc. In the aspect of model essay resource sharing, a user can obtain model essay resources on certain target websites in an internet mode, but in the model essay searching and inquiring process, after the user inputs model essay keywords, the searched model essay is often very many and different in format, and the user can only spend a large amount of time to remove irrelevant model essay, so that a model essay generating method is urgently needed, and the model essay required by the user can be quickly and accurately generated.

Disclosure of Invention

The invention provides a deep learning-based Chinese model sentence generation method, which is used for solving the problem that no Chinese model sentence generation method can quickly and accurately generate a user expected Chinese model sentence at present. According to the method for generating the Chinese model sentence based on deep learning, provided by the invention, the matching value of the Chinese writing information input by the user and each fusion model is calculated, and then the model sentence in the model sentence set corresponding to the class fusion model with the largest matching value is displayed to the user, so that the displayed model sentence is the model sentence expected by the user.

The invention provides a deep learning-based Chinese model sentence generation method, which comprises the following steps:

classifying a plurality of norms crawled from a target website according to the norms to obtain norms sets of different types;

constructing a model essay model of each model essay in each model essay set;

for each exemplar set, fusing the exemplar models corresponding to all exemplars in the exemplar set into class fusion models corresponding to the exemplar set;

receiving Chinese writing information input by a user; the Chinese writing information at least comprises a writing theme and a writing type;

calculating matching values of the Chinese writing information and various fusion models according to a preset matching value algorithm;

and outputting the model texts in the model set corresponding to the class fusion model corresponding to the calculated maximum matching value.

In one embodiment, the constructing the model essay model of each model essay in each model essay set includes:

extracting key items of each model essay in each model essay set and corresponding item attributes of the key items;

and constructing the model essay model of each model essay according to all the item attributes of each model essay.

In an embodiment, for each exemplar set, fusing the exemplar models corresponding to all exemplars in the exemplar set into class fusion models corresponding to the exemplar set includes:

for each model set, calculating the fusion values of the model essay models of all the model essays in the current model essay set according to the model essay model of each model essay in the current model essay set, the entry attribute of each model essay and a preset model essay fusion value algorithm;

judging whether the fusion value of the model essay model corresponding to the current model essay set is smaller than a preset fusion value or not;

if the fusion value of the model essay model corresponding to the current model essay set is smaller than a preset fusion value, generating a corresponding class fusion model according to all model essays in the current model essay set and a preset model essay generation model;

and if the fusion value of the model essay model corresponding to the current model essay set is not smaller than the preset fusion value, crawling a plurality of model essays from the target website, and re-executing the step of classifying the plurality of model essays crawled from the target website according to the model essay types to obtain different types of model essay sets.

In one embodiment, the preset norm fusion value algorithm is:

wherein, R is the fusion value of the model essay models of all the model essays in the current model essay set, and M is the total number of the model essays in the current model essay set;

the model fusion factor of the jth model in the current model essay set is represented, and the value range is [0.5,0.8 ]]；

The model fusion factor of the j +1 model in the current model essay set is represented, and the value range is [0.5,0.8 ]]；

Representing the standard fusion factor of all the norms models in the current norms set,

is a preset value; exp () represents an exponential function; h is the total number of the item attributes corresponding to the jth model in the current model set; eta_hThe attribute value of the h item attribute in the j model in the current model set; eta represents the average value of the attribute values of all the item attributes of the jth model in the current model set.

In one embodiment, the preset matching value algorithm is as follows:

wherein, P_iRepresenting the matching value of the Chinese writing information input by the user and the ith class fusion model; n represents the total number of the pre-obtained class fusion models; beta is a_iFor the associated value, β, of the writing topic and the i-th class fusion model_iHas a value range of [0,1 ]]；_iFor the associated value of the writing type and the ith class fusion model,_ihas a value range of [0.8,1 ]]；F_iThe model confidence value of the ith class fusion model is in a value range of [3,5 ]]。

In one embodiment, the target website is a website related to articles.

In one embodiment, the paradigm shift key entry may include: a model essay theme, a model essay key vocabulary, a model essay technical content, a model essay language modification and a model essay description fluency degree; the corresponding item attributes may include: the attribute of the main body, the attribute of the keyword, the attribute of the technology, the attribute of language modification and the attribute of description fluency.

The invention provides a deep learning-based Chinese model essay generation method, which includes the steps of crawling model essays from a target website and conducting classification processing to obtain model essay sets of different types, determining key item of the model essay of each model essay and corresponding item attributes of the key item, further constructing a model essay model of each model essay, then calculating fusion values of model essay models of all the model essays in the model essay sets according to a preset model essay fusion value algorithm, and when the fusion values of the model essay models are smaller than the preset fusion values, generating models according to all the model essays in the model essay sets and the preset model essay generation models to generate corresponding class fusion models, so that the reliability and the accuracy of the constructed model essay models are guaranteed. And finally, after obtaining the language writing information such as the language main body, the writing type and the like input by the user, calculating the matching value of the language writing information and various fusion models, and finally displaying the model sentence in the model sentence set corresponding to the class fusion model with the highest matching value to the user, so that the model sentence displayed to the user is the model sentence expected by the user.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

fig. 1 is a flowchart of a first embodiment of a deep learning-based method for generating a chinese model sentence according to an embodiment of the present invention;

FIG. 2 is a flowchart of the method of step S102 in FIG. 1;

fig. 3 is a flowchart of a second embodiment of a deep learning-based method for generating a chinese model sentence according to the present invention.

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

Fig. 1 is a flowchart illustrating an embodiment of a method for generating a textual model based on deep learning according to the present invention. As shown in fig. 1, the method comprises the steps of:

s101: classifying a plurality of norms crawled from a target website according to the norms to obtain norms sets of different types;

in this embodiment, the paradigm types may include, for example: the target websites are websites related to articles, such as the Hopkins, the Wanfang, the Uppur, the Baidu and the like.

S102: constructing a model essay model of each model essay in each model essay set;

in this embodiment, step S102 may include:

s201: extracting key items of each model essay in each model essay set and corresponding item attributes of the key items;

in this step, the key entry may include: a model essay theme, a model essay key vocabulary, a model essay technical content, a model essay language modification, a model essay description fluency and the like; the corresponding item attributes may include: a body attribute, a keyword attribute, a technical attribute, a language modification attribute, a description fluency attribute, and the like. When extracting all the model essay key entries of each corresponding model essay in the same type model essay set, based on the model essay attribute calibration model, the corresponding entry attributes can be calibrated to each model essay key entry in each model essay.

S202: and constructing the model essay model of each model essay according to all the item attributes of each model essay.

In this step, the model essay model includes the item attribute and the attribute value thereof.

S103: for each exemplar set, fusing the exemplar models corresponding to all exemplars in the exemplar set into class fusion models corresponding to the exemplar set;

in this embodiment, the class fusion model may be a single fusion model of an application class, an emotion class, a scientific research class, or a combined fusion model, for example, the class fusion model may be an application class and emotion class fusion model, for example: the model A is used for generating the application type model essay, the model B is used for generating the emotion type model essay, the model C is generated by fusing the model A and the model B, and the model C can generate the application type model essay and also can generate the emotion type model essay.

S104: receiving Chinese writing information input by a user; the Chinese writing information at least comprises a writing theme and a writing type;

s105: calculating matching values of the Chinese writing information and various fusion models according to a preset matching value algorithm;

s106: and outputting the model texts in the model set corresponding to the class fusion model corresponding to the calculated maximum matching value.

In this embodiment, the matching value between the language writing information and each type of fusion model is calculated, and then the model sentence in the model sentence set corresponding to the type of fusion model with the largest matching value is obtained, where the model sentence is the model sentence expected by the user.

According to the method for generating the language model texts based on the deep learning, provided by the embodiment of the invention, the model texts are crawled from a target website and are classified to obtain different types of model text sets, the corresponding item attribute is determined by determining the key item of the model texts of each model text, then the model text model of each model text is constructed, and then the model text model in the model text set is generated into the corresponding class fusion model. And finally, after obtaining the language writing information such as the language main body, the writing type and the like input by the user, calculating the matching value of the language writing information and various fusion models, and finally displaying the model sentence in the model sentence set corresponding to the class fusion model with the highest matching value to the user, so that the model sentence displayed to the user is the model sentence expected by the user.

Fig. 3 is a flowchart of an embodiment of a method for generating a language model text based on deep learning according to the present invention. As shown in fig. 3, the method comprises the steps of:

s301: classifying a plurality of norms crawled from a target website according to the norms to obtain norms sets of different types;

s302: extracting key items of each model essay in each model essay set and corresponding item attributes of the key items;

s303: constructing a model essay model of each model essay according to all the item attributes of each model essay;

s304: for each model set, calculating the fusion values of the model essay models of all the model essays in the current model essay set according to the model essay model of each model essay in the current model essay set, the entry attribute of each model essay and a preset model essay fusion value algorithm;

in this embodiment, the preset norm fusion value algorithm is as follows:

the model fusion factor of the jth model in the current model essay set is represented, and the value range is [0.5,0.8 ]]The larger the value is, the better the fusion is represented, and the smaller the value is, the worse the fusion is represented;

the model fusion factor of the j +1 model in the current model essay set is represented, and the value range is [0.5,0.8 ]]The larger the value is, the better the fusion is represented, and the smaller the value is, the worse the fusion is represented;

S305: judging whether the fusion value of the model essay model corresponding to the current model essay set is smaller than a preset fusion value, if so, executing a step S307, otherwise, executing a step S306;

s306: crawling a plurality of model sentences from the target website, and executing the step S301;

s307: generating a corresponding class fusion model according to all the model texts in the current model text set and a preset model text generation model;

s308: receiving Chinese writing information input by a user; the Chinese writing information at least comprises a writing theme and a writing type;

s309: calculating matching values of the Chinese writing information and various fusion models according to a preset matching value algorithm;

in this embodiment, the preset matching value algorithm is as follows:

wherein, P_iRepresenting the matching value of the Chinese writing information input by the user and the ith class fusion model; n represents the total number of the pre-obtained class fusion models; beta is a_iFor the associated value, β, of the writing topic and the i-th class fusion model_iHas a value range of [0,1 ]]The greater the association value is, the higher the association degree of the writing theme and the ith class fusion model is;_ifor the associated value of the writing type and the ith class fusion model,_ihas a value range of [0.8,1 ]]The larger the association value is, the higher the association degree of the writing type and the ith class fusion model is; f_iModel for the i-th class fusion modelThe value range of the model confidence value is [3,5 ]]And the larger the model confidence value is, the higher the model confidence of the fusion model of the ith class is.

S310: and outputting the model texts in the model set corresponding to the class fusion model corresponding to the calculated maximum matching value.

According to the method for generating the Chinese model essay based on deep learning, provided by the embodiment of the invention, model essays are crawled from a target website and are classified to obtain different types of model essay sets, corresponding entry attributes are determined by determining key entries of the model essay of each model essay, a model essay model of each model essay is further constructed, then fusion values of model essay models of all the model essays in the model essay sets are calculated according to a preset model essay fusion value algorithm, and when the fusion values of the model essay models are smaller than the preset fusion values, corresponding class fusion models are generated according to all the model essays in the model essay sets and the preset model essay generation models; when the fusion value of the model essay model is not smaller than the preset fusion value, the model essay is crawled from the target website, and the model essay set is reconstructed to obtain the class fusion model, so that the reliability and the accuracy of the constructed model essay model are ensured. And finally, after obtaining the language writing information such as the language main body, the writing type and the like input by the user, calculating the matching value of the language writing information and various fusion models, and finally displaying the model sentence in the model sentence set corresponding to the class fusion model with the highest matching value to the user, so that the model sentence displayed to the user is the model sentence expected by the user.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for generating a Chinese model sentence based on deep learning is characterized by comprising the following steps:

constructing a model essay model of each model essay in each model essay set;

2. The deep learning-based Chinese model sentence generation method according to claim 1, wherein the constructing of the model sentence model of each model sentence in each model sentence set comprises:

3. The deep learning-based semantic-context generation method according to claim 2, wherein for each canonical context set, fusing the respective corresponding semantic-context models of all the canonical contexts into class fusion models corresponding to the canonical context set, comprises:

and if the fusion value of the model essay model corresponding to the current model essay set is smaller than a preset fusion value, generating a corresponding class fusion model according to all the model essays in the current model essay set and a preset model essay generation model.

4. A deep learning-based Chinese paradigm generation method according to claim 3, wherein the preset paradigm fusion value algorithm is:

is a preset value; exp () represents an exponential function; h is the total number of the item attributes corresponding to the jth model in the current model set; eta_hThe attribute value of the h item attribute in the j model in the current model set; eta represents the jth in the current set of normsAnd averaging the attribute values of all the item attributes of the model essay model.

5. The deep learning-based Chinese model sentence generation method according to claim 1, wherein the preset matching value algorithm is:

6. A method for generating a textual model based on deep learning according to claim 1, wherein said target website is a website related to articles.

7. A method for generating a semantic normogram based on deep learning according to claim 2, wherein the key normogram entries include: a model essay theme, a model essay key vocabulary, a model essay technical content, a model essay language modification and a model essay description fluency degree; the corresponding item attributes may include: the attribute of the main body, the attribute of the keyword, the attribute of the technology, the attribute of language modification and the attribute of description fluency.