CN110321426A

CN110321426A - Abstract abstracting method, device and computer equipment

Info

Publication number: CN110321426A
Application number: CN201910591171.9A
Authority: CN
Inventors: 缪畅宇
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-02
Filing date: 2019-07-02
Publication date: 2019-10-11
Anticipated expiration: 2039-07-02
Also published as: CN110321426B

Abstract

This application involves a kind of abstract abstracting method, device and computer equipments, obtain text to be extracted；Sentence encoder based on neural network model determines that each sentence in text to be extracted belongs to the prediction probability of text snippet；By the sentence extraction device of neural network model, the target sentences collection for belonging to the text snippet of text to be extracted is determined based on prediction probability；Wherein, the determination process of neural network model includes: acquisition sample record, and every sample record includes sample text and crowdsourcing mark, and crowdsourcing mark includes the annotation results that at least two mark personnel are labeled sample text；The learning outcome obtained according to sample text is directed in annotation results and learning process, determines Reward Program value；Based on Reward Program value and learning outcome, sentence extraction device and sentence encoder are determined.In this way, the mode diversification that abstract is extracted, thus, improve the generalization that abstract extracts.

Description

Abstract abstracting method, device and computer equipment

Technical field

This application involves technical field of computer information processing, more particularly to a kind of abstract abstracting method, device and meter Calculate machine equipment.

Background technique

With the rapid development of information technology, the application of the information processing technology has been deep into the every aspect of life.Than Such as, abstract extraction technique is widely used under the scene for extracting text core content automatically, such as news in brief, article abstract Deng.The text snippet being drawn into can indicate precise and to the pointly article content, improve the reading efficiency of user, it can also be used to Yong Huhua Picture.It makes a summary compared to production, extraction-type abstract does not do any rewriting to original text sentence, more meets original text context, and real from algorithm It now sees, extraction-type abstract only needs to extract sentences in article, does not need to rewrite again, so being more suitable for large-scale application.

Traditional abstract extracts mode, and the acquisition modes of the sample label employed in the training process of model generally have Two kinds: (1) by being manually directly labeled to sentence；(2) it is made a summary by the production that people writes, calculates marking to each sentence, Then a threshold value is divided, the high sentence that will give a mark is as target sentences.Both modes all have a problem that: abstract extracts Mode uniquely changed.In fact, the abstract of many articles is not unique, for example for sports news, it can both extract match Troop and score can also extract excellent comment of the article review person to match, some that can also be mentioned in title is huge Star extracts associated clip.From the point of view of user, these extraction modes are all acceptable.Therefore, traditional abstract extraction side Formula has poor generalization.

So our model will have good generalization, it can generate richer clip Text, and be more than It is confined in some model answer.

Summary of the invention

Based on this, it is necessary in view of the above technical problems, provide a kind of abstract extraction side improved abstract and extract accuracy Method, device, computer equipment and storage medium.

A kind of abstract abstracting method, which comprises

Obtain text to be extracted；

Sentence encoder based on neural network model determines that each sentence belongs to text snippet in the text to be extracted Prediction probability；

By the sentence extraction device of neural network model, the text to be extracted is belonged to based on prediction probability determination The target sentences collection of text snippet；

Wherein, the determination process of the neural network model includes:

Sample record is obtained, every sample record includes sample text and crowdsourcing mark, the crowdsourcing mark packet Include the annotation results that at least two mark personnel are labeled the sample text；

The learning outcome obtained according to the sample text is directed in the annotation results and learning process, determines return Functional value；

Based on the Reward Program value and the learning outcome, the sentence extraction device and the sentence encoder are determined.

In one of the embodiments, it is described according in the annotation results and learning process be directed to the sample text Obtained learning outcome determines Reward Program value, comprising:

The learning outcome obtained according to the sample text is directed in learning process, determines in the sample text and belongs to text The target of this abstract learns sentence collection；

The quantity for learning sentence in sentence collection and the annotation results intersection according to the target, with sentence in the annotation results The quantity of son, determines hit probability；

According to the hit probability, Reward Program value is determined.

It is described in one of the embodiments, that sentence in sentence collection and the annotation results intersection is learnt according to the target The quantity of sentence, determines hit probability in quantity, with the annotation results, comprising:

Determine target study sentence collection respectively with the intersection sentence quantity of each annotation results；

According to the quantity of sentence in the intersection sentence quantity and the annotation results, determine for the mark personnel's Hit probability.

It is described according to the hit probability in one of the embodiments, determine Reward Program value, comprising:

When the intersection sentence quantity has the case where being greater than 0, using each hit probability of non-zero the bottom of as Number carries out exponent arithmetic, obtains operation result；The index value range of the exponent arithmetic is (- 1,0)；

It averages to each operation result, be recompensed functional value.

When each intersection sentence is equal in number in 0 when, determine that Reward Program value is equal to preset negative value.

It is described in one of the embodiments, to be based on the Reward Program value and the learning outcome, determine the sentence Withdrawal device and the sentence encoder, comprising:

According to the learning outcome, the model probabilities value of various decimation patterns is determined；

Loss is determined according to the product of the model probabilities value and the Reward Program value for the various decimation patterns Functional value；

According to the loss function value, the sentence extraction device is determined.

It is described in one of the embodiments, that the model probabilities value of various decimation patterns is determined according to the learning outcome, Include:

According to the learning outcome, determination belongs to the first probability that the sentence extracted under the decimation pattern is drawn into Value；

According to the learning outcome, the second probability that the sentence being not belonging under the decimation pattern is not drawn into is determined Value；

Based on each first probability value and each second probability value, model probabilities value is determined.

It exercises supervision training to the sample record, obtains supervised training result；

Based on the supervised training as a result, obtaining learning outcome for the sample text in learning process；

According to the annotation results and the learning outcome, Reward Program value is determined.

It is described in one of the embodiments, to exercise supervision training to the sample record, supervised training is obtained as a result, packet It includes:

It is marked according to the crowdsourcing, determines that standard marks；

It is marked according to the learning outcome of the supervised training and the standard, determines loss function value；

According to the loss function value, supervised training result is determined.

It is described in one of the embodiments, to be marked according to the crowdsourcing, determine that standard marks, comprising:

Determine each sentence in sample text, the labeled times being marked in crowdsourcing mark；

The labeled times are greater than to the sentence of preset times, are determined as what the standard of the sample text was marked Sentence.

A kind of abstract draw-out device, described device include:

Text obtains module, for obtaining text to be extracted；

Probabilistic forecasting module determines each in the text to be extracted for the sentence encoder based on neural network model Sentence belongs to the prediction probability of text snippet；

Determining module of making a summary is determined based on the prediction probability and is belonged to for passing through the sentence extraction device of neural network model In the target sentences collection of the text snippet of the text to be extracted；

Model training module, including sample record acquiring unit, return value determination unit and parameter updating unit；

The sample record acquiring unit, for obtaining sample record, every sample record include sample text with And crowdsourcing mark, the crowdsourcing mark include the annotation results that at least two mark personnel are labeled the sample text；

The return value determination unit, for according in the annotation results and learning process be directed to the sample text Obtained learning outcome determines Reward Program value；

The parameter updating unit, for determining that the sentence is taken out based on the Reward Program value and the learning outcome Take device and the sentence encoder.

A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device realizes following step when executing the computer program:

Obtain text to be extracted；

Wherein, the determination process of the neural network model includes:

A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor Following step is realized when row:

Obtain text to be extracted；

Wherein, the determination process of the neural network model includes:

Above-mentioned abstract abstracting method, device, computer equipment and storage medium, obtains text to be extracted；Based on nerve The sentence encoder of network model determines that each sentence in text to be extracted belongs to the prediction probability of text snippet；Pass through nerve net The sentence extraction device of network model determines the target sentences collection for belonging to the text snippet of text to be extracted based on prediction probability；Wherein, The determination process of neural network model includes: acquisition sample record, and every sample record includes that sample text and crowdsourcing mark, Crowdsourcing mark includes the annotation results that at least two mark personnel are labeled sample text；According to annotation results and study The learning outcome obtained in the process for sample text, determines Reward Program value；Based on Reward Program value and learning outcome, determine Sentence extraction device and sentence encoder.

During the determination of neural network model, every sample record includes sample text and crowdsourcing mark, the crowd Packet mark includes the annotation results that at least two mark personnel are labeled sample text.Moreover, the determination of the neural network In the process, sample is directed in the annotation results and learning process being labeled based at least two mark personnel to sample text The learning outcome that text obtains determines Reward Program value；Again be based on Reward Program value and learning outcome, determine sentence extraction device and Sentence encoder.Accordingly, it is determined that sentence extraction device and sentence encoder, it is contemplated that more mark personnel mark knot Fruit, in this way, the mode diversification that abstract is extracted, thus, improve the generalization that abstract extracts.

Detailed description of the invention

Fig. 1 is the applied environment figure schematic diagram of abstract abstracting method in one embodiment；

Fig. 2 is the flow chart of the abstract abstracting method of one embodiment；

Fig. 3 is sentence encoder operation principle schematic diagram in the abstract abstracting method in a specific embodiment；

Fig. 4 is sentence encoder operation principle schematic diagram in the abstract abstracting method in another specific embodiment；

Fig. 5 be a specific embodiment in abstract abstracting method in supervised training process flow diagram；

Fig. 6 be a specific embodiment in abstract abstracting method in intensified learning process flow diagram；

Fig. 7 is the operational process schematic diagram of the abstract abstracting method in a specific embodiment；

Fig. 8 is the structural block diagram of the abstract draw-out device in one embodiment；

Fig. 9 is the structural schematic diagram of computer equipment in one embodiment.

Specific embodiment

It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.

Fig. 1 is the applied environment figure schematic diagram of abstract abstracting method in one embodiment.Abstract provided by the present application extracts Method can be applied in application environment as shown in Figure 1.Wherein, user terminal 102 is led to by network with server 108 Letter.Wherein, user terminal 102 can be bench device or mobile terminal, such as desktop computer, tablet computer, smart phone. Server 108 can be independent physical server, physical server cluster or virtual server.

The abstract abstracting method of the application one embodiment may operate on server 108.User terminal 102 receives clothes The target sentences collection for the text snippet for belonging to text to be extracted that business device 108 is sent.Server 108 obtains text to be extracted；Base In the sentence encoder of neural network model, determine that each sentence in text to be extracted belongs to the prediction probability of text snippet；Pass through The sentence extraction device of neural network model determines the target sentences for belonging to the text snippet of text to be extracted based on prediction probability Collection；Wherein, the determination process of neural network model includes: acquisition sample record, every sample record include sample text and Crowdsourcing mark, crowdsourcing mark include the annotation results that at least two mark personnel are labeled sample text；It is tied according to mark The learning outcome obtained in fruit and learning process for sample text, determines Reward Program value；Based on Reward Program value and learn It practises as a result, determining sentence extraction device and sentence encoder.

As shown in Fig. 2, in one embodiment, providing a kind of abstract abstracting method.This method may operate in Fig. 1 Server 108.In other embodiments, the abstract abstracting method, also may operate on user terminal 102.The abstract is taken out Take method, comprising the following steps:

S202 obtains text to be extracted.

Text to be extracted is to need to be extracted the text object of abstract in primary abstract extraction process.The text to be extracted It can be news, article, paper etc..

S204, the sentence encoder based on neural network model determine that each sentence belongs to text snippet in text to be extracted Prediction probability.

First each sentence in text to be extracted can be encoded, obtain sentence coding；It is encoded again based on each sentence, It determines in text to be extracted, each sentence belongs to the prediction probability of text snippet.Wherein, sentence coding be will be in text to be extracted In short it is expressed as a vector.Each sentence in text to be extracted is encoded, obtains sentence coding, and be based on each sentence Son coding, determines in text to be extracted, the mode that each sentence belongs to the prediction probability of text snippet has very much.For example, using CNN (convolutional neural networks) encode each sentence in text to be extracted, obtain sentence coding, and encode based on each sentence, Determine that each sentence in text to be extracted belongs to the prediction probability of text snippet, which speed of service is fast.For another example, using LSTM (Long Short-Term Memory, shot and long term memory network) encodes each sentence in text to be extracted, obtains Sentence coding, and encoded based on each sentence, determine that each sentence in text to be extracted belongs to the prediction probability of text snippet.Due to LSTM has combined contextual information when encoding to sentence, therefore the relevance of which context is strong.

In a wherein specific embodiment, using CNN (convolutional neural networks) to each sentence in text to be extracted It is encoded, obtains the example of sentence coding, as shown in Figure 3.To sentence " Police are still hunting for the When driver " is encoded, respectively to each word vectorization, the convolutional layer that the result of vectorization is passed through into convolutional neural networks Process of convolution is carried out, convolution results are obtained.Pond is carried out to convolution results by the pond layer of neural network model again, obtains pond Change result.Redization layer is acted on finally by output layer, obtains the sentence coding of the sentence.

In a wherein specific embodiment, using LSTM, (Long Short-Term Memory, shot and long term remember net Network) each sentence in text to be extracted is encoded, sentence coding is obtained, and encode based on each sentence, determined to be extracted Each sentence belongs to the frame construction drawing of the prediction probability of text snippet as shown in figure 4, carrying out by bottom each in sentence in text The vectorization of word indicates, carries out tissue by each word of middle layer opposite direction quantization means, obtains the expression of sentence level；So Afterwards by the sentence encoder of neural network model, the probability that each sentence belongs to text snippet is obtained.

Objective function in the optimization process of neural network model (e.g., above-mentioned CNN, LSTM) can indicate are as follows:

Wherein, i-th of sentence after si presentation code, D and θ are the parameter of sentence encoder in neural network model, yi It is the label of i-th of sentence.Wherein, P (yi | si, D, θ) is indicated, under parameter role of the current mind by network model, i-th The label of sentence is the probability of yi.Two classification cross entropy loss function summations of n sentences all in sample text can be made For final objective function.

S206 determines the text for belonging to text to be extracted based on prediction probability by the sentence extraction device of neural network model The target sentences collection of this abstract.

The target sentence of text to be extracted can be determined based on prediction probability by the sentence extraction device in neural network model Subset, the target sentences collection of the text to be extracted include the sentence of the text snippet of text to be extracted.

Wherein, the determination process of neural network model is comprising steps of obtain sample record, every sample record includes sample Text and crowdsourcing mark, crowdsourcing mark include the annotation results that at least two mark personnel are labeled sample text；Root The learning outcome obtained according to sample text is directed in annotation results and learning process, determines Reward Program value；Based on return letter Numerical value and learning outcome determine sentence extraction device and sentence encoder.

Crowdsourcing mark includes the annotation results that all mark personnel provide, such as i-th of mark personnel, the mark provided Infusing result may be fi={ s1, s3, s5 }, and representative selects the 1st, 3,5 from text, as text snippet.A namely mark Infusing result includes belonging to the sentence set for the text snippet that mark personnel are marked.For each of sample record sample Text, corresponding crowdsourcing mark.

Reward Program value is the value of Reward Program during intensified learning.The Reward Program is for describing in intensified learning mistake Cheng Zhong, by the award for interacting acquisition with environment.Intensified learning is that server is learnt in a manner of " trial and error ", is passed through Behavior is instructed in the award for interacting acquisition with environment, and target is that intelligent body is made to obtain maximum award.Intensified learning is different from Supervised learning in connectionism study, is mainly manifested in Reward Program value, the return letter provided in intensified learning by environment Numerical value is to make a kind of evaluation to the quality of generation movement, rather than tell reinforcement learning system RLS (reinforcement Learning system) how to go to generate correct movement.

The learning outcome obtained in learning process for sample text may include in sample text, and each sentence belongs to text The probability of this abstract.The target study sentence collection of text snippet according to the probability, can be belonged in sample text.The target learns sentence Collection is the set for belonging to the sentence of text snippet in sample text obtained in learning process.It then can be according to the target It practises sentence collection and annotation results determines Reward Program value.It can be based on Reward Program value and learning outcome, to sentence withdrawal device and sentence Sub-encoders are updated, until determining the ginseng in sentence extraction device and sentence encoder when neural network model is optimal Number.

Abstract abstracting method based on the present embodiment, obtains text to be extracted；Sentence coding based on neural network model Device determines that each sentence in text to be extracted belongs to the prediction probability of text snippet；By the sentence extraction device of neural network model, The target sentences collection for belonging to the text snippet of text to be extracted is determined based on prediction probability；Wherein, the determination of neural network model Process includes: acquisition sample record, and every sample record includes sample text and crowdsourcing mark, and crowdsourcing mark includes at least two The annotation results that a mark personnel are labeled sample text；According in annotation results and learning process be directed to sample text Obtained learning outcome determines Reward Program value；Based on Reward Program value and learning outcome, determine that sentence extraction device and sentence are compiled Code device.

During the determination of neural network model, every sample record includes sample text and crowdsourcing mark, the crowd Packet mark includes the annotation results that at least two mark personnel are labeled sample text.Moreover, the determination of the neural network In the process, sample is directed in the annotation results and learning process being labeled based at least two mark personnel to sample text The learning outcome that text obtains determines Reward Program value；Again be based on Reward Program value and learning outcome, determine sentence extraction device and Sentence encoder.Accordingly, it is determined that sentence extraction device and sentence encoder, it is contemplated that more mark personnel mark knot Fruit, in this way, the mode diversification that abstract is extracted, thus, improve the generalization of abstract abstracting method.

In a wherein concrete application scene, user passes through the text snippet of user terminal requests text to be extracted.Tool Body such as, can be and issue request when mouse skips over headline.It can be when request shows the list of file to be extracted Issue request.The request is used to request server and sends the text snippet of text to be extracted to user terminal.Server can be Before receiving request, or after receiving request, above-mentioned abstract abstracting method, and the text that text to be extracted will be belonged to are run The target sentences collection of abstract, is sent to the user terminal displaying.

The study knot obtained in one of the embodiments, according to sample text is directed in annotation results and learning process Fruit determines Reward Program value, comprising: according to the learning outcome obtained in learning process for sample text, determines sample text In belong to text snippet target study sentence collection；The quantity for learning sentence in sentence collection and annotation results intersection according to target, with mark The quantity for infusing sentence in result, determines hit probability；According to hit probability, Reward Program value is determined.

Due to be directed in learning process the obtained learning outcome of sample text include in the sample text each sentence belong to The probability of summary texts.It can determine that the sentence set for belonging to text snippet in sample text, i.e. target learn according to the probability Sentence collection.After target study sentence collection has been determined, sentence in the intersection of sentence collection and annotation results can be learnt according to the target The quantity of sentence, determines hit probability in quantity and annotation results.Such as, target can be learnt to sentence collection and crowdsourcing mark In intersection in the quantity and annotation results of sentence sentence quantity, determine hit probability.The annotation results of i.e. each mark personnel Union, with target study sentence collection seek common ground, according to the quantity of sentence in the quantity and annotation results of sentence in the intersection, really Determine hit probability.The value of the hit probability can be determined as Reward Program value.The hit probability value can also be carried out into one After step processing, obtained value is determined as Reward Program value.

Abstract based on the present embodiment extracts mode, provides a kind of mode that Reward Program value is determined based on hit probability, In this way, abstract can be made to extract, obtained target sentences set is more accurate, so as to improve the accuracy that abstract extracts.

The quantity for learning sentence in sentence collection and annotation results intersection according to target in one of the embodiments, with mark As a result the quantity of middle sentence, determines hit probability, comprising: determine target study sentence collection respectively with the intersection sentence of each annotation results Quantity；According to the quantity of sentence in intersection sentence quantity and annotation results, the hit probability for being directed to mark personnel is determined.

Such as, the sentence sequence that target study sentence is concentrated can be indicated with ys；Fi indicates the mark knot of i-th of mark personnel Fruit.The sentence quantity of △ (ys, fi)=# (ys, fi)/# (fi), i.e. ys and fi intersection divided by fi sentence quantity.Wherein, △ (ys, fi) indicates the hit probability for i-th of mark personnel.

Abstract abstracting method based on the present embodiment provides a kind of mode for more specifically determining hit probability, should The hit probability that mode determines is more targeted, so as on this basis, further increase the accuracy that abstract extracts.

Further, Reward Program value is determined according to hit probability in one of the embodiments, comprising: when each intersection Sentence is equal in number when 0, determines that Reward Program value is equal to preset negative value.

It is to be appreciated that being directed to the value of the hit probability of mark personnel between [0,1].For target study sentence collection with The hit probability that crowdsourcing marks no any intersection namely all mark personnel be equal to 0 namely each intersection sentence it is equal in number In 0.Illustrate this time study something is rotten cake, at this point it is possible to determine that Reward Program value is equal to preset negative value.Such as, take one it is larger Negative, can be -2, -1, -3 etc..

In one of the embodiments, according to hit probability, Reward Program value is determined, comprising: when intersection sentence quantity is deposited When the case where being greater than 0, exponent arithmetic is carried out using each hit probability of non-zero as the truth of a matter, obtains operation result；Index The index value range of operation is (- 1,0)；It averages to each operation result, be recompensed functional value.

In this embodiment, when intersection sentence quantity has the case where being greater than 0, each hit probability of non-zero is extracted Come, to the probability that each is extracted, carry out exponent arithmetic with y=x^ α effect, x here is the truth of a matter, α take for index (- 1,0), in this way, can will obtain for probability, biggish operation result.For example α=- 1/2 is taken, if probability 0.09, Operation result after then acting on is 0.3；Originally probability is 0.81, and operation result is 0.9 after effect.In this way, can be to avoid All probability values are all less than normal, it is difficult to pull open gap, it is thus possible to make Reward Program value, more reliably, further increase and pluck The accuracy to be extracted.

It is based on Reward Program value and learning outcome in one of the embodiments, determines sentence extraction device and sentence coding Device, comprising: according to learning outcome, determine the model probabilities value of various decimation patterns；It is general according to mode for various decimation patterns The product of rate value and Reward Program value determines loss function value；According to loss function value, sentence extraction device is determined.

Sentence extraction device is the methods of sampling of a Weight, and the probability of summary texts is belonged to according to sentence, and extraction most may be used Several set of energy, such as { s1, s2 } and { s1, s4 }, have respectively represented two kinds of decimation patterns.And Sampling weights are then that basis is returned Function and learning outcome unification is reported to determine.It is to be appreciated that under a kind of decimation pattern, corresponding Reward Program value can be with The return of the decimation pattern is described.It may include sample text due to being directed to the learning outcome that sample text obtains in learning process In, each sentence belongs to the probability of text snippet.Therefore, the mode of various decimation patterns can be determined according to the learning outcome Probability value.Such as, it can be belonged in the probability and sample text of text snippet according to each sentence in a decimation pattern and not belonged to It is obtained in the determine the probability that the sentence of the decimation pattern is not belonging to text snippet.The sentence of the decimation pattern is not belonging in sample text Son is not belonging to the probability of text snippet, can be subtracted with 1 and be not belonging to the sentence of the decimation pattern in sample text and belong to text and pluck The determine the probability wanted obtains.

In the present embodiment, loss is determined according to the product of model probabilities value and Reward Program value for various decimation patterns Functional value, in this way, allowing loss function related to decimation pattern and Reward Program.It is thus possible to according to loss function Value, determines sentence extraction device.In this way, in each iterative learning, Reward Program is accounted for factor by loss function, on the one hand Original sentence encoder can be optimized, on the other hand, according to Reward Program value and model probabilities value, Lai Youhua sentence extraction Device, so that sentence extraction device extracts, probability is bigger, the higher sentence set of Reward Program value.

In one of the embodiments, according to learning outcome, the model probabilities value of various decimation patterns is determined, comprising: root According to learning outcome, determination belongs to the first probability value that the sentence extracted under decimation pattern is drawn into；According to learning outcome, determine The second probability value that the sentence being not belonging under decimation pattern is not drawn into；Based on each first probability value and each second probability value, Determine model probabilities value.

In the present embodiment, a kind of mode more specifically determining model probabilities value is provided.Such as, in a specific example In, decimation pattern here is exactly the maximum several abstract forms of a possibility that sentence extraction device is sampled out, takes out sentence in other words Mode, be formed by set.For each decimation pattern, such as { s1, s2 }, this decimation pattern lower probability is represented with P Value, i.e. model probabilities value.Assuming that sample text is there are three sentence { s1, s2, s3 }, then the model probabilities value under the decimation pattern P=p (y1=1 | s1) * p (y2=1 | s2) * p (y3=0 | s3).

In this way, the accuracy of model probabilities value can be improved, to improve the accuracy of loss function value, improves abstract and mention The accuracy taken.

The study knot obtained in one of the embodiments, according to sample text is directed in annotation results and learning process Fruit determines Reward Program value, comprising: exercises supervision training to sample record, obtains supervised training result；Based on supervised training knot Fruit obtains learning outcome for sample text in learning process；According to annotation results and learning outcome, Reward Program is determined Value.

In the present embodiment, the process for determining sentence encoder and sentence extraction device can be known as intensified learning process, In learning process during intensified learning, it is trained based on supervised training result.In this way, the instruction that can first exercise supervision Practice, then give intensive training, it can thus be avoided the sentence space of extraction is excessively sparse, and causes in intensified learning process Model do not restrain.

It should be noted that in the present embodiment, sample record used in the process of supervised training can be identical Sample record, or different sample records.In a preferred embodiment, in order to further increase the accurate of model Property, so that the accuracy that abstract extracts is improved, it can be during the process of supervised training and intensified learning, using different Sample record.

Further, it exercises supervision training to sample record, obtains supervised training result, comprising: mark according to crowdsourcing, really Calibrate fiducial mark note；It is marked according to the learning outcome of supervised training and standard, determines loss function value；According to loss function value, really Determine supervised training result.

It during supervised training, can be marked according to the crowdsourcing in sample record, determine the sample text of the sample record This standard mark.Such as, the union for each annotation results that crowdsourcing can be marked, the standard as the sample text mark.Again Such as, labeled times in each annotation results can be greater than to the set of the sentence of preset times, marked as standard.In supervised training Each round iterative learning procedure in, this can be discussed into learning outcome and standard and marked and compare, determine loss function value；When Perhaps iteration round is greater than preset value when loss function value restrains or the time of supervised training reaches preset time, determines prison Superintend and direct training result.

Further, it is marked according to crowdsourcing, determines that standard marks, comprising: each sentence in sample text is determined, in crowdsourcing The labeled times being marked in mark；Labeled times are greater than to the sentence of preset times, are determined as the standard mark of sample text The sentence marked.

Such as, the mark of certain words s is denumerable in article can be expressed as k=Σ_i(fi(s))；Here fi (s) represents i-th Whether a mark personnel are using s as text snippet, so k is exactly using s as the quantity of the mark personnel of label, Ye Jibiao in fact Infuse number.Assuming that there is n sentence in text, we can successively obtain the labeled times of this n sentence in this way, then can mistake The sentence (i.e. all mark personnel do not select it) of k=0 is filtered, remaining m sentence score is successively denoted as { k1 ..., km }, can be with One preset times th is set and retains the sentence of k > th as threshold value, and by sequence arrangement in the text, as standard mark Note, can mark that be that other sentences can mark be in the sentence label of standard mark.In this way, standard can be made to mark The annotation results of all mark personnel can be ideally represented as far as possible.It is thus possible to improve the accuracy of supervised training result, from And the accuracy that abstract extracts can be improved.

In a wherein specific embodiment, the flow diagram of supervised training process as shown in figure 5, obtain sample first Then record carries out sentence coding to the sample text in sample record, determined based on sentence coding result each in sample text Sentence belongs to the prediction probability of text snippet, namely obtains the learning outcome of supervised training.It is marked according to crowdsourcing, determines standard mark Then note marks according to the learning outcome of supervised training and standard, determines loss function value.The loss function value, which can be, to be based on The penalty values that cross entropy loss function determines.During supervised training, neural network can be updated according to the loss function value The parameter of the sentence encoder of model, so that neural network model is optimal.

In a wherein specific embodiment, the flow diagram of intensified learning process as shown in fig. 6, obtain sample first Then record carries out sentence coding to the sample text in sample record, determined based on sentence coding result each in sample text Sentence belongs to the prediction probability of text snippet.Then by sentence extraction device, it is based on the prediction probability, determination belongs to sample text Text snippet target study sentence collection namely learning process in the learning outcome that obtains for sample text.It is tied according to mark The learning outcome obtained in fruit and learning process for sample text, determines Reward Program value, finally, being based on Reward Program value With learning outcome, the parameter of sentence extraction device and sentence encoder is updated, so that neural network model is optimal.

In a wherein specific example, the operational process schematic diagram for abstracting method of making a summary, as shown in Figure 7.It obtains wait take out Text is taken, sentence coding is carried out to sentence each in text to be extracted by the sentence encoder of neural network model, is then based on Sentence coding determines that each sentence in text to be extracted belongs to the prediction probability of text snippet；Pass through the sentence of neural network model Withdrawal device determines the target sentences collection for belonging to the text snippet of text to be extracted based on prediction probability.Finally, being also based on this Target sentences collection can also generate the text snippet of prediction.It should be noted that being taken out in trained sentence encoder and sentence On the basis of taking device, the text snippet in text to be extracted is extracted.Due to neural network model crowdsourcing mark answer on Tuning mistake, so can choose the highest several extraction situations of probability, and export in sentence extraction device part.Due to returning letter Number is to mark determination according to crowdsourcing,, can be by sentence encoder and sentence by the feedback of Reward Program value in the training stage Sub- withdrawal device is adjusted to optimal, so the probability of abstract extraction stage output, has incorporated the knot of multiple standards answer training Fruit.

The sentence extraction device for passing through neural network model in one of the embodiments, is belonged to based on prediction probability determination The target sentences collection of the text snippet of text to be extracted, comprising: the portrait of the user of acquisition request text to be extracted；Pass through nerve The sentence extraction device of network model, portrait and prediction probability based on user determine the text snippet for belonging to text to be extracted Target sentences collection.It is to be appreciated that crowdsourcing mark further includes mark personnel during the determination of corresponding neural network model User's portrait.In this way, can to provide different target sentences collection for different user, thus, improve the needle that abstract extracts To property.In other embodiments, user's portrait can not also be distinguished, different target sentences collection is combined, as wait take out Take the most complete text snippet of text.

It should be understood that although each step in the flow chart of Fig. 2 is successively shown according to the instruction of arrow, this A little steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly state otherwise herein, these steps It executes there is no the limitation of stringent sequence, these steps can execute in other order.Moreover, at least part in Fig. 2 Step may include that perhaps these sub-steps of multiple stages or stage are executed in synchronization to multiple sub-steps It completes, but can execute at different times, the execution sequence in these sub-steps or stage, which is also not necessarily, successively to be carried out, But it can be executed in turn or alternately at least part of the sub-step or stage of other steps or other steps.

In one embodiment, as shown in figure 8, providing a kind of abstract extraction corresponding with above-mentioned abstract abstracting method Device, comprising:

Text obtains module 802, for obtaining text to be extracted；

Probabilistic forecasting module 804 determines in the text to be extracted for the sentence encoder based on neural network model Each sentence belongs to the prediction probability of text snippet；

Determining module 806 of making a summary is determined for passing through the sentence extraction device of neural network model based on the prediction probability Belong to the target sentences collection of the text snippet of the text to be extracted；

Model training module 808, including sample record acquiring unit 808a, return value determination unit 808b and parameter are more New unit 808c；

The sample record acquiring unit 808a, for obtaining sample record, every sample record includes sample text This and crowdsourcing mark, the crowdsourcing mark the mark knot being labeled including at least two mark personnel to the sample text Fruit；

Return value determination unit 808b, for according in the annotation results and learning process be directed to the sample text Obtained learning outcome determines Reward Program value；

Parameter updating unit 808c, for determining that the sentence is taken out based on the Reward Program value and the learning outcome Take device and the sentence encoder.

Abstract draw-out device based on the present embodiment, text obtain module 802 and obtain text to be extracted；Probabilistic forecasting module The 804 sentence encoders based on neural network model determine that each sentence in text to be extracted belongs to the prediction probability of text snippet； Determining module 806 of making a summary passes through the sentence extraction device of neural network model, belongs to text to be extracted based on prediction probability determination The target sentences collection of text snippet；Wherein, the sample record acquiring unit 808a acquisition sample record of model training module, every Sample record includes sample text and crowdsourcing mark, and crowdsourcing mark includes that at least two mark personnel mark sample text The annotation results of note；Return value determination unit 808b is obtained according to sample text is directed in annotation results and learning process It practises as a result, determining Reward Program value；Parameter updating unit 808c is based on Reward Program value and learning outcome, determines sentence extraction device And sentence encoder.

Due to model training module, during the determination of neural network model, every sample record includes sample text And crowdsourcing mark, crowdsourcing mark include the annotation results that at least two mark personnel are labeled sample text.Moreover, should During the determination of neural network, sample text is labeled based at least two mark personnel annotation results and study The learning outcome obtained in the process for sample text, determines Reward Program value；It is based on Reward Program value and learning outcome again, really Determine sentence extraction device and sentence encoder.Accordingly, it is determined that sentence extraction device and sentence encoder, it is contemplated that more mark The annotation results of personnel, in this way, the mode diversification that abstract is extracted, thus, improve the generalization of abstract draw-out device.

The return value determination unit in one of the embodiments, for according in learning process be directed to the sample The learning outcome that text obtains determines the target study sentence collection for belonging to text snippet in the sample text；According to the target Learn the quantity of sentence in sentence collection and the annotation results intersection, the quantity with sentence in the annotation results determines that hit is general Rate；According to the hit probability, Reward Program value is determined.

The return value determination unit in one of the embodiments, is also used to determine the target study sentence collection difference With the intersection sentence quantity of each annotation results；According to the number of sentence in the intersection sentence quantity and the annotation results Amount determines the hit probability for being directed to the mark personnel.

The return value determination unit in one of the embodiments, is also used to exist greatly when the intersection sentence quantity When in 0 the case where, exponent arithmetic is carried out using each hit probability of non-zero as the truth of a matter, obtains operation result；It is described The index value range of exponent arithmetic is (- 1,0)；It averages to each operation result, be recompensed functional value.

The return value determination unit in one of the embodiments, is also used to when each intersection sentence is equal in number When 0, determine that Reward Program value is equal to preset negative value.

The parameter updating unit in one of the embodiments, for determining various extractions according to the learning outcome The model probabilities value of mode；For the various decimation patterns, according to the product of the model probabilities value and the Reward Program value, Determine loss function value；According to the loss function value, the sentence extraction device is determined.

The parameter updating unit in one of the embodiments, is also used to according to the learning outcome, and determination belongs to institute State the first probability value that the sentence extracted under decimation pattern is drawn into；According to the learning outcome, determination is not belonging to the pumping The second probability value that sentence under modulus formula is not drawn into；Based on each first probability value and each second probability value, Determine model probabilities value.

The return value determination unit in one of the embodiments, is also used to the instruction that exercises supervision to the sample record Practice, obtains supervised training result；Based on the supervised training as a result, being learnt in learning process for the sample text As a result；According to the annotation results and the learning outcome, Reward Program value is determined.

The return value determination unit in one of the embodiments, is also used to be marked according to the crowdsourcing, determines standard Mark；It is marked according to the learning outcome of the supervised training and the standard, determines loss function value；According to the loss function Value, determines supervised training result.

The return value determination unit in one of the embodiments, is also used to determine each sentence in sample text, in institute State the labeled times being marked in crowdsourcing mark；The labeled times are greater than to the sentence of preset times, are determined as the sample The standard of text marks marked sentence.

As shown in figure 9, in one embodiment, providing a kind of computer equipment, which can be service Device, can also be with terminal.The computer equipment includes processor, memory and the network interface connected by system bus.Wherein, The processor of the computer equipment is for providing calculating and control ability.The memory of the computer equipment includes non-volatile deposits Storage media, built-in storage.The non-volatile memory medium is stored with operating system and computer program.The built-in storage is non-easy The operation of operating system and computer program in the property lost storage medium provides environment.The network interface of the computer equipment is used for It is communicated with external computer equipment by network connection.To realize that a kind of abstract is taken out when the computer program is executed by processor Take method.

In one embodiment, a kind of computer equipment is provided.The computer equipment, including memory and processing Device, the memory are stored with computer program, and the processor realizes that above-mentioned abstract extracts when executing the computer program The step of method.

The application provides a kind of computer equipment, including memory and processor, and the memory is stored with computer journey Sequence, the processor perform the steps of when executing the computer program

Obtain text to be extracted；

Wherein, the determination process of the neural network model includes:

According to the hit probability, Reward Program value is determined.

It averages to each operation result, be recompensed functional value.

It is marked according to the crowdsourcing, determines that standard marks；

According to the loss function value, supervised training result is determined.

In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, it is described Following step is realized when computer program is executed by processor:

Obtain text to be extracted；

Wherein, the determination process of the neural network model includes:

According to the hit probability, Reward Program value is determined.

It averages to each operation result, be recompensed functional value.

It is marked according to the crowdsourcing, determines that standard marks；

According to the loss function value, supervised training result is determined.

Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..

Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.

The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims

1. a kind of abstract abstracting method, which comprises

Obtain text to be extracted；

Sentence encoder based on neural network model determines that each sentence in the text to be extracted belongs to the prediction of text snippet Probability；

By the sentence extraction device of neural network model, the text for belonging to the text to be extracted is determined based on the prediction probability The target sentences collection of abstract；

Wherein, the determination process of the neural network model includes:

Sample record is obtained, every sample record includes sample text and crowdsourcing mark, and the crowdsourcing mark includes extremely The annotation results that few two mark personnel are labeled the sample text；

The learning outcome obtained according to the sample text is directed in the annotation results and learning process, determines Reward Program Value；

2. the method according to claim 1, wherein described according to needle in the annotation results and learning process To the learning outcome that the sample text obtains, Reward Program value is determined, comprising:

The learning outcome obtained according to the sample text is directed in learning process, determines that belonging to text in the sample text plucks The target study sentence collection wanted；

The quantity for learning sentence in sentence collection and the annotation results intersection according to the target, with sentence in the annotation results Quantity determines hit probability；

According to the hit probability, Reward Program value is determined.

3. according to the method described in claim 2, it is characterized in that, described learn sentence collection and mark knot according to the target In fruit intersection in the quantity of sentence, with the annotation results sentence quantity, determine hit probability, comprising:

According to the quantity of sentence in the intersection sentence quantity and the annotation results, the hit for being directed to the mark personnel is determined Probability.

4. according to the method described in claim 3, it is characterized in that, described determine Reward Program value according to the hit probability, Include:

When the intersection sentence quantity exist be greater than 0 the case where when, using each hit probability of non-zero as the truth of a matter into Row index operation, obtains operation result；The index value range of the exponent arithmetic is (- 1,0)；

It averages to each operation result, be recompensed functional value.

5. the method according to claim 1, wherein described tied based on the Reward Program value and the study Fruit determines the sentence extraction device and the sentence encoder, comprising:

Loss function is determined according to the product of the model probabilities value and the Reward Program value for the various decimation patterns Value；

6. according to the method described in claim 5, determining various extraction moulds it is characterized in that, described according to the learning outcome The model probabilities value of formula, comprising:

According to the learning outcome, determination belongs to the first probability value that the sentence extracted under the decimation pattern is drawn into；

According to the learning outcome, the second probability value that the sentence being not belonging under the decimation pattern is not drawn into is determined；

7. the method according to claim 1, wherein described according to needle in the annotation results and learning process To the learning outcome that the sample text obtains, Reward Program value is determined, comprising:

8. being obtained the method according to the description of claim 7 is characterized in that described exercise supervision training to the sample record Supervised training result, comprising:

It is marked according to the crowdsourcing, determines that standard marks；

According to the loss function value, supervised training result is determined.

9. a kind of abstract draw-out device, described device include:

Text obtains module, for obtaining text to be extracted；

Probabilistic forecasting module determines each sentence in the text to be extracted for the sentence encoder based on neural network model Belong to the prediction probability of text snippet；

Determining module of making a summary belongs to institute based on prediction probability determination for passing through the sentence extraction device of neural network model State the target sentences collection of the text snippet of text to be extracted；

The sample record acquiring unit, for obtaining sample record, every sample record includes sample text and crowd Packet mark, the crowdsourcing mark include the annotation results that at least two mark personnel are labeled the sample text；

The return value determination unit, for being obtained according in the annotation results and learning process for the sample text Learning outcome, determine Reward Program value；

The parameter updating unit, for determining the sentence extraction device based on the Reward Program value and the learning outcome And the sentence encoder.

10. a kind of computer equipment, including memory and processor, the memory is stored with computer program, the processing The step of device realizes method described in claim 1-8 any one when executing the computer program.