CN116186243A - Text abstract generation method, device, equipment and storage medium - Google Patents

Text abstract generation method, device, equipment and storage medium Download PDF

Info

Publication number
CN116186243A
CN116186243A CN202310001249.3A CN202310001249A CN116186243A CN 116186243 A CN116186243 A CN 116186243A CN 202310001249 A CN202310001249 A CN 202310001249A CN 116186243 A CN116186243 A CN 116186243A
Authority
CN
China
Prior art keywords
sentence
target
text
value
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310001249.3A
Other languages
Chinese (zh)
Inventor
王伟
张黔
陈焕坤
曾志贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Resources Digital Technology Co Ltd
Original Assignee
China Resources Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Resources Digital Technology Co Ltd filed Critical China Resources Digital Technology Co Ltd
Priority to CN202310001249.3A priority Critical patent/CN116186243A/en
Publication of CN116186243A publication Critical patent/CN116186243A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a text abstract generation method, a device, equipment and a storage medium, and relates to the technical field of artificial intelligence. According to the method, the sentence contribution degree of the target sentence is obtained by using the sentence weight model, and then the sentence set for generating the text abstract is selected according to the sentence contribution degree, so that the text abstract model is more prone to generating the text abstract by using sentences with high contribution degree, the accuracy of the generated text abstract is improved, and the problem that in the related art, the prior information of the sentences is not considered when the text abstract is generated, and long sentences or sentences with low information content are selected as the text abstract, so that the accuracy of the text abstract is lower is solved.

Description

Text abstract generation method, device, equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for generating a text abstract.
Background
Today, people are charged with a lot of information in daily work and life, and a lot of effort is required to screen information which is really useful from information such as news, meeting summary and the like. For example, a text summary can compress lengthy text into short text while preserving the core motif of the text prior to compression, thereby greatly reducing additional effort by humans.
The text abstract in the related art comprises an extraction type abstract, and some sentences in the text to be compressed are extracted through a neural network to form the text abstract. The method specifically comprises the steps of scoring each sentence in a text to be compressed by using a network model, regarding the obtained score as the weight of the sentence, and selecting the specified number of sentences with the top ranking of the weight to form a final text abstract. However, the text abstract model in the related art does not extract prior information of sentences, long sentences with low information content are easy to select as text abstracts, and the accuracy of the final text abstracts is low, so that the information acquisition requirements of people cannot be met.
Disclosure of Invention
The embodiment of the application mainly aims to provide a text abstract generating method, a device, equipment and a storage medium, and the generating accuracy of the text abstract is improved.
To achieve the above object, a first aspect of an embodiment of the present application provides a text abstract generating method, including:
acquiring a target text, wherein the target text comprises a plurality of target sentences;
calculating the contribution degree of each target sentence by using a sentence weight model to obtain the sentence contribution degree of each target sentence;
sorting the plurality of target sentences according to the sentence contribution degree;
Selecting N target sentences according to the sorting result to obtain a sentence set, wherein N is an integer greater than 1;
extracting a text abstract from the sentence set by using a text abstract model, and selecting the target sentence from the sentence set to form an abstract set;
and generating a text abstract of the target text segment according to the target sentence in the abstract set.
In some embodiments, the calculating the contribution degree of each target sentence by using a sentence weight model to obtain the sentence contribution degree of each target sentence includes:
calculating grid parameters of the target sentence, wherein the grid parameters comprise: sentence length ratio and similarity value;
performing parameter mapping on the grid parameters of each target sentence to obtain a Voronoi vector diagram, wherein the horizontal axis of the Voronoi vector diagram is used for representing the sentence length ratio, and the vertical axis of the Voronoi vector diagram is used for representing the similarity value;
calculating a first similarity value of each target sentence according to the Voronoi vector diagram;
calculating a second similarity value of the target sentence;
and calculating statement contribution degree of the target statement according to the first similarity value and the second similarity value.
In some embodiments, the calculating the statement length ratio of the target statement includes:
acquiring a word segmentation sequence of the target sentence;
acquiring the effective word segmentation quantity and the total word segmentation quantity of the word segmentation sequence;
and calculating the statement length ratio of the target statement according to the ratio of the effective word segmentation quantity to the total word segmentation quantity.
In some embodiments, calculating the similarity value for the target sentence comprises:
obtaining a single sentence vector of each target sentence;
calculating an average sentence vector of the single sentence vectors;
and calculating the similarity value of the target sentence according to the single sentence vector and the average sentence vector.
In some embodiments, the performing parameter mapping on the grid parameters of each target sentence to obtain a Voronoi vector diagram includes:
generating a center point of each target sentence according to the grid parameters of each target sentence;
constructing an adjacent triangular surface according to the central point based on the nearest neighbor principle;
and generating the Voronoi vector diagram according to the adjacent triangular surfaces, wherein the Voronoi vector diagram comprises a word segmentation grid of each target sentence.
In some embodiments, the calculating the first similarity value of each of the target sentences according to the Voronoi vector diagram includes:
Calculating the inter-sentence aggregation degree between each target sentence and other target sentences according to the grid area of the word segmentation grid;
calculating a first number of the inter-sentence aggregation degree of each target sentence within a first preset threshold range;
and calculating a first similarity value of each target sentence and other target sentences according to the first quantity and the sentence number of the target sentences.
In some embodiments, the computing a second similarity value for the target sentence comprises:
calculating an inter-sentence distance value between each target sentence and other target sentences;
calculating a second number of inter-sentence distance values of each target sentence within a second preset threshold range;
and calculating a second similarity value of each target sentence and other target sentences according to the second quantity and the sentence number of the target sentences.
In some embodiments, the text summarization model is an DQN model, and the method further includes, before extracting the text summarization for the set of sentences using the text summarization model and selecting the predicted sentence from the set of sentences to form a summary set: training the text abstract model in a reinforcement learning mode; the training process comprises the following steps:
Constructing an auxiliary network model corresponding to the text abstract model;
acquiring a training sentence set, wherein the sentence contribution degree of training sentences in the training sentence set is larger than a third preset threshold value, and the sentence contribution degree is calculated by utilizing the sentence weight model;
selecting a prediction statement at the current moment from the training statement set, storing the prediction statement at the current moment into a summary prediction set, and obtaining a current state value at the current moment;
calculating to obtain a reward function value at the current moment according to the current state value;
inputting the prediction statement at the current moment into the text abstract model to obtain a first estimated value function value;
inputting the prediction statement at the current moment into the auxiliary network model to obtain a second estimated value function value;
calculating an objective function value according to the reward function value, the first estimated function value and the second estimated function value;
selecting a prediction statement of the next moment from the training statement set according to a preset optimization strategy and the objective function value;
and iteratively executing the process, and updating the model weights of the text abstract model and the auxiliary network model in the iteration process until the objective function value reaches a preset iteration condition.
In some embodiments, the calculating the prize function value at the current time according to the current state value includes:
acquiring a current moment evaluation index value and a previous moment evaluation index value;
if the evaluation index value at the current moment is smaller than the evaluation index value at the previous moment, calculating to obtain the reward function value at the current moment according to a first formula;
and if the evaluation index value at the current moment is greater than or equal to the evaluation index value at the previous moment, calculating the reward function value at the current moment according to a second formula.
In some embodiments, updating model weights of the text excerpt model and the auxiliary network model in the iterative process includes:
updating the model weight of the text abstract model in each iteration process;
and transplanting the model weight of the text abstract model into the auxiliary network model based on a preset updating period so as to update the model weight of the auxiliary network model.
In some embodiments, the calculating the objective function value according to the reward function value, the first estimated function value and the second estimated function value includes:
obtaining an attenuation coefficient;
calculating an intermediate value from the attenuation coefficient and the second estimated value function;
And calculating the objective function value according to the reward function value, the first estimated value function value and the intermediate value.
In some embodiments, the text summarization model includes a first hidden layer, a second hidden layer, and a softmax layer; the first model parameters of the first hidden layer include: a first weight matrix and a first bias amount, wherein the second model parameters of the second hidden layer comprise: a second weight matrix and a second bias amount; updating the first model parameter and the second model parameter during each iteration; the step of inputting the target sentence at the current moment into the text abstract model to obtain a first estimated function value comprises the following steps:
inputting the prediction statement into the first hidden layer to obtain a first output;
inputting the first output into the second hidden layer to obtain a second output;
and inputting the third output into the softmax layer to obtain the first estimated function value.
To achieve the above object, a second aspect of an embodiment of the present application proposes a text digest generating device, including:
target text acquisition unit: the method comprises the steps of acquiring a target text, wherein the target text comprises a plurality of target sentences;
Statement contribution degree calculation unit: the sentence contribution degree calculation module is used for calculating the contribution degree of each target sentence by utilizing a sentence weight model to obtain the sentence contribution degree of each target sentence;
a sequencing unit: the method comprises the steps of sorting a plurality of target sentences according to the sentence contribution degree;
a target sentence selection unit: the method comprises the steps of selecting N target sentences according to a sequencing result to obtain a sentence set, wherein N is an integer greater than 1;
a digest extraction unit: the method comprises the steps of extracting text abstracts from a sentence set by using a text abstracting model, and selecting the target sentence from the sentence set to form a abstracted set;
a digest generation unit: and generating a text abstract of the target text segment according to the target sentence in the abstract set.
To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, which includes a memory and a processor, the memory storing a computer program, the processor implementing the method according to the first aspect when executing the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a storage medium, which is a computer-readable storage medium, storing a computer program, which when executed by a processor implements the method described in the first aspect.
According to the text abstract generation method, the device, the equipment and the storage medium, the target text segments containing a plurality of target sentences are obtained, the sentence contribution degree of each target sentence is calculated by utilizing the sentence weight model, the sentence contribution degree of each target sentence is obtained, the plurality of target sentences are sequenced and selected according to the sentence contribution degrees to obtain a sentence set, the text abstract model is utilized to extract the text abstract from the sentence set, and the target sentences are selected from the sentence set to form the abstract set; and generating a text abstract of the target text segment according to the target sentence in the abstract set. According to the method and the device for generating the text abstract, the sentence contribution degree of the target sentence is obtained through the sentence weight model, and then the sentence set used for generating the text abstract is selected according to the sentence contribution degree, so that the text abstract model is more prone to generating the text abstract through sentences with high contribution degree, the accuracy of the generated text abstract is improved, and the problem that in the related art, prior information of sentences is not considered when the text abstract is generated, and long sentences or sentences with low information content are selected as the text abstract, so that the accuracy of the text abstract is low is solved.
Drawings
Fig. 1 is a flowchart of a text abstract generation method according to an embodiment of the present invention.
Fig. 2 is a flowchart of step S120 in fig. 1.
Fig. 3 is a flowchart of step S121 in fig. 2.
Fig. 4 is a flowchart of step S121 in fig. 2.
Fig. 5 is a flowchart of step S122 in fig. 2.
Fig. 6a to fig. 6c are schematic diagrams of a Voronoi vector diagram generating process of the text abstract generating method according to the embodiment of the invention.
Fig. 7 is a flowchart of step S123 in fig. 2.
Fig. 8 is a flowchart of step S124 in fig. 2.
Fig. 9 is a flowchart of a text summary generating method according to an embodiment of the present invention.
Fig. 10 is a flowchart of step 940 in fig. 9.
Fig. 11 is a block diagram of a text summarization apparatus according to still another embodiment of the present invention.
Fig. 12 is a schematic hardware structure of an electronic device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.
First, several nouns involved in the present invention are parsed:
artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.
Reinforcement learning (Reinforcement Learning, RL), also known as re-excitation learning, evaluation learning, or reinforcement learning, is one of the paradigm and methodology of machine learning to describe and solve the problem of agents (agents) through learning strategies to maximize returns or achieve specific goals during interactions with an environment. A common model for reinforcement learning is a standard markov decision process. Reinforcement learning can be classified into mode-based reinforcement learning and modeless reinforcement learning, and active reinforcement learning and passive reinforcement learning, according to given conditions. Algorithms used to solve the reinforcement learning problem can be classified into a policy search algorithm and a value function algorithm. The deep learning model may be used in reinforcement learning to form deep reinforcement learning.
Voronoi diagram: also called Thiessen polygons or Dirichlet diagrams, are made up of a set of consecutive polygons made up of perpendicular bisectors connecting two adjacent points straight lines. N points which are distinguished on the plane are divided into planes according to the nearest neighbor principle, and each point is associated with the nearest neighbor area.
Today, people are charged with a lot of information in daily work and life, and a lot of effort is required to screen information which is really useful from information such as news, meeting summary and the like. For example, a text summary can compress lengthy text into short text while preserving the core motif of the text prior to compression, thereby greatly reducing additional effort by humans.
The text abstract in the related art includes an extraction type abstract, such as TextRank method, which uses a neural network to extract some sentences in the text to be compressed, and uses the neural network to extract the sentences as key words to form the text abstract. The method specifically comprises the steps of scoring each sentence in a text to be compressed by using a network model, regarding the obtained score as the weight of the sentence, and selecting the specified number of sentences with the top ranking of the weight to form a final text abstract. However, the text abstract model in the related art does not extract prior information of sentences, long sentences with low information content are easy to select as text abstracts, and the accuracy of the final text abstracts is low, so that the information acquisition requirements of people cannot be met.
Based on the above, the embodiment of the invention provides a text abstract generating method, a device, equipment and a storage medium, which acquire the statement contribution degree of a target statement by utilizing a statement weight model, and then select a statement set for generating a text abstract according to the statement contribution degree, so that the text abstract model is more prone to generating the text abstract by utilizing a statement with high contribution degree, the accuracy of the generated text abstract is improved, and the problem that in the related technology, a long or low-information sentence is selected as the text abstract without considering prior information of the statement when the text abstract is generated, and the accuracy of the text abstract is lower is solved.
The embodiment of the invention provides a text abstract generating method, a device, equipment and a storage medium, and particularly, the text abstract generating method in the embodiment of the invention is described firstly through the following embodiment.
The embodiment of the invention can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (ArtificialIntelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the invention provides a text abstract generation method, which relates to the technical field of artificial intelligence, in particular to the technical field of data mining. The text abstract generating method provided by the embodiment of the invention can be applied to a terminal, a server and a computer program running in the terminal or the server. For example, the computer program may be a native program or a software module in an operating system; the method can be a local (Native) Application program (APP), namely a program which needs to be installed in an operating system to run, such as a client supporting text abstract generation, or an applet, namely a program which only needs to be downloaded into a browser environment to run; but also an applet that can be embedded in any APP. In general, the computer programs described above may be any form of application, module or plug-in. Wherein the terminal communicates with the server through a network. The text digest generation method may be performed by a terminal or a server, or by a terminal and a server in cooperation.
In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, or smart watch, or the like. In addition, the terminal can also be an intelligent vehicle-mounted device. The intelligent vehicle-mounted equipment provides relevant services by applying the text abstract generation method of the embodiment, and driving experience is improved. The server can be an independent server, and can also be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDNs), basic cloud computing services such as big data and artificial intelligent platforms, and the like; or may be service nodes in a blockchain system, where a Peer-To-Peer (P2P) network is formed between the service nodes, and the P2P protocol is an application layer protocol that runs on top of a transmission control protocol (TCP, transmission Control Protocol) protocol. The server may be provided with a server of the text digest generating system, through which interaction with the terminal may be performed, for example, the server may be provided with corresponding software, which may be an application for implementing the text digest generating method, or the like, but is not limited to the above form. The terminal and the server may be connected by a communication connection manner such as bluetooth, USB (Universal Serial Bus ) or a network, which is not limited herein.
The invention is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The text abstract generating method in the embodiment of the invention is first described below.
Fig. 1 is an optional flowchart of a text summarization method according to an embodiment of the present invention, where the method in fig. 1 may include, but is not limited to, steps S110 to S160. It should be understood that the order of steps S110 to S160 in fig. 1 is not particularly limited, and the order of steps may be adjusted, or some steps may be reduced or increased according to actual requirements.
Step S110: and obtaining the target text segment.
In an embodiment, the target text, i.e. the text to be summarized, may be any text content, such as news stories, article stories, web page text, etc. It can be understood that the target text in this embodiment includes a plurality of target sentences. For example, the target text segment may be divided according to a preset punctuation mark, so as to obtain a plurality of target sentences corresponding to the target text segment, where the preset punctuation mark may be a punctuation mark that indicates that a sentence ends, such as a period, a question mark, or a mark. In an embodiment, a network model related to segmentation may also be built in advance, and segment detection may be performed on the target segment to determine the location of the segment to be segmented.
Step S120: and calculating the contribution degree of each target sentence by using the sentence weight model to obtain the sentence contribution degree of each target sentence.
In order to utilize prior information of sentences in the text abstract generation process to improve the accuracy of text abstract generation, in an embodiment, a sentence weight model is utilized to calculate the contribution degree of each target sentence, so as to obtain the sentence contribution degree of each target sentence, wherein the sentence contribution degree of the target sentence is used for representing prior information contained in the target sentence.
In an embodiment, referring to fig. 2, which is a flowchart showing a specific implementation of step S120 in an embodiment, in this embodiment, the step S120 of calculating a contribution degree of each target sentence by using a sentence weight model to obtain a sentence contribution degree of each target sentence includes:
step S121: and calculating grid parameters of the target sentence.
In one embodiment, the lattice parameters of the target statement include: statement length ratio used for representing the proportion of effective information in the target statement, and similarity value used for representing the correlation degree between the target statement and other target statements.
In an embodiment, referring to fig. 3, which is a flowchart showing a specific implementation of step S121, the step of calculating the statement length ratio of the target statement in this embodiment includes:
step S1211: and obtaining the word segmentation sequence of the target sentence.
In an embodiment, a dictionary-based word segmentation method may be used to segment a target sentence to obtain a word segmentation sequence, where the method matches a character string to be matched in text information of the target sentence with a word in a pre-established dictionary according to a preset strategy. The preset strategy comprises the following steps: a forward maximum matching method, a reverse maximum matching method, a bidirectional matching word segmentation method and the like. In the embodiment, a machine learning algorithm based on statistics can be adopted to segment the target sentence to obtain a word segmentation sequence, and the method utilizes a deep learning related algorithm to label and train different words in the text information. The word segmentation operation process also comprises a process of removing stop words, wherein the stop words refer to unintended words or other words with smaller actual effects in the target sentences, and common stop words can be obtained according to a preset stop word stock. In addition, the present embodiment is not particularly limited to the word segmentation method.
Step S1212: and obtaining the effective word segmentation quantity and the total word segmentation quantity of the word segmentation sequence.
In an embodiment, since the target sentence contains the stop word, the length of the word segmentation sequence after the stop word is removed may change, so the total word segmentation number in the word segmentation sequence without the stop word is different from the effective word segmentation number in the word segmentation sequence after the stop word is removed, and if the target sentence does not contain the stop word, the effective word segmentation number is equal to the total word segmentation number.
Step S1213: and calculating according to the ratio of the effective word segmentation quantity to the total word segmentation quantity to obtain the statement length ratio of the target statement.
In an embodiment, the sentence length ratio of each target sentence is calculated according to the effective word segmentation number and the total word segmentation number obtained in the steps, and the sentence length ratio can represent the degree of information content in the target sentence to a certain extent.
From the above, it can be seen that the ratio of the effective word segmentation can be determined according to the statement length ratio of each target statement, and in order to further reflect the information association of the target statement, the information association in the other dimension is characterized by using the similarity relationship between the target statements.
In an embodiment, referring to fig. 4, which is a flowchart showing a specific implementation of step S121, the step of calculating the similarity value of the target sentence in this embodiment includes:
Step S1214: and obtaining a single sentence vector of each target sentence.
In one embodiment, an embedded vector of each target sentence is obtained, where the embedded vector is denoted as a single sentence vector, then sentence numbering is performed on each single sentence vector in order, and assuming that there are K target sentences, the single sentence vector is denoted as s_d= { D1, D2, …, DK }, where Di is the single sentence vector of the i-th target sentence.
Step S1215: and calculating the average sentence vector of the single sentence vectors.
In an embodiment, summing is performed on each single sentence vector in a vector summation manner according to the sequence of sentence numbers to obtain a sum vector, wherein the sum vector is obtained by sequentially summing the single sentence vectors of all target sentences in the target text, and the sum vector is expressed as: d=d1+d2+ … +dk, and then average calculation is performed on the sum vector to obtain an average sentence vector, where the average sentence vector is expressed as: D/K. It can be understood that the sum vector can represent the associated information of all target sentences to a certain extent, and the average sentence vector also contains the information of all target sentences.
Step S1216: and calculating according to the single sentence vector and the average sentence vector to obtain the similarity value of the target sentence.
In an embodiment, the vector distance between the single sentence vector and the average sentence vector of each target sentence is calculated according to a cosine similarity calculation method, and the cosine similarity measures the similarity between the two vectors by measuring the cosine value of the included angle between the two vectors. It is determined whether the two vectors are pointing approximately in the same direction based on the cosine value of the angle between the two vectors. If the two vectors have the same direction, the cosine similarity has a value of 1; when the included angle of the two vectors is a right angle, the cosine similarity value is 0; when the two vectors are pointing in diametrically opposite directions, the cosine similarity has a value of-1, whereby it can be seen that the cosine similarity is independent of the length of the vector and is only related to the pointing direction of the vector. In this embodiment, the degree of distance between the single sentence vector and the average sentence vector of each target sentence is represented by using cosine similarity, so that the similarity of the two vectors is represented, and the similarity between the single sentence vector and the average sentence vector of each target sentence is quantitatively described similarly, so that the description standard is unified.
From the above, it is known that the term length ratio of each target term represents the information amount attribute of the target term from one dimension, and the similarity value between the single term vector and the average term vector of each target term represents the degree of association of the target term in the whole from the other dimension.
Step S122: and carrying out parameter mapping on the grid parameters of each target sentence to obtain a Voronoi vector diagram.
In an embodiment, the statement length ratio and the similarity value of each target statement are associated by using a Voronoi vector diagram, specifically, the grid parameters of each target statement are subjected to parameter mapping to construct the Voronoi vector diagram, wherein the horizontal axis of the Voronoi vector diagram is used for representing the statement length ratio, and the vertical axis of the Voronoi vector diagram is used for representing the similarity value.
In an embodiment, referring to fig. 5, which is a flowchart of a specific implementation of step S122 shown in an embodiment, in this embodiment, the step of performing parameter mapping on the grid parameters of each target sentence to obtain a Voronoi vector diagram includes:
step S1221: and generating a center point of each target sentence according to the grid parameters of each target sentence.
In one embodiment, a Voronoi coordinate system is first constructed, wherein the Voronoi coordinate system comprises an abscissa and an ordinate, and a center point of each target sentence is marked in the middle of the Voronoi coordinate system, wherein the abscissa of the center point represents a similarity value of the target sentence, and the ordinate of the center point represents a sentence vector ratio of the target sentence. Let K target sentences be denoted as S1, S2, …, SK, and corresponding K center points, denoted as S1', S2', …, SK ', si ' denote the center point of the i-th target sentence, si ' = (similarity value, sentence vector ratio). In one embodiment, referring to FIG. 6a, all center points are marked in the Voronoi coordinate system, six center points are shown, S1', S2', S3', S4', S5', and S6'.
Step S1222: the contiguous triangular faces are constructed from the center points based on nearest neighbor principles.
In an embodiment, the adjacent triangular surfaces are formed by triangles, wherein each vertex of each adjacent triangular surface is a center point of the above steps, the nearest center point is selected to form the triangle based on the nearest neighbor principle, and the different triangles are not intersected, so that a plurality of adjacent triangular surfaces are obtained by traversing.
In one embodiment, the adjoining triangular faces have the following constraint properties: 1) In a plurality of adjoining triangular faces composed of a set of points of the center point, the circumcircle of each adjoining triangular face does not include any other center point in the set of points, i.e., the circumcircle of any one adjoining triangular face does not include any other center point. 2) Among a plurality of adjacent triangular surfaces formed by a point set of the center point, the diagonal lines of the convex quadrangle formed by 2 adjacent triangular surfaces are the largest and no longer increase after being interchanged.
In one embodiment, the generation algorithm of the adjacent triangular surface is specifically: firstly, any central point is used as a starting point, the central point closest to the starting point is found out to be connected with each other, and one edge adjacent to the triangular surface is used as a base line. A third center point is then found that forms an adjoining triangular surface with the base line based on the two beam properties. And finally, connecting the two end points of the base line with a third center point to form a new base line until the completion is finally completed.
Referring to FIG. 6b, the center points S1', S2', and S3' form an adjoining triangular surface; the central points S2', S3' and S4 'form an adjoining triangular surface, and the central points S1', S3 'and S5' form an adjoining triangular surface; the central points S3', S4' and S6' form an adjoining triangular surface; the center points S3', S5', and S6' constitute one adjoining triangular surface, and it can be seen that the present embodiment obtains a plurality of adjoining triangular surfaces from a plurality of center points.
Step S1223: a Voronoi vector diagram is generated from the contiguous triangular faces, the Voronoi vector diagram including a word segmentation grid for each target sentence.
In an embodiment, for each side of the adjacent triangular surface, taking the midpoint position of the side, taking the perpendicular bisectors of the side at the midpoint position, forming an intersection point in the adjacent triangular surface by the three perpendicular bisectors of each adjacent triangular surface, and connecting a plurality of intersection points formed based on the nearest neighbor principle and the position relation of the adjacent triangular surfaces, wherein it can be seen that the center of a circumcircle of each critical triangular surface is the intersection point. The polygon surfaces of the edge area are defined according to preset boundary parameters, such as a horizontal axis boundary value and a vertical axis boundary value, so that the Voronoi vector diagram is formed. In this embodiment, the polygon surface where each center point is located is used as the word segmentation grid of the center point, that is, the word segmentation grid of the target sentence. Referring to fig. 6c, six center points S1', S2', S3', S4', S5 'and S6' are illustrated in a solid circle for clarity of illustration, and the dashed lines illustrate adjoining triangular faces, with reference to fig. 6a and 6b for the location of the center points. And obtaining a plurality of polygonal surfaces (shown by solid lines) according to the intersection point, the preset horizontal axis boundary value and the preset vertical axis boundary value, wherein each polygonal surface comprises a center point which is the center of a circumscribing circle, the circumscribing circle is shown in the figure, and the plurality of polygonal surfaces form a Voronoi vector diagram.
As can be seen from the above, the statement length ratio and the similarity value of each target statement are associated by using the Voronoi vector diagram, wherein the horizontal axis of the Voronoi vector diagram is used for representing the statement length ratio and the vertical axis of the Voronoi vector diagram is used for representing the similarity value.
Step S123: and calculating a first similarity value of each target sentence according to the Voronoi vector diagram.
In an embodiment, referring to fig. 7, which is a flowchart showing a specific implementation of step S123, the step of calculating the first similarity value of each target sentence by using the Voronoi vector diagram in this embodiment includes:
step S1231: and calculating the inter-sentence aggregation degree between each target sentence and other target sentences according to the grid area of the word segmentation grid.
In an embodiment, since each word segmentation grid contains a target sentence, and the word segmentation grids are converted from the sentence length ratio and the similarity value of the target sentence, the grid area of the word segmentation grids can represent relevant information such as the information quantity of the target sentence. In this embodiment, firstly, the area of each word-segmentation grid is calculated, then, the area ratio between the grid area of each word-segmentation grid and the grid area of other target sentences is calculated for the target sentences, the area ratio can reflect the strength of the sentence association degree between the two target sentences to a certain extent, in this embodiment, the area ratio is defined as the inter-sentence aggregation degree, for example, the target sentences Si and the target sentences Sj, the grid area of the target sentences Si is S (Si), the grid area of the target sentences Sj is S (Sj), and the inter-sentence aggregation degree between the target sentences Si and the target sentences Sj is expressed as: s (Si)/S (Sj).
Step S1232: and calculating a first quantity of the inter-sentence aggregation degree of each target sentence within a first preset threshold range.
In an embodiment, a first preset threshold range is set, a plurality of inter-sentence aggregation degrees of each target sentence are screened, and the inter-sentence aggregation degrees within the first preset threshold range are selected, namely, for each target sentence, the inter-sentence aggregation degrees corresponding to other target sentences with higher similarity are selected. It is to be understood that the first preset threshold range may be a value segment, or a value, when the value is greater than the value, the value segment or the value is a value between 0 and 1, which is not limited in this embodiment, and may be selected according to an actual requirement or an empirical value.
In the above embodiment, for each target sentence, the first number M1 of the inter-sentence aggregation degree of each target sentence within the first preset threshold range is calculated.
Step S1233: and calculating a first similarity value of each target sentence and other target sentences according to the first quantity and the sentence number of the target sentences.
In an embodiment, the total number K of sentences of the target sentence is obtained, and the similarity information of the target sentence and other target sentences is represented according to the ratio relationship between the total number K of sentences and the first number K of each target sentence as a first similarity value, wherein the higher the first similarity value is, the tighter the association between the target sentence and other target sentences is, and the target sentence needs to be considered when generating the text abstract. Specifically, the first similarity value in this embodiment is expressed as: ssi1=m1/K.
In order to further characterize similarity relationships between different target sentences, the embodiment of the application also calculates second similarity values of the target sentences and other target sentences.
Step S124: a second similarity value of the target sentence is calculated.
In an embodiment, referring to fig. 8, which is a flowchart showing a specific implementation of step S124, the step of calculating the second similarity value of the target sentence in this embodiment includes:
step S1241: an inter-sentence distance value between each target sentence and the other target sentences is calculated.
In an embodiment, the inter-sentence distance value between each target sentence and other target sentences is calculated by using conventional methods such as hamming distance, editing distance (also called levenstein distance), jaccard similarity coefficient, etc., where the calculation method of the inter-sentence distance value is not specifically limited, and can be selected or replaced according to actual requirements.
Step S1242: and calculating a second quantity of the inter-sentence distance value of each target sentence in a second preset threshold range.
In an embodiment, a second preset threshold range is set, a plurality of inter-sentence distance values of each target sentence are screened, and inter-sentence distance values within the second preset threshold range are selected, namely, for each target sentence, inter-sentence distance values corresponding to other target sentences with higher similarity are selected. It is to be understood that the second preset threshold range may be a value segment, or a value, when the value is greater than the value, the value segment or the value is within the second preset threshold range, and the value segment or the value may be a value between 0 and 1, which is not limited in particular, and may be selected according to an actual requirement or an empirical value.
In the above embodiment, for each target sentence, the second number M2 of the inter-sentence distance values of each target sentence within the second preset threshold range is calculated.
Step S1243: and calculating a second similarity value of each target sentence and other target sentences according to the second quantity and the sentence quantity of the target sentences.
In an embodiment, the total number K of sentences of the target sentence is obtained, and the similarity information of the target sentence and other target sentences is represented according to the ratio relationship between the total number K of sentences and the second number K of each target sentence as a second similarity value, wherein the higher the second similarity value is, the tighter the association between the target sentence and other target sentences is, and the target sentence needs to be considered when generating the text abstract. Specifically, the second similarity value of the i-th target sentence in this embodiment is expressed as: ssi2=m2/K.
Step S125: and calculating according to the first similarity value and the second similarity value to obtain the statement contribution degree of the target statement.
In order to improve the accuracy of similarity calculation between each target sentence and other target sentences, in this embodiment, the sentence contribution of the target sentence is calculated according to the first similarity value and the second similarity value by using a weighting coefficient, and the sentence contribution of the i-th target sentence is expressed as: csi=α=ssi1+ (1- α) ssi2, where the weighting coefficient α is a number between 0 and 1, and may be selected according to an empirical value, which is not specifically limited in this embodiment.
After obtaining the sentence contribution degree of each target sentence, the embodiment of the application performs the text abstract generation process of the following steps in combination with the sentence contribution degree.
Step S130: and sorting the plurality of target sentences according to the sentence contribution degree.
Step S140: and selecting N target sentences according to the sorting result to obtain a sentence set.
In an embodiment, the target sentences are ordered according to the sentence contribution degree, the first N target sentences are selected as basic sentences for text abstract generation, and N is an integer greater than 1. And forming a statement set by the selected N target statements.
Step S150: and extracting the text abstract from the sentence set by using the text abstract model, and selecting a target sentence from the sentence set to form the abstract set.
In one embodiment, the text summarization model is a DQN model, which is a reinforcement learning model, so that the text summarization model is first trained by reinforcement learning prior to this step.
In one embodiment, referring to FIG. 9, the steps for training a text summarization model using reinforcement learning are shown as one embodiment and include:
step S910: and constructing an auxiliary network model corresponding to the text abstract model.
In an embodiment, the text summarization model includes a first hidden layer, a second hidden layer, and a softmax layer connected in sequence. Wherein the first model parameters of the first hidden layer include: the first weight matrix and the first offset, the second model parameters of the second hidden layer include: a second weight matrix and a second bias amount. The method comprises the steps of constructing an auxiliary network model corresponding to the text abstract model, wherein the model structure of the auxiliary network model is the same as that of the text abstract model, and initial model parameters are the same before training. It will be appreciated that the text summarization model may also contain more hidden layers to achieve better generation.
Step S920: a training sentence set is obtained.
In an embodiment, for training sentences in a training sentence set, first, obtaining initial sentences from training paragraphs, then calculating the sentence contribution degree of each initial sentence by using a sentence weight model by using the steps, selecting the number of training sentences according to the training depth, setting a third preset threshold, and selecting the initial sentences with the sentence contribution degree greater than the third preset threshold as training sentences. Or directly selecting a preset number of initial sentences which are ranked at the front according to the experience value as training sentences. The aim of the method is to select sentences with strong relevance by using the Duan Xianyan information of the training text as training sentences, so that the generation accuracy of the text abstract model is further improved. The training sentence set a is an action set in the reinforcement learning training process in this embodiment.
Step S930: and selecting a prediction statement at the current moment from the training statement set, storing the prediction statement at the current moment into the abstract prediction set, and obtaining a current state value at the current moment.
In one embodiment, the action a of reinforcement learning refers to selecting a prediction statement to store in the abstract prediction set after each iteration training, and the initial abstract prediction set is an empty set. It can be understood that the abstract prediction set is to select a training sentence from the training sentence set to generate a final text abstract, and when one training sentence in the training sentence set is selected by a certain iteration process and stored in the abstract prediction set, the training sentence does not participate in the next action process. In this embodiment, the current state value s at the current time point refers to the summary prediction set at the current time point.
Step S940: and calculating the rewarding function value at the current moment according to the current state value.
In one embodiment, a reward function r (s, a) is utilized to evaluate the accuracy of text summaries that can be generated from a collection of summary predictions.
In one embodiment, referring to fig. 10, the step of calculating the prize function value at the current time according to the current state value includes:
step S941: and acquiring the evaluation index value at the current moment and the evaluation index value at the previous moment.
In an embodiment, the evaluation index value is characterized by a BLEU value. If the sentence of the reference abstract and the sentence corresponding to the current state value are given, the length of the sentence is n, and m words appear in the sentence corresponding to the current state value in the sentence of the reference abstract, the BLEU value is expressed as m/n. It can be understood that, in the embodiment of the present application, multiple evaluation indexes of n-gram may be selected to obtain a corresponding ble u value, where n-gram refers to the number of consecutive words being n, and may be, for example, four kinds of ble u-1, ble u-2, ble u-3 and ble u-4.
In an embodiment, the current time evaluation index value and the previous time evaluation index value are acquired according to the iteration time. If the current time is the i time, the current time evaluation index value is expressed as: BLEU (BLEU) i The evaluation index value at the previous time is expressed as BLEU i-1
Step S942: and if the evaluation index value at the current moment is smaller than the evaluation index value at the previous moment, calculating to obtain the rewarding function value at the current moment according to a first formula.
In one embodiment, the first formula is used to calculate a prize function value for the effect that is weaker at the current time than at the previous time, and the prize function value r (s, a) is expressed as:
Figure BDA0004034705450000161
step S943: and if the evaluation index value at the current moment is greater than or equal to the evaluation index value at the previous moment, calculating the reward function value at the current moment according to a second formula.
In one embodiment, the second formula is used to calculate a prize function value for the effect improvement at the current time over the previous time, and the prize function value r (s, a) is expressed as:
r(s,a)=BLEU i -BLEU i-1
in the above embodiment, different reward functions are set according to the change of the learning effect by using step S942 and step S943, so that the reward and punishment mechanism of the reward functions can be further improved, and the learning effect of the text abstract model can be improved.
The reinforcement learning of the embodiment of the application aims at maximizing the rewarding function value caused by actions, setting the objective function to be optimized, and optimizing the weight of the text abstract generation model by adjusting the objective function so as to maximize the rewarding function value.
Step S950: and inputting the prediction statement at the current moment into the text abstract model to obtain a first valuation function value.
Step S960: and inputting the prediction statement at the current moment into the auxiliary network model to obtain a second estimated value function value.
In one embodiment, assume that the weight parameters of the text summarization model are represented by θ and the weight parameters of the auxiliary network model are represented by θ - And (3) representing. The first evaluation function value corresponding to the current state value s generated by the text abstract model under the current action a is expressed as follows: q (Q) θ (s, a) the second estimated function value corresponding to the current state value s 'generated by the auxiliary network model under the current action a' is expressed as:
Figure BDA0004034705450000174
step S970: and calculating the objective function value according to the reward function value, the first estimated function value and the second estimated function value.
In one embodiment, step S970 is specifically: and obtaining an attenuation coefficient, calculating an intermediate value according to the attenuation coefficient and the second estimated value function value, and calculating an objective function value according to the reward function value, the first estimated value function value and the intermediate value.
In one embodiment, the objective function value is expressed as:
Figure BDA0004034705450000171
wherein,,
Figure BDA0004034705450000172
represents an intermediate value, and γ represents an attenuation coefficient, which is a value between 0 and 1.
Step S980: and selecting a prediction statement at the next moment from the training statement set according to a preset optimization strategy and an objective function value.
In an embodiment, the optimization strategy is used for deciding how to select the next state, i.e. how to select the next training sentence to store in the abstract set, and in this embodiment, the preset optimization strategy is a greedy strategy, the probability epsilon is introduced, and the probability is selected according to the current state value and the reward function value at the current moment and the probability of 1 epsilon
Figure BDA0004034705450000173
The action is randomly selected with epsilon probability, namely, the next training sentence is randomly selected and stored in the abstract prediction set.
Step S990: and iteratively executing the process, and updating the model weights of the text abstract model and the auxiliary network model in the iteration process until the objective function value reaches a preset iteration condition.
In an embodiment, without updating the model parameters of the auxiliary network model each time, the model update process comprises: and updating the model weight of the text abstract model in each iteration process, and transplanting the model weight of the text abstract model into the auxiliary network model based on a preset updating period so as to update the model weight of the auxiliary network model. The model weight of the text abstract model is updated after each iteration, and then after each L-round iteration, the model weight of the text abstract model is integrally transplanted into the auxiliary network model to replace the model weight of the auxiliary network model so as to improve the training stability of the reinforcement learning process. It will be appreciated that L may be selected according to actual needs. In one embodiment, optimization of model weights is performed using a gradient descent method.
In an embodiment, the text summarization model includes a first hidden layer, a second hidden layer, and a softmax layer connected in sequence. The process of inputting the prediction statement at the current moment into the text abstract model to obtain the first valuation function value comprises the following steps: and inputting the prediction statement into the first hiding layer to obtain a first output, inputting the first output into the second hiding layer to obtain a second output, and inputting the third output into the softmax layer to obtain a first estimated value function value. In an embodiment, the first model parameters of the first hidden layer include: the first weight matrix w1 and the first offset b1, adopting a relu activation function, assuming that the input is v, and the output of the first hidden layer is: o1=relu (w1×v+b1). The second model parameters of the second hidden layer include: the second weight matrix w2 and the second offset b2, and the output of the second hidden layer is as follows by adopting the relu activation function: o2=relu (w2×o1+b2). O3 was obtained again by a softmax layer. w1, w2, b1, b2 are initially randomly arranged and will vary with the model weights during training. It will be appreciated that o3 is the first evaluation function value described above for the text summarization model and o3 is the second evaluation function value described above for the auxiliary network model.
After the text abstract model is trained by the process, text abstract extraction is carried out on a sentence set obtained by N prediction sentences selected according to the sequencing result, and the DQN model in the reinforcement learning field is adopted to select the prediction sentences from the sentence set to form the abstract set, so that sentences with higher contribution degree tend to be selected, and the quality of the text abstract is improved.
Step S160: and generating a text abstract of the target text segment according to the prediction statement in the abstract set.
In an embodiment, the predicted sentences in the abstract set may be sequentially arranged and output to obtain the text abstract of the target text segment, or the predicted sentences in the abstract set may be semantically adjusted by using the web model to generate the text abstract with higher naturalness, which is not particularly limited in this embodiment.
According to the technical scheme provided by the embodiment of the invention, the statement contribution degree of each target statement is obtained by acquiring the target text segment containing a plurality of target statements and calculating the contribution degree of each target statement by utilizing the statement weight model, N target statements are selected to obtain the statement set by sequencing the plurality of target statements according to the statement contribution degree, the text abstract model is utilized to extract the text abstract of the statement set, and the target statements are selected from the statement set to form the abstract set; and generating a text abstract of the target text segment according to the target sentence in the abstract set.
According to the method and the device for generating the text abstract, the sentence contribution degree of the target sentence is obtained through the sentence weight model, and then the sentence set used for generating the text abstract is selected according to the sentence contribution degree, so that the text abstract model is more prone to generating the text abstract through sentences with high contribution degree, the accuracy of the generated text abstract is improved, and the problem that in the related art, prior information of sentences is not considered when the text abstract is generated, and long sentences or sentences with low information content are selected as the text abstract, so that the accuracy of the text abstract is low is solved.
The embodiment of the invention also provides a text abstract generating device, which can realize the text abstract generating method, and referring to fig. 11, the device comprises:
target segment acquisition unit 1110: the method is used for acquiring a target text segment, and the target text segment comprises a plurality of target sentences.
Statement contribution degree calculation unit 1120: and the sentence contribution degree calculation module is used for calculating the contribution degree of each target sentence by utilizing the sentence weight model to obtain the sentence contribution degree of each target sentence.
Sorting unit 1130: for ordering the plurality of target sentences according to sentence contribution.
The target sentence selection unit 1140: and the method is used for selecting N target sentences according to the sorting result to obtain a sentence set, wherein N is an integer greater than 1.
The digest extraction unit 1150: the method is used for extracting the text abstract from the sentence set by using the text abstract model, and selecting a target sentence from the sentence set to form the abstract set.
Digest generation unit 1160: for generating a text summary of the target text segment from the target sentence in the summary set.
The specific implementation manner of the text abstract generating device in this embodiment is basically the same as the specific implementation manner of the text abstract generating method, and will not be described herein.
The embodiment of the invention also provides electronic equipment, which comprises:
at least one memory;
at least one processor;
at least one program;
the program is stored in the memory, and the processor executes the at least one program to implement the text digest generation method of the present invention as described above. The electronic equipment can be any intelligent terminal including a mobile phone, a tablet personal computer, a personal digital assistant (Personal Digital Assistant, PDA for short), a vehicle-mounted computer and the like.
Referring to fig. 12, fig. 12 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
the processor 1201 may be implemented by a general purpose CPU (central processing unit), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solution provided by the embodiments of the present invention;
The memory 1202 may be implemented in the form of a ROM (read only memory), a static storage device, a dynamic storage device, or a RAM (random access memory). Memory 1202 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented in software or firmware, relevant program codes are stored in memory 1202, and the processor 1201 invokes a text digest generation method for executing the embodiments of the present disclosure;
an input/output interface 1203 for implementing information input and output;
the communication interface 1204 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g., USB, network cable, etc.), or may implement communication in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.); and
a bus 1205 for transferring information between various components of the device such as the processor 1201, memory 1202, input/output interface 1203, and communication interface 1204;
wherein the processor 1201, the memory 1202, the input/output interface 1203 and the communication interface 1204 enable communication connection between each other inside the device via a bus 1205.
The embodiment of the application also provides a storage medium, which is a computer readable storage medium, and the storage medium stores a computer program, and the computer program realizes the text abstract generation method when being executed by a processor.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
According to the text abstract generating method, the text abstract generating device, the electronic equipment and the storage medium, provided by the embodiment of the invention, the sentence contribution degree of each target sentence is obtained by acquiring the target text segment containing a plurality of target sentences and calculating the contribution degree of each target sentence by utilizing the sentence weight model, N target sentences are selected according to the sentence contribution degree in order to obtain a sentence set, text abstract extraction is carried out on the sentence set by utilizing the text abstract model, and target sentences are selected from the sentence set to form the abstract set; and generating a text abstract of the target text segment according to the target sentence in the abstract set. According to the method and the device for generating the text abstract, the sentence contribution degree of the target sentence is obtained through the sentence weight model, and then the sentence set used for generating the text abstract is selected according to the sentence contribution degree, so that the text abstract model is more prone to generating the text abstract through sentences with high contribution degree, the accuracy of the generated text abstract is improved, and the problem that in the related art, prior information of sentences is not considered when the text abstract is generated, and long sentences or sentences with low information content are selected as the text abstract, so that the accuracy of the text abstract is low is solved.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and as those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by those skilled in the art that the technical solutions shown in the figures do not constitute limitations of the embodiments of the present application, and may include more or fewer steps than shown, or may combine certain steps, or different steps.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing a program.
Preferred embodiments of the present application are described above with reference to the accompanying drawings, and thus do not limit the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims (15)

1. A text summary generation method, comprising:
acquiring a target text, wherein the target text comprises a plurality of target sentences;
calculating the contribution degree of each target sentence by using a sentence weight model to obtain the sentence contribution degree of each target sentence;
sorting the plurality of target sentences according to the sentence contribution degree;
selecting N target sentences according to the sorting result to obtain a sentence set, wherein N is an integer greater than 1;
extracting a text abstract from the sentence set by using a text abstract model, and selecting the target sentence from the sentence set to form an abstract set;
and generating a text abstract of the target text segment according to the target sentence in the abstract set.
2. The text abstract generating method according to claim 1, wherein the calculating the contribution of each target sentence by using a sentence weight model to obtain the sentence contribution of each target sentence comprises:
calculating grid parameters of the target sentence, wherein the grid parameters comprise: sentence length ratio and similarity value;
performing parameter mapping on the grid parameters of each target sentence to obtain a Voronoi vector diagram, wherein the horizontal axis of the Voronoi vector diagram is used for representing the sentence length ratio, and the vertical axis of the Voronoi vector diagram is used for representing the similarity value;
Calculating a first similarity value of each target sentence according to the Voronoi vector diagram;
calculating a second similarity value of the target sentence;
and calculating statement contribution degree of the target statement according to the first similarity value and the second similarity value.
3. The text excerpt generation method of claim 2, wherein the calculating the statement length ratio of the target statement includes:
acquiring a word segmentation sequence of the target sentence;
acquiring the effective word segmentation quantity and the total word segmentation quantity of the word segmentation sequence;
and calculating the statement length ratio of the target statement according to the ratio of the effective word segmentation quantity to the total word segmentation quantity.
4. The text excerpt generation method of claim 2, wherein calculating the similarity value of the target sentence comprises:
obtaining a single sentence vector of each target sentence;
calculating an average sentence vector of the single sentence vectors;
and calculating the similarity value of the target sentence according to the single sentence vector and the average sentence vector.
5. The text summarization method according to claim 2, wherein the performing parameter mapping on the grid parameters of each target sentence to obtain a Voronoi vector diagram includes:
Generating a center point of each target sentence according to the grid parameters of each target sentence;
constructing an adjacent triangular surface according to the central point based on the nearest neighbor principle;
and generating the Voronoi vector diagram according to the adjacent triangular surfaces, wherein the Voronoi vector diagram comprises a word segmentation grid of each target sentence.
6. The text summarization generation method of claim 5 wherein the computing a first similarity value for each of the target sentences from the Voronoi vector diagram comprises:
calculating the inter-sentence aggregation degree between each target sentence and other target sentences according to the grid area of the word segmentation grid;
calculating a first number of the inter-sentence aggregation degree of each target sentence within a first preset threshold range;
and calculating a first similarity value of each target sentence and other target sentences according to the first quantity and the sentence number of the target sentences.
7. The text excerpt generation method of claim 6, wherein the calculating a second similarity value of the target sentence comprises:
calculating an inter-sentence distance value between each target sentence and other target sentences;
Calculating a second number of inter-sentence distance values of each target sentence within a second preset threshold range;
and calculating a second similarity value of each target sentence and other target sentences according to the second quantity and the sentence number of the target sentences.
8. The method for generating a text abstract according to claim 1, wherein the text abstract model is a DQN model, and the method further comprises, before extracting the text abstract from the sentence set and selecting the target sentence from the sentence set to form an abstract set: training the text abstract model in a reinforcement learning mode; the training process comprises the following steps:
constructing an auxiliary network model corresponding to the text abstract model;
acquiring a training sentence set, wherein the sentence contribution degree of training sentences in the training sentence set is larger than a third preset threshold value, and the sentence contribution degree is calculated by utilizing the sentence weight model;
selecting a prediction statement at the current moment from the training statement set, storing the prediction statement at the current moment into a summary prediction set, and obtaining a current state value at the current moment;
calculating to obtain a reward function value at the current moment according to the current state value;
Inputting the prediction statement at the current moment into the text abstract model to obtain a first estimated value function value;
inputting the prediction statement at the current moment into the auxiliary network model to obtain a second estimated value function value;
calculating an objective function value according to the reward function value, the first estimated function value and the second estimated function value;
selecting a prediction statement of the next moment from the training statement set according to a preset optimization strategy and the objective function value;
and iteratively executing the process, and updating the model weights of the text abstract model and the auxiliary network model in the iteration process until the objective function value reaches a preset iteration condition.
9. The text summary generation method of claim 8, wherein the calculating the bonus function value at the current time according to the current state value comprises:
acquiring a current moment evaluation index value and a previous moment evaluation index value;
if the evaluation index value at the current moment is smaller than the evaluation index value at the previous moment, calculating to obtain the reward function value at the current moment according to a first formula;
and if the evaluation index value at the current moment is greater than or equal to the evaluation index value at the previous moment, calculating the reward function value at the current moment according to a second formula.
10. The text excerpt generation method of claim 8, wherein updating model weights of the text excerpt model and the auxiliary network model in the iterative process comprises:
updating the model weight of the text abstract model in each iteration process;
and transplanting the model weight of the text abstract model into the auxiliary network model based on a preset updating period so as to update the model weight of the auxiliary network model.
11. The text summary generation method of claim 8, wherein said calculating an objective function value from said bonus function value, said first valuation function value, and said second valuation function value comprises:
obtaining an attenuation coefficient;
calculating an intermediate value from the attenuation coefficient and the second estimated value function;
and calculating the objective function value according to the reward function value, the first estimated value function value and the intermediate value.
12. The text summarization generation method of claim 10 wherein the text summarization model comprises a first hidden layer, a second hidden layer, and a softmax layer; the first model parameters of the first hidden layer include: a first weight matrix and a first bias amount, wherein the second model parameters of the second hidden layer comprise: a second weight matrix and a second bias amount; updating the first model parameter and the second model parameter during each iteration; the step of inputting the prediction statement at the current moment into the text abstract model to obtain a first estimated function value comprises the following steps:
Inputting the prediction statement into the first hidden layer to obtain a first output;
inputting the first output into the second hidden layer to obtain a second output;
and inputting the third output into the softmax layer to obtain the first estimated function value.
13. A text digest generating apparatus, comprising:
target text acquisition unit: the method comprises the steps of acquiring a target text, wherein the target text comprises a plurality of target sentences;
statement contribution degree calculation unit: the sentence contribution degree calculation module is used for calculating the contribution degree of each target sentence by utilizing a sentence weight model to obtain the sentence contribution degree of each target sentence;
a sequencing unit: the method comprises the steps of sorting a plurality of target sentences according to the sentence contribution degree;
a target sentence selection unit: the method comprises the steps of selecting N target sentences according to a sequencing result to obtain a sentence set, wherein N is an integer greater than 1;
a digest extraction unit: the method comprises the steps of extracting text abstracts from a sentence set by using a text abstracting model, and selecting the target sentence from the sentence set to form a abstracted set;
a digest generation unit: and generating a text abstract of the target text segment according to the target sentence in the abstract set.
14. An electronic device comprising a memory storing a computer program and a processor that when executing the computer program implements the text excerpt generation method of any of claims 1 to 12.
15. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the text digest generation method of any one of claims 1 to 12.
CN202310001249.3A 2023-01-03 2023-01-03 Text abstract generation method, device, equipment and storage medium Pending CN116186243A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310001249.3A CN116186243A (en) 2023-01-03 2023-01-03 Text abstract generation method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310001249.3A CN116186243A (en) 2023-01-03 2023-01-03 Text abstract generation method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116186243A true CN116186243A (en) 2023-05-30

Family

ID=86445476

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310001249.3A Pending CN116186243A (en) 2023-01-03 2023-01-03 Text abstract generation method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116186243A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312513A (en) * 2023-09-27 2023-12-29 数字广东网络建设有限公司 Document search model training method, document search method and related device
CN117744753A (en) * 2024-02-19 2024-03-22 浙江同花顺智能科技有限公司 Method, device, equipment and medium for determining prompt word of large language model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312513A (en) * 2023-09-27 2023-12-29 数字广东网络建设有限公司 Document search model training method, document search method and related device
CN117744753A (en) * 2024-02-19 2024-03-22 浙江同花顺智能科技有限公司 Method, device, equipment and medium for determining prompt word of large language model
CN117744753B (en) * 2024-02-19 2024-05-03 浙江同花顺智能科技有限公司 Method, device, equipment and medium for determining prompt word of large language model

Similar Documents

Publication Publication Date Title
CN110532571B (en) Text processing method and related device
CN111898374B (en) Text recognition method, device, storage medium and electronic equipment
CN108875074B (en) Answer selection method and device based on cross attention neural network and electronic equipment
JP6848091B2 (en) Information processing equipment, information processing methods, and programs
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN116186243A (en) Text abstract generation method, device, equipment and storage medium
CN113392651B (en) Method, device, equipment and medium for training word weight model and extracting core words
KR102695381B1 (en) Identifying entity-attribute relationships
CN112084307B (en) Data processing method, device, server and computer readable storage medium
JP7417679B2 (en) Information extraction methods, devices, electronic devices and storage media
CN113158554B (en) Model optimization method and device, computer equipment and storage medium
CN112232086A (en) Semantic recognition method and device, computer equipment and storage medium
CN111125348A (en) Text abstract extraction method and device
CN114841146B (en) Text abstract generation method and device, electronic equipment and storage medium
CN113392179A (en) Text labeling method and device, electronic equipment and storage medium
CN113761220A (en) Information acquisition method, device, equipment and storage medium
CN116719999A (en) Text similarity detection method and device, electronic equipment and storage medium
CN116304005A (en) Text abstract generation method, device, equipment and storage medium
CN110347916B (en) Cross-scene item recommendation method and device, electronic equipment and storage medium
CN117131273A (en) Resource searching method, device, computer equipment, medium and product
CN116975434A (en) Content recommendation method and related equipment
CN117151093A (en) Text paragraph recall method, device, equipment and storage medium
CN117093688A (en) Question answering method, question answering device, electronic equipment and storage medium
CN114090778A (en) Retrieval method and device based on knowledge anchor point, electronic equipment and storage medium
CN114818980A (en) Company similarity calculation method based on graph vectors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination