CN110889292B - Text data viewpoint abstract generating method and system based on sentence meaning structure model - Google Patents

Text data viewpoint abstract generating method and system based on sentence meaning structure model Download PDF

Info

Publication number
CN110889292B
CN110889292B CN201911205403.9A CN201911205403A CN110889292B CN 110889292 B CN110889292 B CN 110889292B CN 201911205403 A CN201911205403 A CN 201911205403A CN 110889292 B CN110889292 B CN 110889292B
Authority
CN
China
Prior art keywords
sentence
topic
sentences
semantic
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911205403.9A
Other languages
Chinese (zh)
Other versions
CN110889292A (en
Inventor
廖祥文
李晓滨
陈志豪
陈癸旭
吴运兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201911205403.9A priority Critical patent/CN110889292B/en
Publication of CN110889292A publication Critical patent/CN110889292A/en
Application granted granted Critical
Publication of CN110889292B publication Critical patent/CN110889292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method and a system for generating a viewpoint abstract by text data based on a sentence meaning structure model, which comprises the steps of firstly extracting a data set to be processed on a website and preprocessing the data set; then, a topic corpus set and a background corpus set are constructed, and topic attributes are extracted; then semantic weight calculation is carried out to obtain semantic weight values of the sentences; then, performing association weight calculation to obtain an association weight value of the sentence; and finally, extracting the viewpoint abstract in the topic by utilizing the topic attribute, the semantic weight value and the associated weight value. The topic text abstract method based on the topic attribute solves the problems existing in the current research method based on the topic attribute and the emotion information thereof, can efficiently and accurately obtain the topic abstract of the topic text, and can be applied to a larger-scale data set application scene.

Description

Text data viewpoint abstract generating method and system based on sentence meaning structure model
Technical Field
The invention relates to the technical field of internet big data analysis, in particular to a method and a system for generating a viewpoint abstract of text data based on a sentence meaning structural model.
Background
With the development of the internet, more and more messages are acquired from the internet by people, and the proportion of data in the fields of microblog, website news, commodity comments and the like in the network life of people is larger and larger. In order to bring more efficient reading and screening experience to people, the abstract part of the web text is often extracted for users to preview, the early work is completed manually, and as the data is increasingly huge, people begin to adopt a method of automatic machine extraction to generate the abstract.
Conventional methods for automatically generating summaries include the use of point of view summary models, including graph models and ranking models. The representation method of the graph model comprises methods such as Textrank, PageRank and LexRank, sentences are used as nodes, a certain relation between the sentences is used as the weight of an edge, iterative updating calculation is carried out on scores of the sentences through a random walk model, scoring of the sentences is achieved, a certain number of sentences with high scores are selected to be combined into a viewpoint abstract, a ranking model is used for constructing a sentence scoring function to achieve scoring of the sentences from the consideration factors such as diversity and redundancy of the viewpoint abstract, or a KL divergence and MMR method are used for carrying out relative score ranking on the sentences, and the viewpoint abstract is obtained through score ranking. The two methods ignore the text topic attribute with finer granularity, consider the diversity of the text subject matter through the diversity of all words in the text, do not consider the influence of the keywords of the text subject matter on the view abstract, and limit the follow-up research of the model to a certain extent.
At present, researchers at home and abroad do not research on the viewpoint abstract models, and both a generative viewpoint abstract model and a viewpoint abstract model based on a submode function are proposed. The method has a good effect, but the time complexity of algorithm solution is too high, and it takes several times of time of other methods for a short data set, and the method can not be applied to an actual scene under a big data background. The view abstraction method based on the submodular function ensures that the obtained local solution can be not lower than 63% of the optimal solution by using the greedy algorithm through the submodular function property, the greedy algorithm considers the conditions of various elements to select sentences, and although the experimental effect is relatively good, the mode of manually constructing the corpus tree is not suitable for wider application scenes.
Most of existing models consider that the diversity of all words in a text sentence is utilized to ensure that a viewpoint abstract covers the text motif, the diversity of the abstract is ensured through the diversity of the words, but the diversity of the words cannot ensure that the viewpoint abstract covers the motif of a source text, and words irrelevant to the motif can influence the finally generated viewpoint abstract.
Disclosure of Invention
In view of the above, the present invention provides a method and system for generating a viewpoint abstract of text data based on a sentence meaning structure model, which extracts syntactic related words from a source text by an entity extraction method as text subject key words, researches emotion information about effective words as evaluation objects in each sentence by combining an emotion analysis research method, and selects sentences to combine into a viewpoint abstract by a viewpoint abstract selection method based on sentence importance, so that the emotion of the whole viewpoint abstract is most clear, and the extracted abstract is most appropriate to the text subject.
The invention is realized by adopting the following scheme: a method for generating a viewpoint abstract based on text data of a sentence meaning structure model specifically comprises the following steps:
extracting a data set to be processed on a website, and preprocessing the data set;
constructing a topic corpus set and a background corpus set, and extracting topic attributes;
semantic weight calculation is carried out to obtain semantic weight values of sentences;
performing association weight calculation to obtain an association weight value of the sentence;
and extracting the viewpoint abstract in the topic by using the topic attribute, the semantic weight value and the associated weight value.
Further, the data set to be processed on the website includes, but is not limited to, microblog data, website news data, and commodity comment data.
Further, the pretreatment specifically comprises:
removing the webpage links in the comment sentences;
removing comment sentences of which the character length is less than 3;
removing common irrelevant words in the comment sentences;
all english is uniformly expressed as lower case english.
Further, the establishing of the topic corpus set and the background corpus set, and the extracting of the topic attributes specifically include: setting the current topic text as a topic corpus and other topic texts as a background corpus aiming at the preprocessed text, calculating the log likelihood ratio of words in the topic corpus by means of a log likelihood ratio method, filtering the words by using a preset threshold, wherein the part of speech of the words is required to be nouns, adjectives, verbs and digital words, and extracting the topic attributes of the topic corpus.
Further, the semantic weight calculation comprises the following steps:
step S11: calculating the emotion score of each sentence as the emotion characteristics by using an emotion analysis method based on an emotion dictionary;
step S12: extracting lexical features by using a semantic word extraction method based on a semantic dictionary;
step S13: analyzing sentences by using BFS-CSA to obtain sentence meaning structural characteristics;
step S14: semantic weights are calculated for the sentences.
Further, step S14 is specifically: using sentence meaning structural characteristics F6And calculating semantic weight of the sentence by the lexical characteristics, wherein the lexical characteristics are divided into 5 types which are respectively average TFIDF (fuzzy binary field decomposition) POS (point of sale) word weight F of effective words of the sentence1Coverage rate of topic words in sentence F2The predicate of the sentence contains the number F of the topic words3The general format of the sentence includes the number F of the topic word4And the number F of effective words in the sentence containing emotional words5(ii) a The semantic weight value calculation method comprises the following steps:
Figure GDA0003593883760000041
in the formula, Pcon(S) is the semantic weight of sentence S, FiAnd muiRespectively representing semantic feature values of the sentence and weighting coefficients of the feature.
Further, the association weight calculation comprises the steps of:
step S21: dividing words by using a sentence meaning structure to generate word vector representation so as to obtain a representation vector of a sentence;
step S22: calculating cosine similarity of expression vectors of the sentences to obtain similarity of the two sentences;
step S23: and constructing a sentence graph model by taking sentences of the document set as nodes, relations among the sentences as edges and similarity among the sentences as weight values, and obtaining the associated weight values of the sentences through semantic overlap ratios of other sentences to the sentences.
Further, the associated weight value R of the sentence in step S23 (S)k,Sj) The calculation was as follows:
R(Sk,Sj)=Pcon(Sj)*s(Sk,Sj);
in the formula, S (S)k,Sj) Representing a sentence SjFor sentence SkSimilarity of (A), Pcon(Sj) Representing a sentence SjThe semantic weight value of (3).
Further, extracting the view abstract in the topic by using the topic attribute, the semantic weight value and the associated weight value specifically comprises: the average sentence weight of each topic is obtained by weighting the semantic weight value and the association weight value, and finally 20 sentences with the highest score are selected as the viewpoint abstract.
The invention provides a text data generation viewpoint summarization system based on a sentence meaning structure model, comprising a memory, a processor and a computer program stored on the memory and capable of being executed by the processor, wherein the computer program when executed by the processor implements the method steps as described above.
Compared with the prior art, the invention has the following beneficial effects: the invention extracts syntactic related words from a source text as key words of the main part of the text by an entity extraction method, researches the emotional information about effective words as evaluation objects in each sentence by combining an emotion analysis research method, and selects the sentences to combine into a viewpoint abstract by a viewpoint abstract selection method based on the importance of the sentences, so that the emotion of the whole viewpoint abstract is most vivid and the main part of the text is most appropriate. The topic text abstract method based on the topic attribute solves the problems existing in the current research method based on the topic attribute and the emotion information thereof, can efficiently and accurately obtain the topic abstract of the topic text, and can be applied to a larger-scale data set application scene.
Drawings
FIG. 1 is a schematic diagram of the method of the embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a method for generating a viewpoint summary based on text data of a sentence meaning structure model, which specifically includes the following steps:
extracting a data set to be processed on a website, and preprocessing the data set;
constructing a topic corpus and a background corpus, and extracting topic attributes;
performing semantic weight calculation to obtain a semantic weight value of the sentence;
performing association weight calculation to obtain an association weight value of the sentence;
and extracting the viewpoint abstract in the topic by using the topic attribute, the semantic weight value and the associated weight value.
In this embodiment, the data set to be processed on the website includes, but is not limited to, microblog data, website news data, and commodity comment data.
In this embodiment, the pretreatment specifically includes:
remove web page links in the comment sentence, such as "http:// t.cn/rcwwWYQZ";
removing comment sentences with the character length smaller than 3, wherein the comment sentences contain too little information, most of the comment sentences are emoticons, and no other useful information exists;
removing common irrelevant words in the comment sentences, such as 'group pictures', 'original text forwarding' and the like;
all English is uniformly expressed as lowercase English.
In order to enable the application to be wider, on the basis of original data, data are cleaned, irrelevant texts are filtered, so that the topic attributes extracted by adopting the topic attribute extraction method are more accurate, and the method is not only applied to the field of Chinese microblogs, but also can be applied to the field of website news and commodity comments.
In this embodiment, the establishing of the topic corpus set and the background corpus set, and the extracting of the topic attributes specifically include: setting the current topic text as a topic corpus and other topic texts as a background corpus aiming at the preprocessed text, calculating the log likelihood ratio of words in the topic corpus by means of a log likelihood ratio method, filtering the words by using a preset threshold, wherein the part of speech of the words is required to be nouns, adjectives, verbs and digital words, and extracting the topic attributes of the topic corpus.
In this embodiment, the semantic weight calculation includes the following steps:
step S11: calculating the emotion score of each sentence as the emotion characteristics by using an emotion analysis method based on an emotion dictionary;
step S12: extracting lexical features by using a semantic word extraction method based on a semantic dictionary;
step S13: analyzing sentences by using BFS-CSA to obtain sentence meaning structural characteristics;
step S14: semantic weights are calculated for the sentences.
Preferably, the current topic text is set as a topic corpus, other topic texts are set as a background corpus, a positive emotion attribute set and a negative emotion attribute set are obtained based on an emotion dictionary, such as a positive emotion word dictionary, a negative emotion word dictionary, various part-of-speech dictionaries and the like, and an emotion score of a sentence is calculated by using an emotion analysis method based on the emotion dictionary to serve as an emotion attribute feature. The emotion vocabulary body comprises 7 types which are nouns (noun), verbs (verb), adjectives (adj), adverbs (adv), network words (nw), idioms (idiom) and prepositions phrases (prep), a part-of-speech set is obtained by utilizing a semantic word extraction method of a semantic dictionary in the first step, and the part-of-speech set is matched with semantic words in sentences to obtain lexical characteristics in the second step. And analyzing sentence meaning structure by using BFS-CSA analysis sentences.
In this embodiment, step S14 specifically includes: using sentence meaning structural characteristics F6And calculating semantic weight of the sentence by the lexical characteristics, wherein the lexical characteristics are divided into 5 types which are respectively average TFIDF (fuzzy binary field decomposition) POS (point of sale) word weight F of effective words of the sentence1Coverage rate of topic words in sentence F2The predicate of the sentence contains the number F of the topic words3The general format of the sentence includes the number F of the topic word4And the number F of effective words in the sentence containing emotional words5(ii) a The semantic weight value calculation method comprises the following steps:
Figure GDA0003593883760000081
in the formula, Pcon(S) is the semantic weight of sentence S, FiAnd muiRespectively representing semantic feature values of a sentence and theA weighting factor for the feature. And obtaining a semantic weight value through the characteristic value and the characteristic weighting coefficient.
Wherein, F1And F2For the statistical characteristics of effective words of a sentence, generally, it is considered that the noun (noun) and the verb (verb) are more obvious in importance in the sentence, and are more important than other parts of speech, the weight is given to 2, and the other parts of speech are 1; predicates, adverbs and the like are the core contents of the sentence, and if the characteristics of the sentence are in the topic table, the closer the relation between the sentence and the topic is, the more the significance of the topic center can be expressed; the topic words contained in the general lattice are selected as features and are used as supplements of predicates. In this embodiment, the calculating of the association weight includes the following steps:
step S21: dividing words by using a sentence meaning structure to generate word vector representation so as to obtain a representation vector of a sentence;
step S22: calculating cosine similarity of expression vectors of the sentences to obtain similarity of the two sentences;
step S23: and constructing a sentence graph model by taking sentences of the document set as nodes, relations among the sentences as edges and similarity among the sentences as weight values, and obtaining the associated weight values of the sentences through semantic overlap ratios of other sentences to the sentences.
Preferably, an n-dimensional space vector V (S) of the sentence is constructed by the obtained sentence similarityk)={ωk,1,ωk,2,…,ωk,nAnd constructing a graph model, wherein each node S in the graph corresponds to a sentence, the degree d of the node S is the number of edges connected with S, the importance degree of S contained information is reflected, and the larger d is, the more sentences related to the sentence are, the more important the contained information of the sentence is.
In the present embodiment, the association weight value R of the sentence in step S23 (S)k,Sj) The calculation was as follows:
R(Sk,Sj)=Pcon(Sj)*s(Sk,Sj);
in the formula, S (S)k,Sj) Representing a sentence SjFor sentence SkSimilarity of (A), Pcon(Sj) Representing a sentence SjThe semantic weight value of (2).
In this embodiment, extracting the view abstract in the topic by using the topic attribute, the semantic weight value, and the association weight value specifically includes: the average sentence weight of each topic is obtained by weighting the semantic weight value and the association weight value, and finally 20 sentences with the highest score are selected as the viewpoint abstract.
Specifically, the selection of sentences is ordered from high to low according to the importance degree, the extraction sequence of the sentences is determined, and the sentences are extracted according to the importance and the redundancy of the sentences. And obtaining the average weight of the sentence by utilizing the semantic weight value and the association weight value, and ensuring the relevance by utilizing the topic attribute. The importance of a sentence is related to two factors: 1) the number of topic attributes contained in the sentence is increased, and the importance of the sentence is increased when the number is increased;
2) the larger the average weight of a sentence, the more important the sentence is. A scoring function is constructed based on these two factors, and the weight θ of both is set, and it is generally considered that both are equally important, where θ is 0.5. The sentences ranked 20 top by the sentence importance score are taken as the viewpoint summary, and the viewpoint summary sentence set S.
The present embodiment provides a text data generation viewpoint summarization system based on a sentence meaning structure model, comprising a memory, a processor and a computer program stored on the memory and capable of being executed by the processor, wherein the computer program when executed by the processor implements the method steps as described above.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (5)

1. A method for generating a viewpoint abstract based on text data of a sentence meaning structure model is characterized by comprising the following steps:
extracting a data set to be processed on a website, and preprocessing the data set;
constructing a topic corpus and a background corpus, and extracting topic attributes;
semantic weight calculation is carried out to obtain semantic weight values of sentences;
performing association weight calculation to obtain an association weight value of the sentence;
extracting viewpoint abstract from the topic by using the topic attribute, the semantic weight value and the associated weight value;
the semantic weight calculation comprises the following steps:
step S11: calculating the emotion score of each sentence as the emotion characteristics by using an emotion analysis method based on an emotion dictionary;
step S12: extracting lexical features by using a semantic word extraction method based on a semantic dictionary;
step S13: analyzing sentences by using BFS-CSA to obtain sentence meaning structural characteristics;
step S14: calculating semantic weight of the sentence;
step S14 specifically includes: using sentence meaning structural characteristics F6And calculating semantic weight of the sentence by the lexical characteristics, wherein the lexical characteristics are divided into 5 types which are respectively average TFIDF (fuzzy binary field decomposition) POS (point of sale) word weight F of effective words of the sentence1Coverage rate of topic words in sentence F2And the predicate of the sentence contains the number F of topic words3The general format of the sentence includes the number F of the topic word4And the number F of effective words in the sentence containing emotional words5(ii) a The semantic weight value calculation method comprises the following steps:
Figure FDA0003593883750000021
in the formula, Pcon(S) is the semantic weight of sentence S, FiAnd muiRespectively representing semantic feature values of the sentences and weighting coefficients of the features;
the association weight calculation comprises the steps of:
step S21: dividing words by using a sentence meaning structure to generate word vector representation so as to obtain a representation vector of a sentence;
step S22: calculating cosine similarity of expression vectors of the sentences to obtain similarity of the two sentences;
step S23: the method comprises the steps of taking sentences in a document set as nodes, taking the relation among the sentences as an edge, taking the similarity among the sentences as a weight value to construct a sentence graph model, and obtaining the associated weight value of the sentence through the semantic overlap ratio of other sentences to the sentence;
the associated weight value R of the sentence in step S23 (S)k,Sj) The calculation was as follows:
R(Sk,Sj)=Pcon(Sj)*s(Sk,Sj);
in the formula, S (S)k,Sj) Representing a sentence SjFor sentence SkSimilarity of (A), Pcon(Sj) Representing a sentence SjThe semantic weight value of (1);
the method for extracting the viewpoint abstract in the topic by utilizing the topic attribute, the semantic weight value and the associated weight value specifically comprises the following steps: the average sentence weight of each topic is obtained by weighting the semantic weight value and the association weight value, and finally 20 sentences with the highest score are selected as the viewpoint abstract.
2. The method for generating the opinion summary based on the text data of the sentence meaning structure model as claimed in claim 1, wherein the data sets to be processed on the website include microblog data, website news data and commodity comment data.
3. The method for generating a viewpoint summary of text data based on sentence meaning structure model as claimed in claim 1, wherein the preprocessing is specifically:
removing the webpage links in the comment sentences;
removing the comment sentences with the character length smaller than 3;
removing common irrelevant words in the comment sentences;
all english is uniformly expressed as lower case english.
4. The method for generating a viewpoint summary of text data based on a sentence meaning structure model as claimed in claim 1, wherein the constructing topic corpus set and the background corpus set and extracting topic attributes specifically are: setting the current topic text as a topic corpus and other topic texts as a background corpus aiming at the preprocessed text, calculating the log likelihood ratio of words in the topic corpus by means of a log likelihood ratio method, filtering the words by using a preset threshold, wherein the part of speech of the words is required to be nouns, adjectives, verbs and digital words, and extracting the topic attributes of the topic corpus.
5. A system for generating a point of view summary of text data based on a sentence meaning model, comprising a memory, a processor and a computer program stored on the memory and executable by the processor, the computer program when executed by the processor implementing the method steps of any one of claims 1 to 4.
CN201911205403.9A 2019-11-29 2019-11-29 Text data viewpoint abstract generating method and system based on sentence meaning structure model Active CN110889292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911205403.9A CN110889292B (en) 2019-11-29 2019-11-29 Text data viewpoint abstract generating method and system based on sentence meaning structure model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911205403.9A CN110889292B (en) 2019-11-29 2019-11-29 Text data viewpoint abstract generating method and system based on sentence meaning structure model

Publications (2)

Publication Number Publication Date
CN110889292A CN110889292A (en) 2020-03-17
CN110889292B true CN110889292B (en) 2022-06-03

Family

ID=69749579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911205403.9A Active CN110889292B (en) 2019-11-29 2019-11-29 Text data viewpoint abstract generating method and system based on sentence meaning structure model

Country Status (1)

Country Link
CN (1) CN110889292B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311385B (en) * 2020-05-15 2020-08-04 成都晓多科技有限公司 Commodity recommendation grammar generation method and system based on commodity selling points
CN111797226B (en) * 2020-06-30 2024-04-05 北京百度网讯科技有限公司 Conference summary generation method and device, electronic equipment and readable storage medium
CN113536761B (en) * 2021-07-09 2024-01-30 南京航空航天大学 Method for calculating sentence similarity based on frame importance

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699525A (en) * 2014-01-03 2014-04-02 江苏金智教育信息技术有限公司 Method and device for automatically generating abstract on basis of multi-dimensional characteristics of text
CN106503064A (en) * 2016-09-29 2017-03-15 中国国防科技信息中心 A kind of generation method of self adaptation microblog topic summary
CN108268668A (en) * 2018-02-28 2018-07-10 福州大学 One kind is based on the multifarious text data viewpoint abstract method for digging of topic
CN108287922A (en) * 2018-02-28 2018-07-17 福州大学 A kind of text data viewpoint abstract method for digging of fusion topic attribute and emotion information
CN109145105A (en) * 2018-07-26 2019-01-04 福州大学 A kind of text snippet model generation algorithm of fuse information selection and semantic association

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9189470B2 (en) * 2012-05-31 2015-11-17 Hewlett-Packard Development Company, L.P. Generation of explanatory summaries

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103699525A (en) * 2014-01-03 2014-04-02 江苏金智教育信息技术有限公司 Method and device for automatically generating abstract on basis of multi-dimensional characteristics of text
CN106503064A (en) * 2016-09-29 2017-03-15 中国国防科技信息中心 A kind of generation method of self adaptation microblog topic summary
CN108268668A (en) * 2018-02-28 2018-07-10 福州大学 One kind is based on the multifarious text data viewpoint abstract method for digging of topic
CN108287922A (en) * 2018-02-28 2018-07-17 福州大学 A kind of text data viewpoint abstract method for digging of fusion topic attribute and emotion information
CN109145105A (en) * 2018-07-26 2019-01-04 福州大学 A kind of text snippet model generation algorithm of fuse information selection and semantic association

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
微博搜索的关键技术研究;段亚娟;《万方数据学位论库》;20141028;第1-113页 *

Also Published As

Publication number Publication date
CN110889292A (en) 2020-03-17

Similar Documents

Publication Publication Date Title
CN108287922B (en) Text data viewpoint abstract mining method fusing topic attributes and emotional information
CN106844658B (en) Automatic construction method and system of Chinese text knowledge graph
Litvak et al. DegExt—A language-independent graph-based keyphrase extractor
US10496756B2 (en) Sentence creation system
CN103136352B (en) Text retrieval system based on double-deck semantic analysis
JP6676109B2 (en) Utterance sentence generation apparatus, method and program
CN108268668B (en) Topic diversity-based text data viewpoint abstract mining method
CN110889292B (en) Text data viewpoint abstract generating method and system based on sentence meaning structure model
Mizumoto et al. Sentiment analysis of stock market news with semi-supervised learning
CN115048944B (en) Open domain dialogue reply method and system based on theme enhancement
CN114065758A (en) Document keyword extraction method based on hypergraph random walk
WO2022183923A1 (en) Phrase generation method and apparatus, and computer readable storage medium
Tran et al. A hybrid approach for building a Vietnamese sentiment dictionary
Gupta et al. Keyword extraction: a review
Singh et al. Words are not equal: Graded weighting model for building composite document vectors
CN114722176A (en) Intelligent question answering method, device, medium and electronic equipment
Malandrakis et al. Sail: Sentiment analysis using semantic similarity and contrast features
CN113111653B (en) Text feature construction method based on Word2Vec and syntactic dependency tree
Sahmoudi et al. Towards a linguistic patterns for arabic keyphrases extraction
Heidary et al. Automatic Persian text summarization using linguistic features from text structure analysis
Kian et al. An efficient approach for keyword selection; improving accessibility of web contents by general search engines
Mahdipour et al. Automatic Persian text summarizer using simulated annealing and genetic algorithm
Zulkhazhav et al. Kazakh text summarization using fuzzy logic
Wenchao et al. A modified approach to keyword extraction based on word-similarity
Tohalino et al. Using virtual edges to extract keywords from texts modeled as complex networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant