CN113536761A - Method for calculating sentence similarity based on frame importance - Google Patents

Method for calculating sentence similarity based on frame importance Download PDF

Info

Publication number
CN113536761A
CN113536761A CN202110776700.XA CN202110776700A CN113536761A CN 113536761 A CN113536761 A CN 113536761A CN 202110776700 A CN202110776700 A CN 202110776700A CN 113536761 A CN113536761 A CN 113536761A
Authority
CN
China
Prior art keywords
frame
importance
frames
sentence
information set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110776700.XA
Other languages
Chinese (zh)
Other versions
CN113536761B (en
Inventor
王铁鑫
史荟
刘文静
严欣华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202110776700.XA priority Critical patent/CN113536761B/en
Publication of CN113536761A publication Critical patent/CN113536761A/en
Application granted granted Critical
Publication of CN113536761B publication Critical patent/CN113536761B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for calculating sentence similarity based on frame importance, which comprises the following steps: step 1: all frames in the English sentence S form a frame semantic information set E; step 2: extracting core frame elements of each frame in the set E; and step 3: calculating the importance of each frame according to the number of core frame elements in each frame in the set E; and 4, step 4: all frames in the English sentence S ' form a frame semantic information set E ', and the importance of each frame in the set E ' is calculated; and 5: taking the same frame in the set E and the set E' as a group of frames; selecting the minimum frame importance in each frame group as the importance of the frame group; and accumulating and calculating the frame importance of all the frame groups, and calculating the similarity of the English sentences S and S' based on the accumulated and calculated values. The method provided by the invention can be applied to natural language processing tasks such as text inclusion recognition, text summarization and the like.

Description

Method for calculating sentence similarity based on frame importance
Technical Field
The invention belongs to the technical field of natural language processing.
Background
The Frame semantic library FrameNet is a semantic knowledge base based on Frame Semantics (Frame Semantics) and is used for the research of languages such as linguistics, computational linguistics, natural language processing and the like. Concept structures and semantic scenes hidden behind words can be mined through the frame semantics.
A frame (frame) in FrameNet refers to a semantic structural form of a sentence expressing a specific scene, which is composed of lemmas (lexical units, LUs) and Frame Elements (FEs) to which it is associated. The various participants, external conditions, etc. involved in the framework are referred to as framework elements. The frame elements are divided into core frame elements (CoreFEs) and non-frame elements (Peripheral, Extra-composite) according to the importance degree, the core frame elements are necessary components of a frame in conceptual understanding, the core frame elements are different in number and type in different frames, and the personality of the frames is displayed; the non-core frame elements express general semantic components such as time, place and the like.
When a sentence includes multiple frames, the importance of the different frames is not necessarily the same, and to accurately measure the similarity between sentences, the importance of the frames must be considered while considering the frames themselves, however, it is not easy to measure the importance of the frames in the sentence, because the measurement result is not constant according to different importance measurement standards. Therefore, the frame importance metric selection is the key to the frame importance metric. The similarity calculation method based on the word level features does not consider the structural information of sentences at present; the similarity calculation method based on sentence structure characteristics fails to fully consider sentence semantics. The conventional sentence similarity calculation method mainly aims at the problems of sentence keywords and sentence structures, and the similarity calculation result is not accurate enough due to the fact that the semantics of the sentences are not comprehensive and the interpretability is poor.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a method for calculating sentence similarity based on frame importance, which aims to solve the problems in the prior art.
The technical scheme is as follows: the invention provides a method for calculating sentence similarity based on frame importance, which comprises the following steps:
step 1: extracting all frames in the English sentence S, and forming a frame semantic information set E by all the frames;
step 2: constructing a frame semantic library FrameNet visualization tool GIFN, and extracting core frame elements of each frame in a frame semantic information set E through the GIFN;
and step 3: calculating a frame influence factor of each frame based on the number of core frame elements in each frame; establishing a frame importance function according to the frame influence factors to obtain the importance w (f) of the ith frame in the frame semantic information set EE,i),fE,iRepresenting the ith frame in the frame semantic information set E, wherein i is 1, 2., frame _ S, and frame _ S is the total number of frames in the frame semantic information set E;
and 4, step 4: forming all frames in the English sentence S ' into a frame semantic information set E ' according to the steps 1-3, and calculating the importance of each frame in the frame semantic information set E ';
and 5: taking the same frame in E and E' as a group of frame groups to obtain frame _ same frame groups; comparing the importance of two frames in the jth frame group, and selecting the minimum frame importance as the frame importance min of the jth frame groupjJ ═ 1,2,. said, frame _ same; and accumulating the frame importance of the frame _ same frame groups, and calculating the similarity of the English sentences S and S' based on the accumulated values.
Further, in the step 1, the english sentence S is input into an open source semantic frame extraction tool SEMAFOR, and the SEMAFOR analyzes the input english sentence S according to the structure of the frame semantic library FrameNet, so as to extract the frame in the english sentence S.
Further, the specific method for constructing the framework semantic library FrameNet visualization tool GIFN in the step 2 comprises the following steps: all frames in the FrameNet are taken as nodes, semantic relations among the frames and semantic relations among the lemmas and the frames are taken as edges, and the nodes and the edges are stored in a graph database Neo4 j.
Further, the similarity calculation formula corresponding to the english sentence S and the sentence S' is as follows:
Figure BDA0003155670990000021
wherein, Similarity _ score is the Similarity between English sentence S and sentence S'; frame _ S 'is the total number of frames in the frame semantic information set E', Maximum (.) is the Maximum value; wherein the expression of Path _ score is as follows:
Figure BDA0003155670990000022
wherein frame _ rel is the number of shortest path frame pairs, and the method for specifically obtaining the shortest path frame pairs is as follows: removing the frames which are the same as the frames in the frame semantic information set E' from the frame semantic information set E to obtain a set E1; removing the frames which are the same as the frames in the frame semantic information set E from the frame semantic information set E 'to obtain a set E' 1; obtaining the number of edges required by each frame in the set E1 to reach any frame in the set E' 1 through a visualization tool GIFN; using two frames with the minimum number of required edges as a shortest path frame pair; path _ valuei,The expression of (a) is as follows:
Figure BDA0003155670990000031
wherein CountPath is the number of edges required by one frame in the ith' shortest path frame pair to reach the other frame; weighttIs the weight of the t-th edge.
Further, the framework influence factor in step 3 is:
Figure BDA0003155670990000032
wherein, ciIs fE,iTotal number of center core frame elements; n isiIs fE,iTotal number of middle frame elements, betaiIs fE,iThe framework influencing factor of (1).
Further, the frame importance function in step 3 is:
Figure BDA0003155670990000033
wherein
Figure BDA0003155670990000034
Is betaiIs indexed to the score.
Has the advantages that: the invention considers the importance of the frame while considering the frame, and can more accurately measure the similarity between sentences. The method can be applied to natural language processing tasks such as text inclusion recognition and text summarization.
Drawings
FIG. 1 is a schematic flow chart of a method for calculating the importance of a frame according to the present invention;
FIG. 2 is a flow chart of extracting core frame elements according to a frame semantic library FrameNet;
FIG. 3 is a flow chart of a compute frame importance function;
FIG. 4 is a diagram of the semantic relationships between partial frames in the GIFN.
Detailed Description
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention.
According to the method for calculating the sentence similarity based on the frame importance, the core frame elements of the frame are extracted according to the frame semantic library FrameNet, the frame importance is distinguished through the number of the core frame elements contained in the frame, and the method is conveniently applied to natural language processing tasks such as text inclusion recognition and text summarization.
The following detailed description of the embodiments of the present invention will be provided with reference to the drawings and examples, so as to fully understand how to implement the technical solution of the present invention and achieve the technical effects. It should be noted that, as long as there is no conflict, the embodiments and the features of the embodiments of the present invention may be combined with each other, and the technical solutions formed are within the scope of the present invention.
The FrameNet described in this embodiment refers to a semantic knowledge base based on Frame Semantics (Frame Semantics) constructed by berkeley division, university of california, usa, and is used for linguistic studies, such as linguistics, computational linguistics, and natural language processing. Concept structures and semantic scenes hidden behind words can be mined through the frame semantics. In FrameNet, a frame refers to the semantic structural form of a sentence expressing a particular scene, made up of a token and its associated frame elements. The various participants, external conditions, etc. involved in the frame are referred to as frame elements, which in the real corpus correspond to the vocabulary describing the event or event modality in the context. The frame elements are divided into core frame elements and non-frame elements according to the importance degree, the core frame elements are necessary components of a frame in concept understanding, the core frame elements are different in number and type in different frames, and the personality of the frames is displayed; the non-core frame elements express general semantic components such as time, place and the like.
SEMAFOR is an open-source framework semantic parser. The method can automatically analyze English sentences according to the FrameNet structure, and obtain frames, frame elements, specific contents indicated by the frame elements and the like aroused by the sentence contents. In the key steps of the implementation design, frame semantic information is acquired by a SEMAFOR open source tool according to a semantic knowledge base FrameNet.
Neo4j is a high-performance NOSQL graph database that stores structured data on a network rather than in tables. It is an embedded, disk-based Java persistence engine with full transactional features. Neo4j provides large-scale scalability, allowing billions of nodes/relationships/attributes to be processed on one machine, extending to multiple machines running in parallel. Graph databases are good at handling large amounts of complex, interconnected, low-structured data that changes rapidly and requires frequent queries, as opposed to relational databases, where such queries result in large numbers of table connections and, therefore, create performance problems. Neo4j focuses on solving the performance degradation problem that occurs when a traditional RDBMS with a large number of connections queries. By modeling the data around the graph, Neo4j will traverse nodes and edges at the same speed, which does not have any relationship to the amount of data that makes up the graph.
As shown in fig. 1, the present embodiment provides a method for calculating sentence similarity based on frame importance, which includes:
step one, identifying all frame semantic information from the English sentence S. And analyzing the sentence S by using an open source frame semantic analysis tool SEMAFOR according to a FrameNet structure to obtain a frame excited by the content of the sentence S, wherein the frame comprises the lemma and frame elements connected with the lemma, and all the frames form a frame semantic information set E. The input SEMAFOR content is English sentence S, and the output is the result analyzed by the SEMAFOR tool.
By means of the development tool Eclipse, frarnet was mapped into Neo4j, resulting in the constructed frarnet visualization tool GIFN: all frames, frame elements and lemmas in the frame semantic library FrameNet are taken as nodes (because a frame is a sentence semantic structure form which is formed by the lemmas and the frame elements connected with the lemmas and expresses a specific scene, the frame is taken as a node), the relationship between the frames and the relationship between the lemmas and the frames are taken as edges and stored in a graphic database Neo4j, and the constructed FrameNet visualization tool 'graphic Interpretation for FrameNet: GIFN' is obtained.
Step two, extracting the frame of each line from the result analyzed by the frame semantic analysis tool SEMAFOR: defining a FrameExtraction class to extract a plurality of frames of each line in a result analyzed by a frame semantic analysis tool SEMAFOR, and defining a searchFrame method in the FrameExtraction class to extract a plurality of frames of each line; the searchFrame method is called and the result output is all the frames contained in each row.
Determining a FrameFEExtraction class to extract frame elements contained in each frame in a result analyzed by a frame semantic analysis tool SEMAFOR, and defining a searchFE method in the FrameFEExtraction class to extract the frame elements contained in each frame; the searchFE method is invoked and the result output is all the frame elements contained under each frame.
Part of key codes of the second step are as follows:
Figure BDA0003155670990000051
Figure BDA0003155670990000061
the results obtained in step two are partially collated as shown in Table 1:
TABLE 1
Frame (Frame) FEs (frame element)
Statement {Message,Speaker}
Sign_agreement {Agreement,Signatory}
Ordinal_numbers {Type}
Possession {Possession}
Compliance {Protagonist}
Step three, displaying a framework semantic library FrameNet in a graphical topographic form by utilizing a FrameNet visualization tool GIFN, wherein frames (framework) and LUs (lemmas) in the GIFN comprise annoset, FEs (framework elements) comprise FEcoreSet (core framework element set), and the FEcoreSet represents core framework elements of the framework in the set E: the specific flow is shown in FIG. 2; defining a FEcoreExtraction class, extracting core frame elements contained in each frame in a result analyzed by a frame semantic analysis tool SEMAFOR, finding out the core frame element result of the frame in the FEcoreExtraction class through FEcoreSet, and outputting the core frame element result as the core frame elements contained in each frame.
Part of the results obtained in step three are summarized in table 2:
TABLE 2
Frame (Frame) CoreFEs (core frame element)
Statement {Message,Speaker}
Sign_agreement {Agreement,Signatory}
Ordinal_numbers {Type}
Possession {Possession}
Compliance {Protagonist}
And step four, calculating the frame influence factor of each frame based on the number of the core frame elements in each frame. Calculating the probability of the number of the core frame elements covered by each frame in the number of the frame elements covered by the frames in the whole sentence S, and defining the probability as a frame influence factor in a frame semantic information set E, wherein the calculation formula is as follows:
Figure BDA0003155670990000071
wherein: c. CiIs fE,iTotal number of center core frame elements; n isiIs fE,iTotal number of middle frame elements, fE,iDenotes the ith frame in the set E, i 1, 2., frame _ S, which is the total number of frames covered in the sentence S. The greater the number of core frame elements covered by a frame, the higher the importance, and the greater its impact factor value.
When the number of core frame elements covered by the two frames is the same, the semantic importance of the two frames is considered to be the same, and the influence factor values are the same.
In this embodiment, fes classification is defined to calculate the total number of frame elements contained in the english sentence S, FrameNum is defined to calculate the number of frames covered by each sentence, fes num is defined to calculate the number of frame elements covered by each frame, fes null is defined to accumulate the number of frame elements, and the output result is the total number of frame elements contained in the whole sentence. Defining a CoreFEsCalculation class to calculate the probability of the number of core frame elements covered by each frame in the number of frame elements covered by the frames in the whole sentence S, defining a CoreFEsNum method to calculate the number of frame elements covered by each frame, and defining a CoreFEsPeer method to calculate the probability by using a formula (1).
The partial calculation results obtained in step four are shown in table 3:
TABLE 3
Figure BDA0003155670990000072
Step five, constructing a frame influence factor matrix, wherein the frame influence factor matrix is as follows:
M=(βi)frame_S×1
and step six, measuring the importance of the frames according to the influence factors of the frames in the set E, defining an importance function of each frame in the sentence S, and calculating the importance of each frame in the frame information set E. Giving corresponding weight according to the number of the core frame elements covered in the frame, and calculating the importance of each frame of the sentence S to the sentence, specifically: (ii) a
The importance of each frame in the frame information set E is initialized. The initialized formula for the importance of each frame in the frame information set E is:
Figure BDA0003155670990000081
and normalizing the importance of the frame in the English sentence S. The calculation formula of the importance of each frame in the normalized English sentence S to the sentence is as follows:
Figure BDA0003155670990000082
wherein
Figure BDA0003155670990000083
For each element in the framework impact factor matrix, also βiAn exponential score of; w (f) is more than 0E,i)≤1,
Figure BDA0003155670990000084
According to one embodiment of the invention, the FrameWeight type calculation frame importance is defined, a CoreFEsPerall method is defined to accumulate frame influence factors, the FrameWeight method is defined to calculate the frame importance by using a formula (2), and the output result is the importance of each frame of a sentence corresponding to the sentence. A flow chart for defining the frame importance function is shown in fig. 3.
Step seven, forming a frame semantic information set E ' by all frames in the English sentence S ' according to the steps one to six, and calculating the importance of each frame in the frame semantic information set E ';
step eight: taking the same frame in the frame semantic information set E and the frame semantic information set E' as a group of frame groups; obtaining frame _ same frame groups; comparing the importance of two frames in the jth frame group, and selecting the minimum frame importance as the importance min of the frame in the jth frame groupjJ 1, 2., frame _ same; and accumulating the importance of the frames of the frame _ same frame group, and calculating the similarity of the English sentences S and S' based on the following formula:
Figure BDA0003155670990000085
wherein, Similarity _ score is the Similarity between English sentence S and sentence S'; frame _ S 'is the total number of frames in the frame semantic information set E', Maximum (.) is the Maximum value; wherein the calculation formula of Path _ score is as follows:
Figure BDA0003155670990000086
frame _ rel is the number of shortest path frame pairs, and the specific method for obtaining the number of shortest path frame pairs is as follows: removing the frames which are the same as the frames in the frame semantic information set E' from the frame semantic information set E to obtain a set E1; removing the same frame in the frame semantic information set E ' from the frame semantic information set E ' to obtain a set E ' 1; obtaining the number of edges required by each frame in the set E1 to reach any frame in the set E' 1 through the visualization tool GIFN, wherein the semantic relation among partial frames is shown in FIG. 4; using the two frames with the minimum number of required edges as the shortest circuitA radial frame pair; path _ valuei,The expression of (a) is as follows:
Figure BDA0003155670990000091
wherein CountPath is the number of edges required by one frame in the ith' shortest path frame pair to reach the other frame; weighttIs the weight of the t-th edge. The weight of each path in fig. 4 is shown in table 4:
TABLE 4
Inter-frame semantic relationships (semantic relationships represented by paths in GIFN are also edges) Path weight
Inherits from 0.55
Is Inherited by 0.55
Perspective on 0.45
Is Perspective in 0.45
Users 0.3
Is Used by 0.3
Subframe of 0.35
Has Subframe(s) 0.35
Precedes 0.2
Is Preceded by 0.2
Is Inchoative of 0.3
Is Causative of 0.3
See also 0.4
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. The invention is not described in detail in order to avoid unnecessary repetition.

Claims (6)

1. A method for calculating sentence similarity based on frame importance is characterized by comprising the following steps:
step 1: extracting all frames in the English sentence S, and forming a frame semantic information set E by all the frames;
step 2: constructing a frame semantic library FrameNet visualization tool GIFN, and extracting core frame elements of each frame in a frame semantic information set E through the GIFN;
and step 3: calculating a frame influence factor of each frame based on the number of core frame elements in each frame; establishing a frame importance function according to the frame influence factors to obtain the importance w (f) of the ith frame in the frame semantic information set EE,i),fE,iRepresenting the ith frame in the frame semantic information set E, wherein i is 1, 2., frame _ S, and frame _ S is the total number of frames in the frame semantic information set E;
and 4, step 4: forming all frames in the English sentence S ' into a frame semantic information set E ' according to the steps 1-3, and calculating the importance of each frame in the frame semantic information set E ';
and 5: taking the same frame in E and E' as a group of frame groups to obtain frame _ same frame groups; comparing the importance of two frames in the jth frame group, and selecting the minimum frame importance as the frame importance min of the jth frame groupjJ 1, 2., frame _ same; the frame importance of the frame _ same frame groups is accumulated,and calculating the similarity of the english sentences S and S' based on the cumulatively calculated values.
2. The method for calculating sentence similarity based on frame importance according to claim 1, wherein the english sentence S in step 1 is input into an open source semantic frame extraction tool semfor, and the semfor analyzes the input english sentence S according to the structure of a frame semantic library FrameNet, thereby extracting the frame in the english sentence S.
3. The method for calculating sentence similarity based on frame importance according to claim 1, wherein the specific method for constructing the frame semantic library FrameNet visualization tool GIFN in the step 2 is as follows: all frames in the FrameNet are taken as nodes, semantic relations among the frames and semantic relations among the lemmas and the frames are taken as edges, and the nodes and the edges are stored in a graph database Neo4 j.
4. The method for calculating sentence similarity based on frame importance of claim 3, wherein the similarity calculation formula for correspondence between English sentence S and sentence S' is as follows:
Figure FDA0003155670980000011
wherein, Similarity _ score is the Similarity between English sentence S and sentence S'; frame _ S 'is the total number of frames in the frame semantic information set E', Maximum (.) is the Maximum value; wherein the expression of Path _ score is as follows:
Figure FDA0003155670980000021
wherein frame _ rel is the number of shortest path frame pairs, and the method for specifically obtaining the shortest path frame pairs is as follows: the same frame as in the frame semantic information set E' is removed from the frame semantic information set E,obtaining a set E1; removing the frames which are the same as the frames in the frame semantic information set E from the frame semantic information set E 'to obtain a set E' 1; obtaining the number of edges required by each frame in the set E1 to reach any frame in the set E' 1 through a visualization tool GIFN; using two frames with the minimum number of required edges as a shortest path frame pair; path _ valuei’The expression of (a) is as follows:
Figure FDA0003155670980000022
wherein CountPath is the number of edges required by one frame in the ith' shortest path frame pair to reach the other frame; weighttIs the weight of the t-th edge.
5. The method for calculating sentence similarity based on frame importance according to claim 1, wherein the frame influence factors in the step 3 are:
Figure FDA0003155670980000023
wherein, ciIs fE,iTotal number of center core frame elements; n isiIs fE,iTotal number of middle frame elements, betaiIs fE,iThe framework influencing factor of (1).
6. The method for calculating sentence similarity based on frame importance according to claim 5, wherein the frame importance function in step 3 is:
Figure FDA0003155670980000024
wherein
Figure FDA0003155670980000025
Is betaiIs indexed to the score.
CN202110776700.XA 2021-07-09 2021-07-09 Method for calculating sentence similarity based on frame importance Active CN113536761B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110776700.XA CN113536761B (en) 2021-07-09 2021-07-09 Method for calculating sentence similarity based on frame importance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110776700.XA CN113536761B (en) 2021-07-09 2021-07-09 Method for calculating sentence similarity based on frame importance

Publications (2)

Publication Number Publication Date
CN113536761A true CN113536761A (en) 2021-10-22
CN113536761B CN113536761B (en) 2024-01-30

Family

ID=78127260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110776700.XA Active CN113536761B (en) 2021-07-09 2021-07-09 Method for calculating sentence similarity based on frame importance

Country Status (1)

Country Link
CN (1) CN113536761B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012113422A (en) * 2010-11-22 2012-06-14 Nippon Telegr & Teleph Corp <Ntt> Document processing apparatus, method and program
US20130054612A1 (en) * 2006-10-10 2013-02-28 Abbyy Software Ltd. Universal Document Similarity
CN110889292A (en) * 2019-11-29 2020-03-17 福州大学 Text data viewpoint abstract generating method and system based on sentence meaning structure model
CN111324690A (en) * 2020-03-04 2020-06-23 南京航空航天大学 FrameNet-based graphical semantic database processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130054612A1 (en) * 2006-10-10 2013-02-28 Abbyy Software Ltd. Universal Document Similarity
JP2012113422A (en) * 2010-11-22 2012-06-14 Nippon Telegr & Teleph Corp <Ntt> Document processing apparatus, method and program
CN110889292A (en) * 2019-11-29 2020-03-17 福州大学 Text data viewpoint abstract generating method and system based on sentence meaning structure model
CN111324690A (en) * 2020-03-04 2020-06-23 南京航空航天大学 FrameNet-based graphical semantic database processing method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TIEXIN WANG等: "A joint FrameNet and element focusing Sentence-BERT method of sentence similarity computation", EXPERT SYSTEMS WITH APPLICATIONS, vol. 200, no. 117084, pages 1 - 11 *
王铁鑫等: "面向英文句子的框架语义扩展及相似度计算", 小型微型计算机系统, pages 1 - 8 *

Also Published As

Publication number Publication date
CN113536761B (en) 2024-01-30

Similar Documents

Publication Publication Date Title
US11520812B2 (en) Method, apparatus, device and medium for determining text relevance
JP7223785B2 (en) TIME-SERIES KNOWLEDGE GRAPH GENERATION METHOD, APPARATUS, DEVICE AND MEDIUM
CN111104794B (en) Text similarity matching method based on subject term
TWI662425B (en) A method of automatically generating semantic similar sentence samples
CN108197117B (en) Chinese text keyword extraction method based on document theme structure and semantics
CN106649260B (en) Product characteristic structure tree construction method based on comment text mining
US20190005049A1 (en) Corpus search systems and methods
CN110502642B (en) Entity relation extraction method based on dependency syntactic analysis and rules
CN111190900B (en) JSON data visualization optimization method in cloud computing mode
US20110302168A1 (en) Graphical models for representing text documents for computer analysis
US20220277005A1 (en) Semantic parsing of natural language query
US11514034B2 (en) Conversion of natural language query
CN103646112A (en) Dependency parsing field self-adaption method based on web search
CN109522396B (en) Knowledge processing method and system for national defense science and technology field
CN110909126A (en) Information query method and device
CN109840255A (en) Reply document creation method, device, equipment and storage medium
CN112633000A (en) Method and device for associating entities in text, electronic equipment and storage medium
JP2018005690A (en) Information processing apparatus and program
JPH0816620A (en) Data sorting device/method, data sorting tree generation device/method, derivative extraction device/method, thesaurus construction device/method, and data processing system
JP7197542B2 (en) Method, Apparatus, Device and Medium for Text Word Segmentation
CN111444713A (en) Method and device for extracting entity relationship in news event
Korobkin et al. Patent data analysis system for information extraction tasks
CN110929509B (en) Domain event trigger word clustering method based on louvain community discovery algorithm
CN106294689B (en) A kind of method and apparatus for selecting to carry out dimensionality reduction based on text category feature
CN112632272A (en) Microblog emotion classification method and system based on syntactic analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant