CN110532553B - Water conservancy space relation word recognition and extraction method - Google Patents

Water conservancy space relation word recognition and extraction method Download PDF

Info

Publication number
CN110532553B
CN110532553B CN201910771664.0A CN201910771664A CN110532553B CN 110532553 B CN110532553 B CN 110532553B CN 201910771664 A CN201910771664 A CN 201910771664A CN 110532553 B CN110532553 B CN 110532553B
Authority
CN
China
Prior art keywords
words
word
spatial
relation
syntax
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910771664.0A
Other languages
Chinese (zh)
Other versions
CN110532553A (en
Inventor
冯钧
相颖
夏佩佩
陆佳民
朱跃龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201910771664.0A priority Critical patent/CN110532553B/en
Publication of CN110532553A publication Critical patent/CN110532553A/en
Application granted granted Critical
Publication of CN110532553B publication Critical patent/CN110532553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method for identifying and extracting water conservancy space relation words, which comprises the following steps: acquiring a spatial relationship seed set based on quantitative statistical characteristics; constructing an original syntax mode; generalizing the syntax modes, namely generalizing a plurality of original syntax modes expressing similar spatial relations into one mode, reducing the number of modes and improving the abstraction degree; and extracting the spatial relationship based on the generalized syntax mode. The invention focuses on the problem of spatial relation extraction in the water conservancy field, realizes automatic identification of spatial relation, construction of spatial relation word set, acquisition of spatial relation syntax mode and extraction of spatial relation tuples by using a weak supervision method, and saves a great deal of manpower and time; the method realizes the extraction of water conservancy data resources oriented to spatial relationships, converts free texts in water conservancy fields into structured data, and supplements the spatial relationships of the maps in a large scale and in a professional way, thereby providing more accurate query service for users.

Description

Water conservancy space relation word recognition and extraction method
Technical Field
The invention relates to the technical field of water conservancy business, in particular to a method for identifying and extracting water conservancy space relation words.
Background
With the rapid development of internet technology, water conservancy business accumulates massive water conservancy data with spatial relations, wherein the water conservancy data comprise a large amount of official documents. And natural language text is an important source of spatial data, so that the extraction of spatial relationship data from the text is an important research direction in the water conservancy field.
The main purpose of information extraction is to extract specific fact information from the text, namely, unstructured natural language text is converted into structured or semi-structured data and stored, so that knowledge can be conveniently and rapidly acquired by people, the information extraction method can be used for detailed mining analysis, and important functions are played in other fields of natural language processing, such as map construction, intelligent QA (quality assurance) systems and the like. Among them, relationship extraction is becoming a significant part of information extraction, and has recently been receiving more and more attention from researchers, and has become a research hotspot. Therefore, when the space relation is considered, the nodes in the water conservancy data resource knowledge graph should utilize an automatic relation extraction means to supplement space semantic information for the graph, and the supplemented space semantic is required to meet the application requirements of water conservancy services.
The traditional entity relation extraction mainly depends on rule matching, which requires a large number of linguistic experts to provide assistance, effective relation features are selected according to the language structure characteristics of the corpus, and the rule is manually written to carry out matching extraction relation. As a primary means of early, it has met with some success in obtaining entity relationships. This is a good result in certain fields or small corpora. However, manually writing rules is time consuming and labor intensive, and it is expensive to repeatedly write rules in various fields.
Word segmentation: unlike English, english words are separated by spaces, and word segmentation also only needs to be performed by spaces. Chinese is a writing unit based on words, and words are the smallest linguistic components in chinese text, so word analysis is the basis and key for chinese information processing. The Chinese word segmentation technique can be divided into three main categories: word segmentation method based on dictionary matching; word segmentation method based on word frequency statistics and word segmentation method based on knowledge understanding.
Part of speech tagging: part-of-Speech tagging (POS tag) is a short term for labeling each word with its Part of Speech, i.e., identifying whether the word is a verb, noun, adjective, or other Part of Speech. In Chinese, as the parts of speech of Chinese vocabulary are less changeable, the parts of speech tagging is relatively simple, and most words have only one part of speech or the most frequently occurring part of speech is far higher than the part of speech with the second frequency. By selecting the highest frequency part of speech, the accuracy of the Chinese part of speech tagging can reach 80%. More accurate part-of-speech tagging can be achieved with HMMs.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a method for identifying and extracting water conservancy space relation words, which utilizes a weak supervision method to realize automatic identification of space relation, construction of space relation word sets, acquisition of space relation syntax modes and extraction of space relation tuples, realizes extraction of water conservancy data resources facing the space relation, converts free texts in the water conservancy field into structured data, and supplements large-scale and professional space relation of a map, thereby providing more accurate query service for users.
In order to solve the technical problems, the invention provides a method for identifying and extracting water conservancy space relation words, which comprises the following steps:
(1) Acquiring a spatial relationship seed set based on quantitative statistical characteristics;
(2) Constructing an original syntax mode, wherein P= < e 1 ,e 2 ,r,C,Pos e1 ,Pos e2 ,Pos r >Wherein P represents a syntactic pattern, r represents a spatial relationship word, and C represents a set { w } of words excluding entities in a sentence 1 ,w 2 ,...,w n },e 1 And e 2 Two water conservancy entity type labels are respectively adopted;
(3) Generalizing the syntax modes, namely generalizing a plurality of original syntax modes expressing similar spatial relations into one mode, reducing the number of modes and improving the abstraction degree;
(4) And extracting the spatial relationship based on the generalized syntax mode.
Preferably, in the step (1), the acquisition of the spatial relationship seed set based on the quantitative statistical feature specifically includes the following steps:
(11) Preprocessing data; the entity performs word segmentation and part-of-speech tagging on the co-occurrence sentence to form a word set, and filters stop words such as yes, handle, stop words and the like;
(12) Feature selection and statistics; the distribution rule of the spatial relationship words in the sentences is obtained by counting 7 features: (a) part-of-speech POS; (b) the location LOC of the relationship word to the water conservancy object entity; (c) A position LCCP (left and right of two entities or in the middle) when a conjunctive or preposition is arranged on the left side of the spatial relation word; (d) distance DIS1 of the spatial relationship word to entity 1; (e) distance DIS2 of spatial relationship words to sentence ends; (f) spatial relationship word length LEN (in words); (g) The distance DIS (e 1, e 2) between two entities (taking words as units) is used as an important basis for calculating the subsequent extracted spatial relation words;
(13) Guan Jici extraction and instance seed set construction; according to the statistical result obtained in the step (12), taking the importance of the part of speech, the position and the distance of the words into consideration, and obtaining space relation words through calculation of the relation word importance degree;
(14) Expanding relation words; and positioning the line of the spatial relation words in the seed set by means of the layered structure of the synonym dictionary, comparing the 8 th bit of the semantic code, and if the line is "=", taking the relation words in the seed set as unified description words, taking the synonym similar words as candidate words, and establishing a water conservancy spatial relation system and expanding the spatial relation words.
Preferably, in step (2), constructing the original syntax mode specifically includes the following steps:
(21) The seed tuple in the step (1) is used as input to obtain co-occurrence sentences in the corpus, and the sentences are preprocessed;
(22) Performing lexical analysis by using a natural language processing tool, performing syntactic analysis by using a Stanford CoreNLP tool to obtain a syntactic tree, and calculating the relative distance between two words according to the directed path length and the node depth of the syntactic structure tree;
(23) Effective vocabularies such as verbs, nouns, adjectives and the like in the word sequence are reserved, and nonsensical words such as numerical words, pronouns and the like are filtered;
(24) The weight calculation is carried out on the reserved word sequence, the weight of each phrase is measured by utilizing the node distance between the phrase structure and the relation words in the sentence of the syntactic analysis tree, and the semantic code of each word in the word forest is identified;
(25) The locations of the two entities and the relationship words in the sentence are identified and stored as syntactic patterns.
Preferably, in step (3), the generalized syntax mode specifically includes the following steps:
(31) Syntactic pattern clustering, namely calculating similarity when the relative positions of the entities and the related words between two syntactic patterns are the same, the types of the entities are the same, and the same effective words exist in the context, and otherwise, directly considering dissimilarity;
(32) The method comprises the steps of generalizing a syntax mode, clustering to form a plurality of clusters, generalizing the plurality of modes in each cluster into an abstract mode, integrating word sequences in the modes into a sequence, and updating pos, wherein the clusters are modes with high similarity 1 ,pos 2 ,pos r Is a value of (2).
Preferably, in the step (4), the extracting of the spatial relationship based on the generalized syntax mode specifically includes the following steps:
(41) Acquiring a co-occurrence sentence set containing the spatial relationship words through the spatial relationship word set and preprocessing;
(42) Acquiring original syntax patterns of the co-occurrence sentences by using a proposed syntax pattern acquisition method, and generating an original pattern set;
(43) And matching the original mode with each mode in the generalized mode set, and extracting a corresponding spatial relationship tuple according to the position information of the entity and the spatial relationship word in the original mode when the entity and the spatial relationship word are the same in word order, the entity type is the same and the mode similarity is greater than a certain threshold value beta.
The beneficial effects of the invention are as follows: on the basis of the existing entity relation extraction technology, the invention focuses on the problem of spatial relation extraction in the water conservancy field, realizes automatic identification of spatial relation, construction of spatial relation word set, acquisition of spatial relation syntax mode and extraction of spatial relation tuples by using a weak supervision method, and saves a great deal of manpower and time; the method realizes the extraction of water conservancy data resources oriented to spatial relationships, converts free texts in water conservancy fields into structured data, and supplements the spatial relationships of the maps in a large scale and in a professional way, thereby providing more accurate query service for users.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
As shown in FIG. 1, a method for identifying and extracting water conservancy space relation words comprises the following steps:
(1) Acquiring a spatial relationship seed set based on quantitative statistical characteristics;
(11) Preprocessing data; the entity performs word segmentation and part-of-speech tagging on the co-occurrence sentence to form a word set, and filters stop words such as yes, handle, stop words and the like;
(12) Feature selection and statistics; the distribution rule of the spatial relationship words in the sentences is obtained by counting 7 features: (a) part-of-speech POS; (b) the location LOC of the relationship word to the water conservancy object entity; (c) A position LCCP (left and right of two entities or in the middle) when a conjunctive or preposition is arranged on the left side of the spatial relation word; (d) distance DIS1 of the spatial relationship word to entity 1; (e) distance DIS2 of spatial relationship words to sentence ends; (f) spatial relationship word length LEN (in words); (g) The distance DIS (e 1, e 2) between two entities (taking words as units) is used as an important basis for calculating the subsequent extracted spatial relation words;
(13) Guan Jici extraction and instance seed set construction; according to the statistical result obtained in the step (12), taking the importance of the part of speech, the position and the distance of the words into consideration, and obtaining space relation words through calculation of the relation word importance degree;
(14) Expanding relation words; and positioning the line of the spatial relation words in the seed set by means of the layered structure of the synonym dictionary, comparing the 8 th bit of the semantic code, and if the line is "=", taking the relation words in the seed set as unified description words, taking the synonym similar words as candidate words, and establishing a water conservancy spatial relation system and expanding the spatial relation words.
(2) Constructing an original syntax pattern, p=<e 1 ,e 2 ,r,C,Pos e1 ,Pos e2 ,Pos r >Wherein P represents a syntactic pattern, r represents a spatial relationship word, and C represents a set { w } of words excluding entities in a sentence 1 ,w 2 ,...,w n },e 1 And e 2 Two water conservancy entity type labels are respectively adopted;
(21) The seed tuple in the step (1) is used as input to obtain co-occurrence sentences in the corpus, and the sentences are preprocessed;
(22) Performing lexical analysis by using a natural language processing tool, performing syntactic analysis by using a Stanford CoreNLP tool to obtain a syntactic tree, and calculating the relative distance between two words according to the directed path length and the node depth of the syntactic structure tree;
(23) Effective vocabularies such as verbs, nouns, adjectives and the like in the word sequence are reserved, and nonsensical words such as numerical words, pronouns and the like are filtered;
(24) The weight calculation is carried out on the reserved word sequence, the weight of each phrase is measured by utilizing the node distance between the phrase structure and the relation words in the sentence of the syntactic analysis tree, and the semantic code of each word in the word forest is identified;
(25) The locations of the two entities and the relationship words in the sentence are identified and stored as syntactic patterns.
(3) Generalizing the syntax modes, namely generalizing a plurality of original syntax modes expressing similar spatial relations into one mode, reducing the number of modes and improving the abstraction degree;
(31) Syntactic pattern clustering, namely calculating similarity when the relative positions of the entities and the related words between two syntactic patterns are the same, the types of the entities are the same, and the same effective words exist in the context, and otherwise, directly considering dissimilarity;
(32) The method comprises the steps of generalizing a syntax mode, clustering to form a plurality of clusters, generalizing the plurality of modes in each cluster into an abstract mode, integrating word sequences in the modes into a sequence, and updating pos, wherein the clusters are modes with high similarity 1 ,pos 2 ,pos r Is a value of (2).
(4) Extracting a spatial relationship based on the generalized syntax mode;
(41) Acquiring a co-occurrence sentence set containing the spatial relationship words through the spatial relationship word set and preprocessing;
(42) Acquiring original syntax patterns of the co-occurrence sentences by using a proposed syntax pattern acquisition method, and generating an original pattern set;
(43) And matching the original mode with each mode in the generalized mode set, and extracting a corresponding spatial relationship tuple according to the position information of the entity and the spatial relationship word in the original mode when the entity and the spatial relationship word are the same in word order, the entity type is the same and the mode similarity is greater than a certain threshold value beta.
Firstly, counting characteristics of spatial relationship words in terms of parts of speech, positions and distances in a sample by a BootStrapping method, introducing importance of the characteristics into spatial relationship word extraction calculation, taking the word with the highest importance as the spatial relationship word of a water conservancy entity pair, and preparing a seed subset for the next spatial relationship extraction based on a syntactic pattern.
And secondly, expanding the spatial relation words in the seed set, acquiring the synonyms and the similar words of the spatial relation words from the synonym dictionary, constructing a synonym library, taking the spatial relation words in the seed set as uniform description words and the rest synonyms as candidate words, so that the subsequent spatial relation extraction is convenient, and a solution method of one sense and multiple words is provided for the spatial relation query.
And finally, taking the seed set as input, preprocessing the seed co-occurrence sentence, acquiring the original syntax mode of the spatial relationship, clustering the syntax mode and generalizing the syntax mode, and obtaining the soft mode with high abstraction degree. Searching candidate words of spatial relation words in a synonym library to obtain co-occurrence sentences possibly containing the relation, preprocessing again to obtain a syntax mode, comparing the syntax mode with soft modes in a mode library, and extracting corresponding spatial relation if the similarity conditions are met.

Claims (4)

1. The method for identifying and extracting the water conservancy space relation words is characterized by comprising the following steps:
(1) Acquiring a spatial relationship seed set based on quantitative statistical characteristics;
(2) Constructing an original syntax mode, wherein P= < e 1 ,e 2 ,r,C,Pos e1 ,Pos e2 ,Pos r Where P represents a syntactic pattern, r represents a spatial relationship word, C represents a set { w } of words in a sentence except for an entity 1 ,w 2 ,...,w n },e 1 And e 2 Two water conservancy entity type labels are respectively adopted;
(3) Generalizing the syntax modes, namely generalizing a plurality of original syntax modes expressing similar spatial relations into one mode, reducing the number of modes and improving the abstraction degree; the method specifically comprises the following steps:
(31) Syntactic pattern clustering, namely calculating similarity when the relative positions of the entities and the related words between two syntactic patterns are the same, the types of the entities are the same, and the same effective words exist in the context, and otherwise, directly considering dissimilarity;
(32) The method comprises the steps of generalizing a syntax mode, clustering to form a plurality of clusters, generalizing the plurality of modes in each cluster into an abstract mode, integrating word sequences in the modes into a sequence, and updating Pos, wherein the clusters are modes with high similarity e1 ,Pos e2 ,Pos r Is a value of (2);
(4) And extracting the spatial relationship based on the generalized syntax mode.
2. The method for identifying and extracting water conservancy space relation words according to claim 1, wherein in the step (1), the acquisition of the space relation seed set based on quantitative statistical characteristics specifically comprises the following steps:
(11) Preprocessing data; dividing words and marking parts of speech of the entity on the co-occurrence sentence to form a word set, and filtering stop words;
(12) Feature selection and statistics; the distribution rule of the spatial relationship words in the sentences is obtained by counting 7 features: (a) part-of-speech POS; (b) the location LOC of the relationship word to the water conservancy object entity; (c) A position LCCP when a ligature or preposition is arranged on the left side of the spatial relation word; (d) distance DIS1 of the spatial relationship word to entity 1; (e) distance DIS2 of spatial relationship words to sentence ends; (f) spatial relationship word length LEN; (g) The distance DIS (e 1, e 2) between two entities is used as an important basis for the calculation of the subsequent extracted spatial relation words;
(13) Guan Jici extraction and instance seed set construction; according to the statistical result obtained in the step (12), taking the importance of the part of speech, the position and the distance of the words into consideration, and obtaining space relation words through calculation of the relation word importance degree;
(14) Expanding relation words; and positioning the line of the spatial relation words in the seed set by means of the layered structure of the synonym dictionary, comparing the 8 th bit of the semantic code, and if the line is "=", taking the relation words in the seed set as unified description words, taking the synonym similar words as candidate words, and establishing a water conservancy spatial relation system and expanding the spatial relation words.
3. The method for identifying and extracting a water conservancy space relation word as claimed in claim 1, wherein in the step (2), the construction of the original syntax pattern comprises the following steps:
(21) The seed tuple in the step (1) is used as input to obtain co-occurrence sentences in the corpus, and the sentences are preprocessed;
(22) Performing lexical analysis by using a natural language processing tool, performing syntactic analysis by using a Stanford CoreNLP tool to obtain a syntactic tree, and calculating the relative distance between two words according to the directed path length and the node depth of the syntactic tree;
(23) Reserving effective words in the word sequence, and filtering nonsensical words;
(24) The weight calculation is carried out on the reserved word sequence, the weight of each phrase is measured by utilizing the node distance between the phrase structure and the relation words in the sentence through the syntax tree, and the semantic code of each word in the word forest is identified;
(25) The locations of the two entities and the relationship words in the sentence are identified and stored as syntactic patterns.
4. The method for identifying and extracting spatial relationship words according to claim 1, wherein in the step (4), the extraction of the spatial relationship based on the generalized syntax pattern specifically comprises the following steps:
(41) Acquiring a co-occurrence sentence set containing the spatial relationship words through the spatial relationship word set and preprocessing;
(42) Acquiring original syntax patterns of the co-occurrence sentences by using a proposed syntax pattern acquisition method, and generating an original pattern set;
(43) And matching the original mode with each mode in the generalized mode set, and extracting a corresponding spatial relationship tuple according to the position information of the entity and the spatial relationship word in the original mode when the entity and the spatial relationship word are the same in word order, the entity type is the same and the mode similarity is greater than a certain threshold value beta.
CN201910771664.0A 2019-08-21 2019-08-21 Water conservancy space relation word recognition and extraction method Active CN110532553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910771664.0A CN110532553B (en) 2019-08-21 2019-08-21 Water conservancy space relation word recognition and extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910771664.0A CN110532553B (en) 2019-08-21 2019-08-21 Water conservancy space relation word recognition and extraction method

Publications (2)

Publication Number Publication Date
CN110532553A CN110532553A (en) 2019-12-03
CN110532553B true CN110532553B (en) 2023-08-22

Family

ID=68662316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910771664.0A Active CN110532553B (en) 2019-08-21 2019-08-21 Water conservancy space relation word recognition and extraction method

Country Status (1)

Country Link
CN (1) CN110532553B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560490A (en) * 2020-12-08 2021-03-26 吉林大学 Knowledge graph relation extraction method and device, electronic equipment and storage medium
CN117034051B (en) * 2023-07-27 2024-05-03 广东省水利水电科学研究院 Water conservancy information aggregation method, device and medium based on BIRCH algorithm

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241538A (en) * 2018-09-26 2019-01-18 上海德拓信息技术股份有限公司 Based on the interdependent Chinese entity relation extraction method of keyword and verb

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241538A (en) * 2018-09-26 2019-01-18 上海德拓信息技术股份有限公司 Based on the interdependent Chinese entity relation extraction method of keyword and verb

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
开放式地理实体关系抽取的Bootstrapping方法;冯恩信;《测绘学报》;20160530;第45卷(第5期);第616-622页 *

Also Published As

Publication number Publication date
CN110532553A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN109271626B (en) Text semantic analysis method
CN111209412B (en) Periodical literature knowledge graph construction method for cyclic updating iteration
CN110705296A (en) Chinese natural language processing tool system based on machine learning and deep learning
Brown et al. Analysis, statistical transfer, and synthesis in machine translation
CN109508459B (en) Method for extracting theme and key information from news
CN111061882A (en) Knowledge graph construction method
CN108920447B (en) Chinese event extraction method for specific field
CN111897917B (en) Rail transit industry term extraction method based on multi-modal natural language features
US20110040553A1 (en) Natural language processing
CN110532553B (en) Water conservancy space relation word recognition and extraction method
Patil et al. Issues and challenges in marathi named entity recognition
CN107562907B (en) Intelligent lawyer expert case response device
Hirpassa Information extraction system for Amharic text
CN109960720B (en) Information extraction method for semi-structured text
Seresangtakul et al. Thai-Isarn dialect parallel corpus construction for machine translation
Batarfi et al. Building an Arabic semantic lexicon for Hajj
Eineborg et al. ILP in part-of-speech tagging—an overview
CN116226362B (en) Word segmentation method for improving accuracy of searching hospital names
Loglo A Lexical Dependency Probability Model for Mongolian Based on Integration of Morphological and Syntactic Features
Samir et al. Training and evaluation of TreeTagger on Amazigh corpus
CN111241827B (en) Attribute extraction method based on sentence retrieval mode
KR20020003574A (en) Apparatus And Method For Word Sense Disambiguation In Machine Translation System
Reeve Integrating hidden markov models into semantic web annotation platforms
CN115455039A (en) Dependency analysis method in natural language query field
Liu Researches Advanced in the Development and Application of Information Extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant