CN114757147A - BERT-based automatic hierarchical tree expansion method - Google Patents

BERT-based automatic hierarchical tree expansion method Download PDF

Info

Publication number
CN114757147A
CN114757147A CN202210350872.5A CN202210350872A CN114757147A CN 114757147 A CN114757147 A CN 114757147A CN 202210350872 A CN202210350872 A CN 202210350872A CN 114757147 A CN114757147 A CN 114757147A
Authority
CN
China
Prior art keywords
entity
hierarchical tree
space
bert
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210350872.5A
Other languages
Chinese (zh)
Inventor
陶明阳
王星
陈吉
张鑫
刘亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning Technical University
Linyi University
Original Assignee
Liaoning Technical University
Linyi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning Technical University, Linyi University filed Critical Liaoning Technical University
Priority to CN202210350872.5A priority Critical patent/CN114757147A/en
Publication of CN114757147A publication Critical patent/CN114757147A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an automatic hierarchical tree expansion method based on BERT, which comprises the steps of extracting an entity set through a corpus and generating word vectors of the entity set, and performing preliminary completion on each entity space corresponding to a hierarchical tree input by a user; generating an optimal class name for each entity space by using a MASK mechanism of BERT, generating a candidate set for each entity space by using a class name guide expansion mode, and supplementing high-quality entities to the corresponding entity space after calculating the score of each candidate entity and the similarity score with the seed set; and carrying out entity disambiguation and obtaining a hierarchical tree expansion result. The method for expanding the automatic hierarchical tree based on the BERT provided by the invention utilizes a language model to understand the hierarchical tree result input by a user, obtains the candidate words at each position, fills the candidate words and finally obtains the hierarchical tree meeting the requirement of the user input result.

Description

BERT-based automatic hierarchical tree expansion method
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to an automatic hierarchical tree expansion method based on BERT.
Background
Hierarchical trees have wide application to many downstream natural language processing tasks. Because the cost of manual labeling is high and the data quality is uneven, a method for automatically constructing a hierarchical tree is urgently needed. At present, the existing hierarchical tree expansion method is mainly the upper and lower relation of the "is-a", which greatly limits the applicability in each real task. Therefore, the invention aims to input a preset hierarchical tree context format by a user task, and the system completes the whole hierarchical tree according to the format. However, the existing expansion method does not achieve high precision and is low in efficiency. And do not meet the needs of downstream tasks well.
Two main tasks of the hierarchical tree expansion are optimized. Firstly, for width expansion, a BERT pre-training model is used, each entity space is endowed with a class name, candidate entities are obtained through the class names, and finally, width expansion results are obtained through ANNOY filtering. Second, for depth expansion, the superior-inferior relationship score of the two nodes is calculated using Word2 Vec.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an automatic hierarchical tree expansion method based on BERT, which utilizes a language model to understand the hierarchical tree result input by a user, obtains candidate words at each position, fills the candidate words and finally obtains the hierarchical tree meeting the requirement of the user input result.
In order to solve the technical problems, the invention is realized by the following technical scheme:
the invention provides a BERT-based automatic hierarchical tree expansion method, which comprises the following steps:
s1: extracting an entity set through a corpus, generating word vectors of the entity set, and performing preliminary completion on each entity space corresponding to a hierarchical tree input by a user;
s2: generating an optimal class name for each entity space by using a MASK mechanism of BERT, generating a candidate set for each entity space by using a class name guide expansion mode, and supplementing high-quality entities to the corresponding entity space after calculating the score of each candidate entity and the similarity score with the seed set;
s3: and carrying out entity disambiguation and obtaining a hierarchical tree expansion result.
Further, in step S1, relevant documents are searched through the internet and determined as positive samples, strong negative samples, irrelevant samples and background samples through manual review, and the samples are classified into a sensitive sample library, a non-sensitive sample library, an irrelevant sample library and a background sample library.
Further, the specific steps of step S1 are as follows:
step S1.1: extracting entities in the corpus as an extended entity set by using a data mining mode;
step S1.2: obtaining a Word vector corresponding to each entity by using a Word2Vec model;
step S1.3: for each entity space, ANNOY or word vector similarity is used for preliminary expansion, so that semantic information represented by the entity space can be more accurately represented.
Preferably, the specific steps of step S2 are as follows:
step S2.1: for each entity space, finding out the possible class names and scores thereof of the entity space through the MLM task of the BERT, and generating the optimal class name and negative class name set of the entity space through the scores;
step S2.2: using the optimal class name and the negative class name set to expand the entity in each entity space, using the expanded entity as a candidate set, and calculating the score of each candidate entity;
step S2.3: and calculating the similarity score of each candidate word and the seed entity by using an ANNOY algorithm, and weighting and summing the similarity score and the score of the class name extension to obtain an extension set of each entity space.
Further, the specific steps of step S3 are as follows:
step S3.1: counting entities appearing in different entity spaces more than 2 times, namely ambiguous entities;
step S3.2: and each entity only keeps the last position of the score to generate a final hierarchical tree expansion result.
Further, the specific steps of step S3.2 are:
first, if the entity is in the user-entered entity, directly discarding the entity;
second, ancestral entities in the ambiguous entity are preferentially retained;
third, the entities with higher similarity scores to the seed entities in the entity space are retained.
Therefore, the invention has the following beneficial effects:
1. firstly, extracting entities of a corpus through data mining and Word2Vec and generating corresponding Word vectors. Secondly, the ANNOY model and the word vector similarity with high efficiency are used for expanding the tree structure input by the user in a small scale. And finally, generating a candidate set for each entity space in a BERT-based class name extension mode, filtering the candidate set by ANNOY, and generating a final hierarchical tree extension result after an entity disambiguation module.
2. A certain number of entities are expanded for each entity space in advance to express semantic information of each entity space more accurately, and the expansion difficulty of the step is low, so that an expansion mode with higher efficiency is selected to improve the overall expansion efficiency.
3. And carrying out expansion on the basis of a pre-training model BERT, and improving the accuracy of the expansion.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following detailed description is given in conjunction with the preferred embodiments, together with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments will be briefly described below.
FIG. 1 is a flow chart of a BERT-based automatic hierarchical tree expansion method of the present invention;
FIG. 2 is a flow chart of the hierarchical tree expansion algorithm of the present invention.
Detailed Description
Other aspects, features and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which form a part of this specification, and which illustrate, by way of example, the principles of the invention. In the referenced drawings, the same or similar components in different drawings are denoted by the same reference numerals.
The invention expands the hierarchical tree structure input by the user and returns a more complete hierarchical tree structure to the user. First, each entity space in the hierarchical tree is preliminarily complemented based on ANNOY and Word2Vec to enhance semantic information of each entity space. Secondly, the class expansion method based on the BERT expands each entity space respectively. And finally, performing entity disambiguation on the expanded hierarchical tree and returning a final expansion result to the user.
As shown in fig. 1 and 2, the BERT-based automatic hierarchical tree expansion method of the present invention includes the following steps:
step 1: and extracting the entity set through the corpus and generating a word vector of the entity set. And performing preliminary completion on each entity space corresponding to the hierarchical tree input by the user. The entity space is all entities under each entity node;
step 2: the best class name is generated for each entity space using the BERT MASK mechanism. And a candidate set is generated for each entity space in a manner of class name guided extension. After the score of each candidate entity and the similarity score of the seed set are calculated, high-quality entities are supplemented to a corresponding entity space;
and step 3: after step 2, an entity may be in 2 or more different entity spaces, so entity disambiguation is required and a hierarchical tree expansion result is obtained.
The specific steps of step 1 are as follows:
step 1.1: extracting entities in the corpus by using a data mining mode to serve as an extended entity set;
step 1.2: obtaining a Word vector corresponding to each entity by using a Word2Vec model;
step 1.3: for each entity space, ANNOY or word vector similarity is used for preliminary expansion, so as to more accurately represent semantic information represented by the entity space. The choice of the two ways is determined by the size of the existing entity in each entity space.
The specific steps of step 2 are as follows:
step 2.1: for each entity space, finding out possible class names and scores of the entity space through an MLM task of the BERT, and generating an optimal class name and a negative class name set of the entity space through the scores;
step 2.2: using the optimal class name and the negative class name set to expand the entity in each entity space, using the expanded entity as a candidate set, and calculating the score of each candidate entity;
step 2.3: and calculating the similarity score of each candidate word and the seed entity by using an ANNOY algorithm, and weighting and summing the similarity score and the score of the class name extension to obtain an extension set of each entity space.
The specific steps of step 3 are as follows:
step 3.1: counting entities appearing in different entity spaces more than 2 times, namely ambiguous entities;
step 3.2: and each entity only keeps the last position of the score to generate a final hierarchical tree expansion result. Firstly, if the entity is in the entity input by the user, directly discarding the entity; second, ancestral entities in the ambiguous entity are preferentially retained; third, the entities with higher similarity scores to the seed entities in the entity space are retained.
1) Physical space completion
In the present invention, the entity expansion of small batch is carried out on each entity space corresponding to the user input hierarchical tree y. Because the expansion scale is smaller, the ANNOY-based entity expansion method and the Word2 Vec-based entity expansion method with higher efficiency are selected, and each entity space P is passed throughiDifferent expansion methods are selected for the entity size of (1).
a. ANNOY-based entity expansion method
And carrying out entity expansion on the entity space P with the number of the entities reaching n by using an ANNOY-based entity expansion method. Firstly, all entities in all candidate entity sets E are encoded, wherein the encoding mode can be Word2Vec, Glove and the like. Second, the entity code is entered into the ANNOY database. And finally, for the entity space P, selecting n entities as a seed set S, searching the nearest neighbor entity and the similarity score of each seed entity S through ANNOY, ranking the entities according to the similarity score, and recording the ranking as L. And calculating the final score of each candidate entity e according to the ranking list in the following specific calculation mode:
Figure BDA0003580165540000061
wherein r isiIn list L for entity eiRank in (1). The main reasons for the expansion by ANNOY are: for the small-scale entity extension method, the accuracy of most entity extension methods can meet the expected requirement, so the efficiency of entity extension is prioritized. ANNOY adopts a data structure of a binary tree to carry out query, and the query efficiency is greatly improved.
b. Entity completion method based on Word2Vec
And (3) using an entity completion method based on Word2Vec for the entity space of which the number of the entities does not reach n. As shown in fig. 2, for entity e2The lower entity space of (2) cannot be expanded in an ANNOY mode, and the method is to obtain an abstract Word vector e of the entity P in the entity space by combining the upper and lower relations of the brother entity space with the property of Word2Vecv(red node in the figure). For the existence of t eligible neighbor entity spaces, evThe specific calculation process is as follows:
Figure BDA0003580165540000071
wherein the content of the first and second substances,
Figure BDA0003580165540000072
is the parent word vector of the neighboring entity space i,
Figure BDA0003580165540000073
average word vector of spatial nodes of neighboring entities, efIs the parent node of the current physical space. Abstract word vector evAnd generating a similar entity ranking list according to ANNOY as the central word vector of the entity space, and taking top-k entities with top ranking as an entity completion result.
2) Extension method based on class name
In the invention, how to expand each entity space based on class names is introduced, and the expansion is divided into 3 steps of class name generation, class name selection and entity expansion based on class names.
a. Class name generation
The class name generation module targets a set of entities into an entity space and generates a set of candidate class names for the entities. First, note that the object of class name generation is similar to the synonym detection task. Thus, the class probe query is constructed using six Hearst patterns. More specifically, three entities in the current set and a Hearst pattern are randomly selected to construct a query. For example, "[ MASK ]]such as Np1,Np2,and Np3.". Wherein the content of the first and second substances,Np1、Np2、Np1is 3 random entities of the entity space, [ MASK]The location predicted for the language model, i.e., the location of the class name. By repeating such a random selection process, a set of queries can be constructed and input into a pre-trained language model (BERT) to obtain the [ MASK ]]An entity that marks the location.
The above method can only generate unigram class names, which is not in line with the actual requirement. The specific solution is to query the LM for the first time and retrieve the first K most likely words, by following the [ MASK ]]Each retrieved word is then added to construct a new query. For example, "[ MASK ]]Class1 such as Np1,Np2,and Np3.". This process is repeated a maximum of three times and retains the class names of all the generated noun phrases.
b. Class name selection
In this module, the candidate class names generated above will be ranked to select a best class name that represents the entire set of entities, and in the next module, some negative class names will be used to filter out erroneous entities.
First, a corpus-based measure of similarity between candidate entities e and class names c is introduced, and given a class name c, first 6 entity queries are constructed by constructing in 6 Hearst patterns, [ MASK ]]Is an entity, and 6 queries are input to the language model to obtain an entity set of 6 queries, namely Xc. Furthermore, X is usedeTo represent all the sets of entities e in the seed set. The similarity of e and c is defined as:
Figure BDA0003580165540000081
Figure BDA0003580165540000082
where cos (x, x') is the cosine similarity between the two vectors x and x 0. The internal max operator finds the maximum similarity between each occurrence of e and a set of entity probe queries constructed based on c. The outer max operator identifies the first k most similar queries of e, and then takes their average as the final similarity between entity e and class name c, similar to finding the k best occurrences of entity e that match any probe query of class c, so it improves the similarity measure that previously utilized only context-free representations of entity and class name.
After the entity class similarity score is defined, one entity may be selected from the current set and a ranked list of candidate class names may be obtained based on their similarity to the entity. Then, given a set of entities, a | E | ranking list L is obtained1,L2,…,L|E|. Finally, aggregating all the lists into a final class name ranking list according to the scores, and selecting one of the first-ranked class names as a positive class name, which is denoted by cpWith simultaneous selection ranked lower than the positive class name c in each listpAs a negative class name set CN
c. Entity extension based on class name
In this module, the positive class name c selected above is utilizedpAnd a negative class name set CNTo assist in selecting new entities to be added to the collection. Each entity eiThe score of (2) is calculated together. The first scoring function being entity eiAnd positive class name cpThe score of (2) is calculated as follows:
Figure BDA0003580165540000091
in the formula, MkIs defined in formula (3). This score is referred to as a local score since it looks only at the top-k best entities in the corpus. The second scoring function calculates the similarity between each candidate entity and the existing entities in the current set based on their context-free representations. Given a current entity set E, several entities are first extracted from E, denoted as EsThen calculate each candidate entity eiThe score of (2) is calculated as follows:
Figure BDA0003580165540000092
because it uses a context-free representation, better reflects the overall position of the entity in the embedding space, and measures entity-entity similarity in a more global sense, thus becoming a global score. Such global scores supplement the local scores described above, and their geometric mean is used to finally rank all candidate entities:
Figure BDA0003580165540000093
as the expansion process iterates, the wrong entities may be contained in the set, resulting in semantic shifts. Therefore, with the negative class names of the above selections, a new ranking algorithm is developed to improve the quality and robustness of entity selection. First, E is resampled from the current physical space EsAnd T times are carried out to obtain T entity ordered lists. And secondly, obtaining T category ranking lists according to a category name sorting process. Finally, screening out entities meeting the conditions, wherein the entities belonging to the target semantic class can meet two conditions intuitively: (1) it appears in the first few bits of the multiple entity ranking tables; (2) selected positive class name c in its corresponding class ranking listpShould be listed above any negative class name. Combining these two criteria, a rank aggregation score is defined, as follows:
Figure BDA0003580165540000101
wherein
Figure BDA0003580165540000102
Is an index function, riIs entity eiRank list L ofiAnd finally, selecting the set of top entity last entity spaces.
3) Entity disambiguation
In this work, for each task-related entity, the goal is to find its single best position in the hierarchical tree of outputs. Therefore, when an entity is found to occur in multiple locations during the tree expansion process, the entity needs to be disambiguated, i.e., the best location where the entity should be located, to resolve such conflicts.
Given a set of conflicting nodes, C, corresponding to different locations of the same entity, the following three rules are applied to select the best location from the set. First, if the entity is among the entities input by the user, the entity is directly selected and the following two steps are skipped. Otherwise, for each pair of nodes in C, check if one of the nodes is an ancestor of the other node, and only the ancestor node is retained. Finally, the score for each remaining node e ∈ C is computed as follows:
Figure BDA0003580165540000103
where sib (e) represents the set of all siblings of e, and par (e) represents its parent. The skip mode characteristics in SK are selected based on their cumulative strength with the entities in sib (e). This equation essentially captures the joint similarity of a node to its siblings and its parent. The node with the highest confidence will be selected. Finally, for each node not selected in C, the whole subtree where its root is located is deleted, all siblings added thereafter are clipped and placed in the "child list" of its parent.
Eventually returning a hierarchical tree that satisfies the user input.
In the invention, 2 tasks of width expansion and depth expansion of the hierarchical tree construction are improved. For deep expansion, small-batch expansion is carried out on the entities in each entity space based on ANNOY algorithm so as to more accurately express the semantics of the entity space. For width expansion, each entity space is assigned a class name based on BERT, and then the entity space is expanded based on the class name.
While the foregoing is directed to the preferred embodiment of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims (5)

1. An automatic hierarchical tree expansion method based on BERT is characterized by comprising the following steps:
s1: extracting an entity set through a corpus, generating word vectors of the entity set, and performing preliminary completion on each entity space corresponding to a hierarchical tree input by a user;
s2: generating an optimal class name for each entity space by using a MASK mechanism of BERT, generating a candidate set for each entity space by using a class name guide expansion mode, and supplementing high-quality entities to the corresponding entity space after calculating the score of each candidate entity and the similarity score with the seed set;
s3: and carrying out entity disambiguation and obtaining a hierarchical tree expansion result.
2. The BERT-based automatic hierarchical tree expansion method according to claim 1, wherein the step S1 is specifically performed as follows:
step S1.1: extracting entities in the corpus by using a data mining mode to serve as an extended entity set;
step S1.2: obtaining a Word vector corresponding to each entity by using a Word2Vec model;
step S1.3: for each entity space, ANNOY or word vector similarity is used for preliminary expansion, so that semantic information represented by the entity space can be more accurately represented.
3. The BERT-based automatic hierarchical tree expansion method according to claim 1, wherein the step S2 is specifically performed as follows:
step S2.1: for each entity space, finding out the possible class names and scores thereof of the entity space through the MLM task of the BERT, and generating the optimal class name and negative class name set of the entity space through the scores;
step S2.2: using the optimal class name and the negative class name set to expand the entity of each entity space, using the expanded entity as a candidate set, and calculating the score of each candidate entity;
step S2.3: and calculating the similarity score of each candidate word and the seed entity by using an ANNOY algorithm, and weighting and summing the similarity score and the score of the class name extension to obtain an extension set of each entity space.
4. The BERT-based automatic hierarchical tree expansion method according to claim 1, wherein the step S3 is specifically performed as follows:
step S3.1: counting entities appearing in different entity spaces more than 2 times, namely ambiguous entities;
step S3.2: and each entity only keeps the last position of the score to generate a final hierarchical tree expansion result.
5. The BERT-based automatic hierarchical tree expansion method according to claim 4, wherein the step S3.2 comprises the following specific steps:
first, if the entity is in the user-entered entity, directly discarding the entity;
second, ancestral entities in the ambiguous entity are preferentially retained;
third, the entities with higher similarity scores to the seed entities in the entity space are retained.
CN202210350872.5A 2022-04-02 2022-04-02 BERT-based automatic hierarchical tree expansion method Pending CN114757147A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210350872.5A CN114757147A (en) 2022-04-02 2022-04-02 BERT-based automatic hierarchical tree expansion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210350872.5A CN114757147A (en) 2022-04-02 2022-04-02 BERT-based automatic hierarchical tree expansion method

Publications (1)

Publication Number Publication Date
CN114757147A true CN114757147A (en) 2022-07-15

Family

ID=82329092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210350872.5A Pending CN114757147A (en) 2022-04-02 2022-04-02 BERT-based automatic hierarchical tree expansion method

Country Status (1)

Country Link
CN (1) CN114757147A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982390A (en) * 2023-03-17 2023-04-18 北京邮电大学 Industrial chain construction and iterative expansion development method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982390A (en) * 2023-03-17 2023-04-18 北京邮电大学 Industrial chain construction and iterative expansion development method

Similar Documents

Publication Publication Date Title
CN111143479B (en) Knowledge graph relation extraction and REST service visualization fusion method based on DBSCAN clustering algorithm
CN109800284B (en) Task-oriented unstructured information intelligent question-answering system construction method
US9230041B2 (en) Search suggestions of related entities based on co-occurrence and/or fuzzy-score matching
US8341159B2 (en) Creating taxonomies and training data for document categorization
CN111190900B (en) JSON data visualization optimization method in cloud computing mode
CN111611361A (en) Intelligent reading, understanding, question answering system of extraction type machine
CN108846029B (en) Information correlation analysis method based on knowledge graph
CN111177591B (en) Knowledge graph-based Web data optimization method for visual requirements
CN112328891B (en) Method for training search model, method for searching target object and device thereof
CN110750704A (en) Method and device for automatically completing query
CN112036178A (en) Distribution network entity related semantic search method
CN111274332A (en) Intelligent patent retrieval method and system based on knowledge graph
CN107679124B (en) Knowledge graph Chinese question-answer retrieval method based on dynamic programming algorithm
CN115470133A (en) Large-scale continuous integrated test case priority ordering method, equipment and medium
Li et al. Visual segmentation-based data record extraction from web documents
Chen Parameterized spatial SQL translation for geographic question answering
CN114757147A (en) BERT-based automatic hierarchical tree expansion method
US9507834B2 (en) Search suggestions using fuzzy-score matching and entity co-occurrence
JP2006227823A (en) Information processor and its control method
US20210406291A1 (en) Dialog driven search system and method
JP2001184358A (en) Device and method for retrieving information with category factor and program recording medium therefor
JPH09288673A (en) Japanese morpheme analysis method and device therefor, and dictionary unregistered word collection method and device therefor
CN115658845A (en) Intelligent question-answering method and device suitable for open-source software supply chain
CN112199461B (en) Document retrieval method, device, medium and equipment based on block index structure
CN115186112A (en) Medicine data retrieval method and device based on syndrome differentiation mapping rule

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination