CN108717423A - A kind of code segment recommendation method excavated based on deep semantic - Google Patents

A kind of code segment recommendation method excavated based on deep semantic Download PDF

Info

Publication number
CN108717423A
CN108717423A CN201810371788.5A CN201810371788A CN108717423A CN 108717423 A CN108717423 A CN 108717423A CN 201810371788 A CN201810371788 A CN 201810371788A CN 108717423 A CN108717423 A CN 108717423A
Authority
CN
China
Prior art keywords
code segment
vector
natural language
code
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810371788.5A
Other languages
Chinese (zh)
Other versions
CN108717423B (en
Inventor
陶传奇
包盼盼
黄志球
周宇
王铁鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN201810371788.5A priority Critical patent/CN108717423B/en
Publication of CN108717423A publication Critical patent/CN108717423A/en
Application granted granted Critical
Publication of CN108717423B publication Critical patent/CN108717423B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • G06F8/315Object-oriented languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of code segments excavated based on deep semantic to recommend method, effect of the depth learning technology in natural language processing and its advantage in the excavation of natural language semanteme is utilized, and combined the characteristics of inquiry code segment is recommended.According to the Natural Language Search of input and code segment itself and its annotation of institute's band, depth excavates natural language semanteme and code segment concrete function, generate sentence vector sum paragraph vector, so that therefore the consistent code segment and natural language querying of semantic attribute is mapped to similar vector space, recommend most matched, similarity sorted N number of code segment from high to low for given inquiry.This method not only increases the accuracy of recommendation, moreover it is possible to improve the recall ratio of recommendation, and have preferable fault-tolerant ability to the natural language querying of input inquiry.

Description

Code segment recommendation method based on deep semantic mining
Technical Field
The invention belongs to the technical field of code recommendation with query, and particularly relates to a code segment recommendation method based on deep semantic mining.
Background
In the actual code writing process, developers often encounter unfamiliar programming tasks or need to realize certain specific functions, and in such a situation, if the developers can find the existing similar code segments to learn the use method of the code segments or directly copy and paste the code segments and then modify and perfect the code segments for code reuse, a great deal of time, energy and meaningless repeated work can be saved for the developers; however, how to recommend high-quality code segments based on the actual needs of developers is an important issue for software reuse.
In actual development, a developer will typically choose to query the required code fragments using a search engine. However, since the software code has integrity, the keywords in the code segment cannot accurately describe the function of a segment of code, and therefore the query result is usually not satisfactory. In addition, the existing recommendation method usually only focuses on the code segment itself and ignores the description information, and the description information of the code segment describes the function of the code segment in natural language most simply and intuitively. In recent years, due to the wide application of deep learning, the field of language processing has also made breakthrough progress, so that deep semantic and information mining on natural languages and programming languages can also have good effects. Therefore, combining language processing technology with code recommendation is a new and effective recommendation method.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a code segment recommendation method based on deep semantic mining, which uses a deep learning technology to support the natural language query-oriented code segment recommendation; the invention can deeply excavate natural language semantics and specific functions of code segments according to natural language search of a user and comments and code segment bodies carried by the code segments, so that the annotated code segments with consistent semantic attributes and natural language queries are mapped to similar vector spaces, and the most matched code segments are recommended for the given query.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
the invention relates to a code segment recommendation method based on deep semantic mining, which comprises the following steps of:
step 1): constructing a large-scale code segment set S with method description information;
step 2): construction method description information set D1And method subject set D2Constructing an annotations Collection D1' training Encoder-Decoder natural language sentence vector generator model M by using constructed data set1Training Encoder-Decoder programming language paragraph vector generator model M2
Step 3), extracting the method Name of each code segment in the code segment set S, and forming a key value pair form < Name, α ' > ' with the mapped vector representation α ' of the code segment as an index file used in recommendation;
step 4): for a given natural language query, a corresponding natural language sentence vector is obtained, and then N pieces of well-ordered code segments which are most matched are recommended to each query in the code segment set S with the method description information.
Preferably, the step 1) specifically comprises: acquiring a specific project from an open source software platform, cutting a source code file in the specific project by taking a method as a unit to obtain a code segment set S with method description information, wherein the name form of each code segment is package name, class name and method name; .
Preferably, the step 1) specifically further comprises: the specific items are Java items, Android items and other items.
Preferably, the step 2) specifically includes:
21) describing information set D in a method1For training set pair Encoder-Decoder natural language sentence vector generator model M1Training is carried out, convergence to a specified state is achieved, and training of a natural language sentence vector generator is completedRefining; method description information set D1The first sentence annotated to each code segment is extracted as input and a natural language sentence vector α is generated1And as part of the corresponding annotated code segment vector representation;
22) with method subject set D2Vector generator model M for training set pairs Encoder-Decoder programming language paragraphs2Training is performed, model training is completed when training to a specified convergence state, and segment vectors α for each code segment body are generated simultaneously2
23) Vector α natural language sentence1And a segment vector α of the code segment body2Weighted addition to obtain vector α as the vector that can ultimately characterize the entire annotated code segment, and then the collection of all vectors α and annotation collection D1' the natural language sentence vector representation as a training set to train the neural network mapping model M3And after training is finished, the model M is mapped through a neural network3Mapping vector α into an annotated code fragment mapped backward quantity representation α'.
Preferably, the step 4) specifically includes:
41) for the trained Encoder-Decoder natural language sentence vector generator model M1Given a natural language input, calculating to obtain a query statement vector β of a specified dimension;
42) the similarity between the two vectors is expressed by the included angle cos theta of the two vectors, the similarity value between the mapped vector representation α' and the query statement vector β is calculated, the N code segments which are most similar to the vector are recommended for the given natural language query, and the N code segments are ranked from high to low according to the similarity.
The invention has the beneficial effects that:
the invention utilizes the function of deep learning technology in natural language processing and the advantages of deep learning technology in language semantic mining to solve the problem of how to recommend high-quality annotated code segments according to given natural language query; has the following advantages:
(1) the natural language processing by deep learning can really dig the natural language semantics deeply, but not only matching by text keywords, so that sentence vectors corresponding to sentences with the same semantics are closer in semantic space distance, the meaning to be expressed by query can be really dug, the matching during recommendation is more accurate, and the recommendation accuracy is improved.
(2) The processing method for paragraph vectorization of the code segment main body by utilizing deep learning can be used for mining the structural information of the code segment and semantic information at a programming language level, and not only is simple feature word extraction performed, so that the information contained in the code segment main body is fully mined, and the recommendation effect of the code segment can be improved.
(3) N code segments with the most similar annotation semantics are obtained by deep semantic matching and are used as recommendation results, and the N code segments are ranked from high to low according to semantic similarity, so that even if the input query expression is not clear enough or has slight deviation, a proper recommendation result can be found at a relatively low position, the recall ratio is improved, and certain fault-tolerant capability is realized.
Drawings
FIG. 1 is a diagram of a framework model used in generating sentence vectors and paragraph vectors in the present invention.
FIG. 2 is a schematic diagram of an Encoder-Decoder model used in the present invention.
Fig. 3 is a schematic diagram of basic units in an Encoder-Decoder model used in the present invention.
Fig. 4 is a schematic diagram of the present invention.
Detailed Description
In order to facilitate understanding of those skilled in the art, the present invention will be further described with reference to the following examples and drawings, which are not intended to limit the present invention.
The technical solution of the present invention is described in detail by using Java code segment recommendation as an example with reference to fig. 1-4 as follows:
step 1: constructing a large-scale code segment set S with method description information; wherein,
11) the method comprises the steps of obtaining Java items on an open source software platform (such as GitHub), cutting Java files in the items according to methods as units to obtain methods with method description information, and writing the methods into files with package names, class names and method names as file names.
12) And screening the preliminarily obtained code segment set S with the method description information, and deleting poor (such as no method description information) or useless (such as a test method) code segments to obtain a simplified high-quality S set.
Step 2: construction method description information set D1And method body set D for training programming language paragraph vector2
Extracting the description information of all methods to obtain a method description information set D used for training natural language sentence vectors1Extracting the first sentence of the description information of the method to obtain an annotation set D1' extracting code segment ontology of all methods to obtain the main set D of the method used for training segment vector of code segment2
Step 3, constructing and training a natural language sentence vector generator and a programming language paragraph vector generator, obtaining a vector representation α with annotation code segments, and then mapping a vector α through a trained neural network mapping model to obtain a vector α', wherein,
31) sentence vector generator for natural language describing information set D by method1For the training set, then for Encoder-Decoder natural language sentence vector generationDevice model M1Training is carried out until the state is converged to a specified state, and the training of the natural language sentence vector generator is completed; method description information set D1The first sentence annotated by each code segment is extracted as M1To generate a natural language sentence vector α1And as part of the corresponding annotated code segment vector representation;
32) paragraph vector generator for programming language and method subject set D2Encoder-Decoder programming language paragraph vector generator model M for input pairs2Training is performed, the training is completed when the training is reached to a specified convergence state, and a segment vector α of each code segment body is generated in the training process2
33) Vector α natural language sentence1And a segment vector α of the code segment body2Weighting and adding to obtain a vector α, taking the vector α as a vector which can finally represent the whole annotated code segment, then taking α corresponding to each annotated code segment and the annotated vector representation corresponding to the annotated code segment as a training set, and training the neural network mapping model M3And after training is completed, α is mapped to a mapped backward quantity representation α' through mapping of the mapping model, and the vector can represent a vector representation of the semantic vector α of the annotated code segment in the natural language semantic space.
And 4, extracting the method Name of each code segment from the code segment set S with the method description information, namely the form of package Name & class Name & method Name, and representing α 'the form of the key value pair < Name, α' >, as an index file used in recommendation with the mapping backward quantity of the code segment.
And 5: a given natural language is queried to obtain a corresponding natural language sentence vector, and then N pieces of well-matched code segments are recommended to each query in a code segment set S with method description information; wherein,
51) for a given natural language query, calculating a query statement sentence vector β corresponding to the natural language query by using a trained Encoder-Decoder natural language sentence vector generator model M1;
52) the similarity between the two vectors is represented by an included angle cos theta of the two vectors, the similarity of the query statement vector β and the similarity of the mapped vector representation α' of each code segment in the index file is obtained through query calculation of a given natural language, the N code segments which are most similar to the vector β are recommended according to the index file, and the N code segments are ranked from high to low according to the similarity.
Example (b):
firstly, cutting Java items acquired from an open source software platform GitHub to obtain code segments with annotations, and writing the code segments into a file. Taking the project of assert j-core-master as an example, the cutting results in … … of "main.
In the project assert j-core-master, 35 code segments with independent functions and high quality are obtained.
After the data set processing is completed, the annotation set D of the code segment is further obtained1’,D1The interpretive statements in' are "remove the first instance of a value if found in the list and replace it with the last item", "get file content", ". Method description information set D1,D1The description in (1) is:
"Remove the first instance of a value of a found in the list and replace it with the last item in the list. this shows a copy down of an update at the exception of the not previous list order", and the like, and code segment method subject set D2.
After all models have been trained, any natural language query can be used to obtain its corresponding sentence vector, and the natural language sentence vector α with annotated code segments1And a segment vector α for each code segment body2And add up to compute the final α as a vector representation of the annotated code segment, in the above example of code segment:
α=[0.0501139,0.0799258,0.0690878,......]
after the mapping model conversion is:
α’=[0.1001695,0.060278,0.0700396,......]
input query y ═ y1,y2,...,yt) Specifically, "remove the first instance of alist" is obtained after model processing, and a sentence vector of a specified dimension corresponding to y is obtained ("remove the first instance of a list" [0.0703125,0.0869141, 0.0878906.])。
The index file of N annotated code segments in the data set S is<N4S1i1>,<N4S2i2>,......,<N4SNiN>In particular as<“Remove the first instance of a value if found in the listand replaces it with the last item in the list”,M1>......<“Input and outputof a file”,Mk>.., having cos θ (y, α)i') are such values as 0.0054, 0.062, 0.785, respectively, and are the smallest N of S, and cos θ (y, α'1)<cosθ(y,α’2)<......<cosθ(y,α’N) Then the recommended result is:
1:<main.java.org.assertj.core&Delta&fastUnorderedRemoveInt,[0.1001695,......]>
2:<……,……>
...
N:<……,……>
the N ordered code segments are indexes when actually recommended, namely links corresponding to the code segments are expressed in the form of package names, class names and method names in the S, when a user wants to check a specific code segment, the user only needs to click to check the real code segment of the source code, and the design is based on the consideration of user comfort and attractiveness.
While the invention has been described in terms of its preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims (5)

1. A code segment recommendation method based on deep semantic mining is characterized by comprising the following steps:
step 1): constructing a large-scale code segment set S with method description information;
step 2): construction method description information set D1And method subject set D2Constructing an annotations Collection D1' training Encoder-Decoder natural language sentence vector generator model M by using constructed data set1Training Encoder-Decoder programming language paragraph vector generator model M2
Step 3), extracting the method Name of each code segment in the code segment set S, and forming a key value pair form < Name, α ' > ' with the mapped vector representation α ' of the code segment as an index file used in recommendation;
step 4): for a given natural language query, a corresponding natural language sentence vector is obtained, and then N pieces of well-ordered code segments which are most matched are recommended to each query in the code segment set S with the method description information.
2. The code segment recommendation method based on deep semantic mining as claimed in claim 1, wherein the step 1) specifically comprises: and acquiring a specific project from the open source software platform, cutting a source code file in the specific project by taking a method as a unit to obtain a code segment set S with method description information, wherein the name form of each code segment is package name, class name and method name.
3. The code segment recommendation method based on deep semantic mining as claimed in claim 2, wherein the step 1) further comprises: the specific project is a Java project or an Android project.
4. The code segment recommendation method based on deep semantic mining as claimed in claim 1, wherein the step 2) specifically comprises:
21) describing information set D in a method1For training set pair Encoder-Decoder natural language sentence vector generator model M1Training is carried out, convergence is carried out to a specified state, and training of the natural language sentence vector generator is completed; method description information set D1The first sentence annotated to each code segment is extracted as input and a natural language sentence vector α is generated1And as part of the corresponding annotated code segment vector representation;
22) with method subject set D2Vector generator model M for training set pairs Encoder-Decoder programming language paragraphs2Carry out training inTraining to a specified convergence state completes the model training and simultaneously generates a segment vector α for each code segment body2
23) Vector α natural language sentence1And a segment vector α of the code segment body2Weighted addition to obtain vector α as the vector that can ultimately characterize the entire annotated code segment, and then the collection of all vectors α and annotation collection D1' the natural language sentence vector representation as a training set to train the neural network mapping model M3And after training is finished, the model M is mapped through a neural network3Mapping vector α into an annotated code fragment mapped backward quantity representation α'.
5. The code segment recommendation method based on deep semantic mining as claimed in claim 1, wherein the step 4) specifically comprises:
41) for the trained Encoder-Decoder natural language sentence vector generator model M1Given a natural language input, calculating to obtain a query statement vector β of a specified dimension;
42) the similarity between the two vectors is expressed by the included angle cos theta of the two vectors, the similarity value between the mapped vector representation α' and the query statement vector β is calculated, the N code segments which are most similar to the vector are recommended for the given natural language query, and the N code segments are ranked from high to low according to the similarity.
CN201810371788.5A 2018-04-24 2018-04-24 Code segment recommendation method based on deep semantic mining Active CN108717423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810371788.5A CN108717423B (en) 2018-04-24 2018-04-24 Code segment recommendation method based on deep semantic mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810371788.5A CN108717423B (en) 2018-04-24 2018-04-24 Code segment recommendation method based on deep semantic mining

Publications (2)

Publication Number Publication Date
CN108717423A true CN108717423A (en) 2018-10-30
CN108717423B CN108717423B (en) 2020-07-07

Family

ID=63899075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810371788.5A Active CN108717423B (en) 2018-04-24 2018-04-24 Code segment recommendation method based on deep semantic mining

Country Status (1)

Country Link
CN (1) CN108717423B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670022A (en) * 2018-12-13 2019-04-23 南京航空航天大学 A kind of java application interface use pattern recommended method based on semantic similarity
CN110716749A (en) * 2019-09-03 2020-01-21 东南大学 Code searching method based on function similarity matching
CN110806861A (en) * 2019-10-10 2020-02-18 南京航空航天大学 API recommendation method and terminal combining user feedback information
CN111061935A (en) * 2019-12-16 2020-04-24 北京理工大学 Science and technology writing recommendation method based on self-attention mechanism
CN111142850A (en) * 2019-12-23 2020-05-12 南京航空航天大学 Code segment recommendation method and device based on deep neural network
CN111191002A (en) * 2019-12-26 2020-05-22 武汉大学 Neural code searching method and device based on hierarchical embedding
CN111459491A (en) * 2020-03-17 2020-07-28 南京航空航天大学 Code recommendation method based on tree neural network
CN111522839A (en) * 2020-04-25 2020-08-11 华中科技大学 Natural language query method based on deep learning
CN111857660A (en) * 2020-07-06 2020-10-30 南京航空航天大学 Context-aware API recommendation method and terminal based on query statement
US11645054B2 (en) 2021-06-03 2023-05-09 International Business Machines Corporation Mapping natural language and code segments
US11720346B2 (en) 2020-10-02 2023-08-08 International Business Machines Corporation Semantic code retrieval using graph matching

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105190597A (en) * 2012-12-13 2015-12-23 微软技术许可有限责任公司 Social-based information recommendation system
US9557972B2 (en) * 2014-03-25 2017-01-31 Electronics And Telecommunications Research Institute System and method for code recommendation and share
CN106462399A (en) * 2014-06-30 2017-02-22 微软技术许可有限责任公司 Code recommendation
CN107506414A (en) * 2017-08-11 2017-12-22 武汉大学 A kind of code based on shot and long term memory network recommends method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105190597A (en) * 2012-12-13 2015-12-23 微软技术许可有限责任公司 Social-based information recommendation system
US9557972B2 (en) * 2014-03-25 2017-01-31 Electronics And Telecommunications Research Institute System and method for code recommendation and share
CN106462399A (en) * 2014-06-30 2017-02-22 微软技术许可有限责任公司 Code recommendation
CN107506414A (en) * 2017-08-11 2017-12-22 武汉大学 A kind of code based on shot and long term memory network recommends method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吕飞: "基于搜索的代码推荐技术研究", 《万方学位论文数据库 硕士论文》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670022A (en) * 2018-12-13 2019-04-23 南京航空航天大学 A kind of java application interface use pattern recommended method based on semantic similarity
CN109670022B (en) * 2018-12-13 2023-09-29 南京航空航天大学 Java application program interface use mode recommendation method based on semantic similarity
CN110716749A (en) * 2019-09-03 2020-01-21 东南大学 Code searching method based on function similarity matching
CN110716749B (en) * 2019-09-03 2023-08-04 东南大学 Code searching method based on functional similarity matching
CN110806861A (en) * 2019-10-10 2020-02-18 南京航空航天大学 API recommendation method and terminal combining user feedback information
CN111061935B (en) * 2019-12-16 2022-04-12 北京理工大学 Science and technology writing recommendation method based on self-attention mechanism
CN111061935A (en) * 2019-12-16 2020-04-24 北京理工大学 Science and technology writing recommendation method based on self-attention mechanism
CN111142850A (en) * 2019-12-23 2020-05-12 南京航空航天大学 Code segment recommendation method and device based on deep neural network
CN111191002A (en) * 2019-12-26 2020-05-22 武汉大学 Neural code searching method and device based on hierarchical embedding
CN111459491B (en) * 2020-03-17 2021-11-05 南京航空航天大学 Code recommendation method based on tree neural network
CN111459491A (en) * 2020-03-17 2020-07-28 南京航空航天大学 Code recommendation method based on tree neural network
CN111522839A (en) * 2020-04-25 2020-08-11 华中科技大学 Natural language query method based on deep learning
CN111522839B (en) * 2020-04-25 2023-09-01 华中科技大学 Deep learning-based natural language query method
CN111857660A (en) * 2020-07-06 2020-10-30 南京航空航天大学 Context-aware API recommendation method and terminal based on query statement
US11720346B2 (en) 2020-10-02 2023-08-08 International Business Machines Corporation Semantic code retrieval using graph matching
US11645054B2 (en) 2021-06-03 2023-05-09 International Business Machines Corporation Mapping natural language and code segments

Also Published As

Publication number Publication date
CN108717423B (en) 2020-07-07

Similar Documents

Publication Publication Date Title
CN108717423B (en) Code segment recommendation method based on deep semantic mining
France et al. The UML as a formal modeling notation
CN111090461B (en) Code annotation generation method based on machine translation model
WO2021213314A1 (en) Data processing method and device, and computer readable storage medium
CN101630314B (en) Semantic query expansion method based on domain knowledge
US20110219360A1 (en) Software debugging recommendations
WO2020010834A1 (en) Faq question and answer library generalization method, apparatus, and device
CN102567306B (en) Acquisition method and acquisition system for similarity of vocabularies between different languages
CN103593335A (en) Chinese semantic proofreading method based on ontology consistency verification and reasoning
Ockeloen et al. BiographyNet: Managing Provenance at Multiple Levels and from Different Perspectives.
Chen Extraction and visualization of traceability relationships between documents and source code
CN113779062A (en) SQL statement generation method and device, storage medium and electronic equipment
CN107656921A (en) A kind of short text dependency analysis method based on deep learning
Ell et al. SPARQL query verbalization for explaining semantic search engine queries
CN118170894A (en) Knowledge graph question-answering method, knowledge graph question-answering device and storage medium
Balsmeier et al. Automated disambiguation of us patent grants and applications
CN114911893A (en) Method and system for automatically constructing knowledge base based on knowledge graph
CN115203337A (en) Database metadata relation knowledge graph generation method
Kim et al. Towards a semantic data infrastructure for heterogeneous Cultural Heritage data-challenges of Korean Cultural Heritage Data Model (KCHDM)
CN117891923A (en) Legal question-answering system based on intention recognition and knowledge graph
Qin et al. The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
CN117473971A (en) Automatic generation method and system for bidding documents based on purchasing text library
CN112269884A (en) Information extraction method, device, equipment and storage medium
Zhen et al. Frequent words and syntactic context integrated biomedical discontinuous named entity recognition method
Xiong et al. OBSKP: Oracle Bone Studies Knowledge Pyramid Model With Applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant