CN113220899A - Intellectual property identity identification method based on academic talent information intellectual map - Google Patents

Intellectual property identity identification method based on academic talent information intellectual map Download PDF

Info

Publication number
CN113220899A
CN113220899A CN202110506792.XA CN202110506792A CN113220899A CN 113220899 A CN113220899 A CN 113220899A CN 202110506792 A CN202110506792 A CN 202110506792A CN 113220899 A CN113220899 A CN 113220899A
Authority
CN
China
Prior art keywords
information
intellectual property
entity
entities
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110506792.XA
Other languages
Chinese (zh)
Inventor
郑中华
胡淦
王文仲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Boyi Information Technology Co ltd
Original Assignee
Shanghai Boyi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Boyi Information Technology Co ltd filed Critical Shanghai Boyi Information Technology Co ltd
Priority to CN202110506792.XA priority Critical patent/CN113220899A/en
Publication of CN113220899A publication Critical patent/CN113220899A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • G06Q50/184Intellectual property management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Technology Law (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an intellectual property identity identification method based on an academic talent information intellectual map. The method aims at intellectual property information of the whole network, such as invention patents, thesis, soft works and the like, realizes the identification of the authors of the intellectual property information, and further completes the establishment of an academic talent information knowledge base in the big data industry. The invention has positive effect on discovering and effectively reserving high-quality talents as soon as possible.

Description

Intellectual property identity identification method based on academic talent information intellectual map
Technical Field
The invention relates to an intellectual property identity identification method based on an academic talent information intellectual map. The method mainly aims at intellectual property information of the whole network, such as patent inventions, thesis, soft works and the like, and realizes the identification of the author of the intellectual property information by the method, thereby completing the establishment of an academic talent information knowledge base in the big data industry, and having positive effects on early discovery and effective reservation of high-quality talents.
Background
The traditional entity linking method mainly comprises three processes of nominal identification, candidate entity generation and candidate entity sequencing, wherein the nominal identification is mostly acquired based on an entity identification technology, the candidate entity generation generally comprises information extraction of a knowledge base, an associated dictionary corresponding to an entity is constructed, and a large number of candidate entities can be generated only by simply matching character strings in the dictionary when candidates are generated according to the dictionary. The knowledge base is generally selected as Wikipedia, so that data limitation is large, and meanwhile, only a dictionary matching mode is adopted, so that not only are too many candidate entities caused, not only is resource waste caused, but also interference items are improved, and the accuracy is reduced.
The mainstream candidate entity ordering method adopts the idea based on similarity comparison, and the basic idea is to select the candidate entity with the maximum similarity as the link target by calculating the context similarity between the entity nominal item extracted from the text and the candidate entity obtained from the knowledge base query. Most of the similarity calculation is carried out by adopting a machine learning method based on artificially defined rules, for example, context features (keywords) and page structure features (such as page redirection, anchor text and the like) of candidate entities on a Wikipedia page are added during solving; or selecting entity popularity to assist disambiguation; or choose to add consideration to the relevance of categories between entities (link relationships, probability of co-occurrence, etc.); the method based on the artificial definition rule has large limitation, cannot acquire all rule information comprehensively, and meanwhile, the context information of the candidate entity based on the Wikipedia or the encyclopedia is not comprehensive and disordered enough, so that the method brings obstruction to accurate identification.
In addition, entity linking methods based on deep learning are also becoming more popular, compared with traditional methods, deep learning methods do not need to define relevant features manually, such as document representation of learning entities based on a Deep Neural Network (DNN), and category representation is obtained by using a CNN; or the expression, the entity and the context are subjected to embedded expression, the characteristics are extracted through CNN, and finally the similarity of the expression and the entity is calculated for linking; or the BERT pre-training language model analyzes the context of the entity nominal item and the correlation information of the candidate entity, and enhances the result of entity link by improving the semantic analysis effect. However, the deep learning method requires too much data and has high requirements on machine performance, especially in a big data scene, and meanwhile, deep learning completely depends on the corpus, and if the deep learning method is completely depended on, the effect may be worse when the corpus is deviated.
Disclosure of Invention
The invention aims to provide an intellectual property identity identification method based on an academic talent information intellectual map.
In order to solve the technical problems, the invention adopts the technical scheme that the intellectual property identity identification method based on the academic talent information intellectual map comprises the following steps:
(1) the crawler acquires talent information data comprising names, resumes and intellectual property information, and establishes a talent information intellectual map based on neo4j according to the information; the knowledge graph is formed in a triple E ═ sub, rela, obj > form and specifically comprises attribute information of the entities and the relationship among the entities;
(2) and (3) nominal identification: for intellectual property information M to be linked, directly acquiring structural form characteristic information and text information of intellectual property in the intellectual property information M based on a regularization rule, wherein M is (M ═1,M2,…,Mn) Wherein M isiObtaining the nominal item;
(3) candidate entity generation: dividing intellectual property information corresponding to entities in the intellectual property map into 4 subject categories of text, theory, agriculture and medicine, and constructing a subject classification model based on word2vec and TextCNN for judgment;
the process of candidate entity generation is as follows:
firstly, for intellectual property information M to be linked, matching is carried out through fuzzy query of an intellectual map according to a designated item, and a possible entity set D is obtained (D)1,D2,…,Dn);
Secondly, obtaining the category of the intellectual property information M by using the trained discipline classification model, wherein M istype=TextCNN(M);
And finally, respectively inputting the entities in the set D into the trained discipline classification model to obtain categories, wherein the final candidate entities are H (H ═ H)1,H2,…,Hk),{Hi∈D,Hitype=MtypeI ═ 1,2, …, k }, where k is the final number of candidate entities after class filtering;
(4) candidate entity ordering: respectively from form features FformalAnd semantic features FsemanticTwo aspects are used for carrying out entity sequencing; for each candidate entity Hi, all information can be obtained based on the knowledge graph except the structural form characteristic information (Gi) directly obtained from the graph1,Gi2,…,Gin) Besides, the method also comprises semantic features Gi based on the intellectual property contentn+1That is, all the information of the candidate entity Hi is Gi ═ (Gi)1,Gi2,…,Gin,Gin+1);
The specific candidate entity ordering process is as follows:
determining weight information of each feature as Wi ═ by using AHP (approximate height-weighted prediction) method1,Wi2,…,Win,Win+1);
Form character FformalSolving: for each candidate entity Hi, all information can be obtained based on the knowledge graph, the text content of the intellectual property information is removed, and only the structural form characteristic information Gi ═ is reserved1,Gi2,…,Gin) (ii) a For the intellectual property information M to be linked, the form characteristics are determined by calculating the matching degree of M and Gi', and the matching degree of M and Gi
Figure BDA0003058741150000031
Figure BDA0003058741150000032
Wherein M isjStructural formal characteristic information, i.e. F, for intellectual property information M to be linkedformal=Sk
Semantic feature FsemanticSolving: between intellectual property information M to be linked and candidate entity HiSemantic feature F ofsemanticThe value of (c) is measured by similarity, and character-based CBA (charbi _ lstm + attribute) network is selected to solve the similarity, specifically, semantic features Gi of candidate entities Hin+1And text information M of intellectual property information M to be linkednThe similarity probabilities [ y1, y2] are determined by a softmax layer after a bi _ lstm + attention layer, respectively]The final similarity probability y1 is FsemanticTaking the value of (A);
sorting: the final similarity Fi F between the candidate entity Hi and the intellectual property information M to be linkedsemantic+FformalThe final linked entity is the entity corresponding to max (fi).
Preferably, the intellectual property information includes information of articles and patents; the attribute information of the entity comprises basic information of a person, graduate colleges, professions, work units and intellectual property information; the relationships among the entities comprise cooperative relationships and alumni relationships.
Preferably, the structured form characteristic information includes author information, organization information, partner information, and periodical information.
Preferably, in the step (3), the construction of the TextCNN subject classification model based on word2vec includes the following specific steps:
firstly, collecting text data of intellectual property rights and labeling, wherein the labeled categories are respectively 4 categories of text (0), theory (1), agriculture (2) and medicine (3);
secondly, segmenting words and stopping the words;
thirdly, training a word2vec model, performing embedding to obtain nxk-dimensional vectors, and converting the class labels into a form of unique hot codes;
and finally, inputting the embedded vectors and the labels into a TextCNN network to train to obtain a subject classification model, wherein the subject classification model structure comprises a convolution and pooling layer, a data splicing layer, a faltent layer, a dropout layer, a full connection layer and a softmax layer.
Preferably, in the step (4), the AHP solving process first constructs an application example through an expert system, obtains an importance judgment matrix of each feature information through example analysis, and then calculates the weight by using SPSSAU software.
Preferably, the character-based CBA network similarity solution in step (4) includes the following specific steps:
first, data is collected and labeled in the form of s1, s2, label, which is 0 if s1 is similar to s2, otherwise it is 1;
secondly, obtaining vector representation of word2vec based on characters;
thirdly, inputting the data into a CBA network for training to obtain a trained CBA model;
and finally, inputting the candidate entity Hi to be identified and the intellectual property information M to be linked into the trained CBA model, and acquiring an output result [ y1, y2] of the softmax layer, wherein y1 is the similarity.
The invention has the beneficial effects that:
1. the traditional entity link can not effectively express the association degree problem between the nominal item and the candidate entity directly according to the characteristics extracted from the entity context texts such as Wikipedia and the like.
2. The traditional candidate entity generation method is directly matched without screening the part, so that the workload is increased, the interference rate is improved, and meanwhile, the situation that most of prediction error data is caused by type errors due to the fact that semantic information is directly matched is not considered.
3. In consideration of the similarity based on the conventional artificial definition rule and the superiority and inferiority based on the deep learning similarity, the invention integrates the formal features based on the knowledge map and the semantic features based on the intellectual property rights in the candidate entity ordering process, and simultaneously realizes the ordering of the similarity of the candidate entities by the AHP method based on statistics.
4. Meanwhile, in the solving process of semantic features based on intellectual property, the accuracy of the similarity of traditional cosine solving based on word2vec is low, an improved siamese CBA is adopted, characters are based, an attention mechanism is added, the similarity probability of the network is extracted to serve as a similarity value, and the accuracy is improved.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a schematic diagram of a TextCNN network structure according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a CBA network structure according to an embodiment of the present invention.
Detailed Description
The intellectual property identity identification method based on the academic talent information intellectual map comprises the following steps:
1. the crawler acquires talent information data, including names, resumes and intellectual property information, wherein the intellectual property information includes information such as articles and patents, and establishes a talent information intellectual map based on neo4j according to the information, the intellectual map is formed in a triple E ═ sub, rela, obj > form and specifically includes attribute information of entities (talents) and relationships among the entities, wherein the attribute information of the entities includes basic information of people, colleges and universities, professions, work units and intellectual property information, and the relationships among the entities include a cooperative relationship and an alumni relationship.
2. And (3) nominal identification: for the intellectual property information M to be linked, due to the normalization of the intellectual property information format, the structured form feature information and the intellectual property content information can be directly obtained based on the regularization rule, wherein M is (M ═1,M2,…,Mn) And the structural form characteristic information comprises author information, organization information, partner information and periodical information.
3. Candidate entity generation: in the conventional method for acquiring candidate entities, matching is generally performed by directly querying features in a map, or entity matching is performed according to similarity methods such as edit distance, but from the previous results, more than half of errors in a sample with model prediction errors are found to be of different types, so that the embodiment provides a type-based candidate entity generation method, which achieves a less and more precise target, improves accuracy and reduces calculation workload.
Specifically, in order to prevent omission caused by too fine type division, the embodiment divides intellectual property information corresponding to the entities in the intellectual property map into 4 subject categories of text, physics, agriculture, and medicine, and constructs a subject classification model based on word2vec and TextCNN.
The generation process of the candidate entity firstly obtains the candidate entity according to the nominal item, and then further screens by utilizing subject categories so as to reduce the operation amount. The method comprises the following specific steps:
firstly, for intellectual property information M to be linked, matching is carried out through fuzzy query of an intellectual map according to a designated item, and a possible entity set D is obtained (D)1,D2,…,Dn) (ii) a The part of query can be directly obtained by using the statement query of neo4j, so that the query is quick;
secondly, obtaining the category of the intellectual property information M by using the trained discipline classification model, wherein M istype=TextCNN(M);
And finally, respectively inputting the entities in the set D into the trained discipline classification model to obtain categories, wherein the final candidate entities are H (H ═ H)1,H2,…,Hk),{Hi∈D,Hitype=MtypeAnd i is 1,2, …, k, where k is the final number of candidate entities after class filtering.
The construction of the textCNN subject classification model based on word2vec comprises the following specific steps:
firstly, collecting text data of intellectual property rights and labeling, wherein the labeled categories are respectively 4 categories of text (0), theory (1), agriculture (2) and medicine (3);
secondly, segmenting words and stopping the words;
thirdly, training a word2vec model, performing embedding to obtain nxk-dimensional vectors, and converting the class labels into a form of unique hot codes;
and finally, inputting the embedded vectors and the labels into a TextCNN network to train to obtain a subject classification model, wherein the subject classification model structure comprises a convolution and pooling layer, a data splicing layer, a faltent layer, a dropout layer, a full connection layer and a softmax layer.
Fig. 1 shows a specific structure of the TextCNN network.
4. Candidate entity ordering: most of the traditional candidate entity sorting methods are based on similarity, for example, sorting is carried out by means of artificially defined characteristics between the referee and the target entity; or directly solving the similarity between the context information and the user information by utilizing a deep learning method. Considering the superiority and inferiority of each of the two methods, the embodiment integrates the two methods, fuses the structural information of the knowledge graph, and respectively adopts the form characteristics FformalAnd semantic features FsemanticTwo aspects are used to perform entity ordering.
For each candidate entity Hi, all information can be obtained based on the knowledge graph except the structural form characteristic information (Gi) directly obtained from the graph1,Gi2,…,Gin) Besides, the method also comprises semantic features Gi based on the intellectual property contentn+1That is, all the information of the candidate entity Hi is Gi ═ (Gi)1,Gi2,…,Gin,Gin+1)。
The specific candidate entity ordering process is as follows:
1) determining weight information of each feature as Wi ═ by using AHP (approximate height-weighted prediction) method1,Wi2,…,Win,Win+1) (ii) a The specific AHP solving method can firstly construct an application example, obtain an importance judgment matrix of each characteristic information through example analysis, and then calculate the weight by utilizing SPSSAU software.
2) Form character FformalSolving: for each candidate entity Hi, all information can be obtained based on the knowledge graph, the text content of the intellectual property information is removed, and only the structural form characteristic information Gi ═ is reserved1,Gi2,…,Gin) (ii) a For intellectual property information M to be linked, determining the shape by calculating the matching degree of M and GiFormula characteristics, degree of matching of M to Gi
Figure BDA0003058741150000081
Figure BDA0003058741150000082
Wherein M isjStructural formal characteristic information, i.e. F, for intellectual property information M to be linkedformal=Sk
3) Semantic feature FsemanticSolving: semantic features F between intellectual property information M to be linked and candidate entities HisemanticThe value of (1) is measured by similarity, and character-based CBA (charbi _ lstm + attribute) network is selected to solve the similarity, specifically, semantic features Gi of candidate entities Hi are used for solving the similarityn+1And text information M of intellectual property information M to be linkednInputting the probability into CBA network to obtain the similar probability [ y1, y2]]The final similarity probability y1 is FsemanticThe value of (a).
As shown in fig. 2, the character-based CBA network construction process includes the following steps:
first, data is collected and labeled in the form of s1, s2, label, which is 0 if s1 is similar to s2, otherwise it is 1;
secondly, obtaining vector representation of word2vec based on characters;
and finally, inputting the data and the label into a CBA network for training to obtain a trained CBA model.
4) Sorting: the final similarity Fi F between the candidate entity Hi and the intellectual property information M to be linkedsemantic+FformalThe final linked entity is the entity corresponding to max (fi).
The embodiment has the following technical characteristics:
1. the embodiment integrates data accumulation of the big data industry, provides the entity link model fusing the structural information of the knowledge graph, and improves the convenience, richness and effectiveness of data.
2. According to the candidate entity generation method based on the type, the calculation workload is reduced, meanwhile, the interference items are reduced, and the accuracy is improved.
3. In the candidate entity ordering process, the formal features based on the intellectual map and the semantic features based on the intellectual property are integrated, and the ordering of the similarity of the candidate entities is realized by the AHP method based on statistics.
4. An improved siamese network CBA is adopted, based on characters, an attention mechanism is added, the similarity probability of the network is extracted to serve as a similarity value, the method replaces the traditional cosine similarity method directly based on word2vec, word segmentation interference is avoided, semantic relevance is increased, and accuracy is improved.
The above-described embodiments of the present invention do not limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (6)

1. The intellectual property identity identification method based on the academic talent information intellectual map comprises the following steps:
(1) the crawler acquires talent information data comprising names, resumes and intellectual property information, and establishes a talent information intellectual map based on neo4j according to the information; the knowledge graph is formed in a triple E ═ sub, rela, obj > form and specifically comprises attribute information of the entities and the relationship among the entities;
(2) and (3) nominal identification: for intellectual property information M to be linked, directly acquiring structural form characteristic information and text information of intellectual property in the intellectual property information M based on a regularization rule, wherein M is (M ═1,M2,…,Mn) Wherein M isiObtaining the nominal item;
(3) candidate entity generation: dividing intellectual property information corresponding to entities in the intellectual property map into 4 subject categories of text, theory, agriculture and medicine, and constructing a subject classification model based on word2vec and TextCNN for judgment;
the process of candidate entity generation is as follows:
firstly, for intellectual property information M to be linked, matching is carried out through fuzzy query of an intellectual map according to a designated item, and a possible entity set D is obtained (D)1,D2,…,Dn);
Secondly, obtaining the category of the intellectual property information M by using the trained discipline classification model, wherein M istype=TextCNN(M);
And finally, respectively inputting the entities in the set D into the trained discipline classification model to obtain categories, wherein the final candidate entities are H (H ═ H)1,H2,…,Hk),{Hi∈D,Hitype=MtypeI ═ 1,2, …, k }, where k is the final number of candidate entities after class filtering;
(4) candidate entity ordering: respectively from form features FformalAnd semantic features FsemanticTwo aspects are used for carrying out entity sequencing; for each candidate entity Hi, all information can be obtained based on the knowledge graph except the structural form characteristic information (Gi) directly obtained from the graph1,Gi2,…,Gin) Besides, the method also comprises semantic features Gi based on the intellectual property contentn+1That is, all the information of the candidate entity Hi is Gi ═ (Gi)1,Gi2,…,Gin,Gin+1);
The specific candidate entity ordering process is as follows:
determining weight information of each feature as Wi ═ by using AHP (approximate height-weighted prediction) method1,Wi2,…,Win,Win+1);
Form character FformalSolving: for each candidate entity Hi, all information can be obtained based on the knowledge graph, the text content of the intellectual property information is removed, and only the structural form characteristic information Gi ═ is reserved1,Gi2,…,Gin) (ii) a For the intellectual property information M to be linked, the form characteristics are determined by calculating the matching degree of M and Gi', and the matching degree of M and Gi
Figure FDA0003058741140000021
Wherein M isjStructural formal characteristic information, i.e. F, for intellectual property information M to be linkedformal=Sk
Semantic feature FsemanticSolving: semantic features F between intellectual property information M to be linked and candidate entities HisemanticThe value of (c) is measured by similarity, and character-based CBA (charbi-lstm + attribute) network is selected to solve the similarity, specifically, semantic features Gi of candidate entities Hin+1And text information M of intellectual property information M to be linkednThe similarity probabilities [ y1, y2] are determined by a softmax layer after passing through a bi-lstm + attention layer respectively]The final similarity probability y1 is FsemanticTaking the value of (A);
sorting: the final similarity of the candidate entity Hi and the intellectual property information M to be linked is listed as Fsemantic+FformalThe final linked entity is the entity corresponding to max (fi).
2. The intellectual property identity recognition method of claim 1, wherein: the intellectual property information comprises information of articles and patents; the attribute information of the entity comprises basic information of a person, graduate colleges, professions, work units and intellectual property information; the relationships among the entities comprise cooperative relationships and alumni relationships.
3. The intellectual property identity recognition method of claim 1, wherein: the structural form characteristic information comprises author information, organization information, partner information and periodical information.
4. The intellectual property identity recognition method of claim 1, wherein: in the step (3), the construction of the TextCNN subject classification model based on word2vec comprises the following specific steps:
firstly, collecting text data of intellectual property rights and labeling, wherein the labeled categories are respectively 4 categories of text (0), theory (1), agriculture (2) and medicine (3);
secondly, segmenting words and stopping the words;
thirdly, training a word2vec model, performing embedding to obtain nxk-dimensional vectors, and converting the class labels into a form of unique hot codes;
and finally, inputting the embedded vectors and the labels into a TextCNN network to train to obtain a subject classification model, wherein the subject classification model structure comprises a convolution and pooling layer, a data splicing layer, a faltent layer, a dropout layer, a full connection layer and a softmax layer.
5. The intellectual property identity recognition method of claim 1, wherein: in the step (4), the AHP solving process firstly constructs an application example through an expert system, obtains an importance judgment matrix of each characteristic information through example analysis, and then calculates the weight by utilizing SPSSAU software.
6. The intellectual property identity recognition method of claim 1, wherein: the character-based CBA network similarity solving method in the step (4) comprises the following specific steps:
first, data is collected and labeled as s1, s2, label, with s1 being similar to s2, labe1 being 0, otherwise 1;
secondly, obtaining vector representation of word2vec based on characters;
thirdly, inputting the data into a CBA network for training to obtain a trained CBA model;
and finally, inputting the candidate entity Hi to be identified and the intellectual property information M to be linked into the trained CBA model, and acquiring an output result [ y1, y2] of the softmax layer, wherein y1 is the similarity.
CN202110506792.XA 2021-05-10 2021-05-10 Intellectual property identity identification method based on academic talent information intellectual map Withdrawn CN113220899A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110506792.XA CN113220899A (en) 2021-05-10 2021-05-10 Intellectual property identity identification method based on academic talent information intellectual map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110506792.XA CN113220899A (en) 2021-05-10 2021-05-10 Intellectual property identity identification method based on academic talent information intellectual map

Publications (1)

Publication Number Publication Date
CN113220899A true CN113220899A (en) 2021-08-06

Family

ID=77094162

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110506792.XA Withdrawn CN113220899A (en) 2021-05-10 2021-05-10 Intellectual property identity identification method based on academic talent information intellectual map

Country Status (1)

Country Link
CN (1) CN113220899A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114913514A (en) * 2021-12-23 2022-08-16 号百信息服务有限公司 Intelligent abnormal vehicle moving identification system
CN115170353A (en) * 2022-07-12 2022-10-11 朗动信息咨询(上海)有限公司 Intellectual property achievement transformation analysis and evaluation system based on big data processing

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268643A (en) * 2018-01-22 2018-07-10 北京邮电大学 A kind of Deep Semantics matching entities link method based on more granularity LSTM networks
CN108920556A (en) * 2018-06-20 2018-11-30 华东师范大学 Recommendation expert method based on subject knowledge map
CN110990590A (en) * 2019-12-20 2020-04-10 北京大学 Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning
CN112131404A (en) * 2020-09-19 2020-12-25 哈尔滨工程大学 Entity alignment method in four-risk one-gold domain knowledge graph
CN112131275A (en) * 2020-09-23 2020-12-25 中国科学技术大学智慧城市研究院(芜湖) Enterprise portrait construction method of holographic city big data model and knowledge graph
CN112330183A (en) * 2020-11-18 2021-02-05 布瑞克农业大数据科技集团有限公司 Method and system for constructing big data portrait of agricultural enterprise
CN112380865A (en) * 2020-11-10 2021-02-19 北京小米松果电子有限公司 Method, device and storage medium for identifying entity in text

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268643A (en) * 2018-01-22 2018-07-10 北京邮电大学 A kind of Deep Semantics matching entities link method based on more granularity LSTM networks
CN108920556A (en) * 2018-06-20 2018-11-30 华东师范大学 Recommendation expert method based on subject knowledge map
CN110990590A (en) * 2019-12-20 2020-04-10 北京大学 Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning
CN112131404A (en) * 2020-09-19 2020-12-25 哈尔滨工程大学 Entity alignment method in four-risk one-gold domain knowledge graph
CN112131275A (en) * 2020-09-23 2020-12-25 中国科学技术大学智慧城市研究院(芜湖) Enterprise portrait construction method of holographic city big data model and knowledge graph
CN112380865A (en) * 2020-11-10 2021-02-19 北京小米松果电子有限公司 Method, device and storage medium for identifying entity in text
CN112330183A (en) * 2020-11-18 2021-02-05 布瑞克农业大数据科技集团有限公司 Method and system for constructing big data portrait of agricultural enterprise

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗安根: "《融合知识图谱的实体链接的算法研究》", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114913514A (en) * 2021-12-23 2022-08-16 号百信息服务有限公司 Intelligent abnormal vehicle moving identification system
CN115170353A (en) * 2022-07-12 2022-10-11 朗动信息咨询(上海)有限公司 Intellectual property achievement transformation analysis and evaluation system based on big data processing

Similar Documents

Publication Publication Date Title
CN106776711B (en) Chinese medical knowledge map construction method based on deep learning
WO2018196561A1 (en) Label information generating method and device for application and storage medium
CN111813950B (en) Building field knowledge graph construction method based on neural network self-adaptive optimization tuning
CN110990590A (en) Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning
CN110134757A (en) A kind of event argument roles abstracting method based on bull attention mechanism
CN112733866B (en) Network construction method for improving text description correctness of controllable image
CN113515632B (en) Text classification method based on graph path knowledge extraction
CN111858940B (en) Multi-head attention-based legal case similarity calculation method and system
CN111159485B (en) Tail entity linking method, device, server and storage medium
CN112069408A (en) Recommendation system and method for fusion relation extraction
CN111061939B (en) Scientific research academic news keyword matching recommendation method based on deep learning
CN113806563A (en) Architect knowledge graph construction method for multi-source heterogeneous building humanistic historical material
CN111858896B (en) Knowledge base question-answering method based on deep learning
CN115438674B (en) Entity data processing method, entity linking method, entity data processing device, entity linking device and computer equipment
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN113220899A (en) Intellectual property identity identification method based on academic talent information intellectual map
CN117236338B (en) Named entity recognition model of dense entity text and training method thereof
CN116975256B (en) Method and system for processing multisource information in construction process of underground factory building of pumped storage power station
CN112256904A (en) Image retrieval method based on visual description sentences
CN107391565A (en) A kind of across language hierarchy taxonomic hierarchies matching process based on topic model
CN114997288A (en) Design resource association method
CN114238653A (en) Method for establishing, complementing and intelligently asking and answering knowledge graph of programming education
CN116108191A (en) Deep learning model recommendation method based on knowledge graph
CN117371481A (en) Neural network model retrieval method based on meta learning
CN112417322A (en) Type discrimination method and system for interest point name text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210806