CN110362692A - A kind of academic circle construction method of knowledge based map - Google Patents
A kind of academic circle construction method of knowledge based map Download PDFInfo
- Publication number
- CN110362692A CN110362692A CN201910668329.8A CN201910668329A CN110362692A CN 110362692 A CN110362692 A CN 110362692A CN 201910668329 A CN201910668329 A CN 201910668329A CN 110362692 A CN110362692 A CN 110362692A
- Authority
- CN
- China
- Prior art keywords
- entity
- author
- academic
- paper
- periodical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010276 construction Methods 0.000 title claims abstract description 9
- 239000007787 solid Substances 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000011160 research Methods 0.000 claims description 19
- 238000000034 method Methods 0.000 claims description 17
- 238000005516 engineering process Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 238000004891 communication Methods 0.000 claims description 3
- 238000011161 development Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000008859 change Effects 0.000 description 2
- 230000009193 crawling Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241001105482 Prionoxystus robiniae Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Databases & Information Systems (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of academic circle construction methods of knowledge based map, comprising steps of step 1, obtains all academic paper information and all academic periodical informations, and as initial data source;Step 2, author, paper and periodical these three entity informations are extracted from initial data source, constitute entity data set;Step 3, author's entity of the same name is concentrated to solid data, and disambiguation processing of the same name is carried out based on mutual similarity;Step 4, the entity data set obtained after disambiguation of the same name processing is stored in Neo4j chart database, forms entity node;Based on the public attribute feature between different entities, the opening relationships side between different entities node finally obtains the academic circle of knowledge based map.The academic circle with logical relation that the present invention constructs, data accuracy is high, is conducive to user and quickly and effectively gets logical relation between required knowledge and required knowledge.
Description
Technical field
The present invention relates to academic social networks technical field, in particular to a kind of academic circle building sides of knowledge based map
Method.
Background technique
With the development of computer networking technology, the hardware and software platform of academic social networks and networking have also obtained rapid hair
Exhibition, provides good academic exchange platform for scholar.Currently, more famous academic social networks has both at home and abroad
ResearchGate, Academia, scientific net and small carpenter worm.With the development of academic activities and scientific research, all can daily
There are the addition of new scholar and scientific research personnel, will lead to the fierce multiplicity to increase severely with scholar's user type of scholar's quantity in this way
Change, therefore, a good academic social networks will be discussed important with academic exchange as every field scholar scientific achievement
Platform.Researcher can cooperate on academic social networks, participate in peer review, share their research, or even divide
Enjoy data.Therefore, it receives the favor of a large amount of scholars, especially young scholar.It can be said that academic social networks is
Gradually change our research mode.The researching value of academic social networks has caused the close attention of scholars.Research
Personnel have carried out a large amount of research to academic social networks, find academic social networks in the exchanges and cooperation that advance science, and
It carries out substitution metering aspect and plays positive effect.
Early in 2000, external academia attempted to set up the professional social networks specifically for researcher, such as
SciLinks, Scientist So-lutions, Nature Network etc. provide basic clothes for the online exchange of researcher
Business.With being continued to develop towards public social networks, the well-known social network sites such as Face-book, Twitter also begin trying for
Researcher builds academic exchange platform, but the professional of its science service receives the query of some scholars.Until 2008,
There is the online academic exchange platform using ResearchGate, Mendeley as representative in foreign countries, have incorporated Open Access Journals and society
The theory for handing over network can not only help researcher to find the scholar of same area and provide online service for them, moreover it is possible to
The channel for obtaining a large amount of valuable knowledge resources is provided for researcher.Then, also there is a batch and has identity function in the country
Website, wherein representational includes scholar's net, Phegda science circle, Baidu academic, scientific net, CNKI scholar's circle etc..These
It is dedicated to the rise and development that promote the website of academic exchange and cooperation to push academic social networks.Academic social networks be with
For the purpose of promoting exchange of knowledge and diffusion, researcher can be helped to establish and safeguard their human relation network, while can
Them are supported to be engaged in service or the platform of Activities in the course of the research.
And academic social networks has the following problems at present: existing science social networks provides good cooperation for its user
The function of platform, but it is really considerably less in the cooperative relationship being established above.The reason for this is that existing science social network
Network provides multiple groups for scholar, and different subjects and theme is added according to oneself specialty background and hobby for scholar
In group, most of group is caused to be from the member composition of different discipline backgrounds, so that group has apparent hand over
Phenomenon is pitched, so that the storage of existing academic information data is scattered, so that established based on storing scattered academic information data
Science circle data are inaccurate.
Summary of the invention
The inaccurate problem of scattered and building academic circle data is stored for existing academic information data, the present invention proposes
A kind of academic circle construction method of knowledge based map, can be improved the accuracy of academic circle data.
To realize the above-mentioned technical purpose, the present invention adopts the following technical scheme:
A kind of academic circle construction method of knowledge based map, comprising the following steps:
Step 1, all academic paper information and all academic periodical informations are obtained, and as initial data source;
Step 2, the entity information of pre-selection entity type is extracted from initial data source, constitutes entity data set;It is described pre-
Selecting entity type includes author, paper and periodical;
Step 3, author's entity of the same name is concentrated to solid data, is carried out at disambiguation of the same name based on mutual similarity
Reason;
Step 4, the entity data set obtained after disambiguation of the same name processing is stored in Neo4j chart database, forms entity
Node;Based on the public attribute feature between different entities, opening relationships side, finally obtains knowledge based between different entities node
The academic circle of map.
The present invention utilizes Neo4j by extracting the entity of author, 3 seed type of paper and periodical from initial data source
Chart database constructs entity node;It then is difference using the public attribute feature between different entities in Neo4j chart database
Entity node opening relationships side obtains the academic circle of knowledge based map.It is equivalent to three kinds of paper, author and periodical inhomogeneities
Relationship is connected in a relational network between the entity and entity of type, and composition is mutually related academic circle, and then user can be with
By the academic circle with logical relation, the logical relation between required knowledge and required knowledge is quickly and effectively got,
Related fields information can comprehensively be understood, provided for user and accurately effectively found potential affiliate support is provided, it can be with
Aid decision etc. is provided for selecting for scientific and technological evaluation expert.
Meanwhile when extracting entity, be equivalent to and weed out the invalid information in initial data source, retain effective information with
All types of entities is established, the validity of solid data can be improved, and then improves the accuracy of constructed academic circle data.
Moreover, can also be improved solid data by concentrating author's entity of the same name to carry out disambiguation processing solid data
Accuracy, and then improve the accuracy of academic circle data.
Further, the detailed process of the step 3 are as follows:
Step 3.1, author's entity is expressed as the feature vector being made of its attribute value;
Step 3.2, all author's entities of the same name are taken, it is similar between any two author's entity of the same name by calculating
Degree, and compared with similarity threshold, the maximum similarity value greater than similarity threshold is taken, it will be two corresponding to maximum similarity value
A author's entity cluster of the same name is cluster, obtains author's entity set;
Wherein, the calculating formula of similarity between any two author of the same name are as follows:
SijIndicate two author's entity a of the same nameiWith author's entity ajBetween similarity, simattr() indicates similarity
Calculate function;
Step 3.3, other any author's entities that the author's entity set obtained with previous step is of the same name are taken, if with author's reality
Similarity between any of body collection author's entity is greater than similarity threshold, then obtains author's entity addition previous step
Author's entity set in;
Step 3.4, it by remaining author's entity of the same name, is handled again by step 3.2 to step 3.3, until to institute
There is author's Entities Matching of the same name to corresponding author's entity set;
Step 3.5, all author's entities in same author's entity set are merged into same author's entity, and to obtain
Author entity setting up author id;And the author id of author's entity in all different authors entity sets is set as different.
Further, author's entity is expressed as the feature vector as composed by following attribute value, the following attribute value
It include: authors' name, scientific research field, affiliated unit and co-author.
Further, the academic paper information is by using crawler technology from web of science bibliographic data base
It acquires, the academic periodical information is acquired from letpub webpage by using crawler technology, and academic paper is believed
Breath and academic periodical information are stored in respectively in the different files of identical csv format.
Widely distributed and low the degree of association academic paper information and academic periodical information are obtained, using crawler technology with building
Entity simultaneously establishes entity relationship based on public attribute, can simplify the data framework of academic circle, so that the availability of academic circle is more
It is high.
Further, the entity information of pre-selection entity type is extracted in step 2 from initial data source, constitutes solid data
The detailed process of collection are as follows:
Step 2.1, initial data source is imported in database;
Step 2.2, data are extracted from initial data source:
Data are extracted from the academic paper information of initial data source in the database: paper name, paper keyword, scientific research
Field, author, time, journal title, periodical id;Data are extracted from the academic periodical information of initial data source in the database:
Journal title, periodical id, impact factor, subregion;
Step 2.3, all paper entities, author's entity and periodical entity, structure are extracted from the data that step 2.2 is extracted
At entity data set;
Wherein, the paper entity obtained includes attribute: paper name, paper id, author, time, journal title, periodical id;?
To author's entity include attribute: authors' name, co-author, scientific research field, affiliated unit;Obtained periodical entity includes attribute:
Journal title, periodical id, impact factor, subregion;The co-author is when extracting author's entity from academic paper information, to extract opinion
The communication author and the first authors of text obtain;
Each attribute of each entity is saved according to triple form are as follows: (entity, attribute-name, attribute value).
Further, the detailed process of the step 4 are as follows:
Step 4.1, the file that all entities that solid data is concentrated are exported as to csv format from database, then leads
Enter into Neo4j chart database, the corresponding entity of each id is respectively formed an entity node in Neo4j chart database;
Step 4.2, using attributive character public between different entities, extract the relationship between different entities: difference is made
Being between person's entity is to deliver between relationship, periodical entity and paper entity between cooperative relationship, author's entity and paper entity
For the relationship of including;
Step 4.3, in Neo4j chart database, will have between related entity node using corresponding relation type
While being attached.
Beneficial effect
This programme utilizes Neo4j by extracting the entity of author, 3 seed type of paper and periodical from initial data source
Chart database constructs entity node;It then is difference using the public attribute feature between different entities in Neo4j chart database
Entity node opening relationships side obtains the academic circle of knowledge based map.It is equivalent to three kinds of paper, author and periodical inhomogeneities
Relationship is connected in a relational network between the entity and entity of type, and composition is mutually related academic circle, and then user can be with
By the academic circle with logical relation, the logical relation between required knowledge and required knowledge is quickly and effectively got,
Related fields information can comprehensively be understood, provided for user and accurately effectively found potential affiliate support is provided, it can be with
Aid decision etc. is provided for selecting for scientific and technological evaluation expert.
Meanwhile when extracting entity, be equivalent to and weed out the invalid information in initial data source, retain effective information with
All types of entities is established, the validity of solid data can be improved, and then improves the accuracy of constructed academic circle data.
Moreover, can also be improved solid data by concentrating author's entity of the same name to carry out disambiguation processing solid data
Accuracy, and then improve the accuracy of academic circle data.
Detailed description of the invention
Fig. 1 is the flow chart of the method for the invention.
Specific embodiment
Elaborate below to the embodiment of the present invention, the present embodiment with the technical scheme is that according to development,
The detailed implementation method and specific operation process are given, is further explained explanation to technical solution of the present invention.
The academic circle construction method of a kind of knowledge based map provided by the invention, by extracting data, extraction entity, building
Vertical entity relationship, is connected to a network of personal connections for relationship between paper, three kinds of different types of entities of author and periodical and entity
In network, the academic circle that is mutually related is constituted, and then user can quickly and effectively obtain by the academic circle with logical relation
The logical relation between required knowledge and required knowledge is got, related fields information can be comprehensively understood.
The present invention is based on the academic circle construction methods of knowledge mapping, as shown in Figure 1, comprising the following steps:
Step 1, all academic paper information and all academic periodical informations are obtained, and as initial data source;
For the authenticity of data, the present embodiment is carried out from web of science bibliographic data base using crawler technology
Crawling for data obtains academic paper information, and data are carried out from letpub webpage crawls acquisition academic periodical information, and learns
Art paper information and academic periodical information are stored in respectively in different excel tables.
Academic paper information includes paper name, author, journal title and scientific research field etc..It is read crawling academic paper information
When paper txt file, continues if reading file is out of question, paper text is re-read if reading file and having omission
Part.And Web of Science bibliographic data base is only supported once to download 500 information, it is therefore desirable to recycle every 500 information
Downloading is primary, and downloading click export can be obtained the academic paper information list using csv form document as format every time, will crawl
Data be written in csv formatted file, and by selecting tab-delimited critical field, and separate the time and refreshed.So
The data crawled are put into inside excel table afterwards, every a line represents the relevant information of an academic paper.Analyze specific field
Information, and will there is the field of multiple data to separate in each column, obtain final excel corresponding with academic paper information
Form document.
Academic periodical information includes journal title, impact factor, subregion etc., wherein impact factor and subregion are to judge periodical
Horizontal index.To academic periodical information crawl and store method, identical as academic paper information, details are not described herein.
Step 2, the entity information of pre-selection entity type is extracted from initial data source, constitutes entity data set;
In the huge initial data source of data volume, it is the data for not actually using value compared with multi information, is constructed
Not only construction work amount is big in science circle, and many and diverse influence of academic circle data made uses, therefore the present invention has needle
To property data therein are pre-processed and cleaned, unwanted data are got rid of, leave important data.Such as it will
The data processings such as article's style, languages and special issue are fallen, and leave the useful informations such as authors' name, scientific research field, paper keyword.
Step 2.1, using the management software of database, by the academic paper information and academic periodical information in excel table
It imported into database;
Step 2.2, data are extracted from initial data source:
Data are extracted from the academic paper information of initial data source in the database: paper name, paper id, scientific research neck
Domain, author, author affiliated unit, time, journal title, periodical id;In the database from the academic periodical information of initial data source
Middle extraction data: journal title, periodical id, impact factor, subregion;
Step 2.3, all paper entities, author's entity and periodical entity, structure are extracted from the data that step 2.2 is extracted
At entity data set;
Wherein, the paper entity obtained includes attribute: paper name, paper id, author, time, journal title, periodical id;?
To author's entity include attribute: authors' name, co-author, scientific research field, affiliated unit;Obtained periodical entity includes attribute:
Journal title, periodical id, impact factor, subregion;The co-author is when extracting author's entity from academic paper information, to extract opinion
The communication author and the first authors of text obtain;
Each attribute of each entity is saved according to triple form are as follows: entity-attribute-name-attribute value.For example,
Three-units-Central South University constitutes the triple sample of one (entity, attribute-name, attribute value).
Step 3, author's entity of the same name is concentrated to solid data, is carried out at disambiguation of the same name based on mutual similarity
Reason, and author id is set;
The present invention converts clustering problem for author's disambiguation problem to realize.
Step 3.1, author's entity is expressed as composed by authors' name, scientific research field, affiliated unit and co-author
Feature vector;
Using Word2Vec tool by this 4 attributes of the authors' name of author's entity, scientific research field, affiliated unit and co-author
Feature is respectively trained as term vector, and each term vector is normalized to the decimal between (0,1), then 4 are normalized
Decimal composition characteristic vector afterwards is used to indicate author's entity;
Step 3.2, all author's entities of the same name are taken, it is similar between any two author's entity of the same name by calculating
Degree, and compared with similarity threshold, the maximum similarity value greater than similarity threshold is taken, it will be two corresponding to maximum similarity value
A author's entity cluster of the same name is cluster, obtains author's entity set;
Wherein, the calculating formula of similarity between any two author of the same name are as follows:
SijIndicate two author's entity a of the same nameiWith author's entity ajBetween similarity, simattr() indicates similarity
Calculate function;
Step 3.3, other any author entities of the same name with author's entity set are taken, if making with any of author's entity set
Similarity between person's entity is greater than similarity threshold, then author's entity set is added in author's entity;
Step 3.4, it by remaining author's entity of the same name, is handled by step 3.2 to step 3.3, until to all same
Name author's Entities Matching is to corresponding author's entity set;
Step 3.5, all author's entities in same author's entity set are merged into same author's entity, and to obtain
Author entity setting up author id;And the author id of author's entity in all different authors entity sets is set as different.
Particularly, if it is to the similarity calculation between two author's entity sets of the same name, it is similar that the present invention defines its
Spend function are as follows: arbitrarily take author's entity from two author's entity sets, carry out after calculating two-by-two, take maximum therein similar
Similarity between angle value author's entity set of the same name as two, formula indicate are as follows:
SpqIndicate two author's entity set c of the same namepWith author's entity set cqBetween similarity, aiAnd ajIt respectively indicates
Author's entity set cpWith author's entity set cqIn author's entity.
Step 4, the academic circle of knowledge based map is constructed;
The entity data set obtained after disambiguation of the same name processing is stored in Neo4j chart database, entity node is formed;Base
Public attribute feature between different entities, the opening relationships side between different entities node, finally obtains knowledge based map
Science circle.Specifically:
Step 4.1, the file that all entities that solid data is concentrated are exported as to csv format from database, then leads
Enter into Neo4j chart database, the corresponding entity of each id is respectively formed an entity node in Neo4j chart database.
Neo4j is a high performance NOSQL graphic data base, it by structural data be stored on network rather than table
In.It is one it is Embedded, based on disk, have the Java persistence engine of complete transactional attribute.Neo4j can also be with
It is counted as a high performance figure engine, which has all characteristics of mature database.It can will be academic using neo4j
Circle visualization, so that the knowledge mapping of academic circle is constructed, and the relationship between entity can very easily be established by neo4j.
Wherein, the above-mentioned file by csv format imported into the step in Neo4j chart database, specifically certainly using Neo4j
The create sentence of band, the solid data in csv formatted file is imported into Neo4j chart database, and corresponding entity is formed
Entity node.
Step 4.2, using attributive character public between different entities, extract the relationship between different entities: difference is made
Being between person's entity is to deliver between relationship, periodical entity and paper entity between cooperative relationship, author's entity and paper entity
For the relationship of including.
It is cooperative relationship between several authors in same piece paper;Paper is included by some periodical, to include and being included
Relationship;It is to deliver relationship between paper and its author.For example, in paper entity attributes include journal title and periodical id,
Therefore can use this attribute includes relationship between paper entity and corresponding periodical entity to construct.Specific entity closes
System can be created by the where sentence of Neo4j.
Step 4.3, in Neo4j chart database, will have between related entity node using corresponding relation type
While being attached.
Above embodiments are preferred embodiment of the present application, those skilled in the art can also on this basis into
The various transformation of row or improvement these transformation or improve this Shen all should belong under the premise of not departing from the application total design
Within the scope of please being claimed.
Claims (6)
1. a kind of academic circle construction method of knowledge based map, which comprises the following steps:
Step 1, all academic paper information and all academic periodical informations are obtained, and as initial data source;
Step 2, the entity information of pre-selection entity type is extracted from initial data source, constitutes entity data set;The pre-selection is real
Body type includes author, paper and periodical;
Step 3, author's entity of the same name is concentrated to solid data, and disambiguation processing of the same name is carried out based on mutual similarity;
Step 4, the entity data set obtained after disambiguation of the same name processing is stored in Neo4j chart database, forms entity node;
Based on the public attribute feature between different entities, the opening relationships side between different entities node finally obtains knowledge based map
Academic circle.
2. the method according to claim 1, wherein the detailed process of the step 3 are as follows:
Step 3.1, author's entity is expressed as the feature vector being made of its attribute value;
Step 3.2, all author's entities of the same name are taken, by calculating the similarity between any two author's entity of the same name,
And compared with similarity threshold, the maximum similarity value greater than similarity threshold is taken, by two corresponding to maximum similarity value
Author's entity cluster of the same name is cluster, obtains author's entity set;
Wherein, the calculating formula of similarity between any two author of the same name are as follows:
SijIndicate two author's entity a of the same nameiWith author's entity ajBetween similarity, simattr() indicates similarity calculation
Function;
Step 3.3, other any author's entities that the author's entity set obtained with previous step is of the same name are taken, if with author's entity set
Any of similarity between author's entity be greater than similarity threshold, then the work obtained author's entity addition previous step
In person's entity set;
Step 3.4, it by remaining author's entity of the same name, is handled again by step 3.2 to step 3.3, until to all same
Name author's Entities Matching is to corresponding author's entity set;
Step 3.5, all author's entities in same author's entity set are merged into same author's entity, and the work to obtain
Person entity setting up author id;And the author id of author's entity in all different authors entity sets is set as different.
3. according to the method described in claim 2, it is characterized in that, author's entity is expressed as composed by following attribute value
Feature vector, the following attribute value include: authors' name, scientific research field, affiliated unit and co-author.
4. the method according to claim 1, wherein the academic paper information by using crawler technology from
It is acquired in web of science bibliographic data base, the academic periodical information is by using crawler technology from letpub net
It is acquired in page, and academic paper information and academic periodical information are stored in respectively in the different files of identical csv format.
5. the method according to claim 1, wherein pre-selection entity class is extracted in step 2 from initial data source
The entity information of type constitutes the detailed process of entity data set are as follows:
Step 2.1, initial data source is imported in database;
Step 2.2, data are extracted from initial data source:
Data are extracted from the academic paper information of initial data source in the database: paper name, paper keyword, scientific research neck
Domain, author, time, journal title, periodical id;Data are extracted from the academic periodical information of initial data source in the database: the phase
Print name, periodical id, impact factor, subregion;
Step 2.3, all paper entities, author's entity and periodical entity are extracted from the data that step 2.2 is extracted, and are constituted real
Volumetric data set;
Wherein, the paper entity obtained includes attribute: paper name, paper id, author, time, journal title, periodical id;It obtains
Author's entity includes attribute: authors' name, co-author, scientific research field, affiliated unit;Obtained periodical entity includes attribute: periodical
Name, periodical id, impact factor, subregion;The co-author is when extracting author's entity from academic paper information, to extract paper
Communication author and the first authors obtain;
Each attribute of each entity is saved according to triple form are as follows: (entity, attribute-name, attribute value).
6. the method according to claim 1, wherein the detailed process of the step 4 are as follows:
Step 4.1, the file that all entities that solid data is concentrated are exported as to csv format from database, is then introduced into
In Neo4j chart database, the corresponding entity of each id is respectively formed an entity node in Neo4j chart database;
Step 4.2, using attributive character public between different entities, extract the relationship between different entities: different authors are real
It is between cooperative relationship, author's entity and paper entity between body for deliver between relationship, periodical entity and paper entity be receipts
Record relationship;
Step 4.3, in Neo4j chart database, will have between related entity node using corresponding relation type side into
Row connection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910668329.8A CN110362692A (en) | 2019-07-23 | 2019-07-23 | A kind of academic circle construction method of knowledge based map |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910668329.8A CN110362692A (en) | 2019-07-23 | 2019-07-23 | A kind of academic circle construction method of knowledge based map |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110362692A true CN110362692A (en) | 2019-10-22 |
Family
ID=68219907
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910668329.8A Pending CN110362692A (en) | 2019-07-23 | 2019-07-23 | A kind of academic circle construction method of knowledge based map |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110362692A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111078710A (en) * | 2019-12-30 | 2020-04-28 | 凌祺云 | Teaching auxiliary system construction method based on knowledge cross-correlation |
CN111091006A (en) * | 2019-12-20 | 2020-05-01 | 北京百度网讯科技有限公司 | Entity intention system establishing method, device, equipment and medium |
CN111143457A (en) * | 2019-12-28 | 2020-05-12 | 北京工业大学 | Student homonymy disambiguation method based on multiple source data sets |
CN111191045A (en) * | 2019-12-30 | 2020-05-22 | 创新奇智(上海)科技有限公司 | Entity alignment method and system applied to knowledge graph |
CN111324609A (en) * | 2020-02-17 | 2020-06-23 | 腾讯云计算(北京)有限责任公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN111522911A (en) * | 2020-04-16 | 2020-08-11 | 创新奇智(青岛)科技有限公司 | Entity linking method, device, equipment and storage medium |
CN111680498A (en) * | 2020-05-18 | 2020-09-18 | 国家基础地理信息中心 | Entity disambiguation method, device, storage medium and computer equipment |
CN112417082A (en) * | 2020-10-14 | 2021-02-26 | 西南科技大学 | Scientific research achievement data disambiguation filing storage method |
CN112836060A (en) * | 2019-11-25 | 2021-05-25 | 中国科学技术信息研究所 | Map construction method and device for scientific and technological innovation data |
CN112966120A (en) * | 2021-02-26 | 2021-06-15 | 重庆大学 | Relationship strength analysis system and information recommendation system |
CN113554175A (en) * | 2021-09-18 | 2021-10-26 | 平安科技(深圳)有限公司 | Knowledge graph construction method and device, readable storage medium and terminal equipment |
CN113780001A (en) * | 2021-08-12 | 2021-12-10 | 北京工业大学 | Visual analysis method for homonymous disambiguation of academic papers |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104346446A (en) * | 2014-10-27 | 2015-02-11 | 百度在线网络技术(北京)有限公司 | Paper associated information recommendation method and device based on mapping knowledge domain |
-
2019
- 2019-07-23 CN CN201910668329.8A patent/CN110362692A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104346446A (en) * | 2014-10-27 | 2015-02-11 | 百度在线网络技术(北京)有限公司 | Paper associated information recommendation method and device based on mapping knowledge domain |
Non-Patent Citations (3)
Title |
---|
刘晓燕等: "基于本体的学术知识地图构建――以国内动态能力研究为例", 《情报理论与实践》 * |
孙雨生等: "国内基于知识图谱的信息推荐研究进展", 《情报理论与实践》 * |
袁凯琦等: "医学知识图谱构建技术与研究进展", 《计算机应用研究》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112836060B (en) * | 2019-11-25 | 2023-11-24 | 中国科学技术信息研究所 | Atlas construction method and apparatus for technological innovation data |
CN112836060A (en) * | 2019-11-25 | 2021-05-25 | 中国科学技术信息研究所 | Map construction method and device for scientific and technological innovation data |
CN111091006A (en) * | 2019-12-20 | 2020-05-01 | 北京百度网讯科技有限公司 | Entity intention system establishing method, device, equipment and medium |
CN111091006B (en) * | 2019-12-20 | 2023-08-29 | 北京百度网讯科技有限公司 | Method, device, equipment and medium for establishing entity intention system |
CN111143457A (en) * | 2019-12-28 | 2020-05-12 | 北京工业大学 | Student homonymy disambiguation method based on multiple source data sets |
CN111191045B (en) * | 2019-12-30 | 2023-06-16 | 创新奇智(上海)科技有限公司 | Entity alignment method and system applied to knowledge graph |
CN111191045A (en) * | 2019-12-30 | 2020-05-22 | 创新奇智(上海)科技有限公司 | Entity alignment method and system applied to knowledge graph |
CN111078710A (en) * | 2019-12-30 | 2020-04-28 | 凌祺云 | Teaching auxiliary system construction method based on knowledge cross-correlation |
CN111078710B (en) * | 2019-12-30 | 2023-10-20 | 凌祺云 | Knowledge cross-correlation-based teaching auxiliary system construction method |
CN111324609A (en) * | 2020-02-17 | 2020-06-23 | 腾讯云计算(北京)有限责任公司 | Knowledge graph construction method and device, electronic equipment and storage medium |
CN111522911A (en) * | 2020-04-16 | 2020-08-11 | 创新奇智(青岛)科技有限公司 | Entity linking method, device, equipment and storage medium |
CN111680498A (en) * | 2020-05-18 | 2020-09-18 | 国家基础地理信息中心 | Entity disambiguation method, device, storage medium and computer equipment |
CN112417082B (en) * | 2020-10-14 | 2022-06-07 | 西南科技大学 | Scientific research achievement data disambiguation filing storage method |
CN112417082A (en) * | 2020-10-14 | 2021-02-26 | 西南科技大学 | Scientific research achievement data disambiguation filing storage method |
CN112966120B (en) * | 2021-02-26 | 2021-09-17 | 重庆大学 | Relationship strength analysis system and information recommendation system |
CN112966120A (en) * | 2021-02-26 | 2021-06-15 | 重庆大学 | Relationship strength analysis system and information recommendation system |
CN113780001A (en) * | 2021-08-12 | 2021-12-10 | 北京工业大学 | Visual analysis method for homonymous disambiguation of academic papers |
CN113780001B (en) * | 2021-08-12 | 2023-12-15 | 北京工业大学 | Visual analysis method for academic paper homonymy disambiguation |
CN113554175A (en) * | 2021-09-18 | 2021-10-26 | 平安科技(深圳)有限公司 | Knowledge graph construction method and device, readable storage medium and terminal equipment |
CN113554175B (en) * | 2021-09-18 | 2021-11-26 | 平安科技(深圳)有限公司 | Knowledge graph construction method and device, readable storage medium and terminal equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110362692A (en) | A kind of academic circle construction method of knowledge based map | |
Hao et al. | Floating or settling down: The effect of rural landholdings on the settlement intention of rural migrants in urban China | |
Shang et al. | Collaborative filtering with diffusion-based similarity on tripartite graphs | |
Leydesdorff et al. | Journal maps on the basis of Scopus data: A comparison with the Journal Citation Reports of the ISI | |
Xie et al. | Open knowledge accessing method in IoT-based hospital information system for medical record enrichment | |
CN103631909B (en) | System and method for combined processing of large-scale structured and unstructured data | |
CN103838785A (en) | Vertical search engine in patent field | |
CN106991614A (en) | The parallel overlapping community discovery method propagated under Spark based on label | |
Chang et al. | Classification and visualization of the social science network by the minimum span clustering method | |
Stoter et al. | A semantic-rich multi-scale information model for topography | |
CN112966053A (en) | Knowledge graph-based marine field expert database construction method and device | |
CN107358534A (en) | The unbiased data collecting system and acquisition method of social networks | |
Widgren | Reading property in the landscape | |
Oliva-Santos et al. | Ontology-based topological representation of remote-sensing images | |
de Souza et al. | Researchers profile, co-authorship pattern and knowledge organization in information science in Brazil | |
Chen et al. | Study on classification of personality-based brand archetype from the perspective of internet | |
Ma et al. | Multiple wide tables with vertical scalability in multitenant sensor cloud systems | |
Qu et al. | A Multiple Salient Features-Based User Identification across Social Media | |
Shan et al. | Heterogeneous empowerment network for activating red cultural heritage: an action research based on urban red tourism resources. | |
US11354519B2 (en) | Numerical information management device enabling numerical information search | |
CN112035680A (en) | Knowledge graph construction method of intelligent auxiliary learning machine | |
US20200183952A1 (en) | Numerical information management device using data structure | |
Dong et al. | Differences in Urban Development in China from the Perspective of Point of Interest Spatial Co-Occurrence Patterns | |
Duklan et al. | Classification of search engine optimization techniques: A data mining approach | |
Du et al. | Research on the Annual Reading Report of Academic Libraries Based on Personas |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191022 |
|
RJ01 | Rejection of invention patent application after publication |