WO2021235617A1

WO2021235617A1 - System for recommending scientific and technical knowledge information, and method therefor

Info

Publication number: WO2021235617A1
Application number: PCT/KR2020/014373
Authority: WO
Inventors: 김인수
Original assignee: 위인터랙트 주식회사
Priority date: 2020-05-20
Filing date: 2020-10-21
Publication date: 2021-11-25
Also published as: KR20210143431A; KR102371329B1

Abstract

The present invention relates to a system for recommending scientific and technical knowledge information, and a method therefor, and the present invention provides a system for recommending scientific and technical knowledge information and a method therefor, the system comprising: a word-similarity model construction means for constructing a word-similarity model of scientific and technical information through text data pre-processing and artificial neural network learning from a large number of scientific and technical documents; a top-level science and technology R&D construction means for constructing top-level science and technology R&D classification systems on the basis of international or domestic science and technology classification systems; a reference-similarity network construction means for constructing a reference-similarity network on the basis of the similarity between the constructed top-level science and technology R&D classification systems; a scientific and technical knowledge information similarity network construction means using the constructed reference-similarity network to construct a scientific and technical knowledge information similarity network to which scientific and technical knowledge information, including patent and thesis information, is added; and a scientific and technical knowledge information recommendation means for recommending scientific and technical knowledge information including patents and theses that are adjacent to a member in the constructed scientific and technical knowledge information similarity network.

Description

Science and technology knowledge information recommendation system and method therefor

The present invention relates to a scientific and technological knowledge information recommendation system and a method therefor. More specifically, it is a user-customized science and technology knowledge information recommendation based on user information, and for the efficient recommendation of science and technology knowledge information in various fields, it is a top-level science that can connect science and technology knowledge information between patent information, thesis information, and user information. Establish a technology R&D classification system, build a science and technology metadata (science and technology knowledge information) similarity network based on the established top-level science and technology research and development classification system, and build a patent around users in the established science and technology knowledge information similarity network and a science and technology knowledge information recommendation system and method that can recommend customized papers to users.

Researchers or companies are continuously conducting research and development of science and technology, and are producing results. In the case of a researcher, the research on the science and technology field in which he or she has been researching is continued, mainly referring to papers in the related technology field. In the case of companies, developers by departments in the field of technology development continue to develop collaboratively through discussions about the company's follow-up items or new business items.

However, current researchers or companies cannot receive or access patent information or thesis information in related technology fields for R&D in a user-centered, systematic and customized manner. is being secured.

Research to solve such problems has been continued. Looking at the related invention, the invention of the company-customized follow-up development item discovery method of Republic of Korea Patent Publication No. 10-2019-0115505 (published date: October 14, 2019) is disclosed.

The disclosed invention is a method of discovering a subsequent development item customized for a company, comprising the steps of: (a) constructing a patent database except for patents whose patent applicants are large corporations, universities, or public research institutes; (b) in the patent database, calculating and constructing a preference for each IPC using the patent frequency for IPC for each applicant; (c) extracting similar companies by establishing the database, setting the reference applicant in the user system, calculating the similarity between the reference applicant and the random applicant using the preference for each IPC between the established reference applicant and the random applicant; (d) calculating a correlation index for each IPC using the degree of similarity between the reference applicant and the similar company and the patent frequency of the similar company for a specific IPC; and (e) extracting an IPC having a high correlation index in the above, and recommending a technical field corresponding to the extracted IPC as a subsequent item.

In the above, the preference is a patent frequency or a fuzzy application value adjusted within a certain range by applying a fuzzy to the patent frequency, and the fuzzy application value is a value converted to a certain scale by applying a fuzzy to the patent frequency, The degree of similarity can be calculated using the preference value of the specific IPC of the reference applicant and the preference value of the specific IPC of the arbitrary applicant. It is a value obtained by normalizing the sum of values multiplied by the similarity of It is an invention further comprising the step of presenting the R&D direction for each patent classification code after calculating at least one of the growth potential indicating the degree of competition of the patent classification code and the self-growth degree of the corresponding patent classification code.

The heterogeneity is expressed as the reciprocal of the similarity between a specific patent classification code with a high correlation index and a patent classification code possessed by the reference applicant, and the degree of competition refers to the total amount of patents applied for a specific patent classification code with a high correlation index, , the growth potential is an invention meaning the average increase rate of patents applied for a specific patent classification code with a high correlation index.

In the disclosed invention, a patent database is constructed except for patents whose patent applicants are large corporations, universities, or public research institutes, and in the patent database, the preference for each IPC is calculated and constructed using the patent frequency for IPC for each applicant, and the database is constructed. After constructing , a reference applicant is set in the user system, and the degree of similarity between the reference applicant and the random applicant is calculated using the preference for each IPC between the set reference applicant and the random applicant, and the R&D direction is recommended for each calculated IPC.

However, the disclosed invention is a configuration that calculates and recommends the preference for each IPC of the standard applicant and the applicant based on the International Patent Classification (IPC) of the patent database and the similarity based on this, and it is a configuration of various scientific and technological knowledge information related to research and development. Since each element cannot be included, there is a problem that systematic and customized scientific and technological knowledge information cannot be provided based on various scientific and technological knowledge information.

Therefore, as a user-customized science and technology knowledge information recommendation based on user information, the highest level science and technology research that can connect science and technology knowledge information among patent information, thesis information, and user information for efficient recommendation of science and technology knowledge information in various fields A development classification system is established, and a science and technology metadata (science and technology knowledge information) similarity network is established based on the established top-level scientific technology R&D classification system, and patents and papers around users within the established science and technology knowledge information similarity network are established. An invention that can recommend customized to a user is desired.

The present invention is to solve the problems of the prior art, and an object of the present invention is to recommend user-customized scientific and technological knowledge information based on user information, and for efficient recommendation of scientific and technological knowledge information in various fields, patent information, thesis information and a top-level science and technology R&D classification system that can connect science and technology knowledge information between user information and a similarity network of science and technology metadata (science and technology knowledge information) An object of the present invention is to provide a science and technology knowledge information recommendation system and method capable of recommending patents and thesis information around users within the established and constructed science and technology knowledge information similarity network in a customized way.

As a technical solution means for achieving the object of the present invention, as a first aspect of the present invention, by collecting and managing scientific and technological knowledge information, and building a scientific and technological knowledge information similarity network based on the collected scientific and technological knowledge information, users an operating computer that provides scientific and technological knowledge information customized to people; a member information data storage unit for storing and managing member information that is connected to the operating computer and joined as a member to the operating computer, information related to science and technology of members, and information on use of scientific and technology knowledge information of members; a science and technology knowledge information data storage unit that is communicatively connected to the operating computer and stores and manages patent information collected by the operating computer, thesis information, and collected information collected from social networks; It is connected to the operating computer to store and manage science and technology word-based similarity model information, science and technology related R&D classification information, similarity network information of science and technology knowledge information, and use information of science and technology knowledge information built by the operating computer. a construction science and technology knowledge information data storage unit; at least one user terminal that is communicatively connected to the operating computer to provide membership registration, user science and technology related information, and the like, and to receive customized scientific and technological knowledge information from the operating computer; a science and technology information providing computer communicating with the operating computer and providing scientific and technological document information in response to a request for providing information from the operating computer; a patent information providing computer communicating with the operating computer and providing patent information in response to a request for providing information from the operating computer; a thesis information providing computer that is connected to the operating computer and provides thesis information in response to a request for providing information from the operating computer; the operating computer includes a social network medium such as the Internet and social networks that collects various information related to science and technology through communication access;

The operating computer builds a word similarity model of science and technology information through text data preprocessing and artificial neural network learning from a large amount of science and technology documents, and builds and stores the highest level science and technology R&D classification system based on international or domestic science and technology classification systems Based on the similarity between the constructed top-level scientific and technological R&D classification system, a reference similarity network is built and stored, and science and technology knowledge information including patent and thesis information is added using the established reference similarity network. A science and technology knowledge information recommendation system characterized by constructing and storing a similarity network, and recommending patents and papers around members in the constructed science and technology knowledge information similarity network to users.

In addition, as a second aspect of the present invention, the operation computer comprises the steps of: building a word similarity model of scientific and technological information through text data preprocessing and artificial neural network learning in a large amount of scientific and technological documents; establishing, by the operating computer, a top-level scientific and technological research and development (R&D) classification system capable of linking heterogeneous scientific and technological knowledge information; constructing a standard scientific and technological knowledge information similarity network by performing, by the operating computer, a similarity calculation between the highest scientific and technological R&D classification systems using the word similarity model; constructing a science and technology knowledge information similarity network to which science and technology knowledge information is added by using the reference science and technology knowledge information similarity network in which the operating computer is built; Calculating, by the operating computer, the initial similarity of members in the science and technology knowledge information similarity network constructed using member information including industry classification, field of interest, field of specialization, and university major input at the time of membership registration (S140) Wow; A science and technology knowledge information recommendation method is provided, including the step of the operating computer recommending science and technology knowledge information around members in the science and technology knowledge information similarity network.

According to the present invention, for the efficient recommendation of scientific and technological knowledge information in various fields, the highest scientific and technological R&D classification system that can connect scientific and technological knowledge information among patent information, thesis information, and user information is constructed, and the Establish a similarity network of science and technology metadata (science and technology knowledge information) based on the science and technology R&D classification system, and use science and technology knowledge information including patents and papers around users in the constructed science and technology knowledge information similarity network It has the effect of being able to make a customized recommendation for you.

1` is a schematic configuration diagram of an embodiment of the science and technology knowledge information recommendation system of the present invention.

2 is a schematic configuration diagram of an embodiment of an operating computer, which is a main part of the scientific and technological knowledge information recommendation system of the present invention.

3 is a flowchart for explaining an embodiment of a method for recommending scientific and technological knowledge information of the present invention.

4 is a flowchart for explaining the main part of the scientific and technological knowledge information recommendation method of the present invention.

5 is a flowchart for explaining the main part of the method for recommending scientific and technological knowledge information of the present invention.

In the present invention, a word similarity model building means for constructing a word similarity model of scientific and technological information through text data preprocessing and artificial neural network learning in a large amount of scientific and technological documents; the highest level science and technology R&D establishment means for establishing the highest level science and technology R&D classification system based on the international or domestic science and technology classification system; a reference similarity network construction means for constructing a reference similarity network based on the similarity between the constructed top-level scientific and technological R&D classification systems; a science and technology knowledge information similarity network construction means for constructing a science and technology knowledge information similarity network to which science and technology knowledge information including patent and thesis information is added using the established reference similarity network; A science and technology knowledge information recommendation system and method including a science and technology knowledge information recommendation means for recommending science and technology knowledge information including patents and thesis around members in the established science and technology knowledge information similarity network are presented.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Terms used in the description of the embodiments of the present invention will be defined. Various computers and terminals used in the present invention may consist of hardware itself, or may be composed of a computer program or a web program utilizing the hardware resources. For example, the operating computer of the present invention may consist of each component of hardware included in the computer, and may consist of a computer program or web program executed by utilizing the hardware resources of the computer.

In addition, the 'user interface' described in the embodiment of the present invention may be a web program or an application program that is output to a user terminal or installed and executed.

In addition, '~ part' described in the embodiment of the present invention may be used instead of '~ means'. Here, '~ part' or '~ means' may be a component of hardware itself, and preferably may be composed of a component of software or a program.

1 is a schematic configuration diagram of an embodiment of the science and technology knowledge information recommendation system of the present invention.

As shown in FIG. 1, the science and technology knowledge information recommendation system of the present invention collects and manages science and technology knowledge information, and builds a science and technology knowledge information similarity network based on the collected science and technology knowledge information to customize it to users. an operating computer 100 that provides scientific and technological knowledge information; A member information data storage unit ( 200) and; a science and technology knowledge information data storage unit 300 connected to communication with the operating computer 100 to store and manage patent information collected by the operating computer 100, thesis information, and collected information collected from social networks; Similarity model information of science and technology information words constructed through communication connection with the operating computer 100, science and technology related R&D classification information, similarity network information of science and technology knowledge information, and scientific and technological knowledge built by the operating computer 100 a science and technology knowledge information utilization data storage unit 400 for storing and managing information use information; at least one user terminal 500 that is communicatively connected to the operating computer 100 to provide membership registration, user science and technology related information, and the like, and to receive customized scientific and technological knowledge information from the operating computer 100; a science and technology information providing computer 600 that is connected to communication with the operating computer 100 and provides scientific and technical document information in response to a request for providing information from the operating computer; a patent information providing computer 700 that is connected to communication with the operating computer 100 and provides patent information according to the information provision request of the operating computer; a thesis information providing computer 800 connected to the operating computer 100 and providing thesis information in response to a request for providing information from the operating computer 100; The operating computer 100 is configured to include a social networking medium 900 such as an Internet website, blog, social network, etc. that can collect various information related to science and technology through communication connection.

The operating computer 100, the scientific and technological information providing computer 600, the patent information providing computer 700 and the thesis information providing computer 800 have their own data storage means or are communicatively connected to an external data storage means, and the present invention It may be composed of at least one server computer equipped with means for the operation and use of the scientific and technological knowledge information recommendation system.

The science and technology information providing computer 600 may be configured as a server that provides a national science and technology document database having a large amount of science and technology document files or an integrated science and technology document database of each country. The patent information providing computer 700 may be configured as a server providing a patent information database server of each country's patent office or an integrated patent information database of each country. The patent information database of each country may include, for example, a database provided by 'www.kipris.or.kr', a website that provides intellectual property information including domestic patent information to users in the case of the Republic of Korea. The integrated patent information database of 'www.escape.net', a website that provides users with patent information from around the world, includes a database.

The thesis information providing computer 800 may be configured as a thesis information database server in each country or a server providing an integrated thesis information database in each country. Patent information database of each country and, for example, in the case of Korea, include the database provided by 'www.ndsl.kr', a website that provides users with thesis information including domestic thesis information and integrated thesis information from around the world. can

The member information data storage unit 200, science and technology knowledge information data storage unit 300 and science and technology knowledge information utilization data storage unit 400 may be configured as data storage means provided by the operating computer 100, , preferably a database management server system (DBMS). In addition, it may be configured as one server system or may be configured as separate server systems.

The member information data storage unit 200 is a member information storage unit 210 that stores and manages basic member information and score information for each member node of users who have signed up as members to use the science and technology knowledge information recommendation system of the present invention. )Wow; an industry classification information storage unit 220 for storing and managing industry classification information selected or input in the user interface when a user joins a membership; a field of interest information storage unit 230 for storing and managing the user's interest (field) information selected or input in the user interface when the user signs up for a membership; a specialization information storage unit 240 for storing and managing the user's current specialization information selected or input in the user interface when the user signs up for membership; a college major information storage unit 250 for storing and managing the user's college major information selected or input in the user interface when the user joins the membership; It is a configuration including a member use information storage unit 260 for a member to store and manage information using science and technology knowledge information recommendation through a user interface in the science and technology knowledge information recommendation system of the present invention.

The science and technology knowledge information data storage unit 300 includes: a patent information data storage unit 310 for storing and managing the collected patent information (registered patents and published patents) of each country around the world; a thesis information data storage unit 320 for storing and managing the collected thesis information from around the world; The operating computer 100 is configured to include a collection information data storage unit 330 that stores and manages science and technology-related information collected through the social networking medium 900 .

The science and technology knowledge information utilization data storage unit 400 includes: a science and technology word similarity model storage unit 410 for storing and managing a science and technology word similarity model performed through analysis and learning in a large number of scientific and technological documents; a top-level science and technology R&D classification information storage unit 420 that stores and manages the top-level science and technology R&D classification system constructed by organizing the international science and technology classification system and the domestic science and technology classification system so as to link science and technology knowledge information; a reference science and technology knowledge information network information storage unit 430 for storing and managing the reference science and technology knowledge information similarity network information constructed by calculating the similarity between the highest level science and technology R&D classification systems using the science and technology word similarity model; a science and technology knowledge information similarity network information storage unit 440 for storing science and technology knowledge information similarity network information constructed by adding science and technology knowledge information to the reference science and technology knowledge information similarity network; It has a configuration including a science and technology knowledge information use information storage unit 450 that stores and manages use information of science and technology knowledge information recommended in the constructed science and technology knowledge information similarity network.

Although the member information data storage unit 200, the science and technology knowledge information data storage unit 300 and the science and technology knowledge information utilization data storage unit 400 have been separately described, it is not limited thereto. It can be configured using an integrated storage and management means, and is included in the member information data storage unit 200 , the science and technology knowledge information data storage unit 300 and the science and technology knowledge information utilization data storage unit 400 . It goes without saying that each of the storage units described above may also be configured by changing its arrangement as needed in terms of use and function.

The user terminal 500 outputs a user interface composed of a website or a web program provided by the operating computer 100, or downloads and executes a user interface provided by the operating computer 100 or an application program download computer. or a mobile phone, a smart phone, a tablet computer, a notebook computer, or a personal computer (PC) provided with a means for outputting a user interface by accessing the cloud computing system.

As shown in FIG. 2, the operating computer 100 of the present invention includes: a user interface management unit 101 that manages identification information and update information of a user interface to be provided to a user terminal; a science and technology information collection and management unit 102 that collects scientific and technological information, patent information, and thesis information from around the world; a member information management unit 103 for storing and managing basic member information and score information for each member node, which users who use the scientific and technological knowledge information recommendation system of the present invention have signed up as members; a member science and technology knowledge information management unit 104 for storing and managing industry classification, field of interest, field of specialization, and university major information selected or input by the users when signing up for membership; a patent information management unit 105 for storing and extracting patent information from around the world collected by the science and technology information collection and management unit 102; a thesis information management unit 106 for storing and extracting science and technology-related thesis information from around the world collected by the science and technology information collection and management unit 102; a science and technology information collection management unit 107 that stores and manages science and technology information of countries around the world collected through social networks such as the Internet and SNS by the science and technology information collection and management unit 102; a science and technology word similarity model information management unit 108 that builds and manages a word similarity model through analysis and learning in a large number of scientific and technological documents; the top-level science and technology R&D classification information management department (109), which builds and manages the top-level science and technology R&D classification system that has been established by organizing the international science and technology classification system and the domestic science and technology classification system to connect science and technology knowledge information; a standard science and technology knowledge information similarity network information management unit 110 that constructs and manages a standard science and technology knowledge information similarity network constructed by calculating the similarity between the highest level science and technology R&D classification systems using a science and technology word similarity model; a science and technology knowledge information similarity network information management unit 111 that builds and manages a science and technology knowledge information similarity network by adding science and technology knowledge information to the reference science and technology knowledge information similarity network; a user science and technology knowledge information similarity calculation information management unit 112 that calculates and manages the similarity of users in the science and technology knowledge information similarity network; a science and technology knowledge information recommendation information management unit 113 for generating and managing a list of science and technology knowledge information to be recommended to a user; a science and technology knowledge information use information management unit 114 that manages use information of recommended science and technology knowledge information generated in the established science and technology knowledge information similarity network; It is a configuration including a user science and technology knowledge information use information management unit 115 that manages the scientific and technological knowledge information usage status information of the member.

The operation of the science and technology knowledge information recommendation system of the present invention will be described in detail with reference to FIGS. 1 and 2 .

The operating computer 100 receives membership registration from users who want to use the science and technology knowledge information recommendation system of the present invention, and provides basic member information and industry classification, fields of interest, specialties and university majors provided by users when they sign up for membership. Receive and manage the same user science and technology knowledge information.

In addition, the operating computer 100 extracts the main text excluding unnecessary paragraphs from each scientific and technical document based on a large amount of scientific and technical document files collected by itself or transmitted from the outside, and morphological analysis from the extracted scientific and technical document body After extracting only noun words from the text using an algorithm, stopword processing is performed to delete unnecessary words to express the characteristics of a sentence or document, such as words that appear frequently such as prepositions and articles.

Here, a morpheme in a morpheme analysis algorithm refers to a "minimum semantic unit" in a language. In this case, meaning includes both lexical and grammatical meanings. Morphological analysis refers to the process of segmenting a word or sentence, which is a language unit with a larger unit than a morpheme, into a morpheme, which is the smallest unit of meaning.

A similarity model between words extracted by extracting from a large number of scientific and technological document files and calculating the meaning between noun words processed as stop words as a specific vector value and applying an unsupervised learning algorithm, a type of artificial neural network learning or machine learning. to build

Here, the unsupervised learning algorithm is to find out how the data is structured without a target value for the input data. By inputting unrefined data and performing feature summary and clustering of the data without training data, the goal It is a fast machine learning method because there is no need to set a value and no prior learning is required.

To summarize the construction of the word similarity model in the large amount of scientific and technological documents, the text is extracted from each scientific and technological document from the large amount of scientific and technological document database, and unnecessary paragraphs are excluded from the main text in this process. Words composed of nouns are extracted, stopword processing is performed on the extracted words, and based on the words, a word similarity model related to science and technology is built through artificial neural network or machine learning learning. In other words, it is possible to pre-process text data from a large amount of scientific and technological document files, and to build a science and technology-related word similarity model through learning through a neural network or machine learning.

In other words, it calculates the similarity of sentences and words in each scientific and technical document based on the large amount of scientific and technical document files collected by itself or transmitted from outside, and weights sentences and words through similarity comparison to exclude unnecessary paragraphs. A stopword that extracts the text, extracts only noun words from the extracted scientific and technological document body using morphological analysis techniques, and deletes unnecessary words to express the characteristics of a sentence or document, such as words with a low frequency of occurrence and words with a short length It performs processing, calculates the meaning between the stopword-processed words as a specific vector value, builds training data for model learning, and learns a model to which an artificial neural network or unsupervised learning algorithm, a type of machine learning, is applied. It may be configured to build a similarity model between the extracted words.

In addition, the operating computer 100 is a science for establishing an effective science and technology knowledge information recommendation system between each heterogeneous science and technology knowledge information, for example, science and technology related papers, patents, and user information who are science and technology experts. Establish a top-level scientific and technological R&D classification system that can connect technical knowledge information. The establishment of the top-level scientific and technological R&D classification system can utilize domestic as well as international information. For example, the OECD's FORD system and Korea's national science and technology classification system can be organized and integrated to build. As an example of the construction type, a classification system can be constructed in the field of mathematics, such as mathematics as a large classification, algebra as a medium classification, and linear algebra as a small classification.

The operating computer 100 constructs a reference science and technology knowledge information similarity network by calculating the similarity between the top-level scientific and technological R&D classification system constructed using the constructed science and technology related word similarity model. As a method, first, the similarity between the sub-classifications of the top-level science and technology R&D classification system is calculated by intervening the science and technology-related word similarity model, and the degree of similarity is fine-tuned using the large and medium classifications of the top-level science and technology R&D classification system to obtain scientific and technological knowledge. Build an information similarity network. Here, the sub-classification of the top-level scientific and technological R&D classification system can be a node, and the degree of similarity can be a relationship.

In summary, it is possible to construct a knowledge information similarity network by intervening a science and technology-related word similarity model to calculate the step-by-step similarity of the top-level science and technology R&D classification system, and then recalculating the similarity according to weights.

The form of the established standard science and technology knowledge information similarity network can be managed as shown in Table 1 below.

	A분류Class A	B분류Class B	C분류Class C	D분류Class D
A분류Class A	1One	0.20.2	0.70.7	0.50.5
B분류Class B	0.20.2	1One	0.40.4	0.10.1
C분류Class C	0.70.7	0.40.4	1One	0.90.9
D분류Class D	0.50.5	0.10.1	0.90.9	1One

The operating computer 100 builds a science and technology knowledge information similarity network by adding science and technology knowledge information in various science and technology fields to the constructed reference science and technology knowledge information similarity network.

As an embodiment of the construction of the science and technology knowledge information similarity network, the science and technology knowledge information to be utilized includes patent information, thesis information, industry classification of members, fields of interest, scientific and technological knowledge information in science and technology fields such as specialized fields and university majors. can be heard

With respect to the patent information used in the construction of the science and technology knowledge information similarity network, the international patent classification (IPC: International Patent Classification) and keywords of the invention are added to the standard science and technology knowledge information similarity network, and the similarity of science and technology words Using the model, it is possible to determine to which node a specific patented invention belongs by calculating the similarity between a set of sentences or words describing the International Patent Classification (IPC) information and the node. In this case, it is natural that one patented invention can have a plurality of nodes.

Specifically, TF-IDF (Term Frequency - Inverse Document Frequency) is used to refine too frequent meaningless words, process stopwords, extract keywords through self-developed Text-Rank technique, and scientific and technological words By calculating and calculating the similarity between the keyword and the node using the similarity model, and normalizing the calculated similarity, the keyword acts to determine the depth within the node.

The TF-IDF is a weight used in information retrieval and text mining, and is a statistical value indicating how important a word is in a specific document when there is a document group consisting of several documents.

The Text-Rank technique (algorithm) is based on the page rank algorithm and is known as a technique for summarizing a single document by weighting sentences and words by comparing the similarity, but it has a characteristic of extracting high-frequency words and sentences. It is a strong algorithm.

With respect to the thesis information used in the construction of the science and technology knowledge information similarity network, the thesis topic classification and keywords are used to be added to the standard science and technology knowledge information similarity network, and the thesis topic classification and node using the science and technology word similarity model By calculating the degree of similarity between the two, it is possible to determine which node a specific paper belongs to. In this case, it is natural that one paper can have multiple nodes.

With respect to the user's university major information used in the construction of the science and technology knowledge information similarity network, the process of simplifying and reclassifying the university major into a representative classification and adding it to the standard science and technology knowledge information similarity network using department classification data can be done

Considering that the process of simplifying and reclassifying the above university major into a representative classification has the same content but different names due to differences in expression methods for each university, simplification and reclassification into a representative classification to achieve unity, provided by the Ministry of Education You can determine which node you belong to by using the department (major) classification data book you are interested in.

With respect to the industry classification information selected by the user used to construct the science and technology knowledge information similarity network, the industry classification code is used to add it to the standard science and technology knowledge information similarity network, and the industry classification code is used using the science and technology word similarity model. It is possible to calculate the similarity between the sentence or word set and the node that describes In this case, it is natural that one industry classification may have a plurality of nodes.

With respect to the user's interest and specialization information used in constructing the scientific and technological knowledge information similarity network, the interest and specialization are large amounts of data in the form of nouns, which are used as standard science and technology using a science and technology word similarity model. It can be added to the knowledge information similarity network, and the similarity between words and nodes related to interests and specialties can be calculated using the science and technology word similarity model. In this case, it is natural that one interest and field of expertise may have a plurality of nodes.

In addition, the operating computer 100 performs node management of members in the established scientific and technological knowledge information similarity network. A member has a score for each node and may belong to the node with the highest score. The score management for each node of the member is performed by setting an initial node based on the basic member information input by the user when signing up for the science and technology knowledge information recommendation system of the present invention, and selecting the university major, industry classification, and field of interest selected by the user. and adding a score to a node corresponding to a specialized field, giving weight to member activities such as inquiry, search, and scrap in the intellectual property recommendation system of the present invention, and additionally collecting data such as inquiry time using cookies Thus, when a member uses the service of the science and technology knowledge information recommendation system of the present invention, it can be configured to calculate a score based on data collected using weights and cookies, and add the score to a node corresponding to the usage history.

In addition, when there is a modification to the member's member information (university major, industry, interest and specialization), it may be configured to calculate the node score of the member by adding or subtracting the score of the corresponding node.

The operating computer recommends science and technology knowledge information based on the constructed similarity network of science and technology knowledge information and node scores of managed members.

The process of recommending scientific and technological knowledge information calculates the difference in scores between the node to which the member belongs and other nodes, and determines the number and depth of other nodes to be used according to the difference in scores. In this case, the greater the difference between the scores, the smaller the number of nodes may be, and the greater the difference, the deeper it may be determined. Recommends filtered science and technology knowledge information by performing filtering using conditions such as year, number of citations, and member inquiry status from the science and technology knowledge information extracted according to the science and technology knowledge information similarity network and member's node score can be configured to

In summary, for the recommendation of scientific and technological knowledge information, the score difference between the member node and other nodes is calculated in the member information storage unit where the score information for each node of the member is stored, and the number of nodes to be extracted is determined according to the calculated score difference. At the same time, it is possible to recommend the filtered scientific and technological knowledge information by determining the node depth to be used according to the calculated score difference and filtering the extracted scientific and technological knowledge information using various conditions. Science and technology knowledge information recommended here may include patent information and thesis information.

3 is a flowchart for explaining an embodiment of a method for recommending scientific and technological knowledge information of the present invention. As shown in FIG. 3 , the method for recommending scientific and technological knowledge information of the present invention comprises the steps of, by an operating computer, constructing a word similarity model of scientific and technological information through text data preprocessing and artificial neural network learning in a large number of scientific and technological documents (S100) Wow; Step (S110) of the operating computer to establish a top-level scientific and technological research and development (R&D) classification system that can connect scientific and technological knowledge information in heterogeneous science and technology fields; the operation computer constructing a standard science and technology knowledge information similarity network by performing a similarity calculation between the highest scientific and technological R&D classification systems using the word similarity model (S120); constructing a science and technology knowledge information similarity network to which science and technology knowledge information is added using the reference science and technology knowledge information similarity network in which the operating computer is built (S130); Calculating, by the operating computer, the initial similarity of members in the science and technology knowledge information similarity network constructed using member information including industry classification, field of interest, field of specialization, and university major input at the time of membership registration (S140) Wow; The operating computer is configured to include a step (S150) of recommending scientific and technological knowledge information around members in the scientific and technological knowledge information similarity network.

In addition, the operation computer may further include the step of resetting the member's similarity in the scientific and technological knowledge information similarity network by using the member's scientific and technological knowledge information use information.

4 is a flowchart for explaining the main part of the scientific and technological knowledge information recommendation method of the present invention. As shown in FIG. 4 , the step of constructing the word similarity model of the present invention (S100) includes extracting the main text excluding unnecessary paragraphs from a large amount of scientific and technological documents (S101) and: morphological analysis from the extracted main text extracting only words that are nouns using a technique (S102); performing stopword processing on the extracted words (S103); It is a configuration including the step (S104) of constructing a science and technology-related word similarity model through artificial neural network or machine learning learning based on the stopword-processed word.

5 is a flowchart for explaining the main part of the method for recommending scientific and technological knowledge information of the present invention. As shown in FIG. 5, the step of recommending the scientific and technological knowledge information of the present invention (S150) is a step of calculating the difference in score between the member node and other nodes in the member information storage unit in which the score information for each node of the member is stored. (S151) and; determining the number of nodes to be extracted according to the calculated score difference (S152); determining a node depth to be used according to the calculated score difference (S153); filtering the extracted scientific and technological knowledge information using various conditions (S154); It is a configuration including a step (S155) of recommending scientific and technological knowledge information including filtered patent and thesis information.

The embodiments of the present invention described above are only some of the various embodiments. The operating computer of the present invention builds a word similarity model of science and technology information through text data preprocessing and artificial neural network learning from a large amount of science and technology documents, and builds a top-level science and technology R&D classification system based on international or domestic science and technology classification systems And, based on the similarity between the constructed top-level scientific and technological R&D classification system, a standard similarity network is built, and science and technology knowledge information similarity network that uses the established standard similarity network to add science and technology knowledge information including patent and thesis information It is natural that various embodiments including in the technical idea of constructing and recommending scientific and technological knowledge information including patents and papers around members in the established scientific and technological knowledge information similarity network to users are included in the protection scope of the present invention. .

The present invention can be applied to science and technology related knowledge information data industry.

Claims

In a system including an operating computer that collects scientific and technological knowledge information and recommends it to users,

The operating computer includes at least one hardware processor and a memory for storing a program, wherein the at least one hardware processor controls execution of the program stored in the memory,

Building a word similarity model of scientific and technological information through text data preprocessing and artificial neural network learning from a large number of scientific and technological documents,

Establish a top-level science and technology R&D classification system based on the international or domestic science and technology classification system,

Based on the similarity between the constructed top-level scientific and technological R&D classification system, a standard similarity network is constructed,

By using the established standard similarity network, a science and technology knowledge information similarity network is constructed to which science and technology knowledge information including patent and thesis information is added,

A science and technology knowledge information recommendation system, characterized in that it recommends science and technology knowledge information including patents and papers around members in the established science and technology knowledge information similarity network.
The method according to claim 1,

The construction of the word similarity model of the scientific and technological information is,

It calculates the similarity of sentences and words in each scientific and technological document based on the large amount of scientific and technological document files collected by itself or transmitted from outside, and weights the sentences and words through similarity comparison to extract the main body excluding unnecessary paragraphs. , extracts only noun words from the extracted scientific and technological document body using morphological analysis techniques, and performs stopword processing by deleting words unnecessary to express the characteristics of a sentence or document, such as words with a low frequency of occurrence and words with a short length. , constructs training data for model learning by calculating the meaning between words processed by stopwords as specific vector values, Science and technology knowledge information recommendation system, characterized in that it builds a similarity model.
The method according to claim 1,

The construction of the top-level scientific and technological R&D classification system is,

Science and technology knowledge information recommendation system, characterized in that by intervening a science and technology-related word similarity model, the step-by-step similarity of the top-level science and technology R&D classification system is calculated, and then the similarity is recalculated according to weights to construct a knowledge information similarity network.
The method according to claim 1,

The construction of the reference similarity network is,

Science and technology knowledge information recommendation system, characterized in that by intervening a science and technology-related word similarity model, the step-by-step similarity of the top-level science and technology R&D classification system is calculated, and then the similarity is recalculated according to weights to construct a knowledge information similarity network.
The method according to claim 1,

The construction of the science and technology knowledge information similarity network is,

A science and technology knowledge information recommendation system that utilizes science and technology knowledge information in science and technology fields such as patent information, thesis information, industry classification of members, fields of interest, specialized fields, and university majors.
The method according to claim 1,

Recommendation of the above science and technology knowledge information,

In the member information storage unit where the score information for each node of the member is stored, the score difference between the member node and other nodes is calculated, the number of nodes to be extracted is determined according to the calculated score difference, and the node depth to be used according to the calculated score difference is determined. A science and technology knowledge information recommendation system, characterized in that it is determined, and the extracted scientific and technological knowledge information is filtered using various conditions to recommend science and technology knowledge information including the filtered patent and thesis information.
constructing, by an operating computer, a word similarity model of scientific and technological information through text data preprocessing and artificial neural network learning in a large number of scientific and technological documents; establishing, by the operating computer, a top-level scientific and technological research and development (R&D) classification system capable of linking scientific and technological knowledge information in heterogeneous science and technology fields; constructing a standard scientific and technological knowledge information similarity network by performing, by the operating computer, a similarity calculation between the highest scientific and technological R&D classification systems using the word similarity model; constructing a science and technology knowledge information similarity network to which science and technology knowledge information is added by using the reference science and technology knowledge information similarity network in which the operating computer is built; calculating, by the operating computer, an initial similarity of members in the similarity network of science and technology knowledge information constructed using member information including industry classification, field of interest, field of specialization, and university major input at the time of membership registration by the user; Scientific and technological knowledge information recommendation method comprising the step of the operating computer recommending scientific and technological knowledge information around members in the scientific and technological knowledge information similarity network.
8. The method of claim 7,

Science and technology knowledge information recommendation method, characterized in that the operation computer further comprises the step of resetting the similarity of the member in the science and technology knowledge information similarity network by using the scientific and technological knowledge information use information of the member.
8. The method of claim 7,

Building the word similarity model comprises:

A step of extracting a main text excluding unnecessary paragraphs from a large number of scientific and technological documents; performing stopword processing on the extracted words; Science and technology knowledge information recommendation method, characterized in that it comprises the step of constructing a science and technology related word similarity model through artificial neural network or machine learning learning based on the stopword-processed word.
8. The method of claim 7,

The step of recommending the science and technology knowledge information,

calculating a score difference between a member node and other nodes in a member information storage unit storing score information for each node of the member; determining the number of nodes to be extracted according to the calculated score difference; determining a node depth to be used according to the calculated score difference; filtering the extracted scientific and technological knowledge information using various conditions; Science and technology knowledge information recommendation method comprising the step of recommending science and technology knowledge information including filtered patent and thesis information.