CN112632954A - Method and device for acquiring technical similarity of mechanisms - Google Patents

Method and device for acquiring technical similarity of mechanisms Download PDF

Info

Publication number
CN112632954A
CN112632954A CN202011596692.2A CN202011596692A CN112632954A CN 112632954 A CN112632954 A CN 112632954A CN 202011596692 A CN202011596692 A CN 202011596692A CN 112632954 A CN112632954 A CN 112632954A
Authority
CN
China
Prior art keywords
similarity
target
technical
acquiring
mechanisms
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011596692.2A
Other languages
Chinese (zh)
Inventor
蔡超
武学敏
杨万征
程国艮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Global Tone Communication Technology Co ltd
Original Assignee
Global Tone Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Global Tone Communication Technology Co ltd filed Critical Global Tone Communication Technology Co ltd
Priority to CN202011596692.2A priority Critical patent/CN112632954A/en
Publication of CN112632954A publication Critical patent/CN112632954A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Abstract

The embodiment of the invention provides a method and a device for acquiring technical similarity of mechanisms. The method comprises the following steps: acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions; and acquiring the technical similarity of the two target institutions according to one or more of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organizational structure similarity of the two target institutions. According to the method and the device for acquiring the technical similarity of the mechanisms, provided by the embodiment of the invention, the technical similarity of the two target mechanisms is acquired through at least one of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target mechanisms, the similarity of the mechanisms can be comprehensively described through objective data, the technical similarity of the mechanisms can be more comprehensively, accurately and objectively acquired, and therefore the accuracy and the comprehensiveness of searching of the similar mechanisms can be improved.

Description

Method and device for acquiring technical similarity of mechanisms
Technical Field
The embodiment of the invention relates to the technical field of data analysis, in particular to a method and a device for acquiring technical similarity of mechanisms.
Background
Scientific research institutions including research-type companies, schools, research institutes and the like are the main subjects of technological innovation. Based on the requirements of scientific research cooperation and the like, the technical similarity degree of the two mechanisms needs to be evaluated.
The prior art cannot comprehensively, accurately and objectively compare the technical similarity of mechanisms, so how to acquire the technical similarity of the mechanisms is a technical problem to be solved urgently in the field.
Disclosure of Invention
The embodiment of the invention provides a method and a device for acquiring technical similarity of an organization, which are used for solving or at least partially solving the defect that the technical similarity of the organization cannot be acquired comprehensively, accurately and objectively in the prior art.
In a first aspect, an embodiment of the present invention provides a method for acquiring mechanism technical similarity, including:
acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions;
acquiring the technical similarity of the two target institutions according to one or more of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target institutions;
wherein the two target mechanisms include a first target mechanism and a second target mechanism.
Preferably, acquiring the patent similarity of the two target mechanisms specifically comprises:
acquiring the technical similarity of the two target mechanisms according to the classification of each patent of the two target mechanisms;
or acquiring the technical similarity of the two target mechanisms according to the content of each patent of the two target mechanisms.
Preferably, the acquiring the similarity of papers of the two target institutions specifically comprises:
acquiring the technical similarity of the two target institutions according to the classification of the papers of the two target institutions;
or acquiring the technical similarity of the two target institutions according to the content of each thesis of the two target institutions.
Preferably, the acquiring of the similarity of the researchers of the two target institutions specifically comprises:
according to the classification of technical documents of research personnel of the two target mechanisms, the technical similarity of the two target mechanisms is obtained;
or acquiring the technical similarity of the two target mechanisms according to the content of each technical document of each research staff of the two target mechanisms.
Preferably, the obtaining of the product similarity of the two target mechanisms specifically comprises:
and acquiring the technical similarity of the two target mechanisms according to the content of the introduction texts of the products of the two target mechanisms.
Preferably, the obtaining of the similarity of the tissue structures of the two target mechanisms specifically comprises:
and acquiring the organization structure similarity of the two target mechanisms according to the classification of the built-in departments of the two target mechanisms.
Preferably, the acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organizational structure similarity of the two target institutions specifically includes:
and acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target institutions and preset weights.
In a second aspect, an embodiment of the present invention provides an apparatus for acquiring technical similarity of mechanisms, including:
the dimension similarity acquisition module is used for acquiring at least two of patent similarity, thesis similarity, scientific research personnel similarity, product similarity and organizational structure similarity of two target institutions;
the comprehensive similarity acquisition module is used for acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organizational structure similarity of the two target institutions;
wherein the two target mechanisms include a first target mechanism and a second target mechanism.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when executing the computer program, the method for obtaining mechanism technical similarity provided in any one of the various possible implementations of the first aspect is implemented.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method for obtaining technical similarities of mechanisms as provided in any one of the various possible implementations of the first aspect.
According to the method and the device for acquiring the technical similarity of the mechanisms, provided by the embodiment of the invention, the technical similarity of the two target mechanisms is acquired through at least one of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target mechanisms, the similarity of the mechanisms can be comprehensively described through objective data, the technical similarity of the mechanisms can be more comprehensively, accurately and objectively acquired, and therefore the accuracy and the comprehensiveness of searching of the similar mechanisms can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method for obtaining technical similarity of a mechanism according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an apparatus for acquiring technical similarity of mechanisms according to an embodiment of the present invention;
fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the embodiments of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience in describing the embodiments of the present invention and simplifying the description, but do not indicate or imply that the referred devices or elements must have specific orientations, be configured in specific orientations, and operate, and thus, should not be construed as limiting the embodiments of the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the embodiments of the present invention, it should be noted that, unless explicitly stated or limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. Specific meanings of the above terms in the embodiments of the present invention can be understood in specific cases by those of ordinary skill in the art.
Fig. 1 is a schematic flow chart of a method for acquiring technical similarity of a mechanism according to an embodiment of the present invention. As shown in fig. 1, the method includes: step S101, one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions are obtained.
Wherein the two target mechanisms include a first target mechanism and a second target mechanism.
Specifically, at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organizational structure similarity of two target institutions can be obtained through analyzing the scientific research output documents (including patents, treatises, research reports and the like), the researcher, the products, the organizational structure and other dimensions of the institutions so as to perform comprehensive technical similarity evaluation.
Patent similarity, thesis similarity, researcher similarity, product similarity, and organizational structure similarity can all be regarded as a dimension.
Patent similarity, which is used to describe the similarity between patents published by two target institutions.
Paper similarity, which is used to describe the similarity between papers published by two target institutions.
Researcher similarity is used to describe the similarity between scientific and technical literature published by the researcher of two target institutions.
Product similarity, which is used to describe the similarity between products produced by two target facilities.
Tissue structure similarity, which is used to describe the similarity between the tissue structures of two target mechanisms.
Step S102, acquiring technical similarity of the two target institutions according to one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organization structure similarity of the two target institutions.
Specifically, the similarities of at least two dimensions obtained in step S101 may be fused, and the technical similarities of the two target mechanisms may be obtained through combined calculation.
The fusion method may include, but is not limited to, obtaining a maximum value, an average value, a weighted average value, or the like of the similarity of the at least two dimensions as the technical similarity of the two target mechanisms.
It is understood that after the technical similarity of the two target mechanisms is obtained, if it is determined that the technical similarity of the two target mechanisms is greater than a preset similarity threshold, the two target mechanisms may be determined as similar mechanisms. The similarity threshold may be set according to actual conditions, and the value thereof is not specifically limited in the embodiments of the present invention.
According to the embodiment of the invention, the technical similarity of the two target mechanisms is obtained through at least one of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target mechanisms, the similarity of the mechanisms can be comprehensively described through objective data, the technical similarity of the mechanisms can be more comprehensively, accurately and objectively obtained, and therefore, the accuracy and the comprehensiveness of searching of the similar mechanisms can be improved.
Based on the content of the above embodiments, acquiring the patent similarity of the two target mechanisms specifically includes: acquiring the technical similarity of the two target mechanisms according to the classification of each patent of the two target mechanisms; alternatively, the technical similarity of the two target institutions is acquired according to the contents of each patent of the two target institutions.
Specifically, the patent similarity of the two target institutions may be acquired according to a simplified calculation method or a content calculation method.
According to the simplified calculation method, acquiring the patent similarity of two target mechanisms specifically comprises the following steps:
acquiring a patent feature vector of each target mechanism according to the classification of the patent of each target mechanism;
and acquiring the patent similarity of the two target mechanisms according to the patent feature vectors of the two target mechanisms.
Specifically, the patents disclosed by the two target entities can be acquired according to data disclosed by the patent offices of various countries or a patent data search engine and the like.
According to the classification of patents (e.g., IPC classification, FC classification), each patent has its class.
The patent classification table obtained by the patent classification method has k classifications, which are respectively C1、……、Ck
All patents published for the first target institution (i.e., institution a) (the total number of all patents published by institution a is denoted PA) Therein belong to class C1The number of patents is marked as PA1(if A falls in C1Class, then PA10), and so on, to obtain PA2、……、PAk
All patents published for the first target institution (i.e., institution B) (the total number of all patents published by institution B is denoted PB) Therein belong to class C1The number of patents is marked as PB1(if B falls on C1Class, then PB10), and so on, to obtain PB2、……、PBk
Will vector (P)A1/PA,……,PAk/PA) Sum vector (P)B1/PB,……,PBk/PB) Respectively as patent feature vectors for the two target institutions.
The similarity between the patent feature vectors of the two target mechanisms can be obtained and used as the patent similarity X of the two target mechanisms1
The similarity between the vectors can be obtained by calculating cosine similarity, euclidean distance, hamming distance, mahalanobis distance, or the like between the vectors.
Preferably, the cosine similarity between the patent feature vectors of the two target mechanisms is obtained as the patent similarity X of the two target mechanisms in the embodiment of the present invention1
According to the content calculation method, acquiring the patent similarity of two target mechanisms specifically comprises the following steps:
acquiring content similarity between each patent of a first target institution and each patent of a second target institution;
and acquiring the patent similarity of the two target organizations according to the content similarity between each patent of the first target organization and each patent of the second target organization.
Specifically, all patents disclosed by organization A may be written as { p }a1、……、pamAll patents disclosed by the organization B are designated as { p }b1、……、pbn}. Where m and n represent the total number of all patents disclosed by the mechanism a and the mechanism B, respectively.
For paiAnd pbj(i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to n), and p can be obtained by adopting a method for calculating the similarity of text contents, such as a bag-of-words method or a semantic space methodaiAnd pbjAs p, the similarity between the text contents ofaiAnd pbjContent similarity sijThus, an m by n matrix S can be obtained.
According to the matrix S, each element in the matrix S is subjected to linear or nonlinear processing, and the patent similarity X of two target mechanisms can be obtained1
For example, the patent similarity of two target institutions can be calculated according to the following formula
Figure BDA0002868121980000081
According to the embodiment of the invention, the patent similarity of two target mechanisms is obtained by simplifying a calculation or content calculation method, and the patent similarity disclosed by the mechanisms can be more accurately described, so that the mechanism technical similarity can be more comprehensively, accurately and objectively obtained on the basis of the patent similarity of the target mechanisms.
Based on the content of the above embodiments, acquiring the thesis similarity of the two target institutions specifically includes: acquiring the technical similarity of the two target mechanisms according to the classification of the papers of the two target mechanisms; or acquiring the technical similarity of the two target institutions according to the content of each thesis of the two target institutions.
Specifically, the paper similarity of the two target institutions can be acquired according to a simplified calculation method or a content calculation method.
According to the simplified calculation method, acquiring the patent similarity of two target mechanisms specifically comprises the following steps:
acquiring a thesis feature vector of each target institution according to the classification of the thesis of each target institution;
and acquiring the thesis similarity of the two target institutions according to the thesis feature vectors of the two target institutions.
Specifically, the papers published by the two target institutions can be obtained according to the data disclosed by each journal or a paper data search engine and the like.
According to literature classification (e.g., Chinese book classification), each paper has its own classification.
The paper classification table obtained by the paper classification method has k classifications, which are respectively C1、……、Ck
All papers published for the first target institution (i.e., institution A) (the total number of all papers published by institution A is denoted LA) Therein belong to class C1Is recorded as LA1(if A does not fall on C)1Class, then LA10), and so on, to obtain LA2、……、LAk
All papers published for the first target institution (i.e., institution B) (the total number of all papers published by institution B is denoted LB) Therein belong to class C1Is recorded as LB1(if B does not fall on C)1Class, then LB10), and so on, to obtain LB2、……、LBk
Will vector (L)A1/LA,……,LAk/LA) Sum vector (L)B1/LB,……,LBk/LB) Respectively as the paper feature vectors of the two target institutions.
The similarity between the thesis feature vectors of the two target institutions can be obtained and used as the thesis similarity X of the two target institutions2
The similarity between the vectors can be obtained by calculating cosine similarity, euclidean distance, hamming distance, mahalanobis distance, or the like between the vectors.
Preferably, in the embodiment of the present invention, the cosine similarity between the thesis feature vectors of the two target institutions is obtained as the thesis similarity X of the two target institutions2
According to the content calculation method, acquiring the thesis similarity of two target institutions specifically comprises the following steps:
acquiring content similarity between each paper of a first target institution and each paper of a second target institution;
and acquiring the paper similarity of the two target institutions according to the content similarity between each paper of the first target institution and each paper of the second target institution.
In particular, the amount of the solvent to be used,
all papers published by institution A can be noted as la1、……、lamAll papers published by institution B are denoted as { l }b1、……、lbn}. Where m and n represent the total number of all papers published by institution a and institution B, respectively.
For laiAnd lbj(i is more than or equal to 1 and less than or equal to m, j is more than or equal to 1 and less than or equal to n), the similarity of the text content can be calculated by adopting a bag-of-words method or a semantic space method, and the like, and l is obtainedaiAnd lbjAs l, the similarity between the text contents ofaiAnd lbjContent similarity sijThus, an m by n matrix S can be obtained.
According to the matrix S, each element in the matrix S is subjected to linear or nonlinear processing, and the thesis similarity X of two target institutions can be obtained2
For example, the similarity of articles for two target institutions may be calculated according to the following formula
Figure BDA0002868121980000101
According to the embodiment of the invention, the thesis similarity of two target institutions is obtained by simplifying a calculation or content calculation method, and the similarity of the thesis disclosed by institutions can be more accurately described, so that the institution technical similarity can be more comprehensively, accurately and objectively obtained based on the thesis similarity of the target institutions.
Based on the content of the above embodiments, acquiring the similarity of the scientific researchers of the two target institutions specifically includes: according to the classification of technical documents of scientific research personnel of the two target mechanisms, the technical similarity of the two target mechanisms is taken; alternatively, the technical similarity of the two target facilities is obtained from the contents of the technical documents of the researchers of the two target facilities.
Specifically, similarities between the scientific literature of each scientific researcher of the first target institution and the scientific literature of each scientific researcher of the second target institution may be obtained.
The lists of scientific researchers for two target institutions may be obtained in advance.
Scientific and technical literature of each scientific research personnel of each target institution can be obtained according to network public information and/or data provided by related scientific and technical literature service providers.
Scientific literature of a researcher may include, but is not limited to, papers, patents, standards, reports, books, and other scientific related works published or published by the researcher.
Scientific literature of scientific researchers may not be limited to scientific literature published or published by the scientific researchers at the current target institution.
For each scientific researcher U at the organization a and each scientific researcher V at the organization B, the scientific researchers may be regarded as an organization, and the similarity between the scientific literature of each scientific researcher U at the organization a and the scientific literature of each scientific researcher V at the organization B is obtained according to the simplified calculation or content calculation method.
The similarity between the scientific literature of each scientific researcher U at organization a and the scientific literature of each scientific researcher V at organization B can be regarded as the technical similarity between the scientific researchers U and the scientific researchers V.
And acquiring the similarity of the scientific research personnel of the two target institutions according to the similarity between the scientific literature of each scientific research personnel of the first target institution and the scientific literature of each scientific research personnel of the second target institution.
Based on the similarity between the scientific literature of each of the researchers at the first target institution and the scientific literature of each of the researchers at the second target institution, a matrix may be constructed in which the elements represent the similarity between the scientific literature of one of the researchers at the first target institution and the scientific literature of one of the researchers at the second target institution.
According to the matrix, each element in the matrix is subjected to linear or nonlinear processing, and the similarity X of scientific researchers of two target institutions can be obtained3
According to the embodiment of the invention, the similarity of the scientific research personnel of the two target institutions is obtained according to the similarity between the scientific research documents of every two scientific research personnel of the two target institutions, and the similarity of the scientific research personnel of the institutions can be more accurately described, so that the technical similarity of the institutions can be more comprehensively, accurately and objectively obtained based on the similarity of the scientific research personnel of the target institutions.
Based on the content of the above embodiments, the obtaining of the product similarity of the two target organizations specifically includes: and acquiring the technical similarity of the two target organizations according to the content of the introduction text of each product of the two target organizations.
Specifically, the product similarity of the two target organizations may be acquired according to the content calculation method.
First, content similarity between each product introduction text of a first target institution and each introduction text of a second target institution is acquired.
According to the network public information, each main product of the target mechanism and the introduction text of the product can be obtained.
For each main product Y of the mechanism A and each main product Z of the mechanism B, a method for calculating similarity of text contents by a bag-of-word method or a semantic space method and the like can be adopted to obtain similarity between introduction texts of the product Y and the product Z, and the similarity is used as similarity of the product Y and the product Z, so that a matrix can be obtained.
Then, based on the content similarity between the introduction text of each product of the first target institution and the introduction text of each product of the second target institution, the product similarity of the two target institutions can be obtained.
Specifically, according to the matrix, each element in the matrix is subjected to linear or nonlinear processing, and the product similarity X of two target mechanisms can be obtained4
According to the embodiment of the invention, the product similarity of two target mechanisms is obtained through a content calculation method, and the similarity of products produced by the mechanisms can be more accurately described, so that the technical similarity of the mechanisms can be more comprehensively, accurately and objectively obtained based on the product similarity of the target mechanisms.
Based on the content of the above embodiments, the acquiring the similarity of the organization structures of the two target mechanisms specifically includes: and acquiring the organizational structure similarity of the two target mechanisms according to the classification of the built-in departments of the two target mechanisms.
Specifically, the similarity of the tissue structures of the two target mechanisms can be obtained according to a simplified calculation method.
Firstly, according to the classification of the built-in department of each target mechanism, the organizational structure feature vector of each target mechanism is obtained.
The information of the departments arranged in the two target institutions can be obtained according to the ways of network public information and the like.
Similar parts can be combined according to the name lists of the departments which are commonly set in the organization to form an organization classification list.
Obtaining k department categories, C respectively, according to the mechanism classification table1、……、Ck
For the first target mechanism (i.e., mechanism A), if there is a mechanism belonging to department class C1The interior department of (1), note QA1If not, Q is notedA1On the other hand, Q is obtainedA2、……、QAk
For the second target mechanism (i.e., mechanism B), if there is a mechanism belonging to department class C1The interior department of (1), note QB1If not, Q is notedB1On the other hand, Q is obtainedB2、……、QBk
Will vector (Q)A1,……,QAk) Sum vector (Q)B1,……,QBk) Respectively as the tissue structure feature vectors of the two target mechanisms.
Then, according to the tissue structure feature vectors of the two target mechanisms, the tissue structure similarity of the two target mechanisms can be obtained.
Specifically, the similarity between the tissue structure feature vectors of the two target mechanisms may be obtained as the tissue structure similarity X of the two target mechanisms5
The similarity between the vectors can be obtained by calculating cosine similarity, euclidean distance, hamming distance, mahalanobis distance, or the like between the vectors.
Preferably, the cosine similarity between the tissue structure feature vectors of the two target mechanisms is obtained as the tissue structure similarity X of the two target mechanisms in the embodiment of the present invention5
According to the embodiment of the invention, the similarity of the tissue structures of the two target mechanisms is obtained by simplifying calculation, and the similarity of the tissue structures of the mechanisms can be more accurately described, so that the technical similarity of the mechanisms can be more comprehensively, accurately and objectively obtained on the basis of the similarity of the tissue structures of the target mechanisms.
Based on the content of each embodiment, acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organizational structure similarity of the two target institutions specifically includes: and acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target institutions and preset weights.
Specifically, a corresponding weight may be set in advance for the similarity of each dimension, and after the similarity of at least two dimensions is obtained through step S101, weighted summation or weighted average may be performed according to the similarity of at least two dimensions and the corresponding weight thereof, so as to obtain the technical similarity of two target mechanisms.
The weight corresponding to the similarity of each dimension can be preset according to the actual situation.
The setting may be performed according to human experience, or the regression method may be adopted according to a technical similarity sample of a manually labeled mechanism.
The labeled sample patterns are as follows: sim (a, B) ═ 0.8. The similarity value (i.e., technical similarity) of the samples may be an empirical value.
If five dimensions such as patent similarity, thesis similarity, researcher similarity, product similarity and organization structure similarity are adopted for fusion, the calculation formula of the technical similarity sim (A, B) of the organization A and the organization B is shown as
sim(A,B)=α1*X12*X23*X34*X45*X5
Wherein, X1、X2、X3、X4、X5Respectively representing the patent similarity, paper similarity, researcher similarity, product similarity and organizational structure similarity of the organization A and the organization B; alpha is alpha1、α2、α3、α4、α5Respectively represent X1、X2、X3、X4、X5The corresponding weight.
According to the embodiment of the invention, the technical similarity of the two target mechanisms is obtained according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target mechanisms and the preset weight, the similarity of the mechanisms can be comprehensively described through objective data, the technical similarity of the mechanisms can be more comprehensively, accurately and objectively obtained, and therefore, the accuracy and the comprehensiveness of searching of the similar mechanisms can be improved.
Fig. 2 is a schematic structural diagram of an apparatus for acquiring technical similarity of a mechanism according to an embodiment of the present invention. Based on the content of the foregoing embodiments, as shown in fig. 2, the apparatus includes a dimension similarity obtaining module 201 and an integrated similarity obtaining module 202, where:
the dimension similarity acquisition module 201 is used for acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions;
the comprehensive similarity obtaining module 202 is configured to obtain technical similarities of the two target institutions according to one or more of patent similarities, thesis similarities, scientific research personnel similarities, product similarities and organizational structure similarities of the two target institutions;
wherein the two target mechanisms include a first target mechanism and a second target mechanism.
Specifically, the dimension similarity acquisition module 201 and the comprehensive similarity acquisition module 202 are electrically connected.
The dimension similarity obtaining module 201 may analyze the dimensions of scientific research output documents (including patents, treatises, research reports, and the like), scientific researchers, products, and organization structures of the organizations, and obtain at least one of the patent similarity, treatise similarity, scientific researcher similarity, product similarity, and organization structure similarity of two target organizations, so as to perform comprehensive technical similarity evaluation.
The comprehensive similarity obtaining module 202 may fuse the similarities of at least two dimensions obtained by the dimension similarity obtaining module 201, and obtain the technical similarity of the two target mechanisms through combined calculation.
The fusion method may include, but is not limited to, obtaining a maximum value, an average value, a weighted average value, or the like of the similarity of the at least two dimensions as the technical similarity of the two target mechanisms.
It should be noted that the dimension similarity obtaining module 201 may include at least two of a patent sub-module, a thesis sub-module, a researcher sub-module, a product sub-module, and an organization sub-module.
The patent submodule can comprise a first simplified calculation unit, which is used for acquiring the patent feature vector of each target mechanism according to the classification of the patent of each target mechanism; and acquiring the patent similarity of the two target mechanisms according to the patent feature vectors of the two target mechanisms.
The patent sub-module may further include a first content calculation unit for acquiring content similarity between each patent of the first target institution and each patent of the second target institution; and acquiring the patent similarity of the two target organizations according to the content similarity between each patent of the first target organization and each patent of the second target organization.
The thesis sub-module may include a second simplified calculation unit, configured to obtain a thesis feature vector of each target institution according to a classification to which the thesis of each target institution belongs; and acquiring the thesis similarity of the two target institutions according to the thesis feature vectors of the two target institutions.
The thesis sub-module may further include a second content calculation unit configured to acquire content similarity between each thesis of the first target institution and each thesis of the second target institution; and acquiring the paper similarity of the two target institutions according to the content similarity between each paper of the first target institution and each paper of the second target institution.
The scientific research personnel submodule is used for acquiring the similarity between the scientific and technological literature of each scientific research personnel of the first target institution and the scientific and technological literature of each scientific research personnel of the second target institution; and acquiring the similarity of the scientific research personnel of the two target institutions according to the similarity between the scientific literature of each scientific research personnel of the first target institution and the scientific literature of each scientific research personnel of the second target institution.
The product submodule is used for acquiring content similarity between the introduction text of each product of the first target institution and the introduction text of each product of the second target institution; and acquiring the product similarity of the two target organizations according to the content similarity between the introduction text of each product of the first target organization and the introduction text of each product of the second target organization.
The organization structure submodule is used for acquiring the organization structure characteristic vector of each target mechanism according to the classification of the built-in department of each target mechanism; and acquiring the tissue structure similarity of the two target mechanisms according to the tissue structure feature vectors of the two target mechanisms.
The comprehensive similarity obtaining module 202 is specifically configured to obtain technical similarities of the two target institutions according to at least two of patent similarities, thesis similarities, scientific research personnel similarities, product similarities and organizational structure similarities of the two target institutions and preset weights.
The apparatus for acquiring technical similarity of a mechanism according to the embodiments of the present invention is configured to execute the method for acquiring technical similarity of a mechanism according to the embodiments of the present invention, and specific methods and processes for implementing corresponding functions by modules included in the apparatus for acquiring technical similarity of a mechanism are described in the embodiments of the method for acquiring technical similarity of a mechanism, and are not described herein again.
The device for acquiring the technical similarity of the mechanism is used for the method for acquiring the technical similarity of the mechanism in each embodiment. Therefore, the description and definition in the method for acquiring the technical similarity of the mechanism in the foregoing embodiments can be used for understanding the execution modules in the embodiments of the present invention.
According to the embodiment of the invention, the technical similarity of the two target mechanisms is obtained through at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target mechanisms, the similarity of the mechanisms can be comprehensively described through objective data, the technical similarity of the mechanisms can be more comprehensively, accurately and objectively obtained, and therefore, the accuracy and the comprehensiveness of searching of the similar mechanisms can be improved.
Fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention. Based on the content of the above embodiment, as shown in fig. 3, the electronic device may include: a processor (processor)301, a memory (memory)302, and a bus 303; wherein, the processor 301 and the memory 302 complete the communication with each other through the bus 303; the processor 301 is configured to invoke computer program instructions stored in the memory 302 and executable on the processor 301 to perform the method for obtaining technical similarities of the mechanism provided by the above-described method embodiments, including, for example: acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions; acquiring the technical similarity of the two target institutions according to one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organization structure similarity of the two target institutions; wherein the two target mechanisms include a first target mechanism and a second target mechanism.
Another embodiment of the present invention discloses a computer program product, the computer program product includes a computer program stored on a non-transitory computer readable storage medium, the computer program includes program instructions, when the program instructions are executed by a computer, the computer can execute the method for acquiring the technical similarity of the mechanism provided by the above-mentioned method embodiments, for example, the method includes: acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions; acquiring the technical similarity of the two target institutions according to one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organization structure similarity of the two target institutions; wherein the two target mechanisms include a first target mechanism and a second target mechanism.
Furthermore, the logic instructions in the memory 302 may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or make a contribution to the prior art, or may be implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Another embodiment of the present invention provides a non-transitory computer-readable storage medium storing computer instructions, which cause a computer to execute the method for obtaining mechanism technical similarity provided by the above method embodiments, for example, the method includes: acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions; acquiring the technical similarity of the two target institutions according to one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organization structure similarity of the two target institutions; wherein the two target mechanisms include a first target mechanism and a second target mechanism.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. It is understood that the above-described technical solutions may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method of the above-described embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method of deriving a technical similarity of a structure, comprising:
acquiring one or more of patent similarity, thesis similarity, researcher similarity, product similarity and organizational structure similarity of two target institutions;
acquiring the technical similarity of the two target institutions according to one or more of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target institutions;
wherein the two target mechanisms include a first target mechanism and a second target mechanism.
2. The method for acquiring the technical similarity of the mechanisms according to claim 1, wherein acquiring the patent similarity of the two target mechanisms specifically comprises:
acquiring the technical similarity of the two target mechanisms according to the classification of each patent of the two target mechanisms;
or acquiring the technical similarity of the two target mechanisms according to the content of each patent of the two target mechanisms.
3. The method of claim 1, wherein obtaining the paper similarity of the two target institutions specifically comprises:
acquiring the technical similarity of the two target institutions according to the classification of the papers of the two target institutions;
or acquiring the technical similarity of the two target institutions according to the content of each thesis of the two target institutions.
4. The method for acquiring the technical similarity of the institutions as claimed in claim 1, wherein the acquiring the researcher similarity of the two target institutions specifically comprises:
according to the classification of technical documents of research personnel of the two target mechanisms, the technical similarity of the two target mechanisms is obtained;
or acquiring the technical similarity of the two target mechanisms according to the content of each technical document of each research staff of the two target mechanisms.
5. The method for obtaining the technical similarity of the organizations according to claim 1, wherein the obtaining the product similarity of the two target organizations specifically comprises:
and acquiring the technical similarity of the two target mechanisms according to the content of the introduction texts of the products of the two target mechanisms.
6. The method for obtaining the technical similarity of organs according to claim 1, wherein obtaining the similarity of tissue structures of the two target organs comprises:
and acquiring the organization structure similarity of the two target mechanisms according to the classification of the built-in departments of the two target mechanisms.
7. The method for acquiring technical similarities of institutions according to any one of claims 1 to 6, wherein the acquiring technical similarities of the two target institutions according to at least two of patent similarities, thesis similarities, researcher similarities, product similarities and organizational structure similarities of the two target institutions specifically comprises:
and acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organization structure similarity of the two target institutions and preset weights.
8. An apparatus for obtaining technical similarity of mechanisms, comprising:
the dimension similarity acquisition module is used for acquiring at least two of patent similarity, thesis similarity, scientific research personnel similarity, product similarity and organizational structure similarity of two target institutions;
the comprehensive similarity acquisition module is used for acquiring the technical similarity of the two target institutions according to at least two of the patent similarity, the thesis similarity, the researcher similarity, the product similarity and the organizational structure similarity of the two target institutions;
wherein the two target mechanisms include a first target mechanism and a second target mechanism.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of acquiring technical similarities of mechanisms according to any one of claims 1 to 7, when the program is executed by the processor.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of acquiring technical similarities of mechanisms according to any one of claims 1 to 7.
CN202011596692.2A 2020-12-29 2020-12-29 Method and device for acquiring technical similarity of mechanisms Pending CN112632954A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011596692.2A CN112632954A (en) 2020-12-29 2020-12-29 Method and device for acquiring technical similarity of mechanisms

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011596692.2A CN112632954A (en) 2020-12-29 2020-12-29 Method and device for acquiring technical similarity of mechanisms

Publications (1)

Publication Number Publication Date
CN112632954A true CN112632954A (en) 2021-04-09

Family

ID=75286328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011596692.2A Pending CN112632954A (en) 2020-12-29 2020-12-29 Method and device for acquiring technical similarity of mechanisms

Country Status (1)

Country Link
CN (1) CN112632954A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312895A (en) * 2021-05-20 2021-08-27 北京邮电大学 Organization mapping method and device of autonomous system AS and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060294060A1 (en) * 2003-09-30 2006-12-28 Hiroaki Masuyama Similarity calculation device and similarity calculation program
US20130031018A1 (en) * 2010-03-29 2013-01-31 Harald Jellum Method and arrangement for monitoring companies
CN103823880A (en) * 2014-03-03 2014-05-28 国家认证认可监督管理委员会信息中心 Attribute weight-based method for calculating similarity between detection mechanisms
CN107908626A (en) * 2016-12-30 2018-04-13 上海壹账通金融科技有限公司 The computational methods and device of company's similarity
CN111428152A (en) * 2020-04-26 2020-07-17 中国烟草总公司郑州烟草研究院 Method and device for constructing similar communities of scientific research personnel
CN111597309A (en) * 2020-05-25 2020-08-28 深圳市小满科技有限公司 Similar enterprise recommendation method and device, electronic equipment and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060294060A1 (en) * 2003-09-30 2006-12-28 Hiroaki Masuyama Similarity calculation device and similarity calculation program
US20130031018A1 (en) * 2010-03-29 2013-01-31 Harald Jellum Method and arrangement for monitoring companies
CN103823880A (en) * 2014-03-03 2014-05-28 国家认证认可监督管理委员会信息中心 Attribute weight-based method for calculating similarity between detection mechanisms
CN107908626A (en) * 2016-12-30 2018-04-13 上海壹账通金融科技有限公司 The computational methods and device of company's similarity
CN111428152A (en) * 2020-04-26 2020-07-17 中国烟草总公司郑州烟草研究院 Method and device for constructing similar communities of scientific research personnel
CN111597309A (en) * 2020-05-25 2020-08-28 深圳市小满科技有限公司 Similar enterprise recommendation method and device, electronic equipment and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
蔡虹等: "我国与主要创新型国家(地区)技术相似度的测算分析", 西安交通大学学报(社会科学版), vol. 29, no. 5, pages 1 - 6 *
谭龙等: "基于相似性指数的创新相似性模型构建及应用", 情报探索, no. 8, pages 9 - 15 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312895A (en) * 2021-05-20 2021-08-27 北京邮电大学 Organization mapping method and device of autonomous system AS and electronic equipment

Similar Documents

Publication Publication Date Title
Vosko Decent Work' The Shifting Role of the ILO and the Struggle for Global Social Justice
KR101658794B1 (en) Document classification system, document classification method, and document classification program
JP5952835B2 (en) Imaging protocol updates and / or recommenders
US10438347B2 (en) Automated quality control of diagnostic radiology
CA3117374C (en) Sensitive data detection and replacement
JPWO2016125310A1 (en) Data analysis system, data analysis method, and data analysis program
Ervik The battle of future pensions: Global accounting tools, international organizations and pension reforms
Jho Challenges for e-governance: protests from civil society on the protection of privacy in e-government in Korea
US20210174968A1 (en) Visualization of Social Determinants of Health
Boye Can you stay home today? Parents’ occupations, relative resources and division of care leave for sick children
CN112632954A (en) Method and device for acquiring technical similarity of mechanisms
KR101966627B1 (en) Medical documents translation system for mobile
Kousis et al. Claiming and framing youth in the public domain during times of increasing inequalities
Aliverti Introduction: Special issue on ‘Policing, migration and national identity’
Göransson Registered nurse-led emergency department triage: organisation, allocation of acuity ratings and triage decision making
Campbell et al. The credibility of health economic models for health policy decision-making: the case of population screening for abdominal aortic aneurysm
Nallakaruppan et al. Comparative Analysis of Deep Learning Models Used in Impact Analysis of Coronavirus Chest X-ray Imaging
Arreman Student perceptions of new differentiation policies in Swedish post-16 education
CN112561714B (en) Nuclear protection risk prediction method and device based on NLP technology and related equipment
KR20230000420A (en) Apparatus and method for building training data using patent document and building training data system using the same
Garg et al. w. jpionline. org
CN110310208B (en) Project claim review application processing method and device
Meaney et al. A Comparative Evaluation Of Transformer Models For De-Identification Of Clinical Text Data
CN113688854A (en) Data processing method and device and computing equipment
Cela et al. Ageing in a multicultural Europe: Perspectives and challenges

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination