CN109726401B - Patent combination generation method and system - Google Patents

Patent combination generation method and system Download PDF

Info

Publication number
CN109726401B
CN109726401B CN201910004140.9A CN201910004140A CN109726401B CN 109726401 B CN109726401 B CN 109726401B CN 201910004140 A CN201910004140 A CN 201910004140A CN 109726401 B CN109726401 B CN 109726401B
Authority
CN
China
Prior art keywords
combination
subset
complementarity
index
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910004140.9A
Other languages
Chinese (zh)
Other versions
CN109726401A (en
Inventor
茹丽洁
康飞
李素粉
范云杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN201910004140.9A priority Critical patent/CN109726401B/en
Publication of CN109726401A publication Critical patent/CN109726401A/en
Application granted granted Critical
Publication of CN109726401B publication Critical patent/CN109726401B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention relates to the technical field of computers, and particularly discloses a patent combination generation method and a platform, wherein the method comprises the following steps: calculating the text similarity between each patent and other patents in the patent set to be grouped; grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents; calculating a network correlation density and a complementarity index of each patent subset; and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity indexes of the patent subset are within a preset complementarity index range, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination. The invention can excavate deep level incidence relation between patent technologies in a non-manual mode and generate the patent combination, thereby improving the identification efficiency of the patent combination and the patent transfer and conversion rate of enterprise patents.

Description

Patent combination generation method and system
Technical Field
The invention relates to the technical field of computers, in particular to a patent combination generation method and system.
Background
With the arrival of the knowledge economic era, the patent application amount of enterprises in China is increased year by year, and the patent pool of the enterprises is continuously enriched. On the other hand, the patent technology transfer and conversion rate of the enterprise patents are extremely low, a great deal of patents are not used, and a huge gap exists between the laboratory results and the technology marketization application. To facilitate the transfer and transformation of patent technologies, patent value evaluations are first performed.
The patent value is not reflected in a single patent, but is reflected in a group of patent combinations with internal correlation, and the value of the patent combination is far greater than the sum of the values of all single patents in the patent combination. Therefore, rather than addressing individual or random patents, the patent technology transfer and conversion process is often directed to the entire transfer of a series of patents with internal associations packaged into patent assemblies for maximum economic benefit. The identification of patent combinations has great significance in improving the overall value and transfer efficiency of enterprise patents. At present, the identification work of patent combination mainly depends on expert experience judgment, and the method is time-consuming and labor-consuming, is difficult to dig the deep level incidence relation between patent technologies, and is not beneficial to improving the patent transfer and conversion rate of the patent technologies.
It should be noted that the above background description is only for the sake of clarity and complete description of the technical solutions of the present invention and for the understanding of those skilled in the art. Such solutions are not considered to be known to the person skilled in the art merely because they have been set forth in the background section of the invention.
Disclosure of Invention
The invention aims to solve the technical problem of the prior art, and provides a patent combination generation method and system, which can be used for mining deep association relations among patent technologies in a non-manual mode and generating patent combinations, so that the identification efficiency of the patent combinations and the patent transfer and conversion rate of enterprise patents are improved.
In order to achieve the above object, the present invention provides a patent combination generating method, including:
calculating the text similarity between each patent and other patents in the patent set to be grouped;
grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents;
calculating a network correlation density and a complementarity index of each patent subset;
and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity indexes of the patent subset are within a preset complementarity index range, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination.
Optionally, the calculating the text similarity between each patent in the to-be-grouped patent set and other patents specifically includes:
generating a theme characteristic vector corresponding to each patent in the patent set to be grouped by adopting an LDA model;
and generating text similarity between each patent and other patents in the patent set to be grouped according to the theme feature vector and the cosine similarity algorithm.
Optionally, before the calculating the network relevance density and the complementarity indicator of each patent subset, the method includes:
acquiring a plurality of patent combination samples, wherein the patent combination samples comprise technology associated patent combination samples and product associated patent combination samples;
calculating the network correlation density and the complementarity index of each patent combination sample;
and generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.
Optionally, the generating the density threshold and the complementarity indicator range according to the network correlation density and the complementarity indicator of the plurality of patent combination samples specifically includes:
averaging or minimizing the network correlation densities of a plurality of the patent combination samples to generate a density threshold;
generating a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination sample;
generating a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination sample;
if it is determined that the network correlation density of the patent subset is greater than the preset density threshold and the complementarity indicator of the patent subset is within the preset complementarity indicator range, generating a patent combination according to the patent subset specifically includes:
if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity index of the patent subset is within the complementarity index range of a preset technology-related patent combination, generating the technology-related patent combination according to the patent subset;
and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity index of the patent subset is within the complementarity index range of a preset product-associated patent combination, generating the product-associated patent combination according to the patent subset.
Optionally by means of a formula
Figure GDA0003794747600000031
Calculating to obtain a network correlation density, wherein D is the network correlation density of the patent subset or the patent combination sample, and R ij The text similarity between the patents i and j in the patent subset or the patent combination sample is shown, and N is the total number of patents included in the patent subset or the patent combination sample.
Optionally, the complementarity indicator is generated from a crossability indicator, a polymerizability indicator, and a diversity indicator;
the cross-over index is expressed by formula
Figure GDA0003794747600000032
Calculating to obtain the result, wherein RS is the cross-over index of the patent subset or the patent combination sample, p i Probability distribution values, p, for a patent class i in all patent classes of the patent subset or patent combination sample j Probability distribution values for patent class j in all patent classes of the patent subset or patent combination sample, d ij The distance value between different patent categories in the patent subset or the patent combination sample is obtained, and alpha and beta are metering parameters;
the polymerization index is represented by the formula
Figure GDA0003794747600000041
Calculating to obtain the result, wherein CC is the polymerization index of the patent subset or the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N Δ As subsets or groups of patentsNumber of ternary closures in the closed sample, N 3 The number of connected triplets in the patent subset or the patent combination sample;
the difference index is expressed by a formula
Figure GDA0003794747600000042
Calculating to obtain the difference index of the patent subset or the patent combination sample, wherein S is the total number of patents contained in the patent subset or the patent combination sample, N is R ij For the co-referenced strength between patent i and patent j in the patent subset or patent combination sample,
Figure GDA0003794747600000043
C ij for co-introduced frequencies between patent i and patent j in a subset of patents or a combination of patents, C i Is the total introduced frequency of patent i, C j The total quoted frequency for patent j.
In order to achieve the above object, the present invention further provides a system for generating a patent combination, including a prediction module, where the prediction module includes:
the first calculation unit is used for calculating the text similarity between each patent in the patent set to be grouped and other patents and calculating the network correlation density and the complementarity index of each patent subset;
the judging unit is used for judging whether the network correlation density of the patent subset is larger than a preset density threshold value and whether the complementarity index of the patent subset is within a preset complementarity index range;
and a first generating unit, configured to group the patent set to be grouped according to the text similarity and generate a plurality of patent subsets, where each patent subset includes a plurality of patents, and when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a preset complementarity indicator range, generate a patent combination according to the patent subset, where the patent combination is a technology-related patent combination or a product-related patent combination.
Optionally, the first computing unit is specifically configured to generate a topic feature vector corresponding to each patent in the to-be-grouped patent set by using an LDA model, and generate text similarity between each patent in the to-be-grouped patent set and other patents according to the topic feature vector and a cosine similarity algorithm.
Optionally, the system further comprises a training module, the training module comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of patent combination samples, and the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples;
the second calculation unit is used for calculating the network correlation density and the complementarity index of each patent combination sample;
and a second generating unit which generates a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of the patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.
Optionally, the second generating unit is specifically configured to average or minimize the network correlation density of the patent combination samples to generate a density threshold, generate a complementarity indicator range of the technology-related patent combination according to a complementarity indicator of the technology-related patent combination samples, and generate a complementarity indicator range of the product-related patent combination according to a complementarity indicator of the product-related patent combination samples;
the first generating unit is specifically configured to generate a technology-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset technology-related patent combination, and generate a product-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset product-related patent combination.
The invention has the following beneficial effects:
according to the patent combination generation method provided by the invention, the patent sets to be grouped are grouped according to the text similarity to generate a plurality of patent subsets, and when the network correlation density of the patent subsets is judged to be greater than the preset density threshold value and the complementarity indexes of the patent subsets are within the preset complementarity index range, the patent combinations are generated according to the patent subsets, the deep level association relationship among patent technologies can be mined in a non-manual mode to generate the patent combinations, so that the identification efficiency of the patent combinations and the patent transfer and conversion rate of enterprise patents are improved.
Specific embodiments of the present invention are disclosed in detail with reference to the following description and the accompanying drawings, which indicate the ways in which the principles of the invention may be employed. It should be understood that the embodiments of the invention are not so limited in scope. The embodiments of the invention include many variations, modifications and equivalents within the spirit and scope of the appended claims.
Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments, in combination with or instead of the features of the other embodiments.
It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps or components.
Drawings
Fig. 1 is a schematic flow chart of a patent combination generation method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a patent combination generation method according to a second embodiment of the present invention;
fig. 3 is a schematic flowchart illustrating an exemplary process of a patent combination generating method according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a patent combination generating system provided in the third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a patent combination generation system according to a fourth embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the following description of the technical solution of the present invention with reference to the accompanying drawings is made clearly and completely, and it is obvious that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The patent combination described in this embodiment refers to a set of related patents under the control of the same assignee, and the important feature of the patent combination is technical relevance. The patent combination includes: the technology-related patent combination based on the technology similarity can form a patent barrier, the similarity degree between the patent technologies in the technology-related patent combination is higher, and the complementarity is lower; the patent association combination based on the combination of different technologies of the same product has higher complementarity of each patent technology in the patent association combination. In the patent technology transfer and conversion of enterprise patents, the two patent combinations have important values, and a plurality of technical schemes with similar technologies or complementary technologies are packaged to form the patent combinations, so that on one hand, the patent stock can be kept alive, and on the other hand, the overall value and transfer conversion rate of the patent technology can be improved.
The technical complementarity of the technology-associated patent combination and the product-associated patent combination is greatly different, each patent in the technology-associated patent combination is developed around similar technologies, and the patents show stronger concentration tendency and inheritance on the patent classification or patent citation relation, so that the complementarity of each patent in the technology-associated patent combination is smaller; the complementarity between patents in the product-related patent combination is relatively large, and the citation relationships (such as citation coupling) between patents in the product-related patent combination, which may be distributed in different patent categories or between patent technologies, are not germane.
Example one
Fig. 1 is a schematic flow chart of a patent combination generation method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 101, calculating text similarity between each patent in the patent set to be grouped and other patents.
And 102, grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents.
And 103, calculating the network correlation density and the complementarity index of each patent subset.
Step 104, judging whether the network correlation density of the patent subset is greater than a preset density threshold value and whether the complementarity index of the patent subset is within a preset complementarity index range, if so, executing step 105; if not, the process is ended.
And 105, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination.
According to the patent combination generation method provided by the embodiment, the patent sets to be grouped are grouped according to the text similarity, a plurality of patent subsets are generated, when the fact that the network correlation density of the patent subsets is larger than the preset density threshold value and the complementarity indexes of the patent subsets are located in the preset complementarity index range is judged, the patent combinations are generated according to the patent subsets, the deep-level association relation among patent technologies can be mined in a non-manual mode, and the patent combinations are generated, so that the recognition efficiency of the patent combinations and the patent transfer and conversion rates of enterprise patents are improved.
Example two
Fig. 2 is a schematic flow chart of a patent combination generating method according to a second embodiment of the present invention, and as shown in fig. 2, the method includes the following steps:
step 201, obtaining a plurality of patent combination samples, wherein the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples.
Preferably, the steps in this embodiment are generated by a patent combination generation system.
The patent combination sample is a patent combination recognized by experts in the field, and includes a plurality of patents. Such as: the patent combination sample may be several patents PP1, PP2, … PPn of company a within technical field T.
And 202, calculating the network correlation density and the complementarity index of each patent combination sample.
The network correlation density of the patent combination sample is used to represent the similarity between patents in the patent combination sample. The patents in the patent combination sample are mutually related, and the patent text content of the patents presents certain similarity. Such as: the technical content, efficacy content and USE content of a plurality of patents in the patent combination sample are similar or identical, or secondary indexing content (such as the USE field of the Derwent database) is included in part of the patents.
Specifically, firstly, a LDA topic model (document topic Allocation, document topic generation model) is used to generate topic feature vectors corresponding to each patent in the patent combination sample, and then text similarity between each patent and other patents in the patent combination sample is generated according to the topic feature vectors and cosine similarity algorithms of all patents in the patent combination sample. Finally, by the formula
Figure GDA0003794747600000081
Calculating the network correlation density of the patent combination sample, wherein D is the network correlation density of the patent combination sample, R ij And N is the text similarity between the patent i and the patent j in the patent combination sample, and is the total number of patents included in the patent combination sample. The topic feature vector is used for constructing a text correlation network, the text similarity and the network correlation density are generated based on the text correlation network and a social network theory, and the network correlation density represents the ratio of the sum of the text similarities among the patents in the patent combination sample to the sum of the maximum text similarity.
Optionally, generating a topic feature vector corresponding to each patent in the patent combination sample by using the Gensim open source toolkit of Python: extracting 8 themes from each patent, performing parameter estimation on all patents in the patent combination sample by adopting a Gibbs sampling algorithm, and solving the theme-feature word probability distribution condition of the whole patent combination sample after 1500 iterations so as to finally generate a theme feature vector corresponding to each patent. The patent is expressed as the topic feature vector, so that the accuracy of generating the patent combination can be ensured while the great dimension reduction of the patent is realized.
In the embodiment, the network correlation density is selected to represent the technical correlation among the patents in the patent combination sample, and follows the integrity principle, the applicability principle and the simplicity principle, and the network correlation density can more intuitively reflect the technical correlation characteristics among the patents in the patent combination sample.
The complementarity index of the patent combination sample is used to indicate the degree of technical difference between patents in the patent combination sample. The complementarity index of the patent combination sample is generated from the crossability index, the polymerizability index and the diversity index. The cross index, the polymerization index and the difference index are generated by a common analysis method, a social network analysis method and a citation analysis method respectively.
The cross-over index is expressed by formula
Figure GDA0003794747600000091
Calculated, wherein RS is the cross-over index of the patent combination sample, p i Probability distribution value, p, in all patent classes of the patent combination sample for patent class i j Probability distribution value in all patent classes of the patent combination sample for patent class j, d ij For the distance values between different patent classes in the patent combination sample, d ij And generating by adopting a cosine similarity algorithm, wherein alpha and beta are metering parameters. Each parameter in the crossability index is generated through a text correlation network constructed by the patent combination samples.
Specifically, a patent in the patent combination sample may have a plurality of patent classification numbers, different patent classification numbers correspond to different patent categories, and all patent categories in the patent combination sample are the total number of all different patent classification numbers of all patents included in the patent combination sample. The crossability index is a comprehensive measure index which can be used for measuring patent groupsAnd combining the distribution characteristics of the patents in the sample and the distance and the difference between the patent categories. In this embodiment, optionally, all patents in the patent combination sample are expressed in the form of category vectors to statistically generate p i And p j The generated patent category statistical table can be shown as the following table one:
watch 1
Sample of a patent combination C 1 C 2 C N
Z 1 1 0 0
Z 2 1 1 0
1
Z M 0 1 0
Wherein, C i For the ith patent category in the patent combination sample, Z j For the jth patent in the patent combination sample, 1 indicates that the patent belongs to a certain patent class, and 0 indicates that the patent does not belong to a certain patent class.
The index of polymerization is represented by the formula
Figure GDA0003794747600000101
Calculating to obtain the result, wherein CC is the polymerizability index of the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N Δ Is the number of ternary closure in the patent combination sample, N 3 The number of connected triplets in the patent combination sample. Specifically, the ternary closure in the patent combination sample refers to a ternary closure in a text relevance network constructed by the patent combination sample, and each ternary closure can be regarded as three different connected triples. The aggregativity index describes the aggregativity characteristics of each patent in a sample of patent portfolio by measuring the number of ternary closures in the text correlation network. N in the index of polymerizability Δ The factor of 3 can ensure that CC is more than or equal to 0 and less than or equal to 1.
Difference index is represented by formula
Figure GDA0003794747600000102
Calculating to obtain the difference index of the patent combination sample, wherein S is the difference index of the patent combination sample, N is the total number of patents contained in the patent combination sample, and R is ij For the co-referenced strength between patent i and patent j in the patent combination sample,
Figure GDA0003794747600000103
Figure GDA0003794747600000104
C ij for co-introduced frequencies between patent i and patent j in patent combination sample, C i Is the total introduced frequency of patent i, C j The total quoted frequency for patent j. The larger the frequency of co-introduction, the more closely the citation relationship between patents. The difference index is an index having an opposite attribute to the similarity, the larger the difference index is, the larger the similarity and the smaller the difference between the patents in the patent combination sample are, and the smaller the difference index is, the smaller the similarity and the larger the difference between the patents in the patent combination sample are.
And step 203, generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises the complementarity index range of the technology-related patent combination and the complementarity index range of the product-related patent combination.
Specifically, step 203 comprises: averaging or minimizing the network correlation densities of a plurality of the patent combination samples to generate a density threshold; generating a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination sample; and generating a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination sample.
Step 201-step 203 are used for implementing training on the patent combination sample to establish a patent combination generation model. In this embodiment, the patent combination generation model may be understood as predicting a patent combination according to a density threshold and a complementarity index range. Optionally, the patent combination generating model may also be tested, that is, the density threshold and the complementarity index range are tested and adjusted according to the test result, so as to obtain the optimal density threshold and complementarity index range. Such as: the test sample of the patent combination generation model can be a plurality of other patents in the technical field T of the enterprise A besides the patents in the patent combination sample.
The density threshold and the complementarity index range are used for generating patent combinations from the patent sets to be grouped, and can be set according to the actual situation of the patents to be grouped. The complementarity index range includes a complementarity index range of a combination of related patents and a complementarity index range of a combination of related patents. The complementarity index range is generated according to the cross index, the polymerization index and the difference index after the weighting of the multiple patent combination samples, the weight of each index is determined in a variation coefficient weighting mode, and the weight can be used for representing the influence of the corresponding index in distinguishing the technology-related patent combination and the product-related patent combination. Weighting the cross index, the polymerization index and the difference index of the technical associated patent combination sample to determine the complementary index range of the technical associated patent combination, and weighting the cross index, the polymerization index and the difference index of the product associated patent combination sample to determine the complementary index range of the product associated patent combination.
Optionally, the weight generation manner of the crossability index, the polymerizability index, and the diversity index includes: first by the formula
Figure GDA0003794747600000121
Calculating to obtain the variation coefficient v of each index i Wherein S is i Is the standard deviation of the i-th index,
Figure GDA0003794747600000122
is the average of the ith index in all patent combination samples. Then passing through the formula
Figure GDA0003794747600000123
Normalizing the variation coefficient of the indexes to generate the weight w of each index i
And step 204, calculating the text similarity between each patent in the patent set to be grouped and other patents.
The patent set to be grouped comprises a plurality of patents, and the method of the embodiment is particularly used for generating one or more patent combinations from the patent set to be grouped. Such as: the patent sets to be grouped are all other patents in the technical field T of the enterprise A except the patents in the patent combination sample.
Step 204 specifically comprises the following steps: and generating theme characteristic vectors corresponding to each patent in the patent set to be grouped by adopting an LDA model. And generating text similarity between each patent and other patents in the patent set to be grouped according to the theme feature vector and the cosine similarity algorithm.
The specific process of step 204 can be described by referring to the text similarity generation manner between patents in the patent combination sample in step 202, and is not described herein again.
And step 205, grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents.
Specifically, a plurality of patents with close text similarity are divided into a patent subset. Multiple patents within a patent subset belong to the same patent subclass.
And step 206, calculating the network correlation density and the complementarity index of each patent subset.
The network correlation density and the complementarity index generating manner of the special subset in step 206 can be described with reference to the network correlation density and the complementarity index generating manner of the special combination sample in step 202, and will not be described herein again.
Step 207, judging whether the network correlation density of the patent subset is greater than a preset density threshold, if so, executing step 208; if not, the process is ended.
The network correlation density of the patent subset is greater than the preset density threshold, which indicates that the similarity of each patent in the patent subset is greater, and the patent subset can be used as a candidate for a patent combination.
Step 208, judging whether the complementarity indexes of the patent subsets are within the complementarity index range of a preset technology-related patent combination, if so, executing step 209; if not, go to step 210.
And 209, generating a technology-associated patent combination according to the patent subset, and ending the process.
Step 210, judging whether the complementarity indexes of the patent subsets are within the complementarity index range of a preset product-related patent combination, if so, executing step 211; the flow ends.
And step 211, generating a product-related patent combination according to the patent subset, and ending the process.
The complementarity index of the product-related patent combination is much greater than that of the technology-related patent combination.
The following specifically exemplifies the patent combination generation method of the following embodiment:
all 5452 patent numbers of organization A in the T field are searched in a patent database, and the content of the patent numbers is manually interpreted by experts in the technical field to obtain 14 patent combinations of the enterprise as a patent combination sample in the embodiment. Respectively calculating the network correlation density and the complementarity index of the 14 patent combination samples, and obtaining the following result according to the network correlation density and the complementarity index of the 14 patent combination samples: the density threshold value is 0.6, the complementarity index range of the technology-associated patent combination is 0.15< z < 0.07, and the complementarity index range of the product-associated patent combination is 0.07 < z < 0.15. Fig. 3 is a schematic flowchart of an example of a patent portfolio generation method according to a second embodiment of the present invention, and as shown in fig. 3, other 11 patent portfolios with portfolio potential of the enterprise are finally predicted according to the density threshold and the complementarity index range. Some of these patent combinations include 167 patents relating to etching, LCD, liquid crystal display, mounting substrate, photoresist layer, substrate processing, evaporator, plasma deposition, etc., and the subject matter of the patent combination can be summarized as an etching process.
According to the patent combination generation method provided by the embodiment, the patent sets to be grouped are grouped according to the text similarity, a plurality of patent subsets are generated, and when the fact that the network correlation density of the patent subsets is larger than the preset density threshold value and the complementarity indexes of the patent subsets are located in the preset complementarity index range is judged, the patent combination is generated according to the patent subsets, the deep-level association relation among patent technologies can be mined in a non-manual mode and a computer-assisted mode, and the patent combination is generated, so that the patent combination identification efficiency of enterprise patents and the patent transfer and conversion rate of the enterprise patents are improved, and meanwhile, the labor cost can be effectively saved.
It should be noted that while the operations of the methods of the present invention are illustrated in the drawings in a particular order, this does not require or imply that the operations must be performed in that particular order, or that all illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
EXAMPLE III
Fig. 4 is a schematic structural diagram of a patent combination generating system according to a third embodiment of the present invention, and as shown in fig. 4, the system includes a prediction module 1, where the prediction module 1 includes a first calculating unit 11, a judging unit 12, and a first generating unit 13.
The first calculating unit 11 is configured to calculate text similarity between each patent in the patent set to be grouped and other patents, and calculate a network correlation density and a complementarity index of each patent subset. The judging unit 12 is configured to judge whether the network correlation density of the patent subset is greater than a preset density threshold and whether the complementarity indicator of the patent subset is within a preset complementarity indicator range. The first generating unit 13 is configured to group the patent sets to be grouped according to the text similarity and generate a plurality of patent subsets, where each patent subset includes a plurality of patents, and when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a preset complementarity indicator range, generate a patent combination according to the patent subset, where the patent combination is a technology-related patent combination or a product-related patent combination.
The patent combination generation system provided by the third embodiment can be used for implementing the patent combination generation method provided by the first embodiment.
In the patent combination generation system provided by this embodiment, the first generation unit of the prediction module is configured to group the to-be-grouped patent sets according to the text similarity and generate a plurality of patent subsets, and when it is determined that the network correlation density of the patent subsets is greater than the preset density threshold and the complementarity indexes of the patent subsets are within the preset complementarity index range, generate the patent combinations according to the patent subsets, and can mine the deep-level association relationship between patent technologies in a non-manual manner and generate the patent combinations, thereby improving the identification efficiency of the patent combinations and the patent transfer and conversion rates of the enterprise patents.
Example four
Fig. 5 is a schematic structural diagram of a patent combination generating system according to a fourth embodiment of the present invention, and as shown in fig. 5, the system includes a prediction module 1, where the prediction module 1 includes a first calculating unit 11, a judging unit 12, and a first generating unit 13.
The first calculating unit 11 is configured to calculate text similarity between each patent in the patent set to be grouped and other patents, and calculate a network correlation density and a complementarity index of each patent subset. The judging unit 12 is configured to judge whether the network correlation density of the patent subset is greater than a preset density threshold and whether the complementarity indicator of the patent subset is within a preset complementarity indicator range. The first generating unit 13 is configured to group the patent sets to be grouped according to the text similarity, and generate a plurality of patent subsets, where each patent subset includes a plurality of patents, and when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicators of the patent subset are within a preset complementarity indicator range, generate a patent combination according to the patent subset, where the patent combination is a technology-related patent combination or a product-related patent combination.
Further, the first calculating unit 11 is specifically configured to generate a topic feature vector corresponding to each patent in the to-be-grouped patent set by using an LDA model, and generate text similarity between each patent in the to-be-grouped patent set and other patents according to the topic feature vector and a cosine similarity algorithm.
Further, the system further comprises a training module 2, wherein the training module 2 comprises: an acquisition unit 21, a second calculation unit 22, and a second generation unit 23.
The acquiring unit 21 is configured to acquire a plurality of patent combination samples, which include a technology-related patent combination sample and a product-related patent combination sample. The second calculating unit 22 is configured to calculate a network correlation density and a complementarity indicator of each patent combination sample. The second generating unit 23 generates a density threshold value and a complementarity index range including a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination from the network correlation density and the complementarity index of a plurality of the patent combination samples.
Further, the second generating unit 23 is specifically configured to average or minimize the network correlation density of the patent combination samples to generate a density threshold, generate a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination samples, and generate a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination samples. The first generating unit 13 is specifically configured to generate a technology-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset technology-related patent combination, and generate a product-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset product-related patent combination.
The patent combination generation system provided by the fourth embodiment can be used for implementing the patent combination generation method provided by the second embodiment.
In the patent combination generation system provided by this embodiment, the first generation unit of the prediction module is configured to group the to-be-grouped patent sets according to the text similarity and generate a plurality of patent subsets, and when it is determined that the network correlation density of the patent subsets is greater than the preset density threshold and the complementarity indexes of the patent subsets are within the preset complementarity index range, generate the patent combinations according to the patent subsets, and can mine the deep-level association relationship between patent technologies in a non-manual manner and generate the patent combinations, thereby improving the identification efficiency of the patent combinations and the patent transfer and conversion rates of patents in enterprises.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and the content of the present specification should not be construed as a limitation to the present invention. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (9)

1. A patent combination generation method is characterized by comprising the following steps:
calculating the text similarity between each patent and other patents in the patent set to be grouped;
grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents;
calculating a network correlation density and a complementarity index of each patent subset;
if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity indexes of the patent subset are within a preset complementarity index range, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination;
the complementarity index is generated according to a crossability index, a polymerizability index and a difference index;
the cross-over index is expressed by formula
Figure FDA0003794747590000011
Calculating to obtain the result, wherein RS is the cross-over index of the patent subset or the patent combination sample, p i Probability distribution values, p, for a patent class i in all patent classes of the patent subset or patent combination sample j Probability distribution values for patent class j in all patent classes of the patent subset or patent combination sample, d ij The distance value between different patent categories in the patent subset or the patent combination sample is obtained, and alpha and beta are metering parameters;
the polymerization index is represented by the formula
Figure FDA0003794747590000012
Calculating to obtain the result, wherein CC is the polymerization index of the patent subset or the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N Δ Is the number of ternary closure packets, N, in a patent subset or patent combination sample 3 The number of connected triplets in the patent subset or the patent combination sample;
the difference index is expressed by a formula
Figure FDA0003794747590000013
Calculating to obtain the difference index of the patent subset or the patent combination sample, wherein S is the difference index of the patent subset or the patent combination sample, N is the total number of patents contained in the patent subset or the patent combination sample, and R is ij For the co-referenced strength between patent i and patent j in the patent subset or patent combination sample,
Figure FDA0003794747590000014
C ij for co-introduced frequencies between patent i and patent j in a subset of patents or in a combination of patents, C i Is the total introduced frequency of patent i, C j The total quoted frequency for patent j.
2. The patent combination generation method according to claim 1, wherein the calculating the text similarity between each patent in the patent set to be grouped and other patents specifically comprises:
generating a theme characteristic vector corresponding to each patent in the patent set to be grouped by adopting an LDA model;
and generating text similarity between each patent and other patents in the patent set to be grouped according to the theme feature vector and the cosine similarity algorithm.
3. The method of generating patent combinations according to claim 1, wherein prior to said calculating the network correlation density and the complementarity indicator for each of said subsets of patents, comprising:
acquiring a plurality of patent combination samples, wherein the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples;
calculating the network correlation density and the complementarity index of each patent combination sample;
and generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.
4. The method according to claim 3, wherein generating a density threshold and a complementarity indicator range from the network correlation density and the complementarity indicator of the plurality of patent combination samples specifically comprises:
averaging or minimizing the network correlation densities of a plurality of the patent combination samples to generate a density threshold;
generating a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination sample;
generating a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination sample;
if it is determined that the network correlation density of the patent subset is greater than the preset density threshold and the complementarity indicator of the patent subset is within the preset complementarity indicator range, generating a patent combination according to the patent subset specifically includes:
if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity index of the patent subset is within the complementarity index range of a preset technology-related patent combination, generating the technology-related patent combination according to the patent subset;
and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity index of the patent subset is within the complementarity index range of a preset product-associated patent combination, generating the product-associated patent combination according to the patent subset.
5. A patent combination generation method according to any one of claims 1 to 4,
by the formula
Figure FDA0003794747590000031
Calculating to obtain a network correlation density, wherein D is the network correlation density of the patent subset or the patent combination sample, and R is ij The similarity of the texts between the patent i and the patent j in the patent subset or the patent combination sample is N, and N is the total number of the patents included in the patent subset or the patent combination sample.
6. A patent portfolio generation system, comprising a prediction module, the prediction module comprising:
the first calculation unit is used for calculating the text similarity between each patent in the patent set to be grouped and other patents, and calculating the network correlation density and the complementarity index of each patent subset; the complementarity index is generated according to a crossability index, a polymerizability index and a difference index;
the cross-over index is expressed by formula
Figure FDA0003794747590000032
Calculating to obtain the result, wherein RS is the cross-over index of the patent subset or the patent combination sample, p i Probability distribution values, p, for a patent class i in all patent classes of the patent subset or patent combination sample j Probability distribution values for patent class j in all patent classes of the patent subset or patent combination sample, d ij The distance value between different patent categories in the patent subset or the patent combination sample is obtained, and alpha and beta are metering parameters;
the polymerization index is represented by the formula
Figure FDA0003794747590000041
Calculating to obtain the result, wherein CC is the polymerization index of the patent subset or the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N Δ Is the number of ternary closure packets, N, in a patent subset or patent combination sample 3 The number of connected triplets in the patent subset or the patent combination sample;
the difference index is expressed by a formula
Figure FDA0003794747590000042
Calculating to obtain the difference index of the patent subset or the patent combination sample, wherein S is the difference index of the patent subset or the patent combination sample, N is the total number of patents contained in the patent subset or the patent combination sample, and R is ij For the co-referenced strength between patent i and patent j in the patent subset or patent combination sample,
Figure FDA0003794747590000043
C ij for co-introduced frequencies between patent i and patent j in a subset of patents or a combination of patents, C i Is the total introduced frequency of patent i, C j Is the total introduced frequency of patent j
The judging unit is used for judging whether the network correlation density of the patent subset is larger than a preset density threshold value and whether the complementarity index of the patent subset is within a preset complementarity index range;
and the first generating unit is used for grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, each patent subset comprises a plurality of patents, and when the network correlation density of the patent subsets is judged to be greater than a preset density threshold value and the complementarity indexes of the patent subsets are within a preset complementarity index range, a patent combination is generated according to the patent subsets, wherein the patent combination is a technology-related patent combination or a product-related patent combination.
7. The system of claim 6, wherein the first computing unit is specifically configured to generate a topic feature vector corresponding to each patent in the to-be-grouped patent set by using an LDA model, and generate a text similarity between each patent in the to-be-grouped patent set and another patent according to the topic feature vector and a cosine similarity algorithm.
8. The patent portfolio generation system of claim 6, further comprising a training module comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of patent combination samples, and the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples;
a second calculating unit, configured to calculate a network correlation density and a complementarity indicator of each patent combination sample;
and the second generating unit is used for generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.
9. The patent portfolio generation system of claim 8,
the second generating unit is specifically configured to average or minimize the network correlation densities of the plurality of patent combination samples to generate a density threshold, generate a complementarity indicator range of a technology-related patent combination according to a complementarity indicator of the technology-related patent combination sample, and generate a complementarity indicator range of a product-related patent combination according to a complementarity indicator of the product-related patent combination sample;
the first generating unit is specifically configured to generate a technology-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset technology-related patent combination, and generate a product-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset product-related patent combination.
CN201910004140.9A 2019-01-03 2019-01-03 Patent combination generation method and system Active CN109726401B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910004140.9A CN109726401B (en) 2019-01-03 2019-01-03 Patent combination generation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910004140.9A CN109726401B (en) 2019-01-03 2019-01-03 Patent combination generation method and system

Publications (2)

Publication Number Publication Date
CN109726401A CN109726401A (en) 2019-05-07
CN109726401B true CN109726401B (en) 2022-09-23

Family

ID=66298005

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910004140.9A Active CN109726401B (en) 2019-01-03 2019-01-03 Patent combination generation method and system

Country Status (1)

Country Link
CN (1) CN109726401B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307009A (en) * 2019-07-26 2021-02-02 傲为信息技术(江苏)有限公司 Method for inquiring technical digital assets

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101625680A (en) * 2008-07-09 2010-01-13 东北大学 Document retrieval method in patent field
CN101894170A (en) * 2010-08-13 2010-11-24 武汉大学 Semantic relationship network-based cross-mode information retrieval method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10417341B2 (en) * 2017-02-15 2019-09-17 Specifio, Inc. Systems and methods for using machine learning and rules-based algorithms to create a patent specification based on human-provided patent claims such that the patent specification is created without human intervention

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101625680A (en) * 2008-07-09 2010-01-13 东北大学 Document retrieval method in patent field
CN101894170A (en) * 2010-08-13 2010-11-24 武汉大学 Semantic relationship network-based cross-mode information retrieval method

Also Published As

Publication number Publication date
CN109726401A (en) 2019-05-07

Similar Documents

Publication Publication Date Title
Hegyi et al. Using information theory as a substitute for stepwise regression in ecology and behavior
Gong et al. Evolutionary generation of test data for many paths coverage based on grouping
CN103370722B (en) The system and method that actual volatility is predicted by small echo and nonlinear kinetics
Galeano et al. Multiple break detection in the correlation structure of random variables
CN105335496A (en) Customer service repeated call treatment method based on cosine similarity text mining algorithm
Yao Model based labeling for mixture models
Bellalij et al. Bounding matrix functionals via partial global block Lanczos decomposition
CN110852856A (en) Invoice false invoice identification method based on dynamic network representation
Chukhrova et al. Nonparametric fuzzy hypothesis testing for quantiles applied to clinical characteristics of COVID‐19
CN109726401B (en) Patent combination generation method and system
Coronel-Brizio et al. The Anderson–Darling test of fit for the power-law distribution from left-censored samples
CN110472048A (en) A kind of auxiliary judgement method, apparatus and terminal device
CN116340831A (en) Information classification method and device, electronic equipment and storage medium
CN114139636B (en) Abnormal operation processing method and device
Zhou et al. Numerical investigations of discrete scale invariance in fractals and multifractal measures
CN110968690B (en) Clustering division method and device for words, equipment and storage medium
CN113869904A (en) Suspicious data identification method, device, electronic equipment, medium and computer program
Rahmawati et al. Comparison of behavioral similarity use TARs and Naïve algorithm for calculating similarity in business process model
Ycart et al. Checking False Discovery Rates on Pvplots
CN109685308A (en) A kind of complication system critical path appraisal procedure and system
Qiubo et al. Research on code plagiarism detection model based on Random Forest and Gradient Boosting Decision Tree
Jin et al. Construction and application of knowledge graph of domestic operating system testing
CN111125685A (en) Method and device for predicting network security situation
Wang et al. Self-normalized score-based tests to detect parameter heterogeneity for mixed models
Yadkikar GPU based malware prediction using LightGBM and XGBoost

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant