CN109726401B

CN109726401B - Patent combination generation method and system

Info

Publication number: CN109726401B
Application number: CN201910004140.9A
Authority: CN
Inventors: 茹丽洁; 康飞; 李素粉; 范云杰
Original assignee: China United Network Communications Group Co Ltd
Current assignee: China United Network Communications Group Co Ltd
Priority date: 2019-01-03
Filing date: 2019-01-03
Publication date: 2022-09-23
Anticipated expiration: 2039-01-03
Also published as: CN109726401A

Abstract

The invention relates to the technical field of computers, and particularly discloses a patent combination generation method and a platform, wherein the method comprises the following steps: calculating the text similarity between each patent and other patents in the patent set to be grouped; grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents; calculating a network correlation density and a complementarity index of each patent subset; and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity indexes of the patent subset are within a preset complementarity index range, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination. The invention can excavate deep level incidence relation between patent technologies in a non-manual mode and generate the patent combination, thereby improving the identification efficiency of the patent combination and the patent transfer and conversion rate of enterprise patents.

Description

Patent combination generation method and system

Technical Field

The invention relates to the technical field of computers, in particular to a patent combination generation method and system.

Background

With the arrival of the knowledge economic era, the patent application amount of enterprises in China is increased year by year, and the patent pool of the enterprises is continuously enriched. On the other hand, the patent technology transfer and conversion rate of the enterprise patents are extremely low, a great deal of patents are not used, and a huge gap exists between the laboratory results and the technology marketization application. To facilitate the transfer and transformation of patent technologies, patent value evaluations are first performed.

The patent value is not reflected in a single patent, but is reflected in a group of patent combinations with internal correlation, and the value of the patent combination is far greater than the sum of the values of all single patents in the patent combination. Therefore, rather than addressing individual or random patents, the patent technology transfer and conversion process is often directed to the entire transfer of a series of patents with internal associations packaged into patent assemblies for maximum economic benefit. The identification of patent combinations has great significance in improving the overall value and transfer efficiency of enterprise patents. At present, the identification work of patent combination mainly depends on expert experience judgment, and the method is time-consuming and labor-consuming, is difficult to dig the deep level incidence relation between patent technologies, and is not beneficial to improving the patent transfer and conversion rate of the patent technologies.

It should be noted that the above background description is only for the sake of clarity and complete description of the technical solutions of the present invention and for the understanding of those skilled in the art. Such solutions are not considered to be known to the person skilled in the art merely because they have been set forth in the background section of the invention.

Disclosure of Invention

The invention aims to solve the technical problem of the prior art, and provides a patent combination generation method and system, which can be used for mining deep association relations among patent technologies in a non-manual mode and generating patent combinations, so that the identification efficiency of the patent combinations and the patent transfer and conversion rate of enterprise patents are improved.

In order to achieve the above object, the present invention provides a patent combination generating method, including:

calculating the text similarity between each patent and other patents in the patent set to be grouped;

grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents;

calculating a network correlation density and a complementarity index of each patent subset;

and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity indexes of the patent subset are within a preset complementarity index range, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination.

Optionally, the calculating the text similarity between each patent in the to-be-grouped patent set and other patents specifically includes:

generating a theme characteristic vector corresponding to each patent in the patent set to be grouped by adopting an LDA model;

and generating text similarity between each patent and other patents in the patent set to be grouped according to the theme feature vector and the cosine similarity algorithm.

Optionally, before the calculating the network relevance density and the complementarity indicator of each patent subset, the method includes:

acquiring a plurality of patent combination samples, wherein the patent combination samples comprise technology associated patent combination samples and product associated patent combination samples;

calculating the network correlation density and the complementarity index of each patent combination sample;

and generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.

Optionally, the generating the density threshold and the complementarity indicator range according to the network correlation density and the complementarity indicator of the plurality of patent combination samples specifically includes:

averaging or minimizing the network correlation densities of a plurality of the patent combination samples to generate a density threshold;

generating a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination sample;

generating a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination sample;

if it is determined that the network correlation density of the patent subset is greater than the preset density threshold and the complementarity indicator of the patent subset is within the preset complementarity indicator range, generating a patent combination according to the patent subset specifically includes:

if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity index of the patent subset is within the complementarity index range of a preset technology-related patent combination, generating the technology-related patent combination according to the patent subset;

and if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity index of the patent subset is within the complementarity index range of a preset product-associated patent combination, generating the product-associated patent combination according to the patent subset.

Optionally by means of a formula

Calculating to obtain a network correlation density, wherein D is the network correlation density of the patent subset or the patent combination sample, and R _ij The text similarity between the patents i and j in the patent subset or the patent combination sample is shown, and N is the total number of patents included in the patent subset or the patent combination sample.

Optionally, the complementarity indicator is generated from a crossability indicator, a polymerizability indicator, and a diversity indicator;

the cross-over index is expressed by formula

Calculating to obtain the result, wherein RS is the cross-over index of the patent subset or the patent combination sample, p _i Probability distribution values, p, for a patent class i in all patent classes of the patent subset or patent combination sample _j Probability distribution values for patent class j in all patent classes of the patent subset or patent combination sample, d _ij The distance value between different patent categories in the patent subset or the patent combination sample is obtained, and alpha and beta are metering parameters;

the polymerization index is represented by the formula

Calculating to obtain the result, wherein CC is the polymerization index of the patent subset or the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N _Δ As subsets or groups of patentsNumber of ternary closures in the closed sample, N ₃ The number of connected triplets in the patent subset or the patent combination sample;

the difference index is expressed by a formula

Calculating to obtain the difference index of the patent subset or the patent combination sample, wherein S is the total number of patents contained in the patent subset or the patent combination sample, N is R _ij For the co-referenced strength between patent i and patent j in the patent subset or patent combination sample,

C _ij for co-introduced frequencies between patent i and patent j in a subset of patents or a combination of patents, C _i Is the total introduced frequency of patent i, C _j The total quoted frequency for patent j.

In order to achieve the above object, the present invention further provides a system for generating a patent combination, including a prediction module, where the prediction module includes:

the first calculation unit is used for calculating the text similarity between each patent in the patent set to be grouped and other patents and calculating the network correlation density and the complementarity index of each patent subset;

the judging unit is used for judging whether the network correlation density of the patent subset is larger than a preset density threshold value and whether the complementarity index of the patent subset is within a preset complementarity index range;

and a first generating unit, configured to group the patent set to be grouped according to the text similarity and generate a plurality of patent subsets, where each patent subset includes a plurality of patents, and when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a preset complementarity indicator range, generate a patent combination according to the patent subset, where the patent combination is a technology-related patent combination or a product-related patent combination.

Optionally, the first computing unit is specifically configured to generate a topic feature vector corresponding to each patent in the to-be-grouped patent set by using an LDA model, and generate text similarity between each patent in the to-be-grouped patent set and other patents according to the topic feature vector and a cosine similarity algorithm.

Optionally, the system further comprises a training module, the training module comprising:

the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of patent combination samples, and the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples;

the second calculation unit is used for calculating the network correlation density and the complementarity index of each patent combination sample;

and a second generating unit which generates a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of the patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.

Optionally, the second generating unit is specifically configured to average or minimize the network correlation density of the patent combination samples to generate a density threshold, generate a complementarity indicator range of the technology-related patent combination according to a complementarity indicator of the technology-related patent combination samples, and generate a complementarity indicator range of the product-related patent combination according to a complementarity indicator of the product-related patent combination samples;

the first generating unit is specifically configured to generate a technology-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset technology-related patent combination, and generate a product-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset product-related patent combination.

The invention has the following beneficial effects:

according to the patent combination generation method provided by the invention, the patent sets to be grouped are grouped according to the text similarity to generate a plurality of patent subsets, and when the network correlation density of the patent subsets is judged to be greater than the preset density threshold value and the complementarity indexes of the patent subsets are within the preset complementarity index range, the patent combinations are generated according to the patent subsets, the deep level association relationship among patent technologies can be mined in a non-manual mode to generate the patent combinations, so that the identification efficiency of the patent combinations and the patent transfer and conversion rate of enterprise patents are improved.

Specific embodiments of the present invention are disclosed in detail with reference to the following description and the accompanying drawings, which indicate the ways in which the principles of the invention may be employed. It should be understood that the embodiments of the invention are not so limited in scope. The embodiments of the invention include many variations, modifications and equivalents within the spirit and scope of the appended claims.

Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments, in combination with or instead of the features of the other embodiments.

It should be emphasized that the term "comprises/comprising" when used herein, is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps or components.

Drawings

Fig. 1 is a schematic flow chart of a patent combination generation method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of a patent combination generation method according to a second embodiment of the present invention;

fig. 3 is a schematic flowchart illustrating an exemplary process of a patent combination generating method according to a second embodiment of the present invention;

fig. 4 is a schematic structural diagram of a patent combination generating system provided in the third embodiment of the present invention;

fig. 5 is a schematic structural diagram of a patent combination generation system according to a fourth embodiment of the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solution of the present invention, the following description of the technical solution of the present invention with reference to the accompanying drawings is made clearly and completely, and it is obvious that the described embodiments are a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

The patent combination described in this embodiment refers to a set of related patents under the control of the same assignee, and the important feature of the patent combination is technical relevance. The patent combination includes: the technology-related patent combination based on the technology similarity can form a patent barrier, the similarity degree between the patent technologies in the technology-related patent combination is higher, and the complementarity is lower; the patent association combination based on the combination of different technologies of the same product has higher complementarity of each patent technology in the patent association combination. In the patent technology transfer and conversion of enterprise patents, the two patent combinations have important values, and a plurality of technical schemes with similar technologies or complementary technologies are packaged to form the patent combinations, so that on one hand, the patent stock can be kept alive, and on the other hand, the overall value and transfer conversion rate of the patent technology can be improved.

The technical complementarity of the technology-associated patent combination and the product-associated patent combination is greatly different, each patent in the technology-associated patent combination is developed around similar technologies, and the patents show stronger concentration tendency and inheritance on the patent classification or patent citation relation, so that the complementarity of each patent in the technology-associated patent combination is smaller; the complementarity between patents in the product-related patent combination is relatively large, and the citation relationships (such as citation coupling) between patents in the product-related patent combination, which may be distributed in different patent categories or between patent technologies, are not germane.

Example one

Fig. 1 is a schematic flow chart of a patent combination generation method according to an embodiment of the present invention, as shown in fig. 1, the method includes:

step 101, calculating text similarity between each patent in the patent set to be grouped and other patents.

And 102, grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents.

And 103, calculating the network correlation density and the complementarity index of each patent subset.

Step 104, judging whether the network correlation density of the patent subset is greater than a preset density threshold value and whether the complementarity index of the patent subset is within a preset complementarity index range, if so, executing step 105; if not, the process is ended.

And 105, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination.

According to the patent combination generation method provided by the embodiment, the patent sets to be grouped are grouped according to the text similarity, a plurality of patent subsets are generated, when the fact that the network correlation density of the patent subsets is larger than the preset density threshold value and the complementarity indexes of the patent subsets are located in the preset complementarity index range is judged, the patent combinations are generated according to the patent subsets, the deep-level association relation among patent technologies can be mined in a non-manual mode, and the patent combinations are generated, so that the recognition efficiency of the patent combinations and the patent transfer and conversion rates of enterprise patents are improved.

Example two

Fig. 2 is a schematic flow chart of a patent combination generating method according to a second embodiment of the present invention, and as shown in fig. 2, the method includes the following steps:

step 201, obtaining a plurality of patent combination samples, wherein the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples.

Preferably, the steps in this embodiment are generated by a patent combination generation system.

The patent combination sample is a patent combination recognized by experts in the field, and includes a plurality of patents. Such as: the patent combination sample may be several patents PP1, PP2, … PPn of company a within technical field T.

And 202, calculating the network correlation density and the complementarity index of each patent combination sample.

The network correlation density of the patent combination sample is used to represent the similarity between patents in the patent combination sample. The patents in the patent combination sample are mutually related, and the patent text content of the patents presents certain similarity. Such as: the technical content, efficacy content and USE content of a plurality of patents in the patent combination sample are similar or identical, or secondary indexing content (such as the USE field of the Derwent database) is included in part of the patents.

Specifically, firstly, a LDA topic model (document topic Allocation, document topic generation model) is used to generate topic feature vectors corresponding to each patent in the patent combination sample, and then text similarity between each patent and other patents in the patent combination sample is generated according to the topic feature vectors and cosine similarity algorithms of all patents in the patent combination sample. Finally, by the formula

Calculating the network correlation density of the patent combination sample, wherein D is the network correlation density of the patent combination sample, R _ij And N is the text similarity between the patent i and the patent j in the patent combination sample, and is the total number of patents included in the patent combination sample. The topic feature vector is used for constructing a text correlation network, the text similarity and the network correlation density are generated based on the text correlation network and a social network theory, and the network correlation density represents the ratio of the sum of the text similarities among the patents in the patent combination sample to the sum of the maximum text similarity.

Optionally, generating a topic feature vector corresponding to each patent in the patent combination sample by using the Gensim open source toolkit of Python: extracting 8 themes from each patent, performing parameter estimation on all patents in the patent combination sample by adopting a Gibbs sampling algorithm, and solving the theme-feature word probability distribution condition of the whole patent combination sample after 1500 iterations so as to finally generate a theme feature vector corresponding to each patent. The patent is expressed as the topic feature vector, so that the accuracy of generating the patent combination can be ensured while the great dimension reduction of the patent is realized.

In the embodiment, the network correlation density is selected to represent the technical correlation among the patents in the patent combination sample, and follows the integrity principle, the applicability principle and the simplicity principle, and the network correlation density can more intuitively reflect the technical correlation characteristics among the patents in the patent combination sample.

The complementarity index of the patent combination sample is used to indicate the degree of technical difference between patents in the patent combination sample. The complementarity index of the patent combination sample is generated from the crossability index, the polymerizability index and the diversity index. The cross index, the polymerization index and the difference index are generated by a common analysis method, a social network analysis method and a citation analysis method respectively.

The cross-over index is expressed by formula

Calculated, wherein RS is the cross-over index of the patent combination sample, p _i Probability distribution value, p, in all patent classes of the patent combination sample for patent class i _j Probability distribution value in all patent classes of the patent combination sample for patent class j, d _ij For the distance values between different patent classes in the patent combination sample, d _ij And generating by adopting a cosine similarity algorithm, wherein alpha and beta are metering parameters. Each parameter in the crossability index is generated through a text correlation network constructed by the patent combination samples.

Specifically, a patent in the patent combination sample may have a plurality of patent classification numbers, different patent classification numbers correspond to different patent categories, and all patent categories in the patent combination sample are the total number of all different patent classification numbers of all patents included in the patent combination sample. The crossability index is a comprehensive measure index which can be used for measuring patent groupsAnd combining the distribution characteristics of the patents in the sample and the distance and the difference between the patent categories. In this embodiment, optionally, all patents in the patent combination sample are expressed in the form of category vectors to statistically generate p _i And p _j The generated patent category statistical table can be shown as the following table one:

watch 1

Sample of a patent combination	C ₁	C ₂	…	C _N
					Z ₁	1	0	…	0
Z ₂	1	1	…	0
					…	…	…	…	1
Z _M	0	1	…	0

Wherein, C _i For the ith patent category in the patent combination sample, Z _j For the jth patent in the patent combination sample, 1 indicates that the patent belongs to a certain patent class, and 0 indicates that the patent does not belong to a certain patent class.

The index of polymerization is represented by the formula

Calculating to obtain the result, wherein CC is the polymerizability index of the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N _Δ Is the number of ternary closure in the patent combination sample, N ₃ The number of connected triplets in the patent combination sample. Specifically, the ternary closure in the patent combination sample refers to a ternary closure in a text relevance network constructed by the patent combination sample, and each ternary closure can be regarded as three different connected triples. The aggregativity index describes the aggregativity characteristics of each patent in a sample of patent portfolio by measuring the number of ternary closures in the text correlation network. N in the index of polymerizability _Δ The factor of 3 can ensure that CC is more than or equal to 0 and less than or equal to 1.

Difference index is represented by formula

Calculating to obtain the difference index of the patent combination sample, wherein S is the difference index of the patent combination sample, N is the total number of patents contained in the patent combination sample, and R is _ij For the co-referenced strength between patent i and patent j in the patent combination sample,

C _ij for co-introduced frequencies between patent i and patent j in patent combination sample, C _i Is the total introduced frequency of patent i, C _j The total quoted frequency for patent j. The larger the frequency of co-introduction, the more closely the citation relationship between patents. The difference index is an index having an opposite attribute to the similarity, the larger the difference index is, the larger the similarity and the smaller the difference between the patents in the patent combination sample are, and the smaller the difference index is, the smaller the similarity and the larger the difference between the patents in the patent combination sample are.

And step 203, generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises the complementarity index range of the technology-related patent combination and the complementarity index range of the product-related patent combination.

Specifically, step 203 comprises: averaging or minimizing the network correlation densities of a plurality of the patent combination samples to generate a density threshold; generating a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination sample; and generating a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination sample.

Step 201-step 203 are used for implementing training on the patent combination sample to establish a patent combination generation model. In this embodiment, the patent combination generation model may be understood as predicting a patent combination according to a density threshold and a complementarity index range. Optionally, the patent combination generating model may also be tested, that is, the density threshold and the complementarity index range are tested and adjusted according to the test result, so as to obtain the optimal density threshold and complementarity index range. Such as: the test sample of the patent combination generation model can be a plurality of other patents in the technical field T of the enterprise A besides the patents in the patent combination sample.

The density threshold and the complementarity index range are used for generating patent combinations from the patent sets to be grouped, and can be set according to the actual situation of the patents to be grouped. The complementarity index range includes a complementarity index range of a combination of related patents and a complementarity index range of a combination of related patents. The complementarity index range is generated according to the cross index, the polymerization index and the difference index after the weighting of the multiple patent combination samples, the weight of each index is determined in a variation coefficient weighting mode, and the weight can be used for representing the influence of the corresponding index in distinguishing the technology-related patent combination and the product-related patent combination. Weighting the cross index, the polymerization index and the difference index of the technical associated patent combination sample to determine the complementary index range of the technical associated patent combination, and weighting the cross index, the polymerization index and the difference index of the product associated patent combination sample to determine the complementary index range of the product associated patent combination.

Optionally, the weight generation manner of the crossability index, the polymerizability index, and the diversity index includes: first by the formula

Calculating to obtain the variation coefficient v of each index _i Wherein S is _i Is the standard deviation of the i-th index,

is the average of the ith index in all patent combination samples. Then passing through the formula

Normalizing the variation coefficient of the indexes to generate the weight w of each index _i 。

And step 204, calculating the text similarity between each patent in the patent set to be grouped and other patents.

The patent set to be grouped comprises a plurality of patents, and the method of the embodiment is particularly used for generating one or more patent combinations from the patent set to be grouped. Such as: the patent sets to be grouped are all other patents in the technical field T of the enterprise A except the patents in the patent combination sample.

Step 204 specifically comprises the following steps: and generating theme characteristic vectors corresponding to each patent in the patent set to be grouped by adopting an LDA model. And generating text similarity between each patent and other patents in the patent set to be grouped according to the theme feature vector and the cosine similarity algorithm.

The specific process of step 204 can be described by referring to the text similarity generation manner between patents in the patent combination sample in step 202, and is not described herein again.

And step 205, grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, wherein each patent subset comprises a plurality of patents.

Specifically, a plurality of patents with close text similarity are divided into a patent subset. Multiple patents within a patent subset belong to the same patent subclass.

And step 206, calculating the network correlation density and the complementarity index of each patent subset.

The network correlation density and the complementarity index generating manner of the special subset in step 206 can be described with reference to the network correlation density and the complementarity index generating manner of the special combination sample in step 202, and will not be described herein again.

Step 207, judging whether the network correlation density of the patent subset is greater than a preset density threshold, if so, executing step 208; if not, the process is ended.

The network correlation density of the patent subset is greater than the preset density threshold, which indicates that the similarity of each patent in the patent subset is greater, and the patent subset can be used as a candidate for a patent combination.

Step 208, judging whether the complementarity indexes of the patent subsets are within the complementarity index range of a preset technology-related patent combination, if so, executing step 209; if not, go to step 210.

And 209, generating a technology-associated patent combination according to the patent subset, and ending the process.

Step 210, judging whether the complementarity indexes of the patent subsets are within the complementarity index range of a preset product-related patent combination, if so, executing step 211; the flow ends.

And step 211, generating a product-related patent combination according to the patent subset, and ending the process.

The complementarity index of the product-related patent combination is much greater than that of the technology-related patent combination.

The following specifically exemplifies the patent combination generation method of the following embodiment:

all 5452 patent numbers of organization A in the T field are searched in a patent database, and the content of the patent numbers is manually interpreted by experts in the technical field to obtain 14 patent combinations of the enterprise as a patent combination sample in the embodiment. Respectively calculating the network correlation density and the complementarity index of the 14 patent combination samples, and obtaining the following result according to the network correlation density and the complementarity index of the 14 patent combination samples: the density threshold value is 0.6, the complementarity index range of the technology-associated patent combination is 0.15< z < 0.07, and the complementarity index range of the product-associated patent combination is 0.07 < z < 0.15. Fig. 3 is a schematic flowchart of an example of a patent portfolio generation method according to a second embodiment of the present invention, and as shown in fig. 3, other 11 patent portfolios with portfolio potential of the enterprise are finally predicted according to the density threshold and the complementarity index range. Some of these patent combinations include 167 patents relating to etching, LCD, liquid crystal display, mounting substrate, photoresist layer, substrate processing, evaporator, plasma deposition, etc., and the subject matter of the patent combination can be summarized as an etching process.

According to the patent combination generation method provided by the embodiment, the patent sets to be grouped are grouped according to the text similarity, a plurality of patent subsets are generated, and when the fact that the network correlation density of the patent subsets is larger than the preset density threshold value and the complementarity indexes of the patent subsets are located in the preset complementarity index range is judged, the patent combination is generated according to the patent subsets, the deep-level association relation among patent technologies can be mined in a non-manual mode and a computer-assisted mode, and the patent combination is generated, so that the patent combination identification efficiency of enterprise patents and the patent transfer and conversion rate of the enterprise patents are improved, and meanwhile, the labor cost can be effectively saved.

It should be noted that while the operations of the methods of the present invention are illustrated in the drawings in a particular order, this does not require or imply that the operations must be performed in that particular order, or that all illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

EXAMPLE III

Fig. 4 is a schematic structural diagram of a patent combination generating system according to a third embodiment of the present invention, and as shown in fig. 4, the system includes a prediction module 1, where the prediction module 1 includes a first calculating unit 11, a judging unit 12, and a first generating unit 13.

The first calculating unit 11 is configured to calculate text similarity between each patent in the patent set to be grouped and other patents, and calculate a network correlation density and a complementarity index of each patent subset. The judging unit 12 is configured to judge whether the network correlation density of the patent subset is greater than a preset density threshold and whether the complementarity indicator of the patent subset is within a preset complementarity indicator range. The first generating unit 13 is configured to group the patent sets to be grouped according to the text similarity and generate a plurality of patent subsets, where each patent subset includes a plurality of patents, and when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a preset complementarity indicator range, generate a patent combination according to the patent subset, where the patent combination is a technology-related patent combination or a product-related patent combination.

The patent combination generation system provided by the third embodiment can be used for implementing the patent combination generation method provided by the first embodiment.

In the patent combination generation system provided by this embodiment, the first generation unit of the prediction module is configured to group the to-be-grouped patent sets according to the text similarity and generate a plurality of patent subsets, and when it is determined that the network correlation density of the patent subsets is greater than the preset density threshold and the complementarity indexes of the patent subsets are within the preset complementarity index range, generate the patent combinations according to the patent subsets, and can mine the deep-level association relationship between patent technologies in a non-manual manner and generate the patent combinations, thereby improving the identification efficiency of the patent combinations and the patent transfer and conversion rates of the enterprise patents.

Example four

Fig. 5 is a schematic structural diagram of a patent combination generating system according to a fourth embodiment of the present invention, and as shown in fig. 5, the system includes a prediction module 1, where the prediction module 1 includes a first calculating unit 11, a judging unit 12, and a first generating unit 13.

The first calculating unit 11 is configured to calculate text similarity between each patent in the patent set to be grouped and other patents, and calculate a network correlation density and a complementarity index of each patent subset. The judging unit 12 is configured to judge whether the network correlation density of the patent subset is greater than a preset density threshold and whether the complementarity indicator of the patent subset is within a preset complementarity indicator range. The first generating unit 13 is configured to group the patent sets to be grouped according to the text similarity, and generate a plurality of patent subsets, where each patent subset includes a plurality of patents, and when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicators of the patent subset are within a preset complementarity indicator range, generate a patent combination according to the patent subset, where the patent combination is a technology-related patent combination or a product-related patent combination.

Further, the first calculating unit 11 is specifically configured to generate a topic feature vector corresponding to each patent in the to-be-grouped patent set by using an LDA model, and generate text similarity between each patent in the to-be-grouped patent set and other patents according to the topic feature vector and a cosine similarity algorithm.

Further, the system further comprises a training module 2, wherein the training module 2 comprises: an acquisition unit 21, a second calculation unit 22, and a second generation unit 23.

The acquiring unit 21 is configured to acquire a plurality of patent combination samples, which include a technology-related patent combination sample and a product-related patent combination sample. The second calculating unit 22 is configured to calculate a network correlation density and a complementarity indicator of each patent combination sample. The second generating unit 23 generates a density threshold value and a complementarity index range including a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination from the network correlation density and the complementarity index of a plurality of the patent combination samples.

Further, the second generating unit 23 is specifically configured to average or minimize the network correlation density of the patent combination samples to generate a density threshold, generate a complementarity index range of the technology-related patent combination according to the complementarity index of the technology-related patent combination samples, and generate a complementarity index range of the product-related patent combination according to the complementarity index of the product-related patent combination samples. The first generating unit 13 is specifically configured to generate a technology-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset technology-related patent combination, and generate a product-related patent combination according to the patent subset when it is determined that the network correlation density of the patent subset is greater than a preset density threshold and the complementarity indicator of the patent subset is within a complementarity indicator range of a preset product-related patent combination.

The patent combination generation system provided by the fourth embodiment can be used for implementing the patent combination generation method provided by the second embodiment.

In the patent combination generation system provided by this embodiment, the first generation unit of the prediction module is configured to group the to-be-grouped patent sets according to the text similarity and generate a plurality of patent subsets, and when it is determined that the network correlation density of the patent subsets is greater than the preset density threshold and the complementarity indexes of the patent subsets are within the preset complementarity index range, generate the patent combinations according to the patent subsets, and can mine the deep-level association relationship between patent technologies in a non-manual manner and generate the patent combinations, thereby improving the identification efficiency of the patent combinations and the patent transfer and conversion rates of patents in enterprises.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and the content of the present specification should not be construed as a limitation to the present invention. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims

1. A patent combination generation method is characterized by comprising the following steps:

if the network correlation density of the patent subset is judged to be larger than a preset density threshold value and the complementarity indexes of the patent subset are within a preset complementarity index range, generating a patent combination according to the patent subset, wherein the patent combination is a technology-related patent combination or a product-related patent combination;

the complementarity index is generated according to a crossability index, a polymerizability index and a difference index;

the cross-over index is expressed by formula

the polymerization index is represented by the formula

Calculating to obtain the result, wherein CC is the polymerization index of the patent subset or the patent combination sample, CC is more than or equal to 0 and less than or equal to 1, and N _Δ Is the number of ternary closure packets, N, in a patent subset or patent combination sample ₃ The number of connected triplets in the patent subset or the patent combination sample;

the difference index is expressed by a formula

Calculating to obtain the difference index of the patent subset or the patent combination sample, wherein S is the difference index of the patent subset or the patent combination sample, N is the total number of patents contained in the patent subset or the patent combination sample, and R is _ij For the co-referenced strength between patent i and patent j in the patent subset or patent combination sample,

C _ij for co-introduced frequencies between patent i and patent j in a subset of patents or in a combination of patents, C _i Is the total introduced frequency of patent i, C _j The total quoted frequency for patent j.

2. The patent combination generation method according to claim 1, wherein the calculating the text similarity between each patent in the patent set to be grouped and other patents specifically comprises:

3. The method of generating patent combinations according to claim 1, wherein prior to said calculating the network correlation density and the complementarity indicator for each of said subsets of patents, comprising:

acquiring a plurality of patent combination samples, wherein the patent combination samples comprise technology-related patent combination samples and product-related patent combination samples;

4. The method according to claim 3, wherein generating a density threshold and a complementarity indicator range from the network correlation density and the complementarity indicator of the plurality of patent combination samples specifically comprises:

5. A patent combination generation method according to any one of claims 1 to 4,

by the formula

Calculating to obtain a network correlation density, wherein D is the network correlation density of the patent subset or the patent combination sample, and R is _ij The similarity of the texts between the patent i and the patent j in the patent subset or the patent combination sample is N, and N is the total number of the patents included in the patent subset or the patent combination sample.

6. A patent portfolio generation system, comprising a prediction module, the prediction module comprising:

the first calculation unit is used for calculating the text similarity between each patent in the patent set to be grouped and other patents, and calculating the network correlation density and the complementarity index of each patent subset; the complementarity index is generated according to a crossability index, a polymerizability index and a difference index;

the cross-over index is expressed by formula

the polymerization index is represented by the formula

the difference index is expressed by a formula

C _ij for co-introduced frequencies between patent i and patent j in a subset of patents or a combination of patents, C _i Is the total introduced frequency of patent i, C _j Is the total introduced frequency of patent j

and the first generating unit is used for grouping the patent sets to be grouped according to the text similarity and generating a plurality of patent subsets, each patent subset comprises a plurality of patents, and when the network correlation density of the patent subsets is judged to be greater than a preset density threshold value and the complementarity indexes of the patent subsets are within a preset complementarity index range, a patent combination is generated according to the patent subsets, wherein the patent combination is a technology-related patent combination or a product-related patent combination.

7. The system of claim 6, wherein the first computing unit is specifically configured to generate a topic feature vector corresponding to each patent in the to-be-grouped patent set by using an LDA model, and generate a text similarity between each patent in the to-be-grouped patent set and another patent according to the topic feature vector and a cosine similarity algorithm.

8. The patent portfolio generation system of claim 6, further comprising a training module comprising:

a second calculating unit, configured to calculate a network correlation density and a complementarity indicator of each patent combination sample;

and the second generating unit is used for generating a density threshold value and a complementarity index range according to the network correlation density and the complementarity index of a plurality of patent combination samples, wherein the complementarity index range comprises a complementarity index range of a technology-related patent combination and a complementarity index range of a product-related patent combination.

9. The patent portfolio generation system of claim 8,

the second generating unit is specifically configured to average or minimize the network correlation densities of the plurality of patent combination samples to generate a density threshold, generate a complementarity indicator range of a technology-related patent combination according to a complementarity indicator of the technology-related patent combination sample, and generate a complementarity indicator range of a product-related patent combination according to a complementarity indicator of the product-related patent combination sample;