Detailed Description
As can be seen from fig. 2, with the digital asset entity 22 as the object of operation, queries for implementing technical class digital assets are almost infeasible, e.g., uncertainty in their data volume, and may be stored in multiple network databases spread across geographic locations, etc.
Since a technical system is an organic collection of a plurality of different levels and different content technical solutions, which may belong to different fields or disciplines, may or may not be related at all, for example, a technical solution of an engine system may relate to mechanical, material, circuit, software, etc. solutions, which may not have any direct relationship with each other from the technical solution point of view. Furthermore, a technical solution may be applicable in different technical systems, for example, from the viewpoint of a technical solution itself, a technical system may not be reflected at all, and therefore, we cannot judge the overall nature of the technical system through a specific individual technical solution, and moreover, it is common knowledge that a part of the technical system cannot be replaced by a whole part. This results in an extremely difficult and thinking-and operation-wise obstacle to the degree of similarity or competitiveness of the two technical systems using the individual information of the technical solutions.
There are many reasons why a technical system theoretically has a myriad of descriptions, which can even be considered to belong to different technical systems. However, the degree of similarity or competitiveness of the two technical systems can still be reflected by some information. For example, the higher the degree of similarity between two technical systems as a whole, the more reactive it will be at a higher level of abstraction, the more likely it will be to be locally similar, and the more reactive it will be at a lower level of abstraction, from which we have the opportunity to judge the degree of similarity or competitiveness of two technical systems by multiple generalized descriptions of different levels of abstraction for one technical system.
Fig. 3 is a flowchart of a method for querying a digital asset of a first technical class according to an embodiment of the present application.
According to fig. 3, first, in step 31, a query request sent by a valid client is obtained, where the query request includes a target digital asset technical description file, and this file includes at least one or more technical solutions, and these technical solutions form a target technical solution set a. Wherein, each technical scheme in the set corresponds to the target technical point, the technical characteristics of the digital asset data package to be inquired are described, the expression form can be any form which is beneficial to clearly describing the digital asset data package, such as a WORD document or a PDF document, and the expression form can adopt the expression form of a patent file, etc. In addition, the query request may also include query conditions to narrow the query scope.
At step 32, a set of digital asset data packets to be detected may be obtained on the system platform or in the blockchain network according to the query condition. For each digital asset data packet in the set, a technical scheme subset B corresponding to all technical points of the digital asset data packet can be obtained through the technical description part or the technical document of the digital asset data packet.
At step 33, a similarity index is calculated for the target solution set a and each solution subset B. The similarity index can represent the overall similarity degree between each digital asset data packet in the digital asset data packet set to be detected and the technical scheme given in the target digital asset technical description file in the query request. And finally, in step 34, reordering the data packets in the digital asset data packet set to be detected according to the similarity index and outputting the data packets, thereby realizing the query of the technical digital assets.
The step of calculating the similarity index between the target solution set a and the solution subset B in step 33 may adopt the following substeps. Refer to fig. 4. Fig. 4 is an exemplary diagram of the similarity index of the first calculation target solution set a and the solution subset B employed in step 13.
According to fig. 4, each technical solution 11, 12, 13 in the target technical solution set 41 is determined, then each digital asset data packet 421, 422, 423 in the digital asset data packet set 42 to be detected is determined one by one, and further, the technical solution in the technical solution subset corresponding to the digital asset data packet 421, 422, 423 can be determined through the technical description part or document in each digital asset data packet 421, 422, 423. Specifically, the technical solutions 211 and 212 are included in the technical solution subset of the packet 421; among the subset of solutions for packet 422 are solutions 221, 222, 223, and 224; among the subset of solutions for packet 423 are solutions 231, 232, and 233. Then, the similarity between each solution 11, 12, 13 in the target solution set 41 and each solution in the solution subset of each data packet 421, 422, 423 is calculated. That is, the similarity between solutions 11, 12, and 13 and each solution in the solution subset of the packets 421, 422, and 423.
Specifically, the following calculations were performed:
the scheme similarity calculation can be performed in various orders, for example, as follows.
1. Calculating the similarity a11-211 and a11-212 between the technical scheme 11 and the technical schemes 211 and 212 of the data packet 421; the similarity between the technical scheme 11 and the technical schemes 221, 222, 223 and 224 is a11-221, a11-222, a11-223 and a 11-224; the similarity between the technical scheme 11 and the technical schemes 231, 232 and 233 is a11-231, a11-232 and a 11-233; reference is made to data set 43 in fig. 4.
2. Calculating the similarity a12-211 and a12-212 between the technical scheme 12 and the technical schemes 211 and 212 of the data packet 421; the similarity between the technical scheme 12 and the technical schemes 221, 222, 223 and 224 is a12-221, a12-222, a12-223 and a 12-224; the similarity between the technical scheme 12 and the technical schemes 231, 232 and 233 is a12-231, a12-232 and a 12-233; refer to data set 44 in fig. 4.
3. Calculating the similarity a13-211 and a13-212 between the technical scheme 13 and the technical schemes 211 and 212 of the data packet 421; the similarity between the technical scheme 13 and the technical schemes 221, 222, 223 and 224 is a13-221, a13-222, a13-223 and a 13-224; the similarity between the technical scheme 13 and the technical schemes 231, 232 and 233 is a13-231, a13-232 and a 13-233; reference is made to data set 45 in fig. 4.
And (II) calculating the maximum similarity of the schemes and the similarity index between the technical scheme sets.
1. Calculating a solution maximum similarity of each solution of the target solution set 41 to a solution subset of the data packet 421, and calculating a similarity index of the target solution set 41 to the solution subset of the data packet 421.
(1) Calculating the maximum similarity A11, A12 and A13 of the technical schemes 11, 12 and 13 and the technical scheme subset of the data packet 321, wherein:
A11=a11-211+a11-212;A12=a12-211+a12-212;A13=a13-211+a13-212;
(2) calculating a similarity index X11 between the target solution set 41 and a solution subset of the data packet 421, wherein:
X11=A11+A12+A13。
2. calculating a solution maximum similarity of each solution of the target solution set 41 to a solution subset of the data packet 422, and calculating a similarity index of the target solution set 41 to the solution subset of the data packet 422.
(1) Calculating the solution maximum similarity B11, B12, B13 of the solution subsets of the solutions 11, 12, 13 and the data packet 422, wherein:
B11=a11-221+a11-222+a11-223+a11-224;B12=a12-221+a12-222+a12-223+a12-224;B13=a13-221+a13-222+a13-223+a13-224;
(2) calculating a similarity index X12 of the target solution set 41 and a solution subset of the data packets 422, wherein:
X12=B11+B12+B13。
3. calculating the maximum similarity of each solution of the target solution set 41 to the solution subset of the data packet 423 and calculating the similarity index of the target solution set 41 to the solution subset of the data packet 423.
(1) Calculating the maximum similarity of the technical solutions 11, 12 and 13 and the technical solution subset of the data packet 423, wherein the maximum similarity is C11, C12 and C13:
C11=a11-231+a11-232+a11-233;C12=a12-231+a12-232+a12-233;
C13=a13-231+a13-232+a13-233;
(2) calculating the similarity index of the target technical solution set 41 and the technical solution subset of the data packet 423:
X13=C11+C12+C13。
it can be seen that X11, X12, and X13 are the basis for packet reordering in step 14.
The similarity of the technical solutions in the above step (a) may be calculated by using a keyword-based calculation method, or may also be calculated by using a semantic-based calculation method, so as to calculate the similarity between each technical solution in the target technical solution set a and each technical solution in the technical solution subset B. For example, the keyword-based method, referring to fig. 5, fig. 5 also shows an example of calculating the maximum similarity of the solutions and the similarity index between the solution sets using the similarity.
Firstly, determining each technical scheme 11, 12 and 13 in the target technical scheme set 51, and extracting all keywords corresponding to the technical schemes 11, 12 and 13 respectively and target keyword sets H1, H2 and H3 generated by corresponding derivative words, wherein the derivative words comprise synonyms, near-synonyms, hypernyms, hyponyms and the like of the keywords; the H1, H2 and H3 are keyword sets formed by removing repeated keywords from the keyword sets respectively. Then, each digital asset data packet 521, 522, 523 in the digital asset data packet set 52 to be detected is determined one by one, and further, the technical scheme in the technical scheme subset corresponding to the digital asset data packet 521, 522, 523 can be determined through the technical description part or document in each digital asset data packet 521, 522, 523. Specifically, the technical solutions 211 and 212 are included in the technical solution subset of the packet 521; among the subset of solutions for packet 522 are solutions 221, 222, 223, and 224; among the subset of solutions for packet 523 are solutions 231, 232, and 233.
The number of occurrences of each keyword in the set of calculated target keywords H1, H2, H3 and each data packet 521, 522, 523 is then calculated. I.e., the number of times each keyword in H1, H2, H3 appears in each solution in the subset of solutions of packets 521, 522, and 523.
As shown in fig. 5, the keywords in the set H1 appear 10 times in the technical solution 211 of the data packet 521, that is, the similarity value is 10; in the technical solution 212, the number of occurrences is 15, that is, the similarity value is 15; the data packet 522 occurs 20 times in the technical scheme 221, that is, the similarity value is 20; in the technical solution 222, 15 occurrences occur, that is, the similarity value is 15; in the technical solution 223, the number of occurrences is 30, that is, the similarity value is 30; occurs 5 times in technical solution 224, i.e., the similarity value is 5; the occurrence is 0 times in the technical solution 231 of the data packet 523, that is, the similarity value is 0; the number of occurrences is 5 in solution 232, i.e., the similarity value is 5, and the number of occurrences is 2 in solution 233, i.e., the similarity value is 2.
The keywords in the set H2 appear 5 times in the technical solution 211 of the data packet 521, that is, the similarity value is 5; in the technical solution 212, the number of occurrences is 15, that is, the similarity value is 15; occurs 5 times in the technical solution 221 of the data packet 522, i.e. the similarity value is 5; in the technical scheme 222, the occurrence is 10 times, that is, the similarity value is 10; in the technical solution 223, the number of occurrences is 20, that is, the similarity value is 20; occurs 10 times in technical scheme 224, i.e. the similarity value is 10; the occurrence is 5 times in the technical scheme 231 of the data packet 523, that is, the similarity value is 5; the number of occurrences is 5 in solution 232, i.e., the similarity value is 5, and the number of occurrences is 5 in solution 233, i.e., the similarity value is 5.
The keywords in the set H3 appear 10 times in the technical solution 211 of the data packet 521, that is, the similarity value is 10; occurs 20 times in technical scheme 212; i.e. a similarity value of 20; 25 occurrences occur in the technical solution 221 of the data packet 522, i.e. the similarity value is 25; in the technical solution 222, 15 occurrences occur, that is, the similarity value is 15; in the technical solution 223, the occurrence is 5 times, that is, the similarity value is 5; occurs 5 times in technical solution 224, i.e., the similarity value is 5; the data packet 523 occurs 10 times in the technical solution 231, that is, the similarity value is 10; the number of occurrences is 5 in solution 232, i.e., the similarity value is 5, and the number of occurrences is 5 in solution 233, i.e., the similarity value is 5.
Adding the times 10 and 15 of occurrence of each keyword in the target keyword set H1 in the technical solutions 211 and 212 of the technical solution subset 521 is the maximum similarity 25 between the technical solution 11 in the target technical solution 51 and the digital asset data packet 521 in the digital asset data packet set 52 to be detected, in this example, the maximum similarity value between the technical solution 11 and the digital asset data packet 521 is 25, that is, the value of a11 in fig. 5 is 25.
Similarly, the maximum similarity between the technical scheme 12 in the target technical scheme 51 and the digital asset data package 521 in the digital asset data package set 52 to be detected can also be obtained, in this example, the maximum similarity between the technical scheme 12 and the digital asset data package 521 is 20, that is, the value of a12 in fig. 5 is 20. The maximum similarity a13 between the solution 13 in the target solution 51 and the digital asset data package 521 in the digital asset data package set 52 to be detected is 30.
Further, the similarity index X11 of the target technical solution 51 with the digital asset data packet 521 in the digital asset data packet set 52 to be detected is a11+ a12+ a13 is 25+20+30 is 75. The similarity index X12 of the target technical solution 51 and the digital asset data packet 522 in the digital asset data packet set 52 to be detected is B11+ B12+ B13 is 70+45+50 is 165. The similarity index X13 of the target technical solution 51 and the digital asset data packet 523 in the digital asset data packet set 52 to be detected is C11+ C12+ C13 is 7+15+20 is 42.
In other embodiments of the present application, semantic similarity between technical solutions is calculated using a semantic-based calculation method. Assume that the semantic similarity function is LAN (X1, X2), where X1 is the description document of the first technical file and X2 is the description document of the second technical file, so the semantic similarity between technical solution 11 and technical solution 211 is LAN (technical solution 11, technical solution 211). Obviously, the similarity index between the digital asset data packets can be obtained through semantic similarity, and details are not repeated here.
Fig. 6 is a flowchart of a second method employed by the process of fig. 3 for calculating a similarity index between a target solution set a and a solution subset B.
The flow illustrated in fig. 6 shows a general scheme, which adopts the principle that, in order to describe a technical system as a whole, the key technical scheme of a technical system is expressed by a general description of four abstract levels (or more levels or less levels, but not less than two levels, too many levels may reduce the efficiency of the method, and the improvement degree of the judgment accuracy is limited), and the degree of similarity or competitiveness of two technical systems can be judged quickly according to the statistics and comparison of the expression quantity of each level of the key technical scheme of the two technical systems. Refer to fig. 6.
In step 61, a technology classification rule having a progressive feature with four levels is determined or selected. The technical classification rule can be designed in advance for use, and if the technical classification rule is used for inquiring a technical system in a specific field, such as a chemical field or a semiconductor field, the targeted technical classification rule is beneficial to the accuracy of retrieval and judgment. However, in most cases, one of the commonly used technical classification rules can be selected for use, which is not so different in application effect, and the most commonly used are the international patent classification rule, the european or U.S. patent classification rule, and the like. The progressive features are the four abstraction levels, and obviously, the international patent classification rules and the like have the features. If the rule is designed by itself, reference can be made to the following table, for example, the meaning of the technical classification rule of four abstraction levels is as follows, wherein the smaller the value, the higher the abstraction level:
table 1 technical rule design table
Hierarchy level
|
A
|
II
|
III
|
Fourthly
|
Name (R)
|
Technical direction
|
Technical Field
|
Direction of specialty
|
Professional field
|
Expression of
|
A-G
|
A-Z
|
A-Z + number 0-9
|
A-Z + number 0-9
|
Description of the invention
|
1 position
|
2 position
|
3 position
|
4 bit |
For example, for the encoding BAFA01a105 of a technical point, where B represents technical direction information of the technical point, AF represents technical field information, a01 represents professional direction information, and a105 represents professional field information.
Since the design of the technical classification rules and the content definition belong to the public technical category, they are not described in detail herein.
Step 62, selecting technical points from the two technical systems, respectively. The selection of the technical points is carried out according to the principles of comprehensiveness, generalization and key consideration. The comprehensive method emphasizes that the selection of the technical key points should cover or take into account each branch of the technical system structure, and avoids omission to the maximum extent; the summary is intended to make the selected technical points and the description thereof have multi-hierarchy, so that the technical point set can embody the integral characteristics of the system; the key point is that key technical schemes or innovative technical schemes with characteristics in the system are selected as far as possible, and the identifiability of the system is improved to the maximum extent. Thus, for the technical point set a extracted from the first technical system summary and the technical point set B extracted from the second technical system summary, the technical classification rule is used to technically classify each of the technical points, so as to obtain the corresponding classification number set A, B. The technical point information in the technical point set is a technical description file of the technical point, and includes information such as characters or pictures, for example, the information may also be a style of a patent application file; and in the classification number set, the technical classification code corresponding to each technical point file is used.
In the following steps, the classification number set A, B will be the object of operation.
Step 63, selecting 80% of numbers as operation objects (generally 100% when the number is small; the description about the number of selected numbers is described in detail later) in an arbitrary manner, such as a random or sequential manner, according to the number of the classified numbers in the classified number set A, and obtaining a new classified number set A; similarly, in the classification number set B, 100% of the numbers are selected as operation objects according to the number of the classification numbers therein, and a new classification number set B is obtained.
For the new classified number set A, for each number in the new classified number set A, obtaining each level code indicated by the number, removing repeated items in the new classified number set A, obtaining each level code set X11, X12, X13 and X14 and corresponding numbers Y11, Y12, Y13 and Y14 of all the numbers, and for each number in the new classified number set B, obtaining each level code indicated by the number, removing repeated items in the new classified number set B, obtaining each level code set X21, X22, X23 and X24 and corresponding numbers Y21, Y22, Y23 and Y24 of all the numbers. How "remove duplicates" is done is explained below. Assume that the first level codes of all the numbers in the new category number set a, i.e. the code set X11 representing the technology direction, are:
x11 ═ B, a, C, B, D, E, F, D, B }, where B repeats 2 times, C repeats 1 time, D repeats 1 time, and X11 ═ B, a, C, D, E, F }, after the repetition is removed, in which case the corresponding code number Y11 is 6.
Step 64, calculating the number E1 of coding coincidence of X11 and X21, and the number E2 of coding coincidence of X12 and X22, the number E3 of coding coincidence of X13 and X23, and the number E4 of coding coincidence of X14 and X24 according to the coding sets X11, X12, X13 and X14, and X21, X22, X23 and X24.
For example, assuming that X11 is { B, a, C, D, E, F }, and X21 is { B, a, G }, the number of code overlaps, E1, of X11 and X21 is 2.
Step 65, calculating the relative overlap ratio Ai and Bi of each level of the classification number set A, B; wherein the content of the first and second substances,
for the classification number set a, Ai ═ (Ei/Y1 i)%; for the classification number set B, Bi ═ (Ei/Y2 i)%.
Step 66 and step 67, calculating the technical correlation index F of the classification number set A according to the relative contact ratio Ai and BiATechnical correlation index F with classified number set BA(ii) a Wherein, FA=∑Ci*Ai;FB∑ Ci ═ Bi; in the formula, Ci is an empirical constant;
according to the correlation index FAAnd FBCalculating the similarity probability G of the classification number set A, BA、GB(ii) a Wherein G isA=FA/(∑Ci);GB=FB/(∑Ci);
G is to beAG is used as the similarity index of the target technical scheme set A and the technical scheme subset BBAs similarity index of the technical scheme subset B and the target technical scheme set A;
in the above correlation equation, i is 1 to n, where n is the number of encoding levels of the technical classification rule, and in this example, n is 4.
In the method described in fig. 6, the correlation between two technical systems is characterized by a correlation index. The correlation index formula is of the form:
F=C1*A1+C2*A2+C3*A3+C4*A4。
in the formula, F represents a correlation index, A1, A2, A3 and A4 respectively represent the contact ratio of primary, secondary, tertiary and quaternary codes of the technical classification codes, C1, C2, C3 and C4 respectively represent the correlation coefficients of the primary, secondary, tertiary and quaternary codes of the technical classification codes and the system integrity property, and empirical values of the correlation coefficients are obtained through methods such as machine learning or statistics and are used for identifying the influence degree of the primary codes on the technical system integrity property.
The degree of similarity or the degree of collision between the two technical systems is characterized by a probability of similarity or a probability of collision. The similarity probability or collision probability formula is in the form:
T=F/(C1+C2+C3+C4)×100%。
fig. 7 is a flowchart of a second technical digital asset query method according to an embodiment of the present application.
According to fig. 7, first, in step 71, a query request sent by a valid client is obtained, where the query request includes a target digital asset technical description file, and this file includes at least one or more technical solutions corresponding to all technical points of the target digital asset, and these technical solutions form a target technical solution set a.
At step 72, a set of digital asset data packets to be detected may be obtained on the system platform or in the blockchain network according to the query condition. For each digital asset data packet in the set, a technical scheme subset B corresponding to all technical points of the digital asset data packet can be obtained through the technical description part or the technical document of the digital asset data packet.
In step 73, the patent classification number sets a corresponding to all technical solutions in the target technical solution set a and the patent classification number sets B corresponding to all technical solutions in each technical solution subset B are determined. Since a technical solution may have multiple patent classification numbers, the set a and the set B should adopt a classification number inclusion standard, and either only select the principal classification number inclusion set of the technical solution or all the classification numbers of the technical solution are included in the set. The former is beneficial to improving the calculation efficiency, but when the calculation resources of the digital processor are sufficient, the latter can improve the calculation accuracy.
At step 74, a similarity index between the target solution set a and each solution subset B is calculated according to the patent classification number set a and each patent classification number set B. The similarity index can represent the overall similarity degree between each digital asset data packet in the digital asset data packet set to be detected and the technical scheme given in the target digital asset technical description file in the query request. And finally, in step 75, reordering the data packets in the digital asset data packet set to be detected according to the similarity index and outputting the data packets, thereby realizing the query of the technical digital assets.
The method for determining similarity index of two technical systems adopted in the embodiment of fig. 7 utilizes the patent classification rule. For example, the international patent classification number described in the patent application information of the two technical systems can be used to obtain the technical field overlapping information indicated by the international patent classification number, and thus the similarity degree of the two technical systems can be determined as a whole. In other embodiments, any technical classification rule may be used to obtain the technical classifications of the key or main technical points of the two technical systems, and is not limited to the patent classification, or the patent classification is only one form of technical classification, and the method provided by the present application may be used as long as the two technical systems perform technical classification on the key or main technical points in the systems according to the same technical classification rule. For example, with two technical systems applied in the united states or in europe, the patent classification number of the united states or europe may be used to determine the degree of conflict between any two technical systems according to the method provided in the present application. The following describes specific implementation processes of other embodiments of the present application with International Patent Classification (IPC) as a technical classification rule of key technical points in a technical system.
The international patent classification number, i.e., IPC, adopts a classification mode of combining functions and applications, and a classification principle of mainly taking functionality and secondarily taking applicability. Using the form of the grade, the technical content is noted as: and five parts of a part, a major class, a minor class, a major group and a minor group are classified step by step to form a complete classification system. Thus, a complete IPC class number is made up of a combination of symbols representing department, major, minor, major and minor groups.
In one embodiment, the five pieces of information are used to determine the degree of similarity or conflict between two technical systems, or two sets of technical systems. In another embodiment, four of the five pieces of partial information, i.e. the major, minor, major and minor groups of information, are used to determine the degree of similarity or degree of conflict between two technical systems, or two sets of technical systems. Similarly, three of the five pieces of partial information, i.e., the small, large and small groups of information, may also be used to determine the degree of similarity or the degree of conflict between two technical systems, or between two sets of technical systems. Alternatively, two of the five pieces of partial information, i.e., the major and minor groups of information, are used to determine the degree of similarity or degree of conflict between the two technical systems, or between the two sets of technical systems. Alternatively, one of the five pieces of partial information, i.e., the information of the group, is also used to determine the degree of conflict between the two technical systems, or between the two technical systems in the two sets.
Obviously, of these five pieces of information, the range of concept of information of a part is the largest, and the purpose of utilizing this information is not to omit the information used; the information concept of the group is minimized, and the information is used for the purpose of making the information used more accurate. Thus, there may be a number of embodiments that utilize patent classification information, such as only department, subclass, major group, and minor group information to determine the degree of similarity or degree of conflict between two technical systems, or two sets of technical systems. And so on. The fourth embodiment of determining the similarity or conflict degree of two technical systems by using three of the five pieces of partial information, i.e. the information of the small group, the large group and the small group, is further described below, and the method in this embodiment may be implemented in the form of software.
Specifically, the step of calculating the similarity index between the target solution set a and the solution subset B in step 74 of the flowchart illustrated in fig. 7 may adopt the following sub-steps. Refer to fig. 8. Fig. 8 is a flowchart of a first method employed by the process step 74 for calculating a similarity index between the target solution set a and the solution subset B.
The process illustrated in fig. 8 is characterized by using a patent application of two technical systems or technical solution sets as a technical point, and using the international patent classification number of the patent application as a technical classification rule. Specifically, the international patent classification number performs the analysis of technical correlation or similarity between two technical systems according to the subclass, major group and minor group classification numbers of the IPC classification of the patent applications in the set A and the set B.
First, in step 81, the IPC numbers in all the patent application information of the patent classification number set a and the patent classification number set B are obtained to form two IPC number sets, and the two IPC number sets respectively correspond to the set A, B.
At step 82, the minor group code, major group code and minor group code indicated by all international patent classification numbers of the first number set are obtained, repeated parts in each group of codes are removed, and a minor group code set B3 (the first column of table 1, i.e. the IPC minor group of set a) is obtained, the number B3 of minor group codes is 19 (the last row of the first column of table 1, i.e. the last row of the IPC minor group column of set a), a major group code set B2 (the first column of table 2, i.e. the IPC major group of set a), the number B2 of major group codes is 19 (the last row of the first column of table 2, i.e. the last row of the IPC major group column of set a), and a minor group code set B1 (the first column of table 3, i.e. the IPC minor group of set a), and the number B1 of minor group codes is 13 (the last row of the first column of table 3, i.e. the last row of.
Then, the minor group codes, major group codes and minor group codes indicated by all international patent classification numbers of the second number set are obtained, repeated parts in each group of codes are removed, and a minor group code set D3 (the second column of table 2, i.e. the IPC minor group of set B), the number D3 of the minor group codes being 10 (the last row of the second column of table 2, i.e. the last row of the IPC minor group column of set B), a major group code set D2 (the second column of table 3, i.e. the IPC major group of set B), the number D2 of the major group codes being 10 (the last row of the second column of table 3, i.e. the last row of the major group column of set B), and a minor group code set D1 (the second column of table 4, i.e. the IPC minor group of set B), the number D1 of the minor group codes being 5 (the last row of the second column of table 4, i.e. the last row of the IPC minor group column of.
Table 2: IPC subclass information comparison table of set A and set B
IPC subclass of set A
|
IPC subclass of set B
|
Overlapping IPC subclasses
|
A41D
|
E21C
|
B65G
|
A62D
|
E21D
|
C02F
|
B01D
|
B23P
|
E21C
|
B01F
|
B25B
|
E21D
|
B03B
|
E02F
|
E21F
|
B61G
|
B65G
|
|
B61K
|
E21F
|
|
B61L
|
G06Q
|
|
B65G
|
C02F
|
|
B66D
|
E01H
|
|
C01B
|
|
|
C01F
|
|
|
C02F
|
|
|
C09K
|
|
|
C25C
|
|
|
E01B
|
|
|
E21C
|
|
|
E21D
|
|
|
E21F
|
|
|
Add up to 19 items
|
Total of 10 items
|
Repeat 5 items |
Table 3: IPC large group comparison table of set A and set B
IPC group of set A
|
IPC group of set B
|
Overlapping IPC team
|
A41D13/00
|
E21C35/00
|
E21D15/00
|
A61F17/00
|
E21C41/00
|
|
A61J9/00
|
E21D15/00
|
|
A62D1/00
|
B23P19/00
|
|
B61K7/00
|
B25B27/00
|
|
B61L11/00
|
E02F9/00
|
|
B61L23/00
|
E21C33/00
|
|
B65G11/00
|
E21D20/00
|
|
B65G21/00
|
E21D23/00
|
|
B65G65/00
|
E21F13/00
|
|
B66B15/00
|
|
|
B66C1/00
|
|
|
B66D1/00
|
|
|
C01B33/00
|
|
|
C02F1/00
|
|
|
C09K3/00
|
|
|
C25C3/00
|
|
|
E21D15/00
|
|
|
E21D19/00
|
|
|
Add up to 19 items
|
Total of 10 items
| Repeat | 1 item |
Table 4: IPC group comparison table for set A and set B
It should be noted that, in step 82, 100% of the patent classification number analysis objects of the set a and the set B are selected respectively, and in other embodiments, only a part of them may be selected. The result of this is that the execution result of the method has a certain error, but the overall judgment is not affected, and the practicability of the method is enhanced, and any technical system can judge under the condition that the patent classification number has an error. In addition, a selection range is set, so that better balance between the effect and the efficiency can be achieved, and the method has flexibility in use.
In step 83, based on the minor group code sets B3 and D3, major group code sets B2 and D2, and minor group code sets B1 and D1 of the two technical systems obtained in step 82, the number of overlap of minor group codes E3 of the two technical systems is calculated to be 5 (the third column of table 1, i.e., the last row of the overlapped IPC minor group column), the number of overlap of major group codes E2 is calculated to be 1 (the third column of table 2, i.e., the last row of the overlapped IPC major group column), and the number of overlap of minor group codes E1 is calculated to be 0 (the third column of table 3, i.e., the last row of the overlapped IPC minor group column).
In step 84, calculating the small group code overlap ratio, the large group code overlap ratio and the small group code overlap ratio of any one technical system according to the small group code number b 3-19, d 3-10, the large group code number b 2-19, d 2-10, the small group code number b 1-13, d 1-5, the number of superposition of two technical system small group codes E3-5, the number of superposition of large group codes E2-1 and the number of superposition of small group codes E1-0; among them, for the first technical system, A3 ═ E3/b3 ≈ (5/19)%, a2 ═ E2/b2 ≈ 5 ≈ 1/19 ≈ 5%, a1 ═ E1/b1 = (0/13)% = 0;
for the second technical system, B3 ═ E3/d3 ≈ 50% (5/10)%, B2 ═ E2/d2 ≈ 10% (1/10)%, and B1 ═ E1/d1 = (0/5)% = 0%.
In step 85, calculating a patent technology correlation index F of any technical system relative to another technical system according to the contact ratio; wherein, for the first technical system, FA=C3*A3+C2*A2+C1*A1,FBC3 × B3+ C2 × B2+ C1 × B1, C3, C2, and C1 are empirical constants, in this example, C3, C2, and C1 respectively represent correlation coefficients of the collision between the IPC subclass, major class, and minor class and the two systems, and their empirical values are 1, 2, and 3, respectively.
For the first technical system, FAA3+ C2 a2+ C1 a1 ═ C3, i.e., FA=C3*A3+C2*A2+C1*A1=1*26%+2*5%+3*0=36%。
For the second technical system, FBC3 × B3+ C2 × B2+ C1 × B1, i.e., FB=C3*B3+C2*B2+C1*B1=1*50%+2*10%+3*0=60%。
In step 86, calculating the patent conflict probability G of any technical system relative to another technical system according to the correlation index F; wherein.
GA=FA/(C3+C2+C1)=36%/(1+2+3)=6%。GAAs a similarity between the first technical system and the second technical system. GB=FB/(C3+C2+C1)=60%/(1+2+3)=10%。GBAs a similarity of the second technical system to the first technical system.
Wherein G isAIs the similarity index, G, of the target technical solution set A and the technical solution subset BBIs the similarity index of the technical solution subset B and the target technical solution set A.