WO2022208998A1

WO2022208998A1 - Method for creating bird's-eye view using intellectual property information

Info

Publication number: WO2022208998A1
Application number: PCT/JP2021/043529
Authority: WO
Inventors: 厚至國行; 昌俊深町; 駿平鈴木
Original assignee: 本田技研工業株式会社
Priority date: 2021-04-02
Filing date: 2021-11-29
Publication date: 2022-10-06
Also published as: JP2022158621A; JP7317067B2

Abstract

[Problem] To create a technological bird's-eye view that would make it possible to discover and analyze a relationship between enterprise, technology, and business that cannot be discovered by a traditional approach. [Means for Solving Problem] [Solution] The present invention associates rightful claimants with each other by technological features on the basis of a dataset in which patent classification information and rightful claimants are associated and, adopting each rightful claimant as a node and the association of rightful claimants with each other as an edge, clusters each node by a group the degree of association of which is relatively high and creates a technological bird's-eye view in graph form.

Description

How to create bird's-eye view using intellectual property information

The present invention uses information related to intellectual property such as patents to overview the distribution and mutual relationships of companies (e.g., applicants and rights holders of intellectual property) and technologies (e.g., IPC, FI, etc.) It relates to a method for creating a bird's-eye view that can be In particular, the present invention relates to a method of creating a bird's-eye view suitable for gaining unprecedented awareness of the distribution and relationships of companies and technologies.

In recent years, the IP landscape has attracted attention as a method for formulating management strategies. According to the 2016 Patent Office Industrial Property Rights System Issues Survey Research Report, the IP landscape is defined as "the company's own market, including competitors, market research and development, management strategy trends, and technical information such as individual patents. It shows the bird's eye view of the current situation and future prospects for the position."

IP landscape is a method that has been used by advanced companies for a long time, but the recent development of IoT technology, big data technology, AI technology, robotics technology, etc. has lowered the threshold between technologies in different fields and industries. The company's adjacent industries have increased and become more complex, and data processing has become faster and more sophisticated, making it possible to adopt unprecedented analysis methods. there is

Under these circumstances, various IP landscape methods and tools for IP landscape have been developed.

Patent 6794584 publication Japanese Patent Application Laid-Open No. 2019-152939 Japanese Patent No. 6370434

Patent Document 1 describes a method for illustrating a patent strategy chart, which reflects the product information related to the embodiment and the claims and the embodiment, and the inclusion relationship and / or relationship of the embodiment to the claims. Disclosed is a graphical representation of a patent strategy chart showing the nature of the patent strategy.

In Patent Document 2, analysis target data such as applicants, patent classifications, keywords, etc., who have increased applications according to the development scale, are displayed without omission regardless of the development scale to facilitate visual discovery. A patent mapping display device capable of

Although Patent Document 3 does not mention patent information, it uses a corporate group/business attribute co-occurrence matrix to determine the similarity between companies by cosine similarity, etc., and to determine the distribution of companies and their mutual relationships. A bird's-eye view system is disclosed.

In Patent Document 1 and Patent Document 2, a bird's-eye view that can overlook the distribution and mutual relationships of companies (e.g., intellectual property applicants and rights holders) and technologies (e.g., IPC, FI, etc.) is disclosed. It has not been. Further, Patent Document 3 does not disclose any bird's-eye view of companies and technologies using intellectual property information such as patents.

In addition to the above documents, there are many documents that disclose methods for creating bird's-eye views of companies and technologies from patent information, but they are so-called patent maps, which are simply organized patent information. These patent maps are convenient for grasping the current patent application and holding status for each company and technology, but they are not sufficient as an analysis method for formulating future management strategies. was an issue.

In order to solve the above problems, the present invention is a method for creating a technology bird's-eye view, and a specific patent (pending patent, patent expired before registration after filing, patent in existence, patent expired after registration, It includes any one or more of utility model registration during application, utility model registration extinguished before registration after filing, utility model registration in existence, and utility model registration extinguished after registration.Not limited to rights in Japan.The same shall apply hereinafter.) associated with the patent classification information given to the specific patent right holder (including one or more of the applicant, the current right holder, and the past right holder; the same shall apply hereinafter). a data set recording step of recording a data set in a data set holding unit; and technology distribution information indicating the distribution of technologies owned by the right holder based on the patent classification information associated with the right holder in the data set. A technology distribution information acquisition step to be acquired, an association step for associating rights holders based on either a value indicating the degree of commonality or a value indicating the degree of similarity of the technology distribution information of each right holder, and connecting each right holder to a node Assuming that the association between each rights holder is an edge, the clustering step divides each node into groups that are estimated to have a relatively high degree of association from the state of the connection of each node by the edge, and the nodes, edges, and clustering results are and a display step for displaying in a graph format. etc.

Specific features of the bird's-eye view created by the present invention are as follows. First, since this bird's-eye view is drawn using publicly systematized and maintained patent classification information, etc., it is an accurate and objective company can show the distribution and relationships between them (although you can use your own taxonomy if you prefer). In addition, this bird's-eye view covers a large number of companies, and in order to associate each company based on the technology distribution or the characteristics of the technology distribution possessed by each company, it is possible to comprehensively and comprehensively view the technological relationships between many companies. can be grasped. Furthermore, in this bird's-eye view, the company distribution is created in a graph format with each company as a node and the relationship between companies as an edge. to create Therefore, not only directly related companies but also indirectly related companies (neighbors of neighbors, etc.) can be appropriately grouped to systematically grasp the overall picture of the industrial world. Therefore, according to this bird's-eye view, there is a high possibility that unprecedented distributions and relationships between companies can be discovered, and the present invention can be said to be extremely useful for formulating future management strategies.

Flow chart of Embodiment 1 Flow chart of Embodiment 2 Flow chart of Embodiment 3 Flow chart of Embodiment 4 Flow chart of Embodiment 5

Hardware diagram

Formula 1 according to the present invention Formula 2 according to the present invention Formula 3 according to the present invention Formula 4 according to the present invention Formula 5 according to the present invention Formula 6 according to the present invention Formula 7 according to the present invention Formula 8 according to the present invention Formula 9 according to the present invention Formula 10 according to the present invention Formula 11 according to the present invention Formula 12 according to the present invention Formula 13 according to the present invention Examples of datasets according to the invention Example 1 of technology distribution according to the present invention Example 2 of technology distribution according to the present invention Example 1 of association according to the present invention Example 2 of association according to the present invention Example 1 of technology bird's-eye view according to the present invention Example 2 of technology bird's-eye view according to the present invention Example 3 of technology bird's-eye view according to the present invention

Hereinafter, each embodiment of the present invention will be described with reference to the drawings. The mutual relationship between the embodiments and the claims is as follows. First, Embodiment 1 is the most basic embodiment and relates to all claims, but corresponds to Claim 1 in particular. Embodiment 2 mainly corresponds to claim 2 . Embodiment 3 mainly corresponds to claim 3 and claim 4 . Embodiment 4 mainly corresponds to claim 5 . Embodiment 5 mainly corresponds to claim 6 . Embodiment 6 mainly corresponds to claim 7 . Embodiment 7 mainly corresponds to claim 8 . Embodiment 8 mainly corresponds to claim 9 . However, the present invention is by no means limited to these embodiments, and can be embodied in various forms without departing from the scope of the invention.
<Embodiment 1>

In Embodiment 1, the most basic embodiment of the present invention will be described. The outline of the steps of this embodiment is as shown in FIG. Each step will be described below.

The "data set recording step" records a data set that associates the patent classification information assigned to a specific patent with the right holder of that specific patent in the data set holding unit.

"Specific patent" means a pending patent, a patent extinguished before registration after filing, a patent in existence, a patent extinguished after registration, utility model registration during application, utility model registration extinguished before registration after filing, utility model registration in existence, after registration Any one or more of the lapsed utility model registrations. Therefore, for example, pending patents and patents in existence, which are rights that are currently in existence, may be targeted, or patents in existence, which are rights that have been registered, and patents that have expired after registration, may be targeted. Of course, it is also possible to include patents that have expired after filing and before being registered as an indication of trends in technological development. These points are the same for utility model registration. Of course, other combinations can be freely adopted depending on the purpose.

"Patent classification information" typically includes IPC, FI, F-term, USPC (United States Patent Classification), ECLA (European Patent Classification), CPC (Common Patent Classification), etc. Refers to patent classification information, but is not limited to this. It may be information based on an original classification used within a company or an original classification created by an information vendor. Also, for example, a unique classification may be created from statistical information of words used in claims and used. In other words, the patent classification here refers to all information for classifying patents.

Also, the above patent classification information is not necessarily limited to information using the classification code or the like as it is. For example, a code separated by the first digit of the IPC (section) or the first three digits of the IPC (class) may be used. Also, the delimitation criteria are not limited to the number of digits in classification definition such as section, class, and subclass. Also, instead of using the number of digits, codes of the same hierarchy based on the hierarchical concept defined in IPC may be used. Alternatively, the first three digits of the IPC may be used for one technical field, and the first four digits of the IPC may be used for another technical field.

Also, it is natural that the patent classification information does not need to maintain the appearance of the original code. For example, for data compression, the IPC may be converted to a form of keys such as numbers. In addition, the same is true when the information related to the original patent classification remains while maintaining its characteristics, such as when IPC is used as a dimension and dimensionality is compressed. These points should be appropriately implemented by those skilled in the art.

"Right holder" includes any one or more of the above specific patent applicant, current right holder, and past right holder. Therefore, for example, the applicant may be targeted for pending patents, and the current right holder may be targeted for pending patents. Also, if necessary, past right holders may be targeted.

In addition, the rights holders mentioned above are not limited to the names listed in the bulletins and original records. For example, it may be the right holder of the name identification destination whose name is identified in the official gazette or original register. Also, the name of the subsidiary may be changed to the name of the parent company, or the name of the corporate group to which the company belongs may be changed. If the rights are jointly owned, the patents may be counted as patents of respective right holders, or all the joint owners may be registered under one name. In other words, the right holder here broadly refers to the formal/substantial right holder of the patent. Note that the right holder information may of course be a right holder code or a company code (EDI code, etc.) of a company whose names have been collated, instead of the name of the right holder.

A "data set that associates patent classification information assigned to a specific patent with the right holder of the specific patent" is, for example, the X column is the right holder information and the Y column is the patent classification information. points to a CSV file. Of course, in addition to the CSV format, the TSV format, the Excel format, the relational database format, and the XML format may also be used. Any format is acceptable as long as the patent classification information is associated with the specific patent right holder. For example, FIG. 20 shows part of the contents of an Excel file (the patent number and right holder name are converted to dummies).

In addition, FIG. 21 shows part of the contents of an Excel-format file that aggregates patents for each right holder and each IPC (right holder names are converted to dummies). In this step, as shown in FIG. 21, it is possible to read a file that is compiled in advance for each right holder and each IPC. Of course, the patent classification information data and the right holder information data may be held separately, and a data file for linking may be held to read a plurality of files in which the two are associated. As long as patent classification information and right holders are substantially associated, the actual data format is merely a matter of design by those skilled in the art.

A "data set storage unit" is a device or recording medium that records a data set. This data set holding unit is typically an electric/magnetic recording medium such as a memory or HD, but is not limited to this. It should be noted that recording the data set here does not have to be long-term recording, and may be recording for temporary calculation for later steps.

The "technology distribution information acquisition step" acquires technology distribution information indicating the distribution of technologies owned by the right holder based on the patent classification information associated with the right holder in the data set.

"Technology distribution information" indicates the distribution of technologies owned by the right holder, and is statistical information on patent classifications assigned to patents by each right holder. This may be simple aggregated information, or may be aggregated information subjected to processing such as emphasis and standardization. To give a simple example, a certain right holder Company A owns Patent 1 and Patent 2, Patent 1 is granted IPC "H01M 10/02" and "A01B 1/14", Patent 2 has If IPCs "H01M 10/02" and "A01D 29/00" are given, the IPCs associated with Company A and their aggregate values are 2 for "H01M 10/02" and 1 for "A01B 1/14" , "A01D 29/00" becomes 1. This may be used as technology distribution information. It should be noted that this is a simple example for explanation, and actually becomes as shown in FIG.

In addition, technology distribution information is technology distribution information with noise removed by limiting aggregated information on patent classifications attached to patents for each right holder to patent classifications with a certain number of appearance ratios and number of applications. good Also, numerical values may be standardized using the appearance ratio, the average value of the number of appearances, or the standard deviation. Furthermore, by multiplying the appearance ratio and the number of appearances by N, for example, those with a high appearance ratio and number of appearances may be emphasized. Those skilled in the art may optionally apply such standardization or weighting of statistical features. For example, in FIG. 22, the technology distribution information is narrowed down to only patent classification information that appears 400 times or more.

The "associating step" associates the right holders based on either the value indicating the degree of commonality or the value indicating the degree of similarity of the technology distribution information between the respective right holders.

The "value indicating the degree of commonality of technology distribution information" is, for example, in the case of the degree of commonality between Company A and Company B, the number and ratio of patent classification information common to the technology distributions of Company A and Company B can be mentioned. . Specifically, as the simplest method, if two IPCs are common in the technology distribution of company A and company B, the value indicating the degree of commonality between AB is set to 2. At this time, it is of course possible to use technology distribution information that has been emphasized and standardized. In addition, the value indicating this degree of commonality will be used as an edge weight in a later step, but if you want to perform calculations without weights, set 1 if the degree of commonality is above a certain level, and 0 if it is less than a certain level. good too.

Also, the "value indicating the similarity of technology distribution information" is, for example, a cosine similarity, a similarity by set operation, a value based on the deviation of a probability distribution, and the like. The details of how these similarities are calculated will be described in another embodiment.

The results of associating rights holders as described above can be represented, for example, by a square matrix. FIG. 23 is an example of this. In FIG. 23, for example, 13 is described in the intersecting element of company A (row) and company B (column), and this 13 is the degree of commonality between company A and company B.

In the "clustering step", each right holder is treated as a node, and the association between each right holder is treated as an edge, and each node is divided into groups that are estimated to have a relatively high degree of association based on the state of connection of each node by the edge.

A "node" is a node (point of contact) in so-called graph theory. An "edge" is also an edge (side) in graph theory. Therefore, the "connection state of each node by an edge" is the connection state of each node. When edge weights are taken into account, it refers to the state of connection in which weights are also taken into consideration. However, these terms are not bound by the strict definition of graph theory, and the drawing of points of contact and edges representing the network structure is within the scope of the present invention.

As a method of ``estimating that each node is relatively highly related from the state of connection of each node by an edge'', there is a method using a modularity index Q, for example. Modularity index Q is defined as the ratio of links connecting nodes in a group minus the expected value if the links are randomly placed, for a given partition of the network. . Modularity Q is an index well known to those skilled in the art, but the formula is shown below just in case.

Although the above modularity formula does not take edge weights into account, a formula that takes edge weights into account may be used as in the following formula.

Even when using a formula that does not consider edge weights, clustering that reflects weight differences to some extent may be performed by calculating assuming that there are no edges below an arbitrary weight (with zero weight).

The higher the modularity index Q, the higher the number of links between nodes in the group (that is, the higher the degree of relevance of each node). It can be said that Therefore, in this embodiment, group division (clustering) is performed so that Q becomes high. It should be noted that "dividing groups" broadly refers to the process of creating a state in which grouping has been performed, for example, excluding clustering by algorithms such as the following greedy method is not.

Algorithms for clustering using the modularity index Q are also well known to those skilled in the art, and some examples will be outlined. For example, there is a greedy method that merges a cluster with another cluster from a state in which all nodes are individually separate clusters. In this method, the modularity index Q is used as a criterion for selecting clusters to be merged, and clusters with a larger Q after merging are merged as needed. Conversely, there is also a method in which all nodes are treated as one cluster, and the cluster is divided at any time using the modularity index Q as a reference. Since maximization of the modularity index Q in clustering is an NP problem, algorithms are proposed from time to time by those skilled in the art to find approximate solutions. Any algorithm may be applied to the present invention.

Note that "estimated" in "each node is estimated to have a relatively high degree of relevance" means that the solution obtained by the calculation process may be an approximate solution as described above. ing. In other words, there is no need to continue the calculation until the exact solution is obtained, and the calculation may be terminated in the middle. However, this does not exclude the case where the obtained solution is an exact solution, and does not negate the clustering process for obtaining an exact solution. It may be estimated, for example, when the convergence rate of the convergence target in the nth and n+1th operations falls within a predetermined range. For example, the convergence rate, which is the difference, may be about 1% to 20%.

The feature of the above clustering is that clusters are created from the state of connection of the edges between nodes after drawing edges between nodes, not similarity information between nodes. Therefore, in the present invention, all methods of estimating the degree of relevance from the state of connection of the edges of each node can be adopted, instead of the similarity information between each node. With this feature, the present invention appropriately groups not only directly related adjacent companies but also indirectly related adjacent companies (adjacent to adjacent, etc.), and can systematically grasp the overall picture of the industrial world. There are benefits.

It should be noted that an index other than the modularity index Q may be used to evaluate the state of edge coupling between nodes. For example, there is the spin glass method. In this method, 1. If nodes belonging to the same community are connected to each other, plus; Minus if nodes belonging to the same community are not connected to each other; 4. Negative if nodes belonging to different communities are connected to each other; If nodes belonging to different communities are not connected to each other, the plus four factors are combined for scoring.

There is also a random walk method. In this method, a grouping method assuming that edges are randomly picked and moved between nodes, for example, the range of nodes that stay longer is regarded as a group.

A clustering method that allows one node to belong to multiple groups can also be used. This includes, for example, a method of regarding a clique (a group of nodes in which all nodes are interconnected) as one group.

As a simpler clustering method, there is also a method in which all nodes are treated as one cluster and cut from edges with high betweenness centrality to form groups. Betweenness centrality indicates how many paths of all shortest paths between two nodes include a given node (or edge), and can be defined by the following formula. This formula is for the betweenness centrality of a node, but the betweenness centrality of an edge can be calculated by changing "things passing through node i" to "things passing through edge i".

There are many clustering methods, and they can be implemented relatively easily by using libraries such as igraph.

"Display step" displays nodes, edges, and clustering results in graph form. For example, FIG. Although FIG. 25 is monochrome, each node edge is actually colored, and each color constitutes a separate cluster.

Layout algorithms such as Fruchterman-Reingold, ForceAtlas2, etc. are preferably used for drawing nodes/edges. Note that the present invention does not impose any restrictions on the node/edge layout algorithm, so those skilled in the art may use any algorithm for rendering. Just to make sure, the formulas of the above two algorithms are as follows. The Fruchterman-Reingold equation is as follows.

The formula for ForceAtlas2 is as follows. Note that k in the formula is a coefficient set by the user.

I will also outline the actual drawing method. The above two equations require attractive force and repulsive force. Attraction acts between nodes that are connected by edges. On the other hand, repulsive force acts on nodes that are not connected by edges. The attractive force acts to bring the positions of the nodes closer together, and the repulsive force acts to move the positions of the nodes farther apart. Therefore, the positions are adjusted so that the positional relationship between each node is as close as possible to the most appropriate position by these two actions. Specifically, the adjustment of the node positions is repeated an appropriate number of times, and the pattern with the best evaluation value is adopted. Techniques such as simulated annealing are often used for this. Detailed algorithms are well known to those skilled in the art and are not essential to the present invention, so further detailed description is omitted here.

Algorithms such as ForceAtlas and OpenOrd can also be used for graph visualization (layout). Those skilled in the art may use these algorithms as appropriate.

According to this embodiment having the above configuration, it is possible to create a technology bird's-eye view showing the accurate and objective distribution and relationship between companies. Moreover, according to this bird's-eye view, it is possible to comprehend the technical relationships between many companies comprehensively and from a bird's-eye view. In addition, not only directly related companies but also indirectly related companies (neighbors of neighbors, etc.) can be appropriately grouped to systematically grasp the overall picture of the industrial world. Therefore, according to this bird's-eye view, there is a high possibility that unprecedented distributions and relationships between companies can be discovered, and the present invention can be said to be extremely useful for formulating future management strategies.
<Embodiment 2>

Embodiment 2 is basically the same as Embodiment 1, but the node is one of the patent classification information given to the patent (main patent classification information), and the patent classification information based on the technology used at the same time It is characterized by creating a technology bird's-eye view that overlooks the relationship. An overview of the steps of this embodiment is as shown in FIG. Differences from the first embodiment will be described below.

In this embodiment, in the "technical data set recording step", one patent classification information of the patent among the patent classification information assigned to a specific patent is used as the main patent classification information, and the main patent classification information and the patent records the data set associated with the patent classification information in the data set holding unit.

"Main patent classification information" is typically the leading IPC, but it is not limited to this. For example, when the first three digits (class) of the IPC are used as the patent classification information, the first three digits (class) of the most frequently assigned IPC may be adopted as the main axis. Specifically, if five IPCs are attached to one patent, three IPCs are of class A01 and the remaining two IPCs are of class B01, then A01 is adopted.

In the technological bird's-eye view created in this way, the nodes are the main patent classifications. Also, the edge is the commonality (or similarity) of the technologies used together between the principal patent classifications. That is, in the first embodiment, the right holder was the node, but in the present embodiment, the main patent classification is the node. FIG. 26 is an example of this.

According to this embodiment having the configuration described above, it is possible to create a technology bird's-eye view showing the distribution and relevance of the technology used together with the main patent classification information. In addition, according to this bird's-eye view, it is possible to comprehend technical relationships among many main patent classifications comprehensively and from a bird's-eye view. Furthermore, not only directly related main axis patent classifications but also indirectly related adjacent main axis patent classifications (adjacent to adjacent, etc.) are appropriately grouped, and the whole picture of patent technology in the industrial world can be systematically grasped. . Therefore, according to this bird's-eye view, there is a high possibility of discovering relationships between technologies that have not existed in the past, and the present invention can be said to be extremely useful for formulating future management strategies.
<Embodiment 3>

Embodiment 3 is basically the same as Embodiment 1, but the value indicating the degree of commonality or the value indicating the degree of similarity between nodes as a reference for drawing edges between nodes is relatively large. It has a node similarity identification information associating step for associating node similarity identification information indicating a node similarity relationship with each other, and displays the node similarity identification information in addition to nodes, edges, and clustering results in a "node similarity display substep". is characterized by In addition, it is characterized in that a plurality of nodes can be aggregated based on the node similarity identification information in the "aggregation step". The outline of the steps of this embodiment is as shown in FIGS. 3 and 4. FIG. Differences from the first embodiment will be described below.

In this embodiment, in the "node-similar identification information associating step", rights holders with a high degree of commonality/similarity in technologies possessed by rights holders are grouped together. Here, an example of a method of creating groups will be described (methods of calculating commonality and similarity will be described in another embodiment).

An example of the simplest way to create a group is to group together right holders who have completely common technologies (all technologies or characteristic technologies with a high percentage of ownership) owned by right holders. Although this method is simple, there is a possibility that it cannot be grouped because the conditions for grouping are strict.

Another way to create groups is to group them so that the degree of commonality and average similarity between rights holders in the group is as large as possible. Specifically, the values obtained from the number of common patent classifications, cosine similarity and similarity by set operation described later, and the degree of divergence of the probability distribution between right holders are equal to or greater than a certain value. A way to create a group. According to this method, the number of groups can be arbitrarily adjusted by adjusting the constant value, which is convenient. Also, instead of designating a constant value, the nodes may be grouped sequentially from a combination of nodes having relatively high mutual similarity (or estimated to be high).

It should be noted that the value indicating the degree of commonality or the value indicating the degree of similarity used for the above grouping does not need to adopt the same calculation method as the value for drawing edges. For example, the above grouping may be performed using a value indicating the degree of commonality, and association between rights holders for drawing edges may be performed using a value indicating the degree of similarity.

In the "node similarity display substep", the node similarity identification information is displayed on the technology bird's-eye view. For example, there is a display method that attaches the same symbol to similar nodes.

Also, in the "aggregation step", the grouped rights holders can be aggregated and displayed. A plurality of rights holders may be collectively displayed in one node, or a plurality of nodes may be surrounded by a circle-like display. By doing so, the visibility of the graph is greatly improved. This "aggregation step" may be implemented after the graph is displayed in the "display step". In this case, the function is such that the multiple nodes that have already been displayed before aggregation are put into an aggregated state and displayed again. Also, the "aggregation step" may be performed before the "display step" so that a plurality of nodes are aggregated and displayed in advance. Of course, the user may arbitrarily set and designate the timing of aggregation. Note that when nodes are aggregated, the edges and edge weights of the aggregated nodes may be totaled, and clustering processing may be executed again.

With this embodiment having the above configuration, even if there are a large number of rights holders (nodes), by aggregating similar nodes, it is possible to make it easier to see the technology bird's-eye view, so that more effective analysis can be performed. Become.
<Embodiment 4>

Embodiment 4 is basically the same as Embodiment 1, but uses patent classification information in which the number of distributions and/or the ratio of distributions in each technology distribution information is an arbitrary value or more for associating right holders. It is characterized by points. The outline of the steps of this embodiment is as shown in FIG. 1 (same as Embodiment 1). Differences from the first embodiment will be described below.

For the technology distribution information in the present embodiment, for example, it is preferable to replace aggregate information of patent classifications attached to patents for each right holder with a ratio, and use patent classification information in which the ratio is equal to or greater than an arbitrary value for association. be. In addition, it is preferable to assign a uniform weight (for example, “1”) to the patent classification information that exceeds an arbitrary value, aggregate the weight, and associate the nodes. Also, the arbitrary value should be adjustable by the user.

For example, in the example of the above configuration, in the technology distribution of Company A, Company B, and Company C, "H01M 10/02" is distributed at a rate of 10% for Company A and Company B, and C It is assumed that the companies were distributed at a rate of 1% (other technology distributions are omitted).

If the arbitrary value is 10% or more, the two companies A and B are associated with equal weight. This can be said to have focused on only the technologies that are important to the company, discarding unimportant technologies, and making mutual associations. On the other hand, when the arbitrary value is 1% or more, the three companies A, B, and C are associated with equal weight. In this case, the 1% patent classification and the 10% patent classification are treated in the same way as "1" in the association, so the technology that is not so important to Company C (for example, peripheral technology) was emphasized and associated. It will be. By changing the degree of importance of the technology used for association in this way, various technology bird's-eye views can be created.
<Embodiment 5>

Embodiment 5 is basically the same as Embodiment 1, but is characterized in that the values used for associating rights holders are set operations. Specifically, it is characterized by using the degree of similarity between right holders calculated by a set operation, with right holders as a set and patent classification information associated with the right holders as elements of the set. Specific methods of set operations are as follows. The outline of the steps of this embodiment is as shown in FIG. 1 (same as Embodiment 1). Differences from the first embodiment will be described below.

Specific methods of set operations are as follows. For example, the formula for the Jaccard coefficients below is:

The formula for the Dice coefficient is as follows.

The formula for the Simpson coefficient is:

According to this embodiment having the above configuration, it is possible to create a technical bird's-eye view from the viewpoint of set operations.
<Embodiment 6>

Embodiment 6 is basically the same as Embodiment 1, but is characterized in that the values used for associating rights holders are similarities using vector feature amounts. Specifically, it is characterized in that it is a degree of similarity between right holders calculated as a vector feature value based on patent classification information associated with each right holder. Specific vector calculation methods are as follows. The outline of the steps of this embodiment is as shown in FIG. 1 (same as Embodiment 1). Differences from the first embodiment will be described below.

Cosine similarity will be described as an example of similarity using vector features. Cosine similarity is an index that expresses the closeness of angles formed by vectors. The formula for calculating the cosine similarity is as follows. The vector of each node is obtained by taking each patent classification associated with the right holder as each dimension and the number of appearances of each patent classification as the size of each dimension.

FIG. 24 is a matrix of relationships between rights holders created by cosine similarity.

According to the present embodiment having the above configuration, a technical bird's-eye view can be created from the viewpoint of the vector feature amount.
<Embodiment 7>

Embodiment 7 is basically the same as Embodiment 1, but is characterized in that the value used for associating rights holders is a value based on the deviation of the probability distribution. Specifically, the value indicating the degree of similarity in the association step is obtained from the degree of divergence of the probability distribution between right holders, where the distribution of the appearance frequency of the patent classification information associated with the right holder is set as a probability distribution. It is characterized by the use of values that can be obtained. Specific probability distribution calculation methods are as follows. The outline of the steps of this embodiment is as shown in FIG. 1 (same as Embodiment 1). Differences from the first embodiment will be described below.

The divergence of probability distributions between right holders is, for example, KL divergence. The formula for KL divergence is: For the probability distribution, for example, the appearance ratio of the patent classification for each right holder may be used.

In addition, JS divergence is an index that makes KL divergence easier to use.

Since the values obtained by the above equations become smaller as the two are more similar, it is necessary to make adjustments such as taking reciprocals. According to this embodiment having the above configuration, a technology bird's-eye view can be created from the viewpoint of the degree of divergence of the probability distribution.
<Embodiment 8>

Embodiment 8 is basically the same as Embodiment 1, but adds patent classification contribution information for identifying patent classification information that contributed to the association between nodes to the nodes, the edges, or the clustering results. It is characterized in that it further has a labeling step. Furthermore, among the patent classifications that contributed to the association between each node, it is preferable to identify the patent classification information that has a large influence in the cluster created from the state of edge coupling. Below, the outline of the steps of this embodiment is as shown in FIG. Differences from the first embodiment will be described below.

The "patent classification that contributed to the association between each node" in the "labeling step" is the patent classification that influenced the value used for association in the association step, and is the patent classification that is common among the nodes. If there are multiple patent classifications that are common among nodes, a patent classification that is considered to have a greater influence on community formation may be selected. As an example of a value that can be used as the degree of influence here, there is a score (hereinafter referred to as SF-ICF) calculated by replacing words in TF-IDF with subclasses and documents with clusters.

First, TF-IDF will be outlined. TF-IDF is an index for evaluating the importance of a word in a document. Desired. It can be said that TF indicates the degree of importance of a certain word in a certain document, and IDF indicates the general appearance frequency of that word.

　In other words, in TF-IDF, words that appear more often are more important to the document, and words that appear in the document are evaluated as more important, even though they generally appear less frequently.

As described above, SF-ICF is a score calculated by replacing words in TF-IDF with subclasses and documents with clusters. Specifically, it can be expressed by the following formula.

That is, in SF-ICF, the more frequently appearing subclasses are more important to the cluster, and the more frequently appearing subclasses in the cluster are evaluated as more important, even though their general appearance frequency is low. .

Although SF-ICF uses subclasses, there is no problem in using patent classifications that are higher or lower than subclasses. Also, clusters can be used by replacing them with groups created in the third embodiment.

Fig. 27 shows the actual labeling. However, the label that describes drinks, seasonings, etc., uses a rewritten IPC label that is easy to understand.

With this embodiment having the above configuration, it is possible to know what kind of patent classification the edge between nodes is caused by, and what kind of patent classification influences the formation of clusters. can be done. As a result, for example, it is possible to know what kind of technology is possessed by right holder A and right holder B belonging to the same cluster, and it becomes possible to analyze the technology bird's-eye view more effectively. In particular, it is possible to compare the clusters created from the state of edge coupling in the clustering step with the groups described in the third embodiment, and group comparisons can be made from different viewpoints, enabling more advanced analysis. Become.

Finally, an example of the hardware and software configuration for executing the method according to the present invention on a computer is shown (Fig. 6). The present invention can be implemented as hardware, software, or both hardware and software. Specifically, hardware includes CPU, main memory, GPU, image memory, graphic board, or secondary storage device (hard disk, non-volatile memory, storage media such as CD and DVD, and read drives for those media). and various input/output devices.

In addition, as software, various OSs, databases such as relational databases, XML databases, and file databases, programs and drawing components implemented in C#, Python, etc., may be combined. Also, some of the steps may be performed by known software. Furthermore, any of these programs may be realized as a plurality of modularized programs, or two or more programs may be combined to be realized as one program.

FIG. 6 shows that each program and each data recorded in the non-volatile memory are read into the main memory and arithmetic processing is performed, and that the arithmetic result performed on the main memory can be recorded in the non-volatile memory. (simplified diagram). However, this is only an example, and a specific implementation method can be appropriately selected by those skilled in the art. The present invention is not limited to the configuration of the hardware diagram of FIG. The present invention may consist of multiple systems, and some may be operated manually without being programmed.

With the present invention, which has the various configurations described above, it is possible to create a technology bird's-eye view that shows the accurate and objective distribution and relationship between companies. Moreover, according to this bird's-eye view, it is possible to comprehend the technical relationships between many companies comprehensively and from a bird's-eye view. Furthermore, not only directly related adjacent companies but also indirectly related adjacent companies (adjacent to adjacent, etc.) can be appropriately grouped, and the overall picture of the industrial world can be grasped systematically. Therefore, according to this bird's-eye view, there is a high possibility that unprecedented distributions and relationships between companies can be discovered, and the present invention can be said to be extremely useful for formulating future management strategies.

Claims

A method for creating a technology bird's eye view, comprising:
Specific patents (pending patents, patents that have expired after filing, patents that have expired after filing, patents that have expired after filing, utility model registrations that have been filed, utility model registrations that have expired after filing, utility model registrations that have expired, utility models that have expired after registration (including any one or more of the registrations. Not limited to rights in Japan. The same shall apply hereinafter.), and the right holders of the specific patent (applicant, current right holder, a data set recording step of recording the associated data set in the data set holding unit;
a technology distribution information acquisition step of acquiring technology distribution information indicating the distribution of technologies owned by the right holder based on the patent classification information associated with the right holder in the data set;
an associating step of associating rights holders based on either a value indicating the degree of commonality of technology distribution information or a value indicating the degree of similarity between the respective right holders;
a clustering step of dividing each node into groups that are estimated to have a relatively high degree of relevance based on the state of the connection of each node by edges, with each rights holder being a node and the relationship between each rights holder being defined as an edge;
a display step of displaying the nodes, edges and clustering results in graph form;
A technology bird's-eye view creation method consisting of.
A method for creating a technology bird's eye view, comprising:
Among the patent classification information assigned to a specific patent, one of the patent classification information is regarded as the main patent classification information, and a data set that associates the main patent classification information with the patent classification information of the relevant patent is retained. a technical data set recording step for recording in a section;
a patent technology distribution information acquisition step of acquiring patent technology distribution information indicating the distribution of technologies used simultaneously with the main patent classification information based on the patent classification information associated with the main patent classification information in the data set; ,
an associating step of associating the main patent classification information with each other based on either a value indicating the degree of commonality or the value indicating the similarity of the patent technology distribution information between the main patent classification information;
Clustering in which each main axis patent classification information is treated as a node, and the relationship between each main axis patent classification information is treated as an edge, and each node is divided into groups that are estimated to have a relatively high degree of association from the state of connection of each node by the edge. a step;
a display step of displaying the nodes, edges and clustering results in graph form;
A technology bird's-eye view creation method consisting of.
A node similarity relationship between nodes for which the value indicating the degree of commonality or the value indicating the degree of similarity between nodes based on which edges are drawn between nodes is an arbitrary value or more and/or is a relatively large value further comprising a node-similar identification information associating step of associating node-similar identification information indicating
The technical overview according to claim 1 or claim 2, wherein the display step further comprises a node similarity display substep of displaying associated node similarity identification information in addition to nodes, edges, and clustering results. Diagramming method.
4. The technology bird's-eye view creation method according to claim 3, further comprising an aggregating step of aggregating nodes associated with the same or/and similar node similarity identification information.
Patents for which the value indicating the degree of commonality in the association step is equal to or greater than an arbitrary value in the number of distributions and/or the ratio of distributions among the patent classification information associated with the right holder (or main patent information) 5. The technology bird's-eye view creation method according to any one of claims 1 to 4, wherein the degree of commonality between right holders (or main patent information) based on classification information.
The value indicating the degree of similarity in the association step is calculated by a set operation with the right holder (or main patent information) as a set and the patent classification information associated with the right holder (or main patent information) as elements of the set. The technology bird's-eye view creation method according to any one of claims 1 to 4, wherein the degree of similarity between right holders (or main patent information) to be acquired
Between right holders (or main patent information) in which the value indicating the degree of similarity in the association step is calculated as a vector feature value based on the patent classification information associated with the right holder (or main patent information) The technology bird's-eye view creation method according to any one of claims 1 to 4, wherein the similarity of
The value indicating the degree of similarity in the association step is the distribution of the appearance frequency of the patent classification information associated with the right holder (or main patent information). 5. The technology bird's eye view creation method according to any one of claims 1 to 4, wherein the value is obtained from the degree of divergence of the probability distribution.
Claim 1, characterized by further comprising a labeling step of giving patent classification contribution information for identifying patent classification information that contributed to the association between each node to the nodes, the edges and/or the clustering results. The technology bird's-eye view creation method according to any one of claims 1 to 8
A program that causes a computer to execute the technique bird's-eye view creation method according to any one of claims 1 to 9.