WO2021044428A1 - Method and system of publication landscape analysis - Google Patents

Method and system of publication landscape analysis Download PDF

Info

Publication number
WO2021044428A1
WO2021044428A1 PCT/IL2020/050968 IL2020050968W WO2021044428A1 WO 2021044428 A1 WO2021044428 A1 WO 2021044428A1 IL 2020050968 W IL2020050968 W IL 2020050968W WO 2021044428 A1 WO2021044428 A1 WO 2021044428A1
Authority
WO
WIPO (PCT)
Prior art keywords
publication
landscape
metrics
metric
axis
Prior art date
Application number
PCT/IL2020/050968
Other languages
French (fr)
Inventor
Michael Elliot ADEL
Original Assignee
Adel Michael Elliot
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Adel Michael Elliot filed Critical Adel Michael Elliot
Publication of WO2021044428A1 publication Critical patent/WO2021044428A1/en
Priority to US17/686,715 priority Critical patent/US20220188322A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents
    • G06Q50/184Intellectual property management

Definitions

  • the present invention relates to a publication landscape analyzing system and method, and particularly to a system and method for characterizing and visualizing patent landscapes.
  • a publication landscape can be defined as a portfolio of publications which meet a specific search criterion.
  • the invention is particularly applicable to patent publication landscapes.
  • One of the objectives of such characterization is to provide actionable insight which can be weighed in decisions related to business activities such as investment, mergers or acquisitions.
  • An additional objective is to enable decision makers operating within a business space to objectively position themselves or their competitors in the publication landscape.
  • the objective of the current invention is to provide analysis and visualization of publication landscapes according to a novel method which provides insight into a combination of any one, two or all three of the metrics of scale, consolidation and top ranked player dominance.
  • the method of analysis of a publication landscape by a computing device includes the input of data in the form of at least one search criterion indicative of a publication landscape; querying of a publication database, retrieval of a publication list meeting said at least one criterion from said publication database; counting publications with common player names from said publication list and sorting by count to generate a discrete distribution; applying power law analysis to said discrete distribution to determine the value of an exponent S of said discrete distribution and using said exponent S as a metric of consolidation of said publication landscape.
  • a Pareto i.e. a ranked array of publication counts per player is fitted to a power law distribution of count versus rank.
  • the exponent of the power law can be fixed, for example, to minus unity, or may vary.
  • One approach to estimate both or either of the entitlement of the top ranked player and a metric of consolidation is to allow the exponent to float.
  • the publication count for player of rank R to be C R and the top ranked player’s publication count entitlement as Ci.
  • the system is overdetermined and we may apply, for example, a simple least squares method to determine the estimators for the exponent and top rank player publication count.
  • the exponent of the power law distribution, S may be used as an index of consolidation.
  • consolidation refers to the concentration of the distribution of assets (i.e. publications) within the landscape. Landscapes with high values of S would be considered consolidated, while landscapes with low values would be considered fragmented. The rationale behind this assertion may be explained through consideration of the boundary cases of perfect fragmentation and perfect consolidation. In the case of a perfectly fragmented landscape, the publications would be equally distributed between the players. It is easily seen that this would produce a distribution of slope S equal to zero. In the converse case of a perfectly consolidated landscape, the whole portfolio would belong to a single dominant player and the slope S of the discrete Pareto distribution would tend to infinity.
  • An additional feature of the present invention is the calculation and use of the expectation value of the top ranked player publication count Ci as a metric of landscape scale.
  • Alternative metrics of scale may be the total count of publications for the first n ranked players, where n may vary from unity to the total number of players.
  • a further feature of the method is the use of the ratio of the expectation value of the top ranked player publication count, Ci, to the actual top ranked player publication count, C A , as a metric of dominance of the top ranked player. More formally, it is stated that for a landscape of a given, finite number of publications, as the exponent rises, so will the expectation value of the top rank player’s patent count entitlement, Ci. If we therefore wish to quantify the dominance of the position of the top ranked player, we should normalize their actual patent count by the expectation value of their patent count. Algebraically, the dominance factor ( D ) can be defined by the equation
  • this metric of dominance, D, of the top ranked player is considered a metric of the landscape as a whole and not just a metric of the top ranked player’s portfolio. This same metric may be applied to other ranked players.
  • FIG. 1 illustrates discrete Pareto distributions on a log log plot for Zipfian case of slope -1, and two extreme cases of slope 0 and -100 for publication landscapes of equal publication count (of 5187).
  • FIG. 2 is a log-log graph of assignee publication count vs rank of assignee for the lithography patent landscape, often termed a “Zipf plot”.
  • FIG. 3 is a log-linear graph of the Predicted top assignee publication count (Ci) evolution over time for three patent landscapes.
  • FIG. 4 is a linear graph of the Dominance factor (D) evolution over time for three patent landscapes.
  • FIG. 5 is a linear graph of the Zipf plot exponent (S) evolution over time for three patent landscapes.
  • FIG. 6 is a log linear graph of the predicted top assignee count (Ci) vs the Zipf plot exponent (S) for three patent landscapes. This graph is also termed a “meta-landscape”.
  • FIG. 7 is a linear log graph of the predicted top assignee count (Ci) vs the Zipf plot exponent (S) for three patent landscapes. This graph is also termed a “meta-landscape”.
  • FIG. 8 is a log linear graph of the predicted top assignee count (Ci) vs the Zipf plot exponent (S) for three patent landscapes whereby the balloon size is scaled according to the Dominance factor (D).
  • FIG. 9 is a group of Zipf plots for the lithography landscape at different times.
  • FIG. 10 is a diagram of a publication database analysis system.
  • FIG. 11 is a flowchart of a method of analysis of a publication database.
  • FIG. 2 An example with actual patent publication data illustrating graphically, the definition and extraction of the above-mentioned metrics is shown in Fig. 2 for the lithography patent landscape, in which the publication count is plotted versus the player (assignee) rank on a log log plot, yielding a straight line to a good approximation.
  • a plot may be called a discrete Pareto plot, or a Zipf plot, after the American linguist George Kingsley Zipf who first observed that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table, subsequently denoted as Zipf s law, as disclosed in G. K. Zipf, Selected Studies of the Principle of Relative Frequency in Language, Harvard University Press, 1932.
  • the metrics may be calculated by the above described methods for a sequence of search criteria, whereby the search criteria include a set of publication date windows.
  • the publication window may have a fixed lower bound defined by the earliest date of publication and an upper bound which is incremented by a preset duration, such as one year.
  • the publication window duration can be fixed and moved in time.
  • Figs. 3, 4 & 5 display the time evolution of the metrics, Ci, D and S respectively, for three different search criteria.
  • the search criteria were set as
  • TAC (lithography OR lithographic) AND PBD:[19000101 TO XXXX0101] (4) TAC: (glyphosate) AND PBD:[19000101 TO XXXX0101] (5)
  • TAC indicates that the term in brackets appears in the title, abstract or claims of the patent publication
  • CPC/IPC indicates the patent classification class (in this case G02B 27/01 indicates head-up display)
  • XXXX indicates a year which is incremented to create the points in the respective graphs.
  • Said graphs may be linear, log-linear or log-log.
  • the present invention also includes methods of overcoming meta-landscape trajectory biases resulting from reliance on the current state of a publication database.
  • the current publication database may assign a subset of publications with original player B to player A, resulting from an acquisition of portfolio B by portfolio A. Therefore, by way of example, if the publication count is performed according to one of the examples in equations (4), (5) or (6 ) above, the trajectory will not accurately reflect the state of the landscape in the past, prior to the above mentioned acquisition as it will indicate a higher level of consolidation than that which existed at the time by virtue of the attribution of the count of the merged component B to A’s portfolio.
  • this problem is overcome by maintaining and querying earlier (time-stamped) versions of the database when compiling time-resolved trajectories, rather than relying on restricting the query of the current database by publication date.
  • a merger table is constructed as an adjunct to the current publication database which specifies merger/acquisition date when a publication migrated between players. Using this additional information the accurate historical state of the landscape can be reconstructed.
  • the visualization of the meta-landscape may include information from two or even all three of the indices of scale, dominance and consolidation.
  • a metric of scale e.g. the estimator of the publication count of the top ranked player, Ci
  • an index of consolidation e.g. the exponent, S
  • Fig. 6 An example of such a meta-landscape is shown in Fig. 6 for the three above specified landscapes.
  • CS meta-landscape An interesting and novel feature of the so-called CS meta-landscape is the non-monotonic trajectory traversed by the different landscapes in the meta-landscape.
  • elements in the CS meta landscape are labeled with time or date labels as shown in Fig. 6.
  • the metric of scale is plotted on the x-axis and the metric of consolidation on the y-axis.
  • the slope of the graph i.e. dS/dCi can be calculated which may be considered an indicator of landscape dynamic, i.e.
  • Said landscape dynamic metric may be used or visualized in a way similar to any of the previously specified metrics.
  • the meta-landscape incorporates three metrics, that of scale on the y-axis, consolidation on the x-axis and top player dominance, which is used to scale the balloons. Contrast or color of the marker may also be used to indicate the third metric. This may be termed a CSD meta-landscape.
  • the visualization maybe in 3D space by virtue of the use of either virtual or augmented reality (AR or VR) display devices.
  • AR or VR augmented reality
  • the 3 spatial dimensions of the AR or VR display may be assigned to the 3 metrics of scale, dominance and consolidation, i.e. a 3D version of the CSD meta-landscape.
  • 1 of the 3 spatial dimensions of the AR or VR display may be allocated to time, t (i.e. date) and the other 2 to any pair of the 3 metrics of scale, dominance and consolidation. These may be termed either CSt, CDt, SDt landscapes respectively.
  • multiple search criteria can be used to situate the current state (or at some other specified time) of multiple publication landscapes within any of the meta-landscapes specified in previous paragraphs.
  • the search criterion may also specify legal or jurisdictional restrictions such as allowed or active patent publications or USPTO only.
  • a composite metric may be calculated which is partially dependent on the publication count but may include additional data which may reflect on the player’s position in the field.
  • a composite metric is the so called “Patent Asset Index” as described by Ernst et. al. in
  • the Patent Asset Index A new approach to benchmark patent portfolios," World Patent Information, 2010, in which the index includes additional quality metrics of the publication, which may indicate technology or market relevance, such as citation counts and GDP normalized geographical coverage. The index may then be used in a fashion similar to that of the publication count in the subsequent analysis.
  • some other metric may be counted and ranked such as number of citations.
  • said other metric may be a metric of value or cost which is stored in a database.
  • a further embodiment of the method comprises a dynamic rather than static visualization. So, for example, referring to Error!
  • the said metrics or any combination of them may be implemented as fields of a database.
  • a database of search terms produced from a database of publications fields of metrics of scale, consolidation and dominance or any combination of them can be used to characterize the search terms which nominally characterize a landscape or portfolio.
  • Such a database, in conjunction with other business metrics which characterize a technology domain can be used as a training set for a machine learning algorithm, tasked, for example, with finding technology domains that are more likely to have attractive acquisition targets versus unassailable dominant players.
  • the player may be the inventor.
  • the same mathematical metrics may then be reinterpreted to characterize the innovation processes of the specific landscape or for a particular player in a specific landscape or without restriction to a specific landscape.
  • the search criterion is systematically incremented or changed in order to visualize a large ensemble of metric data associated with a meta-landscape.
  • codes which classify publications according to domain of endeavor may be used (such as IPC, UPC, GBC, F-Term, etc%) to span a broad domain.
  • a single point can be plotted on the CS diagram for each sub-category to create a “heat diagram” showing density of points within the CS diagram.
  • Such diagrams may also be displayed dynamically, as described above.
  • a user or machine 1 may instigate an action via a display interface operable to control a graphical display device, sending input to a computing device 2.
  • Said device may execute program instructions stored on a non-transitory computer readable medium and executable by the at least one computing device, which is communicatively coupled to an electronic publication database 3, a data storage unit 4 and a graphical display device 5.
  • the system’s method of operation will further be described with reference to Fig. 11.
  • Said input may be of the form of a database query 6, examples of which are shown in equations (4) - (6).
  • a publication list 7 is then extracted from publication database 2. In one embodiment of the invention, the publication list is then name harmonized 8 by methods known in the art.
  • One example method of name harmonization is to sort the publication list alphabetically according to the player name (e.g. assignee name) and to harmonize names of players with minor typographical or punctuational variations.
  • a distribution of publication counts is then generated 9 from the publication list by counting publications with common player names and the distribution is sorted in descending order by publication count.
  • the name harmonization is performed subsequent to distribution generation and entries are combined and the distribution is resorted.
  • Zipfian analysis is performed in order to calculate metrics of scale, dominance and consolidation or any combination of the three.
  • One method of performing said Zipfian analysis is according to equations (1) - (3) above.
  • landscape visualization on a graphical display device is performed 11.
  • the visualization may take the form of any of the graphs in Figs. 2 - 9. Said visualization may be either static or dynamic.
  • said metrics may be stored in a data storage unit.
  • said metrics may be retrieved from said storage unit and the above described methods of visualization may be performed at a subsequent time. It is appreciated that wherein the non-transitory computer readable medium and/or the database may be locally resident, they may also be remote from the graphical display device.
  • the calculated indices may be used as metrics of quality of the search criterion.
  • discrete Pareto distributions with lower estimated exponents are more likely to result from search criteria which combine publications from players which are not necessarily in a competition with one another. This can be easily demonstrated by taking any two search criteria which result in two separate and distinct ensembles of players and combining them with an “OR” statement.
  • Such “logical fragmentation” always results in a lower exponent than either of the two landscapes analyzed independently. Therefore, when comparing two candidate search criteria for the same landscape, that with the higher exponent, if said difference is significant, should be preferred.

Abstract

A system according to the present invention may include a graphical display device, a computing device communicatively coupled to the display device and to a publication database. The computing device may be configured to calculate metrics of either scale, consolidation or top ranked player dominance of a publication landscape and to generate visualizations of said metrics and cause the display unit to concurrently or sequentially display them. A method may include the input of a search criterion or criteria, querying of a publication database, retrieval of a publication list meeting said criterion or criteria from said publication database followed by any or all of a list of actions, including sorting, name harmonization and frequency ranked distribution generation. In a subsequent step power law analysis is performed on said distribution, to determine all or a subset of metrics of scale, dominance and consolidation. Said metrics are subsequently visualized on a display on either a linear, log-linear, linear-log or log-log scale. Said visualization may be static or dynamic.

Description

APPLICATION FOR PATENT
TITLE OF INVENTION Method and system of publication landscape analysis CROSS REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of provisional patent application (PPA) Serial Number 62/895,513 filed 04-September-2019 by the present inventor, which is incorporated by reference in its entirety herein. FIELD OF THE INVENTION
The present invention relates to a publication landscape analyzing system and method, and particularly to a system and method for characterizing and visualizing patent landscapes.
BACKGROUND OF THE INVENTION
Business analysts have a need to characterize publication landscapes. In this context, a publication landscape can be defined as a portfolio of publications which meet a specific search criterion. The invention is particularly applicable to patent publication landscapes. One of the objectives of such characterization is to provide actionable insight which can be weighed in decisions related to business activities such as investment, mergers or acquisitions. An additional objective is to enable decision makers operating within a business space to objectively position themselves or their competitors in the publication landscape.
SUMMARY OF THE INVENTION
The objective of the current invention is to provide analysis and visualization of publication landscapes according to a novel method which provides insight into a combination of any one, two or all three of the metrics of scale, consolidation and top ranked player dominance. In one embodiment, the method of analysis of a publication landscape by a computing device includes the input of data in the form of at least one search criterion indicative of a publication landscape; querying of a publication database, retrieval of a publication list meeting said at least one criterion from said publication database; counting publications with common player names from said publication list and sorting by count to generate a discrete distribution; applying power law analysis to said discrete distribution to determine the value of an exponent S of said discrete distribution and using said exponent S as a metric of consolidation of said publication landscape.
In one example, a Pareto, i.e. a ranked array of publication counts per player is fitted to a power law distribution of count versus rank. The exponent of the power law can be fixed, for example, to minus unity, or may vary. One approach to estimate both or either of the entitlement of the top ranked player and a metric of consolidation is to allow the exponent to float. In this approach we shall denote the publication count for player of rank R to be CR and the top ranked player’s publication count entitlement as Ci. Following Newman, (see equation 1 on p. 325, Newman, 2005) in a frequency versus rank distribution these parameters are related by the expression
CR = C,R~S (1) where S is the exponent of the distribution. Applying a natural logarithm yields the following linear expression: logeCR = logeC1 — SlogeR. (2)
For a given publication landscape with observed publication count distribution Ci... Cn, for ranked players 1 to n, the system is overdetermined and we may apply, for example, a simple least squares method to determine the estimators for the exponent and top rank player publication count.
Other mathematical fits such as beta or gamma functions may also be envisaged. In the particular case of the Pareto fit described above, there are two free parameters, the expectation value of the top ranked player publication count, Ci and the exponent expectation value, S. In this context the term “player” may take on various interpretations, including, but not limited to the assignee, original assignee, inventor, authority, agency, publisher, owner, academic institution, author, country, city, journal, ORCID or other publication or industry identifier, etc... Other statistical estimation methods are also anticipated, such as orthogonal distance regression, chi square minimization or the Levenberg-Marquardt iterative method. As mentioned above, a constant may be introduced representing the logarithm of the expectation value of the top ranked player publication count. Other more generalized methods of parameter estimation may be used, as known in the art.
In a preferred embodiment, the exponent of the power law distribution, S, may be used as an index of consolidation. In this context, the term consolidation refers to the concentration of the distribution of assets (i.e. publications) within the landscape. Landscapes with high values of S would be considered consolidated, while landscapes with low values would be considered fragmented. The rationale behind this assertion may be explained through consideration of the boundary cases of perfect fragmentation and perfect consolidation. In the case of a perfectly fragmented landscape, the publications would be equally distributed between the players. It is easily seen that this would produce a distribution of slope S equal to zero. In the converse case of a perfectly consolidated landscape, the whole portfolio would belong to a single dominant player and the slope S of the discrete Pareto distribution would tend to infinity. Given these two boundary cases, the exponent S is proposed as a metric of landscape consolidation, as is illustrated in Fig. 1. More specifically, for patent publication landscapes, S may vary from 0.2 to 2, or thereabouts, with a typical boundary between consolidated versus fragmented landscapes set to unity, i.e. S=l.
An additional feature of the present invention is the calculation and use of the expectation value of the top ranked player publication count Ci as a metric of landscape scale. Alternative metrics of scale may be the total count of publications for the first n ranked players, where n may vary from unity to the total number of players.
A further feature of the method is the use of the ratio of the expectation value of the top ranked player publication count, Ci, to the actual top ranked player publication count, CA, as a metric of dominance of the top ranked player. More formally, it is stated that for a landscape of a given, finite number of publications, as the exponent rises, so will the expectation value of the top rank player’s patent count entitlement, Ci. If we therefore wish to quantify the dominance of the position of the top ranked player, we should normalize their actual patent count by the expectation value of their patent count. Algebraically, the dominance factor ( D ) can be defined by the equation
D = CA/CI (3) It is important to point out that in the preferred embodiment, this metric of dominance, D, of the top ranked player is considered a metric of the landscape as a whole and not just a metric of the top ranked player’s portfolio. This same metric may be applied to other ranked players.
BRIEF DESCRIPTION OF FIGURES The embodiment is herein described, by way of example only, with reference to the accompanying drawings, wherein:
FIG. 1 illustrates discrete Pareto distributions on a log log plot for Zipfian case of slope -1, and two extreme cases of slope 0 and -100 for publication landscapes of equal publication count (of 5187). FIG. 2 is a log-log graph of assignee publication count vs rank of assignee for the lithography patent landscape, often termed a “Zipf plot”.
FIG. 3 is a log-linear graph of the Predicted top assignee publication count (Ci) evolution over time for three patent landscapes.
FIG. 4 is a linear graph of the Dominance factor (D) evolution over time for three patent landscapes.
FIG. 5 is a linear graph of the Zipf plot exponent (S) evolution over time for three patent landscapes.
FIG. 6 is a log linear graph of the predicted top assignee count (Ci) vs the Zipf plot exponent (S) for three patent landscapes. This graph is also termed a “meta-landscape”. FIG. 7 is a linear log graph of the predicted top assignee count (Ci) vs the Zipf plot exponent (S) for three patent landscapes. This graph is also termed a “meta-landscape”.
FIG. 8 is a log linear graph of the predicted top assignee count (Ci) vs the Zipf plot exponent (S) for three patent landscapes whereby the balloon size is scaled according to the Dominance factor (D). FIG. 9 is a group of Zipf plots for the lithography landscape at different times.
FIG. 10 is a diagram of a publication database analysis system. FIG. 11 is a flowchart of a method of analysis of a publication database.
DETAILED DESCRIPTION OF THE INVENTION
An example with actual patent publication data illustrating graphically, the definition and extraction of the above-mentioned metrics is shown in Fig. 2 for the lithography patent landscape, in which the publication count is plotted versus the player (assignee) rank on a log log plot, yielding a straight line to a good approximation. Such a plot may be called a discrete Pareto plot, or a Zipf plot, after the American linguist George Kingsley Zipf who first observed that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table, subsequently denoted as Zipf s law, as disclosed in G. K. Zipf, Selected Studies of the Principle of Relative Frequency in Language, Harvard University Press, 1932.
The three metrics extracted out of the discrete Pareto analysis methodology described above are useful for characterizing, both numerically and graphically, a publication landscape and more specifically a patent landscape. The methods of characterization will now be described in further detail.
In an embodiment, the metrics may be calculated by the above described methods for a sequence of search criteria, whereby the search criteria include a set of publication date windows. By way of example, the publication window may have a fixed lower bound defined by the earliest date of publication and an upper bound which is incremented by a preset duration, such as one year. Or the publication window duration can be fixed and moved in time. Such an approach can result in the visualization of a landscape temporal trajectory. For example, Figs. 3, 4 & 5 display the time evolution of the metrics, Ci, D and S respectively, for three different search criteria. In these examples the search criteria were set as
TAC: (lithography OR lithographic) AND PBD:[19000101 TO XXXX0101] (4) TAC: (glyphosate) AND PBD:[19000101 TO XXXX0101] (5)
(CPC:(G02B 27/01) OR IPC:(G02B 27/01) ) AND PBD:[19000101 TO XXXX0101] (6) whereby TAC indicates that the term in brackets appears in the title, abstract or claims of the patent publication, CPC/IPC indicates the patent classification class (in this case G02B 27/01 indicates head-up display) and XXXX indicates a year which is incremented to create the points in the respective graphs. Said graphs may be linear, log-linear or log-log. These time-resolved trajectories tell the business stories behind the respective intellectual property spaces, in terms of scale, top ranked player dominance and consolidation.
With respect to date related information, the present invention also includes methods of overcoming meta-landscape trajectory biases resulting from reliance on the current state of a publication database. By way of example, the current publication database may assign a subset of publications with original player B to player A, resulting from an acquisition of portfolio B by portfolio A. Therefore, by way of example, if the publication count is performed according to one of the examples in equations (4), (5) or (6 ) above, the trajectory will not accurately reflect the state of the landscape in the past, prior to the above mentioned acquisition as it will indicate a higher level of consolidation than that which existed at the time by virtue of the attribution of the count of the merged component B to A’s portfolio. In a further embodiment of the present invention, this problem is overcome by maintaining and querying earlier (time-stamped) versions of the database when compiling time-resolved trajectories, rather than relying on restricting the query of the current database by publication date. In an alternative embodiment, a merger table is constructed as an adjunct to the current publication database which specifies merger/acquisition date when a publication migrated between players. Using this additional information the accurate historical state of the landscape can be reconstructed.
In a preferred embodiment of the invention, the visualization of the meta-landscape may include information from two or even all three of the indices of scale, dominance and consolidation. For example, in one specific case, a metric of scale, e.g. the estimator of the publication count of the top ranked player, Ci, is plotted on a log scale on the y-axis, and an index of consolidation, e.g. the exponent, S, is plotted on a linear scale on the x-axis. An example of such a meta-landscape is shown in Fig. 6 for the three above specified landscapes. An interesting and novel feature of the so-called CS meta-landscape is the non-monotonic trajectory traversed by the different landscapes in the meta-landscape. In all three examples, it is noted that the landscape folds back on itself as the landscape grows while undergoing fragmentation, followed by a trajectory reversal after which the landscape consolidates. In one embodiment, elements in the CS meta landscape are labeled with time or date labels as shown in Fig. 6. In an alternative embodiment shown in Fig. 7, the metric of scale is plotted on the x-axis and the metric of consolidation on the y-axis. In this configuration, the slope of the graph, i.e. dS/dCi can be calculated which may be considered an indicator of landscape dynamic, i.e. an indicator of whether the landscape is consolidating or fragmenting. Said landscape dynamic metric may be used or visualized in a way similar to any of the previously specified metrics. In Fig. 8, the meta-landscape incorporates three metrics, that of scale on the y-axis, consolidation on the x-axis and top player dominance, which is used to scale the balloons. Contrast or color of the marker may also be used to indicate the third metric. This may be termed a CSD meta-landscape.
In a further embodiment, the visualization maybe in 3D space by virtue of the use of either virtual or augmented reality (AR or VR) display devices. In this case the 3 spatial dimensions of the AR or VR display may be assigned to the 3 metrics of scale, dominance and consolidation, i.e. a 3D version of the CSD meta-landscape. In an alternative embodiment, 1 of the 3 spatial dimensions of the AR or VR display may be allocated to time, t (i.e. date) and the other 2 to any pair of the 3 metrics of scale, dominance and consolidation. These may be termed either CSt, CDt, SDt landscapes respectively.
In an alternate visualization, multiple search criteria can be used to situate the current state (or at some other specified time) of multiple publication landscapes within any of the meta-landscapes specified in previous paragraphs. The search criterion may also specify legal or jurisdictional restrictions such as allowed or active patent publications or USPTO only.
In a further embodiment of the invention, rather than counting and ranking numbers of publications, a composite metric may be calculated which is partially dependent on the publication count but may include additional data which may reflect on the player’s position in the field. One example of such a composite metric is the so called “Patent Asset Index” as described by Ernst et. al. in
"The Patent Asset Index - A new approach to benchmark patent portfolios," World Patent Information, 2010, in which the index includes additional quality metrics of the publication, which may indicate technology or market relevance, such as citation counts and GDP normalized geographical coverage. The index may then be used in a fashion similar to that of the publication count in the subsequent analysis. In an alternative embodiment, rather than rather than counting and ranking number of publications some other metric may be counted and ranked such as number of citations. Furthermore, said other metric may be a metric of value or cost which is stored in a database. A further embodiment of the method comprises a dynamic rather than static visualization. So, for example, referring to Error! Reference source not found., which displays a set of frequency ranked Paretos of assignee publication counts at incrementally longer or later publication periods, said Paretos may be introduced sequentially (for example at time steps of t where t may vary from 0.1 to 10 seconds). Additionally, in the said dynamic visualization, the trajectory of a given player in the Zipf plot over time may be highlighted by color, size, brightness or any other contrast mechanism. Such dynamic visualization may also be applied to data such as that shown in Figs. 3 to 8 or to an AR or VR enabled 3D visualization as specified in previous paragraphs.
In a further embodiment, the said metrics or any combination of them may be implemented as fields of a database. For example, in a database of search terms produced from a database of publications, fields of metrics of scale, consolidation and dominance or any combination of them can be used to characterize the search terms which nominally characterize a landscape or portfolio. Such a database, in conjunction with other business metrics which characterize a technology domain can be used as a training set for a machine learning algorithm, tasked, for example, with finding technology domains that are more likely to have attractive acquisition targets versus unassailable dominant players.
In a further embodiment in which the method is applied to a patent database, the player may be the inventor. The same mathematical metrics may then be reinterpreted to characterize the innovation processes of the specific landscape or for a particular player in a specific landscape or without restriction to a specific landscape.
In a further embodiment, the search criterion is systematically incremented or changed in order to visualize a large ensemble of metric data associated with a meta-landscape. By way of example, and in the case of patent landscapes, codes which classify publications according to domain of endeavor may be used (such as IPC, UPC, GBC, F-Term, etc...) to span a broad domain. A single point can be plotted on the CS diagram for each sub-category to create a “heat diagram” showing density of points within the CS diagram. Such diagrams may also be displayed dynamically, as described above.
The system architecture of the invention will now be described with reference to Fig. 10. A user or machine 1 may instigate an action via a display interface operable to control a graphical display device, sending input to a computing device 2. Said device may execute program instructions stored on a non-transitory computer readable medium and executable by the at least one computing device, which is communicatively coupled to an electronic publication database 3, a data storage unit 4 and a graphical display device 5. The system’s method of operation will further be described with reference to Fig. 11. Said input may be of the form of a database query 6, examples of which are shown in equations (4) - (6). A publication list 7 is then extracted from publication database 2. In one embodiment of the invention, the publication list is then name harmonized 8 by methods known in the art. One example method of name harmonization is to sort the publication list alphabetically according to the player name (e.g. assignee name) and to harmonize names of players with minor typographical or punctuational variations. A distribution of publication counts is then generated 9 from the publication list by counting publications with common player names and the distribution is sorted in descending order by publication count. In an alternative embodiment, the name harmonization is performed subsequent to distribution generation and entries are combined and the distribution is resorted. In a next step Zipfian analysis is performed in order to calculate metrics of scale, dominance and consolidation or any combination of the three. One method of performing said Zipfian analysis is according to equations (1) - (3) above. In a subsequent step, landscape visualization on a graphical display device is performed 11. By way of example, the visualization may take the form of any of the graphs in Figs. 2 - 9. Said visualization may be either static or dynamic. In an optional step, said metrics may be stored in a data storage unit. In a further optional step, said metrics may be retrieved from said storage unit and the above described methods of visualization may be performed at a subsequent time. It is appreciated that wherein the non-transitory computer readable medium and/or the database may be locally resident, they may also be remote from the graphical display device.
In an alternative invention, the calculated indices may be used as metrics of quality of the search criterion. For example, discrete Pareto distributions with lower estimated exponents are more likely to result from search criteria which combine publications from players which are not necessarily in a competition with one another. This can be easily demonstrated by taking any two search criteria which result in two separate and distinct ensembles of players and combining them with an “OR” statement. Such “logical fragmentation” always results in a lower exponent than either of the two landscapes analyzed independently. Therefore, when comparing two candidate search criteria for the same landscape, that with the higher exponent, if said difference is significant, should be preferred.
It should be appreciated that while all the above examples have been directed towards publications from an intellectual property database, the methods and systems may also be applied to any publication database, including but not limited to trademarks, designs, scientific publications, books, blogs, advertisements, wiki pages, legal actions, website hits, or any other situation where conditions of preferential attachment are likely to prevail. It is pointed out that in the context of the current invention, the use of the term “publication” should not rule out the option whereby the database within which the search is performed is in fact not in the public domain and is strictly speaking not a publication database. An example of such a “non-public” database could be a list of wiki pages or web pages or any other internal data set in which players compete for dominance.

Claims

I CLAIM:
1. A method of analysis of a publication landscape by a computing device, including:
(a) the input of data in the form of at least one search criterion indicative of a publication landscape;
(b) querying of an electronic publication database, retrieval of a publication list meeting said at least one criterion from said publication database;
(c) counting publications with common player names from said publication list and sorting by count to generate a discrete distribution;
(d) applying power law analysis to said discrete distribution to determine the value of an exponent S of said discrete distribution;
(e) using said exponent S as a metric of consolidation of said publication landscape.
2. The method of claim 1 wherein said publication landscape is a patent landscape.
3. The method according to either claim 1 or 2 wherein additional metrics are determined, of either or both
(a) top ranked player dominance or
(b) landscape scale.
4. The method according to any one of claims 1 to 3 wherein said metric or metrics are visualized graphically on a graphical display device.
5. The method according to any one of claims 1 to 3 wherein said metric or metrics are stored in a data storage unit.
6. The method of claim 4 wherein said visualization is a graph in which on one axis said metric of landscape consolidation is plotted and on the other axis said metric of landscape scale is plotted.
7. The method of claim 6 wherein said metric of landscape scale is plotted logarithmically on said axis.
8. The method of claim 4 wherein all three of said metrics are plotted graphically.
9. The method of claim 4 wherein said search criteria include date ranges and said visualization is a graph with one of said metrics on one axis and date, year or time on the other axis.
10. The method of claim 4 wherein said graphical visualization is dynamic.
11. A system for analyzing a publication landscape comprising:
(a) a display interface operable to control a graphical display device,
(b) at least one computing device, and
(c) program instructions stored on a non-transitory computer readable medium and executable by the at least one computing device to: i) receive input data in the form of at least one search criterion indicative of a publication landscape; ii) query of an electronic publication database, retrieval of a publication list meeting said at least one criterion from said publication database; iii) count publications with common player names from said publication list and sort by count to generate a discrete distribution; iv) apply power law analysis to said discrete distribution to determine the value of an exponent S of said discrete distribution; v) use said exponent S as a metric of consolidation of said publication landscape.
12. The system of claim 11 wherein said publication landscape is a patent landscape.
13. The system according to either claim 11 or 12 wherein additional metrics are determined, of either or both
(a) top ranked player dominance or
(b) landscape scale.
14. The system according to any one of claims 11 to 13 wherein said metric or metrics are visualized graphically on a graphical display device.
15. The system according to any one of claims 11 to 13 wherein said metric or metrics are stored in a data storage unit.
16. The system of claim 14 wherein said visualization is a graph in which on one axis said metric of landscape consolidation is plotted and on the other axis said metric of landscape scale is plotted.
17. The system of claim 16 wherein said metric of landscape scale is plotted logarithmically on said axis.
18. The system of claim 14 wherein all three of said metrics are plotted graphically.
19. The system of claim 14 wherein said search criteria include date ranges and said visualization is a graph with one of said metrics on one axis and date or time on the other axis.
20. The system of claim 14 wherein said graphical visualization is dynamic.
21. The system of claim 14 wherein the non-transitory computer readable medium and/or the database are remote from the graphical display device.
PCT/IL2020/050968 2019-09-04 2020-09-03 Method and system of publication landscape analysis WO2021044428A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/686,715 US20220188322A1 (en) 2019-09-04 2022-03-04 Method and system of database analysis and compression

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962895513P 2019-09-04 2019-09-04
US62/895,513 2019-09-04

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/686,715 Continuation-In-Part US20220188322A1 (en) 2019-09-04 2022-03-04 Method and system of database analysis and compression

Publications (1)

Publication Number Publication Date
WO2021044428A1 true WO2021044428A1 (en) 2021-03-11

Family

ID=74852568

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2020/050968 WO2021044428A1 (en) 2019-09-04 2020-09-03 Method and system of publication landscape analysis

Country Status (1)

Country Link
WO (1) WO2021044428A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6748394B2 (en) * 2000-04-27 2004-06-08 Hyperion Solutions Corporation Graphical user interface for relational database
US8176440B2 (en) * 2007-03-30 2012-05-08 Silicon Laboratories, Inc. System and method of presenting search results
US20160350886A1 (en) * 2011-04-15 2016-12-01 Ip Street Holdings, Llc Method and System for Evaluating Intellectual Property

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6748394B2 (en) * 2000-04-27 2004-06-08 Hyperion Solutions Corporation Graphical user interface for relational database
US8176440B2 (en) * 2007-03-30 2012-05-08 Silicon Laboratories, Inc. System and method of presenting search results
US20160350886A1 (en) * 2011-04-15 2016-12-01 Ip Street Holdings, Llc Method and System for Evaluating Intellectual Property

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NEWMAN M E J: "Power laws, Pareto distributions and Zipfs law", CONTEMPORARY PHYSICS, vol. 46, no. 5, 1 September 2005 (2005-09-01), pages 323 - 51, XP080176374 *

Similar Documents

Publication Publication Date Title
US10558712B2 (en) Enhanced online user-interaction tracking and document rendition
Bontis et al. A follow‐up ranking of academic journals
US10387437B2 (en) Query rewriting using session information
Glänzel et al. A priori vs. a posteriori normalisation of citation indicators. The case of journal ranking
US20100274783A1 (en) Tuning of relevancy ranking for federated search
Evans et al. Universality of performance indicators based on citation and reference counts
Long Research on art innovation teaching platform based on data mining algorithm
KR102104316B1 (en) Apparatus for predicting stock price of company by analyzing news and operating method thereof
EP3007080A1 (en) A document ranking apparatus, method and computer program
US20140006332A1 (en) Scientometric Methods for Identifying Emerging Technologies
Jacsò Pragmatic issues in calculating and comparing the quantity and quality of research through rating and ranking of researchers based on peer reviews and bibliometric indicators from Web of Science, Scopus and Google Scholar
Li et al. RETRACTED ARTICLE: Research on art innovation teaching platform based on data mining algorithm
García-Pérez Limited validity of equations to predict the future h index
CN112487283A (en) Method and device for training model, electronic equipment and readable storage medium
CN114139539A (en) Enterprise social responsibility index quantification method, system and application
De Filippo et al. Toward a classification of Spanish scholarly journals in social sciences and humanities considering their impact and visibility
US20220188322A1 (en) Method and system of database analysis and compression
CN105405051A (en) Financial event prediction method and apparatus
US11568314B2 (en) Data-driven online score caching for machine learning
Uddin et al. A Sciento-text framework to characterize research strength of institutions at fine-grained thematic area level
Roszkowska et al. Can the holistic preference elicitation be used to determine an accurate negotiation offer scoring system? A comparison of direct rating and UTASTAR techniques
WO2021044428A1 (en) Method and system of publication landscape analysis
CN104199924B (en) The method and device of network form of the selection with snapshot relation
Ko et al. A study on the optimization of KCI-based index (Kor-Factor) in evaluating Korean journals
Gou et al. Encoding the citation life-cycle: the operationalization of a literature-aging conceptual model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20859833

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20859833

Country of ref document: EP

Kind code of ref document: A1