KR20090062270A

KR20090062270A - Patent intelligence system providing automatic applicant-unit analysis

Info

Publication number: KR20090062270A
Application number: KR1020070129412A
Authority: KR
Inventors: 강민수
Original assignee: (주)광개토연구소; 강민수
Priority date: 2007-12-12
Filing date: 2007-12-12
Publication date: 2009-06-17

Abstract

The present invention relates to an analysis support patent information system in the name of the applicant.

In the case of using the present invention, a patent information service system for each company is automatically built on the basis of a patent information database, and further, the inventor information belonging to the company included in the patent information is extracted, and the patent information of the inventor unit is automatically obtained. Create and integrate hierarchically the service system, and automatically extract and provide domestic and foreign patent information related to its applied patent technology to each inventor through the patent classification code included in the inventor's patent application document information. It becomes possible.

Furthermore, the present invention utilizes the information extracted from the patent application documents of the company and the inventors of the company and the search expression obtained from the inventor, and the monitoring function, statistics, and analysis for automatically updating domestic and foreign patent documents related to the patent. Includes competitive information, reporting, and functionality.

Since the company and the company's inventors automatically generate a patent information service system for themselves without any additional effort, it is easy to utilize domestic and foreign patent information related to them without the effort of construction and separate setting.

Description

Patent intelligence system providing automatic applicant-unit analysis

The present invention relates to an automatic analysis support patent information system of an applicant's name unit and an automatic analysis support patent information system of an applicant's name unit for generating and providing advanced patent information including analysis information.

Patent information has the characteristics of technical information, rights information and management information, and its importance is increasing in the global international competition. As the technical information, it is possible to know the trend of technology development and technical ideas applied to individual patents, and as the right information, it is possible to grasp the scope of rights of individual patents and the degree of nationalization of domestic and foreign rights.

There are many ways to obtain patent information, but there are largely 1) the use of patent information system provided by each country's patent office, 2) the use of patent information system developed by private companies, and 3) the individual company's purpose. It can be classified into using a patent information system that is properly constructed. A representative example of the method of 1) is a patent information system provided by the Korean Intellectual Property Office (www.kipo.go.kr) and the Korean Patent Information Service (www.kipris.or.kr), and 2) a representative example of the US www.delphion .com, www.patolis.co.jp in Japan, www.wips.co.kr in Korea and www.wisdomain.com.

Private companies in each country, such as Thomson Scientific, the operator of www.delphion.com, have established a database of patent information and provide search results in a variety of ways for search expressions entered through a patent search engine. In addition, various types of analysis software have been developed based on this patent information, and the software distributed under the trade name AUREKA developed by the operator of www.micropatent.com (acquired and acquired by Thomson Scientific) is famous. And, it is known that the technology giants around the world have built and operated patent retrieval and patent management systems inside, but it is difficult to access their patent retrieval and patent management systems from outside.

It is common to spend more than billions of dollars to construct such a patent search and patent management system, and most domestic and foreign mid-sized and small venture companies do not have their own patent search and patent management system. Many of these mid- to small- and mid-sized startups have a patent-only organization, which manages their patents and collects patent information for them. However, companies that are small in size or do not have high awareness of patents often do not have a dedicated patent organization. These companies are in charge of managing their patents by requesting patent law firms or law firms.

These patent organizations use externally accessible free online patent information service companies such as www.delphion.com to collect and manage patent information on the platform they provide. In addition, most of the patent applications in the name of the company is made by the inventors belonging to the company, the inventors of the global trend to collect domestic and foreign patent information related to their invention. The reason for this is that despite the existence of the patent publication period, more than 70% of the world's new technology information is disclosed in the form of patent information, and because the patent information has many commercially available ideas compared to the paper information and the like. However, many of these inventors, even if they are not used to patent search or are familiar with the search, have access to external patent information service providers such as www.deljphion.com to obtain patent information.

It is common for the inventors to continuously study the specific technical field for several years to several decades, so that the technical field of the invention of the present invention is not greatly changed. Therefore, the patent technical information of the country or foreign countries about the field directly or indirectly related to the technical field invented by him will be useful information to the inventor. Therefore, it has been requested to supply a differentiated patent information service system for a specific inventor, which is operated by the inventor unit of the company as well as an individual company unit, and optimized to the inventor unit. Such a patent information service system has been developed by itself. It will greatly improve access to patent information for medium and small venture companies that cannot be maintained and will be a new platform for in-house patent information distribution for companies that can develop and maintain their own.

In addition, the company or inventors automatically analyze the patents that they manage, and the analysis information about competitors and competition technologies related to them, and the monitoring information about competitors and competition technologies are automatically arranged and sent to the companies or inventors. If provided directly, it would be convenient for the enterprise or inventor.

The first technical problem to be achieved by the present invention is to build a patent information system for each company automatically based on the patent information database, and further extract the inventor information included in the patent information, and automatically to the inventor information patent information system. The present invention provides a patent information system that integrates both systems hierarchically.

The second technical problem to be achieved by the present invention is a patent that automatically collects domestic and foreign patent information related to the company and / or the inventor in at least one or more of the patent information system of the applicant name unit and the patent information system of the inventor unit generated It is to present an information system.

The third technical problem to be achieved by the present invention is at least one or more of the applicant's patent information system and the inventor's patent information system through a multi-dimensional analysis operation result table automatically domestic and foreign patents related to companies and / or inventors The present invention provides a patent information system that automatically performs patent analysis by various patent analysis indexes on information.

The fourth technical problem to be achieved by the present invention is a patent that automatically analyzes the patent information associated with the company and / or the inventor itself in at least one or more of the patent information system of the applicant name unit and the inventor information patent system It is to present an information system.

The fifth technical problem to be achieved by the present invention is to propose various preprocessing methods for patent document information, and a patent information system for performing various patent analyzes based on preprocessed patent information.

The sixth technical problem to be achieved by the present invention is to provide a patent information system from the first technical task to the fifth technical task online in a plurality of applicant units or to present a patent information system operating in the internal system of individual applicants. .

In order to achieve the above object of the present invention, a patent document mast DB, pre-processing module unit, patent information processing basic module and analysis module, the analysis module is characterized in that to provide automated analysis results in the name of the applicant unit An automatic analysis support patent information system of an applicant's name unit is disclosed.

The effects of the present invention are as follows.

First, it is possible to generate a unit patent information system of the applicant name hierarchically integrated only with the obtained applicant information, and also to effectively create the unit patent information system of the inventor name.

Second, the patent information DB can be effectively established through various preprocessing.

Third, through the preprocessing of patent classification code information, the lower patent classification code can be effectively processed in patent search, patent analysis, patent monitoring, etc., thereby improving the quality of patent search, patent analysis, and patent monitoring services.

Fourth, by processing the fusion information on the patent classification code information, it is possible to grasp the tendency of fusion between the patent classification code information, and the fusion attribute can be utilized in patent analysis of the present invention.

Fifth, by processing various types of analysis including citation analysis in a patent document set unit, patent information can be processed independently of the size and type of the document set.

Premise Information

Format of published patent document and included information

Each country's Patent Office publishes a public publication for a patent application or a patent registration with certain conditions. These publications are published in several formats, including text, images, pdf, SGML, or XML files. The current publications in each country are in the form of XML files, and documents in the past SGML or other formats can be converted into XML files with proper processing. On the other hand, XML files of different formats for each country or different formats for each document originator can be converted to a common XML file format. The World Intellectual Property Organization (WIPO) sets out standards for labeling each bibliography, and these standards are adopted in many countries. Currently WIPO ST. 36 is recommended and can be found at www.wipo.int .

The published patent information differs from country to country, but largely includes bibliographic information and patent text.

Information on bibliographic matters includes: information on the country where the patent document was issued, name of the invention, information on the applicant (including assignee, hereinafter) / patent (including assignee, hereinafter), information on inventor, patent classification code Information on (any one or more selected from IPC, USPC, FI, FT, and ECLA, which is the same below), information on various dates such as the filing date, and information on various numbers such as the application number are essentially included. In addition, abstract information, information on representation when there is representation, reference information (reference information on prior art disclosed by the applicant, documents reviewed or investigated by the examiner, or information on patent classification symbols) One or more), information about priorities, information about examiners, etc. may optionally be present. As the contents of the bibliographic matter, the information on the abstract, the representative claims, or the claims may further include information about the first claim.

The main text contains information about the technical content of the patent. The body of a typical patent document includes information on the name of the invention, information on the composition of the invention, and information on the claims. In addition to these, effects of the invention, industrial applicability, technical problems to be achieved by the invention, Description of the drawings, any one or more of the prior art is included. In addition, according to the inventor's selection or the type of patent technology, information about the drawings is additionally included as essential or optional.

In addition to the bibliographic information and the main text, there is more information available, the most representative of which is administrative information. Of course, each country also provides additional information that is not provided by other patent offices. Such is the case with Field of Search provided by the US Patent and Trademark Office. Information on administrative matters refers to information on various administrative matters of the relevant patent office or patent institute for each patent document. Such administrative processing information is typical of information about events occurring within the agency in charge, such as undertaking an examination, and / or information about events occurring in the interaction with the applicant or the person using the agency, such as the applicant's name change application.

Presence of patent classification code information and patent classification code system

In addition, each patent document is assigned at least one patent classification code on at least one patent classification code system. Documents issued by each country, such as Korea, the United States, Japan, and Europe (EPO), are assigned an IPC (International Patent Classification) symbol, and each country has its own special classification system (USPC in the case of USPC). Or UPC, F-

Term / FT, FI (File Index, ECLA, etc. in the case of the European Patent Office) is assigned a patent classification symbol only for the issuing country of the patent document.

The IPC is made in accordance with the Strasbourq Agreement on International Patent Classification and is assigned at least one IPC symbol to all patent documents. IPC notation has a classification notation system that is hierarchically represented by sections, classes, subclasses, groups, and subgroups.

In addition, the patent offices of some countries have their own unique patent classification system, such as the UPC used by the US Patent Office, the ECLA used by the European Patent Office, and the FI and FT used by the Japanese Patent Office.

The UPC uses a class and subclass notation system, the Japanese FI has a structure that extends IPC vertically, and the FT is a major subject category called the theme and an F-Term belonging to the theme's infrastructure. It has a sign system.

Patent classification code systems correspond to each classification code in a one-to-one correspondence section describing the contents of the corresponding patent classification code "title". On the other hand, in the case where the patent classification code system has a predetermined depth or less, a dot may be given to title information of each patent classification code. The following example shows the hierarchical nature of the patent classification code system, the existence of title information, and the presence of dots providing information about the relative position on the classification system before the title information.

Section: H Electric

Class: H01 Basic electric element

Subclass: H01F Magnet

Main group: H01F 1/00 Magnet or magnetic body characterized by magnetic material

1-dot subgroup: 1/01 * Inorganic materials

2-dot subgroup: 1/03 ** characterized by coercive force

3-dot subgroups: 1/032 *** of hard magnetic material

4-dot subgroup: · 1/04 **** Metal or alloy

5-dot subgroup: 1/047 ***** Alloy characterized by composition

6-dot subgroup: 1/053 ****** Including rare earth metals

Taking the IPC as an example, the patent classification symbols corresponding to all subgroups have a "subclass" + "number / number" structure, in which a title is associated with a title corresponding to the patent classification symbol. (Except numbers / 00, most of them are combined with dots, and usually numbers / 00 are symbols corresponding to the main group.) The fewer dots are combined in a title, the more the title is surrounded by its surroundings. In comparison, it is a technical classification of a relatively high concept, and the more dots, the more the title is a technical classification of a lower / detail concept in comparison with its surroundings.

As in the above example, the IPC has a multi-level hierarchical structure (tree structure). Meanwhile, USPC, FT, and ECLA also have a multi-level hierarchical structure. The multi-level hierarchical structure of each patent classification code may be DB.

Introduction of configuration of patent information system 1 of the present invention

First, the structure of the patent information system 1 of this invention is demonstrated briefly.

It demonstrates with reference to FIG. 1 and FIG.

The patent information system 1 of the present invention is connected to the user computer 300 through the wired / wireless network 200. At this time, the user computer 300 is a computer, a laptop, a wired / wireless communication terminal, a computer that is assumed to be a game machine, and a third server, a server of an organization or institution, and a third server other than the patent information system 1. Any computer which connects to the said patent information system 1, such as a program module, is mentioned. When the patent information system 1 performs a web service, the patent information system 1 should be equipped with a web service support module.

The patent information system 1 includes a DB unit 20 related to a DB, a preprocessing module unit 30 for performing various preprocessing, a patent information processing basic module 40 for processing and analyzing patent information, member management, and policy. Support module 50 (member information processing module 501, multilingual processing module 502, translation module 503, etc.) responsible for various kinds of support such as management, etc., directly related to advanced analysis of patent information Patent Intelligence Module 60 (The patent intelligence module generates high-level patent information mainly through patent analysis, so it may also be referred to as patent analysis intelligence module. In this specification, both terms are used in an equivalent sense.) And a hierarchically integrated patent information service system for generating an aggregate of patent information services in units of applicants, inventors or agents.

The patent information system 1 is largely comprised of six components.

The first component is various preprocessing modules, which are modules for performing various preprocessings for each purpose on the patent documents obtained.

The second component is the patent information processing basic module 40, which includes 1) search function, 2) analysis function, 3) monitoring function, 4) analysis function, 5) patent document set function, 6) various multi-level directory creation function. It includes a module that performs a reporting function.

The third component is the patent intelligence module 60, which includes: 1) a module for generating various advanced patent analysis information including citation analysis, comparative analysis, and the like.

The fourth component is a patent information service system generation module which generates the patent information system 1 for each applicant, inventor, agent, and patent classification code.

The fifth component is the support module 50, which allows 1) member information processing, 2) multilingual processing, 3) translation processing, 4) web service processing, and other inventive concepts to be serviced in various network 200 environments. It is a component responsible for supporting.

The sixth component is various DBs

1) Patent document mast DB (202),

2) patent classification code mast DB (203),

3) various mast DBs such as the main mast DB 204,

4) Various DBs related to analysis,

5) Various support DB (member DB (206-2) , menu DB (206-3) , policy DB (representative phrase extraction policy DB (206-1-1), weighting policy DB (206-1-2), Patent indicator DB (206-1-3), analysis query DB (206-1-4)), and the like),

6) Various secondary processing DB (text mining DB (207-1), representative phrase DB (207-2),

7) multiple patent classification code relation DB 207-3).

In addition to the six components are representative elements, other configurations described in the specification of the present invention will be included in the system (1) of the present invention, it is obvious to those skilled in the art and functions (module, firewall, membership management) Etc.) are not mentioned separately.

The DB unit 20 includes a DBMS 201 for managing a DB, a patent document mast DB 202 which is a database that integrates and stores patent documents, and a patent classification code mast DB for storing information about patent classification codes ( 203), the subject mast DB 204 storing information on the applicant, inventor, company, etc., the multi-dimensional analysis operation result table DB 205 storing the multidimensional analysis results, the member, and information on various policies and options. And a secondary processing DB unit 207 for storing a result of processing various data.

The menu DB has a notation for each menu for each country or language, and the patent information system 1 of the present invention reads locale information in a browser of a user to access, and corresponds to each country / language corresponding to the locale information. The star menu is extracted from the menu DB, and a screen is generated using the extracted language menu and transmitted to the user. To do this, the menus included in each screen should be coded, and the html constituting the screen should include menu codes, not language menus. For example, if there is a menu of "search", and the code is 1111, it should be treated as 1111: search (Korean): search (English): 檢索 (Japanese). The menu broadly refers to text and images constituting the screen. If the text and image contains a language expression, if the same processing as for the "search" is performed for each text and image, the locale information of the user's browser is obtained, and the menu of the language corresponding to the locale information is obtained. You can extract (text, image), combine them, construct them in HTML, and send them to the user. Through this process, it is possible to construct a patent information system 1 which has been multilingually processed. On the other hand, when the user selects a specific language, if the language is in the menu DB, a menu of a language suitable for the language may be extracted. If the language is not in the menu DB, an English menu is extracted by processing English as a default. do.

Pretreatment module part 30

Various preprocessing modules of the present invention will be described. The preprocessing module 30 includes 1) original patent document processing module 301-1, 2) counting preprocessing module 3100, 3) patent classification code statistical preprocessing module 3200, 4) weighted preprocessing module 3300 ), 5) citation information preprocessing module (3400), 6) patent classification code preprocessing module (301-3-1 or 3500), 7) applicant representative preprocessing module (301-4-1-1 or 3600), 8) Representative phrase extraction preprocessing module 3700, 9) Family information preprocessing module 3800, 10) Multiple patent classification symbol relational preprocessing module 3900, 11) Owner change information preprocessing module 302, 12) Obtain administrative processing information Any one or more of the modules 303 may be further included.

Original patent document processing module (301-1)

The original patent document processing module 301-1 of the present invention will be described. The source patent document processing module 301-1 is obtained for each country obtained (internationally filed patent documents are disclosed by WIPO. WIPO is not a country, but is included in each country for convenience of description). Process data (XML, SGML, or other format).

The source patent document processing module 301-1 obtains the obtained patent document data and performs a predetermined process, and then obtains the acquired patent document data in accordance with a format that can be processed in the patent information system 1 of the present invention. Generate patent document data

The predetermined processing may include 1) error filtering, 2) patterned error correction, 3) manual error correction, 4) conversion of SGML document data to XML document data, and 5) standard format conversion.

The source patent document processing module 301-1 passes the obtained patent document data through at least one error verification filter 301-1-1, and the obtained patent document data is stored in the patent information system of the present invention. Verifies that it conforms to the format that can be processed in (1).

Errors to be verified by the error verification filter 301-1-1 include 1) correspondence between the actual document data with the DTD information of XML included in various XML and SGML tag errors-each patent document, and 2) Errors in the representation of values in tags, errors in the representation of the contents of various bibliographic fields (e.g., year display in document numbers, errors in the notation of patent classification symbols, etc.), and other various errors.

In addition, the source patent document processing module 301-1 includes an error pattern correction module 301-1-2, and the error pattern correction module 301-1-2 includes frequent error patterns and such error patterns. And a correction pattern corresponding to the error correction pattern, and correct the error automatically when the error pattern is correctable among various errors filtered by the error verification filters 301-1-1. The error pattern and correction patterns may be added continuously. Errors that cannot be corrected by the error pattern correction module 301-1-2 are manually processed to generate corrected patent document data. The error pattern and the correction patterns are stored in the error / correction pattern DB. Examples of the error pattern and the correction pattern include two-digit year display in the year notation, wrong date and month separator, wrong number of digits in the application number (especially the number of 0, etc.), and incorrect format in the patent classification code notation (/ sign. Are missing, case notation errors, extra zeros after the / symbol, etc.).

In addition, the source patent document processing module 301-1 preferably further includes an SGML-XML conversion module for converting the acquired patent document data in the SGML format into corresponding XML patent document data. The SGML-XML conversion module generates the converted XML document data by referring to the DTD data related to the SGML document data in accordance with the DTD of the XML document data. The error verification is preferably performed with XML document data converted by the SGML-XML conversion module. Although the patent information system 1 of the present invention is described based on a patent document converted into XML, it will be apparent that the idea of the present invention is still valid for structured patent document data in a format other than XML. .

In addition, the original patent document processing module 301-1 may further include a standard format conversion module, wherein the standard format conversion module converts the error-proven patent document data into a single standard format. Perform the function to generate standardized patent document data. Patent documents have different data formats over time (changes in formats such as SGML and XML, changes in their detailed DTDs in the same format, field names added, changes in representation format, etc.) While WIPO sets the standard for XML, there may be any number of other DTDs that comply with the standard, such as the Korean-issued XML document, the US Patent Office-issued XML document, the Japanese Patent Office-issued XML document, and the EPO-issued XML document. If you unify this into a single XML document format under one standardized DTD, future document handling can be simplified (such as when styling XML or converting to pdf). Each document format is complex, requiring new styles or pdf creation.) The standard format conversion is standard in each country. It is not accepted or international locations may also include conversion of. A representative example of this is the year-old notation such as fire extinguishing or flatness found in Japanese patent documents. For example, a change to the four-digit year notation for the year is made.

Hereinafter, the patent document data or patent document of the present invention refers to patent document data or patent document that has passed, transformed, corrected, or standardized the error filter.

The patent document data processed and stored by the source patent document processing module 301-1 is used for each purpose of the patent information system 1 of the present invention. Each of the above purposes may include 1) processing for search purposes, 2) generating a patent document mast DB, 3) performing a translation, 4) processing a document output format such as pdf, and 5) storing and backing up storage.

Patent document mast DB generation module (301-2)

The patent document mast DB generation module 301-2 of the present invention is based on the patent document data generated by the original patent document processing module 301-1 and information processed by various preprocessing modules of the present invention. Create a DB 202. A detailed module included in the patent document mast DB generation module 301-2 of the present invention is listed as follows. 1) Generate source bibliography DB (202-1-1), processing bibliography DB (202-1-2), translation bibliography DB (202-1-3), representative drawing DB (202-1-4) There is a bibliographic DB generation module 301-2-1. 2) A document DB generation module 301-2-2 for generating a patent document DB 202-2-1, a translation patent document DB 202-2-2, and a processing patent document DB 202-2-3. There is this. 3) There is a rights holder variation DB generation module 301-2-3 that generates a US assignee variation DB 202-3-1 and a rights setting variation DB 202-3-2. 4) There is a family information DB generation module 301-2-4 for generating a country-specific patent status DB 202-4-1 and a country-specific family DB 202-4-2. 5) the applicant citation information DB 202-5-1, the citation information DB generation module 301-2-5 for generating the examination citation information DB 202-5-2, and 6) the administrative processing information DB 202-. 6) there is an administrative processing information DB generating module 301-2-6.

The patent document mast DB 202 includes a bibliographic information DB, a document DB, a rights change information DB, a family information DB, a citation information DB, an administrative processing information DB, and the like.

The bibliographic DB includes source bibliographic information that can be obtained directly from patent document data, patent bibliographic data itself or processed bibliographic information processed for the entire patent document data set, and the bibliographic bibliographic information and processed bibliographic items for various languages. It contains translation bibliographic information and other information for each language that is the target of translation. The bibliographic DB generation module of the present invention is responsible for generating the bibliographic DB.

The document DB includes patent document data relating to patent document data itself, translation document data for each language in which part or all of the patent document data is translated into various languages, the patent document data or the translated patent document data for each language in pdf, doc. It contains data on processing patent documents converted into various file formats such as .tiff, html, and so on. The patent document data, the translation patent document data for each language, and the processing patent document data may exist in the form of a patent document DB, a translation patent document DB, and a processing patent document DB, respectively. The document DB generation module of the present invention is responsible for generating the document DB.

The right holder change information DB includes an assignee change information DB for US patent documents, and information on changes in the name information of various patent right holders and patent applicants managed by patent offices of each country, and setting registration and termination registration of various licensees. It may contain information about. The rights holder variation DB generation module of the present invention is responsible for generating the rights holder variation information DB.

The family information DB includes country-specific family data that includes information on family patent documents for each country related to individual patent documents, and the status of each family patent document in each country (the current state of a country-specific application, whether or not to obtain rights, etc.). ) May include country-specific patent status data that includes information. The family information DB generation module of the present invention is responsible for generating the family information DB.

In the citation information DB, the applicant citation information DB generated based on the citation information cited by the patent applicant, cited during the examination of the examiner for a specific patent application (mainly cited to deny novelty and progressiveness of the patent application to be examined). Document) There is a database of examination citation information generated based on information about other patent document examination citations. The citation information DB generation module of the present invention is responsible for generating the citation information DB.

The administrative processing information DB includes data generated based on administrative processing information generated by patent offices of each country for each patent document. The administrative processing information DB generating module of the present invention is responsible for generating the administrative processing information DB.

The source bibliographic information is as described above with information that can be obtained directly from the patent document data. The bibliographic DB generation module obtains from the patent document data by referring to the information on the table structure of the bibliographic DB, the field names of each table, and the correspondence between the field names of each table and the corresponding tag names included in the patent document data. One piece of information is obtained from the contents of each field of each table.

For example, as follows.

-<KR_ApplicationInformation>

<KR_ApplicationNumber> 10-2006-7012696 </ KR_ApplicationNumber>

<KR_ApplicationDate> June 23, 2006 </ KR_ApplicationDate>

</ KR_ApplicationInformation>

-<KR_Applicant>

-<KR_ApplicantInformation>

<KR_ORGName> Sumcom Inc. </ KR_ORGName>

<KR_Country> United States </ KR_Country>

<KR_Postcode> 000-000 </ KR_Postcode>

<KR_Address> San Diego Morehouse Drive5775, California, USA (92121-1714 right) </ KR_Address>

</ KR_ApplicantInformation>

</ KR_Applicant>

When the patent document data includes the above information, 10-2006-7012696 is obtained as its content in the application number field of the table requiring the application number as it is, or 10-2006-7012696 is entered as it is or by a predetermined field such as 1020067012696. Convert to the input format and type. Similarly, June 23, 2006 may be entered as it is in the application date field, or converted into 20060623. On the other hand, if there is an applicant table, and there is an applicant name field, an applicant country field, an applicant address field, and an applicant zip code field, the fields are H.com Inc., United States, 000-000, San Diego, California, USA. Enter Morehouse Drive 5755 (right 92121-1714).

At this time, if the field name for inputting the application number is AppNum, the information contained in the KR_ApplicationNumber tag may be input in the AppNum field only if AppNum has to specify information on the correspondence of KR_ApplicationNumber. The specification information for the correspondence between the field name of each table and the field name of each table and the corresponding tag name included in the patent document data is as described above.

On the other hand, when an XML document is given, a number of solutions that make it into DB are available as open source or commercial software, and programming thereof is easy for those skilled in the art with knowledge of XML document parsing.

Through the above process, the bibliographic details DB generation module can obtain bibliographic details from the patent document data and make them into a DB. The DB in which such bibliographic details information is stored is called a source bibliographic details DB.

Subsequently, the bibliographic DB generation module may generate processed bibliographic data by obtaining information processed by various preprocessing modules and the like for the patent document data and / or the entire patent document data. The included DB is called processing bibliographic DB. The processing surge may include 1) various counting information, 2) various calculation and evaluation information.

For example, counting information includes: 1) the number of applicants and / or patentees, 2) the number of inventors, 3) the number of claims for each step, such as the filing or registration stage, 4) the number of specification pages, 5) the number of drawings, 6) The number of types of patent classification symbols, 7) the number of patent classification symbols of each kind, 8) the number of references (= citations) / citations, 9) the number of examination citations / citations, 10) the examiner investigated Number of patent classification symbols, 11) number of priority claims, 12) number of family patents in each country, including its own country, 13) total number of families, 14) number of independent terms, 15) number of dependent terms, 16) number of patents by country of reference, 17) the total number of patent documents among references, and 18) the number of non-patent documents among references. The counting preprocessing module of the present invention is responsible for processing such counting information. Details are described in the counting pretreatment module of the present invention.

Taking various calculation and evaluation information as an example, 1) the patent classification code included in the patent document of the applicant included in the patent document (the patent classification code itself, the lower patent classification code of the patent classification code, or the parent) Or the total number of patents, total registrations, share, concentration rate, activity rate (AI), etc.). Values for calculating various patent indices, 2) values for calculating various patent indices for patent classification symbols included in the patent document of the inventor included in the patent document, and 3) various analysis indices. Various pretreatment modules and analysis modules of the present invention will be described.

Patent classification code mast DB generation module (301-3)

The preprocessing module of the present invention includes a patent classification code mast DB generation module 301-3, and the DB generated by the patent classification code mast DB generation module 301-3 is as follows.

1) The original patent classification code DB 203-1 which stores raw data of various patent classification codes obtained from patent offices or patent source data sources of each country.

2) There are variations patent classification number DB (203-2) for storing the patent classification codes modified according to the application of various patent classification patent information system of the present invention, the sign (1), which variant patent classification number generation module ( 301-3-3).

3) There is a total high patent classification symbol set DB (203-3) that collects and stores all the upper patent classification symbols for an arbitrary patent classification symbol, which is a total high patent classification symbol set generation module (301-3-1-). 1) produces this.

4) There is a lower patent classification code set DB (203-4) that stores information on the immediate or all lower patent classification codes for any patent classification code, which is a patent classification code set generation module (301-3-3). 1-2) produces this. The lower patent classification code set generation module 301-3-1-2 includes a lower patent classification code set generation module 301-3-1-2-1 and all lower patents that generate only the lower patent classification code set. There is a subordinate patent classification code set generation module 301-3-1-2-2 for generating a classification code set.

5) There is a patent classification code tree table DB 203-5 which stores a patent classification code system in a tree structure, which is generated by a patent classification code tree table generation module (not shown).

6) There is a total high patent classification code table DB (203-6) that stores the total high patent classification code for any patent classification code by level, and this is a total high patent classification code set generation module (301-3-1). -1) produces On the other hand, if there is an update in the patent classification code, it is processed by the patent classification update module 301-3-5 of the present invention, and a predetermined module related to the updated patent classification code performs a predetermined process.

Hereinafter, the patent classification code will be collectively described.

Multilevel of patent classification code

Each patent classification code in a multi-level hierarchical structure has a patent classification code vs. a patent classification code. The title information of the patent classification code has a corresponding relationship. Many such examples are presented throughout the patent specification. In order to obtain information about all of the lower patent classification symbols of a given patent classification symbol in a hierarchical structure, a search engine (search module 401 and search engine are used in the present invention as synonyms) or extension / wild to DBMS 201. Cards and the like can be used. For example, if you want to get all the information about the lower patent classification of H01F, you can get all the lower patent classification symbols of H01F by querying the search term or query with the extension / wildcard (eg?) In H01F. . Similarly, to get all the information about the lower patent classification of H01F1 / 00, enter "H01F1?". However, it is not possible to enter "H01F1 / 01?" To obtain all the information about the lower patent classification of H01F1 / 01. The reason for this is that patent classification symbols in which a dot is included in the title information of a patent classification symbol such as IPC have the same display pattern (for example, main group display + / + number for IPC, class number + number for USPC, In the case of FT, it has one number + one letter + three digits + two letters + number, etc.) and displays the relative parent child relationship / parent child relationship as the number of dots. In other words, when the title information does not contain a dot, the parent / child relationship / parent sub-relation can be distinguished only by the patent taxonomy, but otherwise, the patent taxonomy cannot distinguish. .

On the other hand, due to the property that the patent classification code has a hierarchical structure, the patent document corresponding to the lower patent classification code should also correspond to the upper patent classification code. That is, if a patent classification code of H01F1 / 04 is assigned to a specific patent document, the patent document also corresponds to H01F1 / 032, H01F1 / 03, H01F1 / 01, and of course H01F1 / 00. . Conversely, the patent information related to H01F1 / 03 should include not only H01F1 / 03, but also patent information related to all patent classification codes below H01F1 / 03 in the hierarchical structure of the patent classification code. That is, the patent information related to H01F1 / 03 should naturally include patent information related to all patent classification symbols below H01F1 / 03 in the patent classification code system as well as H01F1 / 032 and H01F1 / 04.

64 shows an example in which the inventive concept of the present invention is not applied. As can be seen in FIG. 64, more documents are available corresponding to A61B 3/02, which are subcategories, than documents corresponding to A61B 3/00.

Three ways to process lower patent classification code

For the same reason as above, it is necessary to process a patent classification code including a dot in title information to include information about all sub-patent classification codes of the patent classification code. The patent classification code preprocessing module 301-3-1 or 3500 included in the patent classification code mast DB generation module 301-3.

All processes of extracting information for a given patent classification code need to include patent information on the sub-patent classification code, which is a representative case of such information extraction process: 1) search, 2) statistics, 3) analysis, 4) Monitoring, 5) directory display, and the like. In the situation where an extension (wildcard) is not used, in order to include information about all sub-patent classification codes for a given patent classification code, one of the following processes is necessary.

First, a patent classification code system (tree structure) is searched for a given patent classification code by using a depth-first search method to obtain information on all lower patent classification codes. This information retrieval process can occur each time per query / retrieval. Meanwhile, all sub-patent classification symbols for a patent classification symbol including a dot in all patent classification symbols or title information for a given patent classification symbol system (tree structure) are obtained and stored, and all sub-patents stored during query / search It may take a way to obtain information about the classification code. Given data in a tree structure, finding the immediate descendants and / or all subnodes of each node, such as depth-first search, for each node constituting the tree structure is the basis of computational science. Since it corresponds, the description thereof will be omitted. In addition, since it is also basic to store the information of the obtained immediate node and / or all subordinate nodes in correspondence with a specific node, a description thereof will be omitted.

Second, for each patent classification code, information about the patent classification code corresponding to its parent is stored step by step by referring to the patent classification code system (tree structure), and when there is a given patent classification code, the patent classification code is applied. It is also possible to find a particular step that first appears and take a way to get information about all patent classification codes that are marked with that patent classification code.

Third, a modified patent classification code can be generated by changing only the representation of the patent classification code while maintaining the system of the given patent classification code system (tree structure), and the above first and second methods can be implemented for the modified patent classification code. . On the other hand, by properly selecting the modification method with reference to the patent classification code system, it is possible to modify the patent classification code notation so as to respond to a range search / range query.

Example description via H04B 7/00

It will be described in more detail below.

The patent classification code information and the title information of the above-described patent classification information correspond to 1: 1, and in the case of IPC, the upper and lower hierarchical information, which cannot be represented by the patent classification code, is represented by the number of dots in the subgroup or less. . In the case of IPC, it is difficult to know the upper and lower relations only by the patent classification code information below the subgroup. Therefore, the upper and lower hierarchical information is grasped by using the dots included in the title information corresponding to the patent classification code information. That is, below the subgroup, by identifying the dot structure included in the title, the upper and lower hierarchical structures between the patent classification symbols below the subgroup are identified. The upper and lower hierarchies between the patent classification codes have the form of a tree structure.

In the present invention, a tree structure between the patent classification symbols will be described using the IPC 7th edition standard main group H04B 7/00 (the title is a wireless transmission system) and the patent classification symbols corresponding to the lower part thereof. And the description is not limited to the present embodiment, and is applied equally or equally to the entire IPC area, as well as other patent categorization system including the upper and lower levels of the hierarchy as a dot structure in the title (for example, For example, USPC, FI, FT, or ECLA) may be applied in the same or equivalent manner.

The patent classification code and its title corresponding to the main group H04B 7/00 and the lower part thereof have the following configuration as of May 5, 2006 based on the IPC 7th edition.

H04B 7/00 Radio transmission systems, ie using radiated electromagnetic fields

H04B 7/005 Control of transmissions; Equalization

H04B 7/01 .Reduction of phase shift

H04B 7/015 .Reduction of echo effects

H04B 7/02 Diversity System

H04B 7/04 .. using plural independent aerial lines

H04B 7/06 ... in the sending country

H04B 7/08 ... in the receiving country

H04B 7/10 .. using single airborne systems characterized by polarization or directional characteristics

H04B 7/12 ..Frequency diversity system

H04B 7/14 relay system

H04B 7/145 ..Manual relay system

H04B 7/15 ..Active relay system

H04B 7/155 ... ground based stations

H04B 7/165 using angle modulation

H04B 7/17 .... using pulse modulation

H04B 7/185 ... space based or airborne stations

H04B 7/19 .... earth synchronization stations

H04B 7/195 .... asynchronous station

H04B 7/204 ... multiple access

H04B 7/208 .... Frequency division multiple access

H04B 7/212 .... Time division multiple access

H04B 7/216 .... Code Division or Spread-Spectrum Multiple Access

H04B 7/22 .Scattered wave propagation system

H04B 7/24. For communication between two or more points

H04B 7/26. At least one point being moved

When the tree hierarchical structure is made easier based on the dot, it can be expressed as shown in Table 1 below.

TABLE 1

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup 4-dot subgroup title H04B 7/00 Wireless transmission system H04B 7/005 Control of transmission; Equalization H04B 7/01 Reduction of phase shift H04B 7/015 .Reduction of echo effects H04B 7/02 Diversity system H04B 7/04 .. using multiple independent aerial lines H04B 7/06 ... in the sending country H04B 7/08 ... in the receiving country H04B 7/10 Using single airborne systems characterized by polarization or directional characteristics H04B 7/12 Frequency Diversity System H04B 7/14 Relay system H04B 7/145 ..Manual relay system H04B 7/15 ..Active relay system H04B 7/155 ... ground based stations H04B 7/165 Using angle modulation H04B 7/17 Using pulse modulation H04B 7/185 Space or Aircraft Bureau H04B 7/19 .... Earth Station H04B 7/195 .... asynchronous station H04B 7/204 ... multi-access H04B 7/208 Frequency Division Multiple Access H04B 7/212 .... Time Division Multiple Access H04B 7/216 Code Division or Spread-Spectrum Multiple Access H04B 7/22 Scattered wave propagation system H04B 7/24 For communication between two or more points H04B 7/26 .. at least one point being moved

Based on the information shown in Table 1 above, when the parent child relationship / parent sub-relationship between the patent classification codes is made into a DB, the following may be an embodiment. (Of course, in this case, the title corresponding to each patent classification code is used. The name may be stored in a separate table, and the title name is preferably translated into each language and stored for each language, that is, the patent classification code is a key value for each language (translated). It is preferable that a title information table exists.) Given a given IPC patent classification code, the first English letter in the IPC is followed by a section, followed by a class, followed by a subclass, followed by a main group. After that, it can be seen that the dot level is referred to by referring to the corresponding title information. At this time, by obtaining all patent classification symbols above it with reference to the patent classification symbol system (tree structure), a total high patent classification symbol table as shown in Table 2 below may be generated. Table 2 below can be generated not only the IPC but also the USPC, FT, ECLA, FI, etc., of course. On the other hand, the total upper patent classification code table as shown in Table 2, when the patent classification code is updated, it is preferable to obtain the updated patent classification code information to update the table. The total high patent classification code table as shown in Table 2 is generated by the total high patent classification code table generation module of the present invention obtained from a patent classification code mast DB. The total patent classification code table generating module preferably generates the total patent classification code table whenever the information on the patent classification code is updated. The total patent classification code table generation module may generate the same general patent classification code table for the modified patent classification code DB of the present invention.

In Table 2, each field, IPC_ID, represents an ID of a specific IPC, S is a section level, C is a class level, SC is a subclass, MG is a main group, and 1dot is a level having one dot in title information. Denotes a level with two dots in the title information, 3dot denotes a level with four dots in the title information (h n dot denotes a level with n dots in the title information), and self denotes the IPC itself. Indicates.

TABLE 2

IPC_ID S C SC MG 1dot 2dot 3dot 4dot Self 69964 H H04 H04B H04B7 / 00 H04B7 / 00 69965 H H04 H04B H04B7 / 00 H04B7 / 005 H04B7 / 005 69966 H H04 H04B H04B7 / 00 H04B7 / 01 H04B7 / 01 69967 H H04 H04B H04B7 / 00 H04B7 / 015 H04B7 / 015 69968 H H04 H04B H04B7 / 00 H04B7 / 02 H04B7 / 02 69969 H H04 H04B H04B7 / 00 H04B7 / 02 H04B7 / 04 H04B7 / 04 69970 H H04 H04B H04B7 / 00 H04B7 / 02 H04B7 / 04 H04B7 / 06 H04B7 / 06 69971 H H04 H04B H04B7 / 00 H04B7 / 02 H04B7 / 04 H04B7 / 08 H04B7 / 08 69972 H H04 H04B H04B7 / 00 H04B7 / 02 H04B7 / 10 H04B7 / 10 69973 H H04 H04B H04B7 / 00 H04B7 / 02 H04B7 / 12 H04B7 / 12 69974 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 14 69975 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 145 H04B7 / 145 69976 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 15 69977 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 155 H04B7 / 155 69978 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 155 H04B7 / 165 H04B7 / 165 69979 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 155 H04B7 / 17 H04B7 / 17 69980 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 185 H04B7 / 185 69981 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 185 H04B7 / 19 H04B7 / 19 69982 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 185 H04B7 / 195 H04B7 / 195 69983 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 204 H04B7 / 204 69984 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 204 H04B7 / 208 H04B7 / 208 69985 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 204 H04B7 / 212 H04B7 / 212 69986 H H04 H04B H04B7 / 00 H04B7 / 14 H04B7 / 15 H04B7 / 204 H04B7 / 216 H04B7 / 216 69987 H H04 H04B H04B7 / 00 H04B7 / 22 H04B7 / 22 69988 H H04 H04B H04B7 / 00 H04B7 / 24 H04B7 / 24 69989 H H04 H04B H04B7 / 00 H04B7 / 24 H04B7 / 26 H04B7 / 26

That is, given the IPC patent classification code of H04B7 / 06, referring to the IPC patent classification code system, if all the parent nodes (the branch points of the tree structure, etc. are referred to as nodes) are found, H, H04, H04B , H04B7 / 00, H04B7 / 02, H04B7 / 04, H04B7 / 06, and H04B7 / 06, which can be arranged in accordance with the field structure to generate data as shown in IPC_ID69970.

Given a table of the top patent classification symbols as shown in Table 2 above, obtaining all the high patent classification symbols of a given patent classification symbol is as follows: 1) Find the given patent classification symbol in the self field and follow the row of the found patent classification symbol. As you move up the level, you will obtain a patent classification code for each step. It is the total high patent classification code set DB of the present invention which collects the total high patent classification code of the given patent classification code as described above, and utilizes the total high patent classification code table to utilize all lower patents of a specific patent classification code. The classification code extracted and made into a DB is the lower patent classification code set DB of the present invention. In this case, only the direct patent classification code of a specific patent classification code may be extracted using the total high patent classification code table, and such information is stored in a DB of the direct patent classification code set of the present invention.

When there is a total high patent classification code table as shown in Table 2, and a specific patent classification code is given, a method of obtaining the patent classification code and all lower patent classification codes will be described as an example. In step 1, if H04B7 / 15 is first found in Self, it is found in IPC_ID69976. In step 2, if it is found in which level field the same patent classification code as itself is found, it is recognized as a 2 dot level field. In step 3, if the patent classification code with H04B7 / 15 is found in the 2-dot level field, H04B7 / 15, H04B7 / 155, H04B7 / 165, H04B7 / 17, H04B7 / 185, H04B7 / 19, H04B7 / 195, H04B7, respectively / 204, H04B7 / 208, H04B7 / 212, and H04B7 / 216. These are all subpatent classification symbols of H04B7 / 15 itself and H04B7 / 15. Except for yourself, you can map all your subpatent classification symbols to yourself. In this case, a lower patent classification code set DB may be generated with information about the patent classification code and all lower patent classification codes for a specific patent classification code. In this case, when all lower patent classification symbols of the patent classification symbol are stored for a specific patent classification symbol, a total lower patent classification symbol set DB may be generated. And, counting the number of all sub-patent classifications (including 11 here, 10 except myself) may be counted. Such counting is preferably performed by the counting pretreatment module of the present invention.

On the other hand, considering a method of extracting only the subordinate patent classification code here, when performing the step 3, the 3 dot level field value is not null (the subordinate value exists), and the 4 dot level field value is null. If the patent classification code is found, H04B7 / 155, H04B7 / 185, and H04B7 / 204 are obtained, respectively. At this time, if the direct lower patent classification code of the patent classification code is stored for a specific patent classification code, a DB of the lower patent classification code set can be generated.

In this case, the number of the subordinate patent classification symbols (here 3) may be counted, and this counting may be preferably performed by the counting preprocessing module of the present invention. If such a direct patent classification code is needed, it is because of the need for step-by-step unfolding. Stepwise expansion refers to each level (section, class, subclass, main group, 1 dot subgroup, 2 dot subgroup, 3 dot subgroup, 4 dot subgroup, ... n dot sub when displaying the IPC directory structure). It refers to unfolding in stages by group, etc., and displays only the patent classification symbols in that stage, while in this case, the display shows the values (search results, statistical values, calculated values, analysis values, etc.) for each stage even if unfolded. It is preferable that all values) take into account both the specific patent classification code and all the sub-patent classification codes of the specific patent classification code, for example, when H04B7 / 15 is expanded immediately below H04B7 / 15, Even if H04B7 / 185 and H04B7 / 204 are taken to the next level, the values for H04B7 / 155 are considered to include all of H04B 7/165 and H04B 7/165, which are descendants of H04B7 / 155 and H04B7 / 155, and H04B 7 Should be displayed as a value of / 155 It means, of course, in special cases will be able to display the values for only that particular patent classification symbols (such as the user's choice).

The table structure as described above (that is, the table used to find the subordinate patent classification code in the above three steps) may be utilized in the search and analysis described below, and in particular, each patent classification code may be used in advance for convenience of analysis. If a value is calculated according to a predetermined rule (View), physical view (Materialized View, calculation result table for multi-dimensional analysis), cube, etc. can be used. When calculating these calculations, a rollup operation (a concept of including a value for itself and a value for a subordinate object as its own value) is utilized. The table can be utilized. For example, a roll-up operation may generate a year value by adding a quarterly value to a year value, and each quarterly value refers to a method of generating a quarterly value by adding a value of months forming each quarter. When calculating a value for a particular patent classification code (for example, the number of applications per year), a value for the patent classification code can be generated by adding the value for itself and the value for its lower patent classification code. Details are detailed in the analysis of the present invention.

By processing the same for all IPC codes / IPC patent classification codes in the same way as above, a table of the above format for all IPC patent classification codes can be generated, and these tables and the respective languages mentioned above The source IPC patent classification code DB corresponding to the IPC can be generated from the source patent classification code DB including the title information. In the same manner as described above, USPC, FT, FI, ECLA, etc. can generate the same table.

Example description with US class 002 apparel

This will be described again using the USPC as an example.

The following is the first part of a subclass of US class 002 apparel. This is a copy from http://www.uspto.gov/web/patents/classification/uspc002/sched002.htm . Class numbers 002 are omitted before the numbers below, but all numbers should be considered to have 002 before them.

1 MISCELLANEOUS

455 GUARD OR PROTECTOR

456 .Body cover

457 ..Hazardous material body cover

458 ..Thermal body cover

2.11 .. Astronaut '' 's body cover

2.12 ... Having relatively rotatable coaxial coupling component

2.13 ... Having convoluted component

2.14 ..Aviator '' 's body cover

2.15 ..Underwater diver '' 's body cover

2.16 ... Having an insulation layer

2.17 ... Having a garment closure

459 .Shoulder protector

Such information is available from the USPTO and may be obtained by other means. For reference, the Index to U.S. The Patent Classification file is expressed as follows: (The first column is its own USPC, the second column is the depth / level indication (class is 0, no dot is 1, 1 dot is 2), and the third is The column is the serial number, the fourth column is the parent node, and the last column is the title.)

002000000 0 1 APPAREL

002001000 1 2 002000000 MISCELLANEOUS

002455000 1 3 002000000 GUARD OR PROTECTOR

002456000 2 4 002455000 Body cover

002457000 3 5 002456000 Hazardous material body cover

002458000 3 6 002456000 Thermal body cover

002002110 3 7 002456000 Astronaut '' 's body cover

002002120 4 8 002002 110 Having relatively rotatable coaxial coupling component

002002130 4 9 002002 110 Having convoluted component

002002140 3 10 002456000 Aviator '' 's body cover

002002150 3 11 002456000 Underwater diver '' 's body cover

002002160 4 12 002002 150 Having an insulation layer

002002170 4 13 002002 150 Having a garment closure

002459000 2 14 002455000 Shoulder protector

When there is such USPC information, a person skilled in the art can generate a USPC patent classification code multi-level hierarchical structure based on dot information or parent information included in title information of the USPC patent classification code as shown in Table 3 below. It is natural. The following table may be generated based on the USPC information. The method of production is the same as that performed at IPC. Of course, even in this case, it may be natural that there may be a table in which title information is translated for each language, further from a table including title information in English. In the following, the super category shows that we can create a super category at the class level of the USPC. Classes in the USPC are the official top level, but the number of classes is hundreds, so if you need a super category that groups / categorizes these hundreds of classes, you can create them. In Table 3 below, it can be seen that all of the 002 classes belong to the M1 super category.

TABLE 3

USPC_ID super category Class 0 dot 1 dot 2 dot 3 dot Self 2 M1 002000000 002000000 3 M1 002000000 002455000 002455000 4 M1 002000000 002455000 002456000 002002110 002002110 5 M1 002000000 002455000 002456000 002002110 002002120 002002120 6 M1 002000000 002455000 002456000 002002110 002002130 002002130 7 M1 002000000 002455000 002456000 002002140 002002140 8 M1 002000000 002455000 002456000 002002150 002002150 9 M1 002000000 002455000 002456000 002002150 002002160 002002160 10 M1 002000000 002455000 002456000 002002150 002002170 002002170 436 M1 002000000 002455000 002456000 002456000 437 M1 002000000 002455000 002456000 002457000 002457000 438 M1 002000000 002455000 002456000 002458000 002458000 439 M1 002000000 002455000 002459000 002459000

Tables relating to the USPC as shown in Table 3, wherein 1) the subordinate patent classification symbol, 2) the subordinate patent classification symbol, 3) the number of subordinate patent classification symbol, 4) the subordinate patent classification for a specific USPC patent classification symbol The number of symbols can be known as in the case of the IPC described above, and the method of generating the source USPC patent classification code DB including the table and utilizing the table are the same as in the case of the IPC described above.

In the above description, IPC and USPC have been described, and it will be apparent to those skilled in the art that this method can be equally applied to other patent classification codes FT, FI, and ECLA having a hierarchical tree structure.

Hierarchical representation between patent classification symbols (how to create a child-parent relationship)

The patent classification code mast database generation module may be configured with respect to another patent classification code except for the highest patent classification code (eg, section display patent classification code in the case of IPC and class display patent classification code in the case of USPC). Corresponding parent patent classification code to which the patent classification code belongs (1: 1) or subordinate patent classification code corresponding to immediately below the classification code with respect to the one patent classification code (1: n) Perform the function you set. While both are computationally feasible, the electronic method of creating a parent 1: 1 relationship corresponding to one patent classification child is more reasonable. All of the patent classification codes establish a relationship with each other in the tree structure by the above correspondence.

The former is a method of mapping a parent to a child, and the latter is a method of mapping a parent to children belonging to the parent. Both can be used to find the children belonging to the parent. The first is to find the child pointing to the parent when the parent is given, and the latter is stored directly because the children belonging to the parent are stored directly. You can use Therefore, the former stores the child-parent relationship 1: 1 based on the child, and the latter directly stores the immediate children belonging to the parent centered on the parent.

A method of generating the hierarchical patent classification code database by the hierarchical patent classification code database will be described below based on IPC classification.

1) From the section to the main group, a hierarchical patent classification code database is created according to the classification code system, and its own parent information is stored for itself, its own upper and lower patent classification codes, or its own child information is stored. Save it.

2) For the main group patent classification code, the patent classification code for which the title corresponding to the patent classification code has one dot is found for all patent classification codes until the next main group patent classification code is issued. The main group patent classification code is usually easy to find because 00 is combined after the slash (^ / ^). In the case of child-> parent (the way the child points to its only parent), each patent classification symbol in the 1-dot subgroup with one dot stores the patent classification symbol of the main group which is its only immediate parent. . In the case of parent-> child (a method of storing all children under the child), the patent classification code of the 1-dot subgroup having one dot belonging to the main group is associated with the patent classification code representing the main group. Save it.

At this time, the main group itself, its own subordinate patent classification code is stored its parent information, or its own child information.

3) The patent classification code having two dots in the title is searched for all the patent classification symbols belonging to the same main group for the patent classification symbol having one dot and before the patent classification symbol having one dot. If there is no other patent classification code that belongs to the same main group and has one dot in the title (one subgroup in the main group, or the last subgroup among the subgroups), one dot in the title The patent classification code which has two dots in the title is searched among all the patent classification symbols before the patent classification symbol which has the patent classification symbol which has and the next main group. The way to save its parent and child is the same as in 2) above.

4) With respect to the patent classification code having two dots in the title, a patent classification code having three dots in the title is found by the same logic as in the above 3), and processed in the same manner as in the above 3).

Another hierarchical patent classification code database generation module suggests a method of generating the hierarchical patent classification code database as follows.

You can do the following by looking up its parent, as the IPC moves up from the bottom of the table in order (record number): 1) The number of dots in the title information corresponding to the bottom patent classification code is checked, and the number of dots in the title of the patent classification code in the record number (just above IPC) with its own record number -1 is checked. If the number of dots is one, it is its parent. If the number of dots is the same, it is at the same level as itself. If a patent classification code of a subgroup or more is encountered, the patent classification code of the sub group or more is stored as its parent.

2) If the number of dots in the title of the patent classification code in the record number having the own record number -1 is equal to or greater than itself or the patent classification code is more than the subgroup, the parent relationship is ignored, and the own record number -1 2) is performed for the patent classification code which is the record number.

In the above two embodiments, a method of generating a hierarchical patent classification code database has been presented. However, this is only an example, and includes parent information or child of own IPC for all IPCs. Generating information will be readily available to those skilled in the art of developing computational systems.

The hierarchical patent classification code database generation module generates a hierarchical patent classification code database when a given patent classification code exists, and one embodiment thereof is as follows.

Example via H04B7 / 00

For the patent classification code below H04B7 / 00, an embodiment of the implemented results of both methods is presented.

First, a result is obtained by matching a child patent classification code with parent (child-> parent), which is the immediate classification code to which the classification code belongs. An example is shown in Table 4 below.

TABLE 4

Child Parent (his parent) H04B 7/005 7/00 H04B 7/01 7/00 H04B 7/015 7/00 H04B 7/02 7/00 H04B 7/04 H04B 7/02 H04B 7/06 H04B 7/04 H04B 7/08 H04B 7/04 H04B 7/10 H04B 7/02 H04B 7/12 H04B 7/02 H04B 7/14 7/00 H04B 7/145 H04B 7/14 H04B 7/15 H04B 7/14 H04B 7/155 H04B 7/15 H04B 7/165 H04B 7/155 H04B 7/17 H04B 7/155 H04B 7/185 H04B 7/15 H04B 7/19 H04B 7/185 H04B 7/195 H04B 7/185 H04B 7/204 H04B 7/15 H04B 7/208 H04B 7/204 H04B 7/212 H04B 7/204 H04B 7/216 H04B 7/204 H04B 7/22 7/00 H04B 7/24 7/00 H04B 7/26 H04B 7/24

Next, Table 5 shows an example of the result of a method of having all child information about its own child (parent-> child).

TABLE 5

Parent Child (own child) H04B 7/00 H04B 7/005 H04B 7/01 H04B 7/015 H04B 7/02 H04B 7/14 H04B 7/22 H04B 7/24 H04B 7/005 null H04B 7/01 null H04B 7/015 null H04B 7/02 H04B 7/04 H04B 7/10 H04B 7/12 H04B 7/04 H04B 7/06 H04B 7/08 H04B 7/06 null H04B 7/08 null H04B 7/10 null H04B 7/12 null H04B 7/14 H04B 7/145 H04B 7/15 H04B 7/145 null H04B 7/15 H04B 7/155 H04B 7/185 H04B 7/204 H04B 7/155 H04B 7/165 H04B 7/17 H04B 7/165 null H04B 7/17 null H04B 7/185 H04B 7/19 H04B 7/195 H04B 7/19 null H04B 7/195 null H04B 7/204 H04B 7/208 H04B 7/212 H04B 7/216 H04B 7/208 null H04B 7/212 null H04B 7/216 null H04B 7/22 null H04B 7/24 H04B 7/26 H04B 7/26 null

Directory Creation Module (405)

The patent classification code information may be arranged in a directory manner, and the user may systematically access the lower patent classification code of the specific patent classification code by browsing the user. Currently, www.delphion.com provides this service. However, the site provides information on the number of subordinate patent classification symbols, but 1) does not provide the total number of subordinate patent classification symbols of the user, and 2) the number of patent documents for all patent classification symbols. And 3) it does not provide a patent document that contains all direct sub-patent classification symbols that correspond to that number when clicked. The present invention can solve this problem through a hierarchical patent classification code database and a modified patent classification code search method.

The directory generation module 405 of the present invention provides a subordinate patent classification symbol for each patent classification symbol, and counts the count values for the patent classification symbol and the subordinate patent classification symbol to the patent classification symbol and the subordinate patent classification symbol. You can mark it next. The counting value includes 1) the number of all subclasses, 2) the number of subordinate patent classifications, 3) the depth of classification, 4) the number of patent documents corresponding to all subclasses, and 5) the number of subordinate patent classification symbols. It may be any one or more selected from the number of patent documents, preferably 1), 2), 5) are displayed together. The directory generation module 405 is the above 1) to 5) or read at least one of in the counting value is placed preceding i) counting the preprocessing module of the present invention, ii) performing operations for multi-dimensional analysis for total analysis to be described later results table iii) the patent classification code tree table, or iv) at least one of the total high-level patent classification code table, and 1) to 5) can be displayed.

When the patent classification code information is arranged in a directory manner, a symbol such as a color or an icon may be displayed in each directory to improve accessibility of the user. In other words, if there are many sub-categories, all sub-categories, deep, deep patents, many patent documents, etc., different colors or preset icons are assigned to the user. The number of all subcategories, the depth of categorization, the number of patent documents, etc.). In particular, when the number of all sub-classifications of a specific patent classification code directory is large or the depth of classification is large, it is more important than the directory that is not because it has many patent applications to the area and proves that various technologies are applied. This is because it is often larger.

65 shows an embodiment of the directory creation module 405 of the present invention. As shown in FIG. 65, at least one patent classification code may be checked by a predetermined step of the directory, and the patent classification code selected by the check box may be searched in a country unit or based on patent classification code. It can be seen that a patent analysis can be performed.

The directory generation module 405 of the present invention includes a patent classification code utilization directory generation module 405-1 for generating a directory for a patent classification code corresponding to a multi-level patent classification code system for each type of at least one patent classification code. There is a thematic directory generation module 405-2 for generating various analysis thematic or other determinable thematic multilevel directories introduced in the specification of the present invention.

Lower layer patent classification code processing module

The lower hierarchical patent classification code processing module searches for the lower hierarchical patent classification code depending on the acquired patent classification code through the hierarchical patent classification code database with respect to the patent classification code obtained by a search expression input by a user. Play a role.

In order for a patent document corresponding to a lower patent classification code to be included in a search result, at least one of the following three methods is required, and the lower layer patent classification code processing module performs the above operation. To perform.

First, the hierarchical patent classification code database finds all of the lower hierarchy patent classification codes of the patent classification code obtained on the basis of the acquisition time point. Patent classification symbols in a hierarchical patent classification symbol database have a tree structure, which makes it possible to find them. If the received search expression includes a specific patent classification code, it is possible to find all patent classification codes of the lower layer of the patent classification code along the tree structure, and transmit the result to the modified search expression generation module. In addition, the modified search expression generation module may query the search engine by generating a new search expression reflecting all the patent classification codes received.

Second, the lower layer patent classification symbols are found, matched (stored in a matching table) for all patent classification symbols in advance, and if there is a specific patent classification symbol in the search expression obtained, the lower level matching the specific patent classification symbol. Hierarchical patent classification symbols can be immediately sent to the modified search expression generation module.

There may be two broad ranges of the lower layer patent classification code extracted by the lower layer patent classification code extraction module. One is a method of extracting only the patent classification code immediately below the obtained patent classification code, and the other is a method of extracting all lower patent classification codes that depend on the obtained patent classification code. At this time, even when extracting only the subordinate patent classification symbols, the subordinate patent classification symbols are extracted again for each of the extracted subordinate patent classification symbols, and the method is subordinated to the patent classification symbols obtained by repeating such a method. All subpatent classification symbols can be extracted.

For example, when the patent classification code included in the search expression is H04B 7/15, the subordinate patent classification code is H04B 7/155, H04B 7/185, and H04B 7/204, and all subordinate patent classifications. Symbols H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, H04B 7/19, H04B 7/195, H04B 7/204, H04B 7/208, H04B 7/212, H04B 7 / 216.

H04B7 / 15, H04B7 / 155, H04B7 / 165, H04B7 / 17, H04B7 / 185, H04B7 / 19, H04B7 / 195, H04B7 / 204, H04B7 / 208, H04B7 / 212 and H04B7 / 216.

It will be apparent to those skilled in the art that the above-described direct patent classification code and all lower patent classification codes can be easily extracted directly or recursively from the child-> parent correspondence or from the parent-> child correspondence. Of course it will.

The lower layer patent classification code extracting module combines and stores the extracted lower layer patent classification code. When the patent classification code included in the above search expression is H04B 7/15, ^ H04B 7/15, H04B 7/155, H04B 7/185, and H04B 7/204 ^ are stored when only the lower patent classification code is stored. And save all lower patent classification symbols ^ H04B 7/15, H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, H04B 7/19, H04B 7 / 195, H04B 7/204, H04B 7/208, H04B 7/212, H04B 7/216 ^ will be stored together.

Modify Search Expression Generation Module

The modified search expression generation module generates the modified search expression by integrating the combined patent classification code with the original existing search expression. The integration is slightly different in the case of range search and in some other cases. First, the case of non-range search will be described.

In the existing search formula including the patent classification code, at the position where the patent classification code was located, the patent classification code and the extracted lower layer patent classification code are grouped in an OR relationship, and the bundled patent classification code is used. It can be done in a way that replaces the patent classification symbol positions of the original existing search expression. In this case, if two or more patent classification codes are included in the original existing search expression, it is natural that the above patent classification codes may be integrated for each patent classification code.

For example, if the given search expression is ^ {Keyword = Wireless and Active} and {Applicant = Three Temples} and {IPC = H04B 7/15}, the modified search expression generation module may perform IPC H04B 7 in the given search expression. / 15, obtain and coalesce the direct or all lower patent classification symbols, ^ {Keyword = Wireless and Active} and {Applicant = Samsung Electronics} and {IPC = H04B 7/15 OR H04B 7/155 OR H04B 7 / 185 OR H04B 7/204} or ^ {Keyword = Wireless and Active} and {Applicant = Samsung} and {IPC = H04B 7/15 OR H04B 7/155 OR H04B 7/165 OR H04B 7/17 OR H04B 7 / 185 OR H04B 7/19 OR H04B 7/195 OR H04B 7/204 OR H04B 7/208 OR H04B 7/212 OR H04B 7/216} Generates a modified search expression that incorporates classification symbols. Preferably the latter search equation is good.

Summarizing the role played by the lower layer patent classification code processing module, when the patent classification code is included in the obtained search expression, the information related to the lower layer patent classification code of the patent classification code is found and the modified search expression generation module is processed. Let me do it. Of course, if the patent classification code of the search expression obtained can be processed by a truncator such as in the case of IPC or more than the main group, the lower layer patent classification code processing module does not need to find the lower patent classification code. For example, when only H04B7 / 00 or H04B is obtained, it is possible to process H04B7 / * or H04B *.) Therefore, in the case of IPC, the lower layer patent classification code processing module has a dot in the subgroup, that is, in the title information. It is more useful when there is one, and it can also be useful when determining the position on the classification system by the number of dots in other classification system.)

The search engine receives a search expression including a patent classification code from a user's computer. When the patent classification code included in the obtained search expression includes a dot in title information, the search engine includes the patent classification code preprocessing engine. Obtain a lower patent classification code on the patent classification code system, bind the obtained patent classification code with OR, generate a modified search expression including the lower patent classification code with OR, and perform a search with the modified search expression. . Each step is as described above.

If there is an insertion, deletion, or relocation in the patent taxonomy, the insertion can be treated similarly to the concept of adding a new word at a specific place in the dictionary, and the deletion can be handled by removing the word. In the case of relocation, it can be treated as a combination of insertion and deletion. Of course, whenever there is an insertion, deletion or relocation, the entire database can be updated or recreated for the entire changed patent taxonomy.

Scope search

Introduction to the range search concept

In the above, the description has been focused on generating a modified search expression by enclosing the lower patent classification codes in OR. Meanwhile, conventional search engines support range search. Next, a method of processing a patent classification code for supporting a range search, rather than a method of finding and grouping the lower patent classification codes by an OR, will be described. The method does not find each of the lower layer patent classification symbols of the acquired patent classification code separately but in the form of a range on the patent classification code system of the lower hierarchy patent classification codes. In particular, the range retrieval ensures even greater retrieval speed for retrieval of materials in dictionary order.

The reason for finding ranges on the patent classification code system of lower layer patent classification codes is that conventional search engines support range search, which is much faster than finding each one within that range. . For example, in the case of H04B 7/15, H04B 7/15 and H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, H04B 7/19, H04B 7/195, H04B Searching from H04B 7/15 to H04B 7/216 is much faster than querying the search engine by grouping all 7/204, H04B 7/208, H04B 7/212, H04B 7/216 in or relations The results can be answered. In particular, the more sub-layer patent classification codes, the greater the difference in search speed.

Necessity of introduction of modified patent classification code as precondition

However, a representative requirement of prerequisites for range search should be arranged in alphabetical order of the patent classification symbols in the search index that the search engine finds. Therefore, if the patent classification symbols are arranged in alphabetical order, the classification symbols may be put in the search index as they are. Otherwise, a separate classification system should be made in which the classification symbols are arranged in the alphabetical order and put into the search index. The former is called direct injection type and the latter is called deformation input type. The representative patent classification code system of the direct input type is IPC, and the representative patent classification code system to be modified and input is USPC. Of course, the IPC can be made into a modified input with a slight modification. The separate classification system is called a modified patent classification code system.

91 is a patent classification code preprocessing engine for processing a modified patent classification code, and does not use symbols on the patent classification code system as it is for at least one patent classification code system selected from among IPC, USPC, FI, FT, and ECLA. Instead, the modified patent classification code may be utilized to support a range search.

The modified patent classification code preprocessing engine of the present embodiment generates the modified patent classification code system, processes the lower layer patent classification code by using the modified patent classification code system, and modifies the search expression obtained to generate a modified search expression. It performs the function. The modified patent classification code preprocessing engine includes a modified patent classification code database generation module, a modified patent classification code utilization lower layer patent classification code processing module, and a modified patent classification code utilization modified search expression generation module.

For IPC

In order to support the range search more smoothly, even in the case of the direct input type, the matching table can be used to transform the modified input type. For example, the IPCs such as H04B 7/005, H04B 7/01, and H04B 7/185 may be modified as H04B70000050, H04B70000100, H04B70000185 and the like, respectively. Modification rules may vary, but according to the present exemplary embodiment, four digits are allocated to a group and four digits are assigned to a sub group. It would be desirable to accept the classification system associated with the class as it is.) Of course, when there is an extended symbol under the sub group such as FI, the modified input patent classification symbol can be generated by allocating corresponding digits. . Therefore, in the case of having a direct input patent classification code system, the matching table can generate a modified patent classification code system even with a very simple rule.

For USPC

On the other hand, even if it does not have a direct input patent classification code system like the USPC, it is possible to generate a matching table. A method of generating a matching table for the USPC will be described later. Generating the modified patent classification code system as described above is the modified patent classification code database generation module This is in charge. The modified patent classification code system generated by the modified patent classification code database generation module is stored in the modified patent classification code database.

The modified patent classification code preprocessing engine correspondingly assigns the classification symbols on the modified patent classification code system 1: 1 by one to the classification symbols on a given classification system, and searches the index with the modified patent classification code system to which the indexer is assigned. Help build it. When a search expression is inputted into the given patent classification code system of the user, the modified search expression generation module finds a modified patent classification code corresponding to the input patent classification code, and conducts a search to derive a search result. Serve It is not necessary that the modified patent classification code appear in the search results.

In the case of IPC, the range search is performed

Hereinafter, a process of performing a range search will be described in more detail. This range search is applied not only to IPC but also to other patent classification codes such as USPC. It is important to find the next sibling on the tree structure of a given patent classification code in order to automatically find the lower patent classification code dependent on the given patent classification code and perform a range search. That is, the range of the patent classification code to be searched is greater than or equal to the given patent classification code and becomes the next sibling. The same applies to the modified patent classification code.

The IPC classification system can be regarded as a representative patent classification system of direct input type when the subgroup notation is treated as a string rather than a number. Thinking like a dictionary, H04B 7/005 is followed by H04B 7/01, H04B 7/02 is followed by H04B 7/12, and H04B 7/185 is followed by H04B 7/19. In this case, if there is H04B 7/02 in the search expression obtained, H04B 7/04, H04B 7/06, H04B 7/08, H04B 7/10 below it in the hierarchical patent taxonomy database containing the tree structure. , Find H04B 7/12. The finding method can be used recursively until the next sibling. An example of this method is described in more detail: when there is a specific patent classification code, it finds its parent in the tree structure, the parent has children, and finds the next child among them. More specifically, the next sibling of H04B 7/02 becomes H04B 7/15. At this time, to find the lower patent classification code of H04B 7/02, first find H04B 7/15 that becomes the next sibling, and then ask all patent classification codes that are larger than H04B 7/02 and smaller than H04B 7/15. .

At this time, if next sibling is in a node / layer other than its parent parent / layer (for example, next sibling in H04B 7/15 (two dots) is H04B 7 /). H04B 7/22 (1 dot), which is not a descendant of 14, but a descendant of the parent of the parent, next sibling is null, in this case when the parent node / hierarchy is not null recursively. Just look for the next sibling of the parent hierarchy / node.

How to create a modified patent classification symbol for USPC

Next, the USPC will be described in more detail. USPC, USPTO's patent technology classification system, has no dot structure, and it is difficult to grasp the relationship between the subclasses and the correlations only by the number of subclasses. In other words, unlike the IPC, the USPC is arranged irrespective of the alphabetical order and requires more systematic modification. USPC or USPC Patent Classification System (http: //www.uspto.gov/web/patents/classification / ... or Classification Index to US Patent published by the US www.uspto.gov as of June 2006 Classification (aka, Classification Index File ) can be viewed as a variant. In this case, the visually understood USPC code is not in alphabetical order as IPC, so modifications are necessary.

The following two methods can be modified.

First, a tree structure is created based on title information including a dot structure for the entire USPC. The tree structure should be arranged such that the USPC forms nodes of the tree structure, reflecting parent-child relationships between USPCs. A depth-first search of the USPC tree structure is followed by a modified patent classification code including an alphabetical arrangement for each node. Of course, it would be natural to match the USPC of each node with the modified patent classification code corresponding to the USPC and store it in the modified patent classification database.

Second, the Index to US Patent Classification file provided by the US Patent and Trademark Office (this file contains information about the USPC, description, and number of dots for each title (depth information based on root) and its parent. It may contain information about the.) made of a tree of the entire USPC structure based on the following, depth-first order by using a navigation technique such as a search technique for each node assigned the alphabetical order deformation patent classification codes, modified for each title Match the patent classification code with the corresponding USPC and save it.

The above two examples are just an embodiment, and another method for mapping the modified patent classification code to each individual USPC by utilizing title information including the dot structure of the USPC, and arranging the system of the modified patent classification code in alphabetical order. There may be many depending on the variant algorithm. One example of this is presented in the table of IPCs reflecting the title information, and the modified patent classification code arranged in alphabetical order through the table. Therefore, not only it is impossible to describe all such modification methods, but such modifications will be apparent to those skilled in the art such as computerized experts. It will be obvious that the modified patent classification code system modified by the above-described algorithm is included in the present invention.

Hereinafter, it demonstrates in detail through the following example.

1 MISCELLANEOUS

455 GUARD OR PROTECTOR

456 .Body cover

457 ..Hazardous material body cover

458 ..Thermal body cover

2.11 .. Astronaut '' 's body cover

2.12 ... Having relatively rotatable coaxial coupling component

2.13 ... Having convoluted component

2.14 ..Aviator '' 's body cover

2.15 ..Underwater diver '' 's body cover

2.16 ... Having an insulation layer

2.17 ... Having a garment closure

459 .Shoulder protector

From the dot structure, it can be seen that 457, 458, 2.11, 2.14, and 2.15 are directly below the 456 body cover, and 2.12 and 2.13 are directly below 2.11. It is obvious that the above table can be made by reflecting the dot structure, and since the above description is provided in the description of the IPC, the description of this USPC will be omitted.

93 shows a tree structure of the USPC reflecting the dot structure. FIG. 94 shows that the modified patent classification code may correspond to each node of the USPC tree of FIG. 93. 95 shows that the modified patent classification code can form a tree structure that is structurally equivalent to the USPC tree structure. As described above, each of the USPCs can be stored in correspondence with the modified patent classification code. An example of the storage may be as shown in Table 6 below. Table 6 below may be one embodiment of the modified patent classification database of the present invention.

TABLE 6

Subclass before transformation Dots Modified patent classification code (including class number) title One 002001000 MISCELLANEOUS 455 002002000 GUARD OR PROTECTOR 456 One 002002100 Body cover 457 2 002002110 Hazardous material body cover 458 2 002002120 Thermal body cover 2.11 2 002002130 Astronaut '' 's body cover 2.12 3 002002131 Having relatively rotatable coaxial coupling component 2.13 3 002002132 Having convoluted component 2.14 2 002002140 Aviator '' 's body cover 2.15 2 002002150 Underwater diver '' 's body cover 2.16 3 002002151 Having an insulation layer 2.17 3 002002152 Having a garment closure 459 One 002002200 Shoulder protector

And the Index to U.S. The Patent Classification file is expressed as follows.

002000000 0 1 APPAREL

002001000 1 2 000 000 MISCELLANEOUS

002455000 1 3 000 000 GUARD OR PROTECTOR

002456000 2 4 45 5000 Body cover

002457000 3 5 45 6000 Hazardous material body cover

002458000 3 6 456000 Thermal body cover

002002110 3 7 456000 Astronaut '' 's body cover

002002120 4 8 002110 Having relatively rotatable coaxial coupling component

002002130 4 9 002110 Having convoluted component

002002140 3 10 456000 Aviator '' 's body cover

002002150 3 11 456000 Underwater diver '' 's body cover

002002160 4 12 002 150 Having an insulation layer

002002170 4 13 002 150 Having a garment closure

002459000 2 14 455000 Shoulder protector

Index to U.S. Patent Classification is described. The first three digits of the first nine digits are the class number and the last six digits represent the subclass number. On the Internet, there is a decimal point or a point like 2.11, but it can be seen that it is represented by 002110. The next digit represents the relative depth from the root. Apparel of class 002 is 0 depth, MISCELLANEOUS without dot is 1depth to represent apparel directly below, and thermal body cover with 2 dots is 3 depth. The column after depth is the serial number, and the next column is the number representing its parent. That is, 455000 attached immediately before the body cover, such as ^ 455000 body cover ^, indicates that the parent has 455000 and that the GUARD OR PROTECTOR is its parent. This can be seen as a representation corresponding to the dot structure. For example, in more detail, in the ^ 456000 Aviator's body cover ^, the parent of Aviator's body cover is a body cover using 456000.

Since the USPC classification symbols are not arranged in alphabetical order, the range search is impossible unless reprocessed.

Index to U.S. It would be easy to create a tree structure as shown in FIG. 96 by using the information about the parent and the relative depth information in each USPC in the Patent Classification file. It is easy to assign the modified patent classification code to the tree structure as shown in FIG. 8 as shown in FIG. 94, and a tree structure of the modified patent classification code as shown in FIG. 45 can be created using the modified patent classification code. Of course I will. As described above, each of the USPCs can be stored in correspondence with the modified patent classification code. An example of the storage may be as Table 7 below. Table 7 below may be one embodiment of the modified patent classification database of the present invention.

TABLE 7

Class + subclass before transformation depth Subclass of base parent before transformation Variant Patent Classification Symbol Subclass of variant patent taxonomy parent Title 002000000 0 002000000 APPAREL 002001000 One 000000 002001000 000000 MISCELLANEOUS 002455000 One 000000 002002000 000000 GUARD OR PROTECTOR 002456000 2 455000 002002100 002002000 Body cover 002457000 3 456000 002002110 002002100 Hazardous material body cover 002458000 3 456000 002002120 002002100 Thermal body cover 002002110 3 456000 002002130 002002100 Astronaut '' 's body cover 002002120 4 002110 002002131 002002130 Having relatively rotatable coaxial coupling component 002002130 4 002110 002002132 002002130 Having convoluted component 002002140 3 456000 002002140 002002100 Aviator '' 's body cover 002002150 3 456000 002002150 002002100 Underwater diver '' 's body cover 002002160 4 002150 002002151 002002150 Having an insulation layer 002002170 4 002150 002002152 002002150 Having a garment closure 002004590 2 455000 002002200 002002000 Shoulder protector

Of course, the modified patent classification code having the alphabetical order may be corresponded to the USPC system including the given title information through an algorithm different from that of Table 7, which is a matching table of the modified patent classification code corresponding to the USPC as described above. Of course it can be stored.

92 is a flowchart illustrating an embodiment in which the hierarchical modified patent classification code database generation module generates the modified patent classification code database.

The modified patent classification code database generation module obtains all patent classification codes on the patent classification code system (S4220), arranges all the acquired patent classification codes in a tree structure with reference to dot information of a title (S4230), and The modified patent classification code is corresponded to each patent classification code node while searching using the depth classification search method in the patent classification code system arranged in a tree structure (S4240), and the relationship between the patent classification code and the corresponding modified patent classification code is determined. The modified patent classification code database is stored (S4250).

Subsequently, in order to generate a modified search expression based on the range search, the modified patent classification code utilization lower layer patent classification code processing module of the present invention obtains information for generating a modified search expression by processing a patent classification code obtained. Explain if you do.

In the above case, if the input search expression is obtained by USPC as ^ 002/456 ^, the subclass containing search is 002/456, 002/457, 002/458, 002 / 2.11, 002 / 2.12, 002 /2.13, 002 / 2.14, 002 / 2.15, 002 / 2.16, 002 / 2.17 should be searched. At this time, as can be seen from the tree structure of FIG. 93, the next sibling of 002/456 is 002/459. Therefore, when the corresponding modified patent classification code is matched in the above table, if the instruction to search for all modified patent classification codes larger than or equal to 002002100 corresponding to 002/456 and smaller than 002002200 corresponding to 002/459, which is the next sibling, is searched for, The search will include a subcategory.

More specifically, there are two ways to do this: The first method finds 002002100 as the corresponding modified patent classification code when ^ 002/456 ^ is obtained, then finds 002/459, the next sibling of 002/456, and the modified patent classification code corresponding to 002/459. Looking for 002002200 which is, how to bind these 002002100 and 002002200 in range. The range can be greater than or equal to 002002100 and less than 002002200. A feature of the present method is a method of first finding a next sibling of patent classification symbols obtained in a tree structure before modification, and finding a modified patent classification symbol corresponding to each of the obtained patent classification symbols and the next sibling. In the present method, since the modified patent classification code system and the corresponding patent classification code system correspond to 1: 1 in terms of structure, the next sibling of the patent classification code input by the user is first found, and the patent classification code and the A method of finding each modified patent classification code corresponding to the next sibling of the patent classification code.

The patent search engine receives a search expression including a patent classification code. Subsequently, when the obtained patent classification code includes a dot in the title information, the modified patent classification code utilization lower layer patent classification code processing module (or the lower layer patent classification code processing module) is hierarchical patent classification code. The next sibling patent classification code of the obtained patent classification code is found in a tree structure reflecting the system. Subsequently, the modified patent classification code utilization lower layer patent classification code processing module performs the modified patent classification code and the next sibling modification patent corresponding to the obtained patent classification code and the next sibling patent classification code on the modified patent classification code database. Get the classification code. Subsequently, the modified patent classification code utilizing a modified search expression generation module generates a modified search expression including a range as the modified patent classification code and the next sibling modified patent classification code of the modified patent classification code, and through the search engine Go through a search with a modified search

The second method finds 002002100 as the corresponding modified patent classification code when ^ 002/456 ^ is obtained, and then finds 002002200, the next sibling corresponding to 002002100, in the modified patent classification code system, and covers these 002002100 and 002002200 in the range. That's how it's tied. The range can be greater than or equal to 002002100 and less than 002002200. A feature of this method is a method of finding a modified patent classification code corresponding to the acquired patent classification code and finding a next sibling corresponding to the modified patent classification code. That is, this method is a method of first finding a modified patent classification code for the input patent classification code, and then finding the next sibling of the found modified patent classification code in the modified patent classification code system. The second method has an advantage of reducing one act of searching than the first method.

Next, a flow of an embodiment of a search method using a modified search method using the second method will be described.

The search engine receives a search expression including a patent classification code. When the obtained patent classification code includes a dot in the title information, the modified patent classification code utilization lower layer patent classification code processing module is modified patent corresponding to the obtained patent classification code on the modified patent classification code database. After obtaining a classification code, finding a next sibling modified patent classification code of the modified patent classification code in a tree structure reflecting the modified patent classification code system, and using the modified patent classification code, a modified search expression generation module is generated. A modified search expression including a range as a classification code and a next sibling modified patent classification code of the modified patent classification code is generated, and the search is performed by the modified search expression through the search engine.

Through the first or second method, the lower layer patent classification code processing module using the modified patent classification code may recognize two modified patent classification codes corresponding to both ends of the range for performing a range search on the obtained patent classification code. Will be.

Of course, the search index should include information on the modified patent classification code corresponding to each USPC. The modified patent classification code may be formed only of numbers having arbitrary digits, but may be formed including numbers and letters. In this case, when there is something added in the USPC, it is possible to generate a modified patent classification code so as to have a proper arrangement position when arranging in alphabetical order without reflecting the dot structure. The method of generating such an alphabetical arrangement position is the same known technique as having a proper arrangement position in alphabetical order when a new word is added to the dictionary, and will be easily implemented by those skilled in the art. In addition, if more subdivisions are created in alphabetical order, you can continue to expand the backseat. For example, when a plurality of patent classification symbols occur in multiple stages at the bottom of 002 / 2.12, additional backseats such as 0020021310a and 0020021310b, which are lower than alphabetical order of modified patent classification symbols 002002131 corresponding to 002 / 2.12, are higher than 002002132, etc. As you expand, you will continue to be able to generate variant patent classification symbols.

As described above, even in the case of having a patent classification code system which does not have an alphabetical order structure itself as in USPC, it has been described that a range search can be performed by introducing a modified patent classification code. Of course, even if it already has an alphabetical order structure, such as IPC, it will of course be able to match the new modified patent classification code. Therefore, the present invention can be applied to FI, ECLA, and FT patent classification codes which have the characteristics of IPC extension. That is, the modified patent classification code reflecting the dot structure of each classification code may be generated for the range search in IPC, FI, FT, ECLA, USPC, etc., and the modified patent classification code is arranged in alphabetical order. It can support range search.

The search index for the search engine preferably includes information in which the original patent classification code is processed and information in which the modified patent classification code is processed. If the user does not want a search result for the lower layer patent classification code, it is not necessary to use an index on which the modified patent classification code is processed.

In the above, the modified search expression generation module and the modified search expression generation module using modified patent classification code of the present invention have been described. Use of Modified Search Expression Generation Module and Modified Patent Classification Symbol The modified search expression generation module generates the modified search expression by integrating with the existing existing search expression with the obtained modified patent classification symbols. For range searches, the modified search components generated are slightly different depending on whether the search engine supports heterogeneous range symbol processing for ranges. Heterogeneous range symbol processing refers to handling cases where symbols on both sides of the range are different, such as greater than or equal to ^ A and less than B. In the case of supporting heterogeneous range symbol processing, for the acquired patent classification symbol, ^ (a range of patent classification symbols larger than or equal to the obtained patent classification symbol and smaller than next sibling of the obtained patent classification symbol) ^ Create a modified search component called. ^ ((Obtained classification symbol) or (patent classification symbols in a range greater than or equal to the obtained patent classification symbol and smaller than next sibling of the obtained patent classification symbol) when the heterogeneous range symbol processing is not supported. } Will be generated. Of course, in the case of using the modified patent classification code arranged in a dictionary, the modified patent classification code will be processed as described above.

If the search expression obtained is ^ {Keyword = Wireless and Active} and {Applicant = Samsung Electronics} and {IPC = H04B 7/15}, if the heterogeneous range symbol processing is performed, ^ {Keyword = Wireless and Active} and {Applicant = Samsung Electronics} and {IPC> = H04B 7/15 and IPC <H04B 7/22} ^ On the other hand, if the heterogeneous range symbol is not processed, ^ {Keyword = Wireless and Active} and {Applicant = Samsung Electronics} and {IPC = H04B 7/15 or (IPC> H04B 7/15 and IPC <H04B 7/22) } Will be.

The core of the patent classification code preprocessing engine of the present invention as described above, when there is a given patent classification code, automatically searches for and analyzes the lower patent classifications in all lower or lower layers of the patent classification code. , Monitoring, etc., can be reflected in all search and query expressions.

When the search result is output by the modified search expression, the display of the search result may be provided by clustering according to the searcher's selection. Clustering is a method of grouping search results and expressing them in group units, and when applied to the present invention, are as follows. When there is a clustering window that represents the clustered structure, or when a clustering level can be selected, the search results are expressed only in the search results of the dot subgroup and subgroups corresponding to the preset or predetermined clustering level. When a lower or upper clustering level is selected (the dot level corresponds to the clustering level), a search result corresponding to the selected clustering level is expressed.

For example, if clustering at the 1-dot subgroup level is selected through the search result related to H04B 7/00, the clustering window corresponds to 7/005, H04B 7/01, H04B corresponding to the 1-dot subgroup. 7/015, H04B 7/02, H04B 7/14, H04B 7/22, H04B 7/24 appear in the form of a subdirectory folder of H04B 7/00, and the search results show all results corresponding to H04B 7/00 . At this time, when the searcher selects H04B 7/14, the subordinate H04B 7/145 and H04B 7/15 appear in the form of a sub-directory folder. In this case, the search result corresponds to H04B 7/14 and the lower patent classification code. Only search results are displayed. If the searcher selects H04B 7/15 again, the subordinates H04B 7/155, H04B 7/185, and H04B 7/204 are displayed, and the search result corresponds to H04B 7/15 and its lower patent classification code. Only the results are displayed.

At this time, for the convenience of the searcher, the number of patent classification symbols below each clustering item (the folder name of the directory) in the clustering window and the number of all sub patent classification symbols together or only one Can be displayed. The number of patent documents corresponding to the patent classification code and all lower patent classification codes of the patent classification code may be displayed. At this time, both the number of applied documents and the number of registered documents may be displayed, or only the number of applied documents or one registered document may be displayed. For example, if the patent classification code is H04B 7/15, the subordinate patent classification code is H04B 7/155, H04B 7/185, and H04B 7/204, so that ^ 3 ^ is displayed or all lower patent classification codes are H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, H04B 7/19, H04B 7/195, H04B 7/204, H04B 7/208, H04B 7/212, H04B 7/216 , So you can display ^ 10 ^. It is advisable to give both indications because there may be less classifications directly underneath, but there may be an enormous size classification underneath them. This is especially true in emerging emerging technologies, or in high-tech areas where technology divergence rates or subdivisions are increasing rapidly. If only the number of sub-categories is displayed, it is because there is a user who ignores the categorization because it is impossible to grasp the technical field when the numerical value is small although the importance of the technical field is actual.

Family Information Preprocessing Module

A patent family is a collection of document information consisting of documents that are directly related to a specific patent document, either domestically or internationally. Usually within one country, there are conditions that define the scope of the family in each country, such as 1) a split application, 2) a change application or a dual application, and 3) a national priority claim application (some continuing applications in the US, a reissued patent, etc.). May be slightly different), etc. form a domestic patent family. Each country's application and 3), when 1) an international application (PCT), 2) a treaty priority application, or 3) said 1) or 2) are filed with a patent office in a particular country, usually based on a single application. (1) Split application, (2) Modified or double application, (3) Domestic priority claim application (some continuing patent applications in the United States, reissued patents, etc.). Etc.) form an international patent family. The configuration of the family information preprocessing module for preprocessing the patent family information of the present invention is well illustrated in FIG. In order to process the family information, a family information preprocessing engine 3810 that processes the information is required, and various DBs for obtaining the family information are required. Examples of necessary DBs may include treaty priority information DB 3671, PAJ DB 3673, KPA DB 3675, Inpadoc DB 3677, and other family information DB 3830. In particular, Inpadoc DB 3677 plays an important role.

The family information preprocessing module is closely related to the processing and updating of the patent document mast DB. The reason is that the patent information is 1) new documents are continuously published, 2) most of them are obtained at the national level, and new documents are continuously issued for each of the countries, so the family information is described in 1) and //. Or it may change at any time according to 2). That is, the family information needs to be updated when the document of 1) or 2) is obtained, and updating of the family information when the document of 1) or 2) is obtained is performed by the family information preprocessing module of the present invention. The family information can be discovered through the existence of a specific application number in original application information or priority information for an existing document or a new document.

87 shows how the family information preprocessing engine of the family information preprocessing module processes family information. The family information preprocessing module obtains at least one or more patent document information (S3720), and determines whether the patent document information includes family information such as a priority claim number, a division application, a change application, or some continuous application (S3730), If there is family information, the information of the patent document is stored in a patent DB or a search index in association with a unique document number of the family patent document corresponding to the family information (S3740), and the country-specific family information in the selectively obtained patent document Counting and storing the number of pieces of information, or when there is information about the number of family information in the unique document number of the family patent document is added to the number (S3750).

Family information processed by the family information preprocessing module is stored in the family information DB of the present invention.

Citation Information Preprocessing Module

The citation information preprocessing module of the present invention includes any one or more of a citation information acquisition module 3400-1 for acquiring information related to a citation, and a citation information updating module 3400-2 for updating citation information based on the citation information. It is included. The citation information acquiring module includes an applicant citation information acquiring module (3400-1-1) for acquiring citation related information from information cited by the applicant in the applicant's patent document, and examination citation information for acquiring the information cited by the examiner in the examination process. There is an acquisition module 3400-1-2. On the other hand, the applicant citation information acquisition module (3400-1-1), the applicant citation patent document information acquisition module (3400-1-1-1) and the applicant citation non-patent document information acquisition module (3400-1-1-) 2) There is. The applicant cited patent document information obtaining module 3402-1-1 operates when the cited information is patent information. The examination citation information acquisition module 3400-1-2 includes an examiner citation patent document information acquisition module 3400-1-1-1 and an examiner citation non-patent document information acquisition module 3400-1-2. The citation information preprocessing module processes the citation information obtained by the citation information obtaining module in a predetermined manner.

The types of patent citation information may include 1) prior art citation information for which the applicant cites other patents as prior art, and 2) positive citation information for citation of the patent as citation reference cited by the examiner when examining a particular patent document. . Of course, the citation will naturally occur at home and abroad. In general, citation of another patent as the first prior art is in prior art document information in Korea or Japan, and in reference information in the United States. Such prior art citation information can be obtained from the bibliography or text of the patent document. In addition, many countries publish verifiable citations, which can be found in public information published by the Office. In the present specification, the prior art citation information will be mainly described, but if the obtained true citation information can be treated like the prior art citation information, the above description may be equally applicable.

Preprocessing the prior art citation information and the reference citation information is performed by the citation information preprocessing module of the present invention. As shown in FIG. 5, the citation information preprocessing module includes a prior art citation information preprocessing module that preprocesses the prior art citation information. Includes a Citation Preprocessing module to preprocess the citation information. Since a citation or text of a specific document generally includes forward citation, the document information about the forward citation is stored in the search index or the specific document information of the specific document. However, the forward citation of the specific document is not stored in the specific document information, but is stored in the search index of the document citing the specific document or the document information citing. Therefore, it is necessary to find the back citations for the particular document and store them in association with the particular document, which function is performed by the prior art citation information preprocessing module.

Prior Art Citation Information Preprocessing Module

The prior art citation information preprocessing module may obtain backward citation information of the specific document quoting the specific document by querying the search index or the DBMS for the application number of the specific document or the unique number value of the specific document. have. The prior art citation information preprocessing module may include the obtained back citation information of the specific document in the index during the indexing of the specific document, and store the back citation information of the specific document in the patent document mast DB. Can be. Forward citation information is almost unchanged (it is written at the time of the applicant's departure, and there may be occasional additions, but the addition of forward citations is rare). Back citation information is always updated when patent information is updated. Information to be monitored. Therefore, when a new patent document occurs and the search index is updated or the patent document mast DB is updated, the prior art citation information preprocessing module queries whether there is an updated patent document citing the specific document, The citation information must be updated.

However, when updating the back-quote information by querying when updating the search index or updating the patent document mast DB, as described above, a great deal of queries must be executed at each update. Therefore, the application number or the unique number of all the specific documents should be queried.) Therefore, the prior art citation information preprocessing module may cite the information about the specific document in the new patent document (updated). ) Is found, the new patent document information is added as backward citation information to the information related to the specific document. Preferably, the information related to the specific document is a search index relating to the specific document or a record relating to the specific document on the patent document mast DB. That is, the prior art citation information preprocessing module obtains forward citation information included in the new patent document when obtaining a new patent document, and searches for a search index or patent in relation to the document number cited based on the document number being cited. It is stored in document information about the document number being cited in the document mast DB. When the method described in this paragraph is taken, the forward citation information included in the new patent document information is utilized, so that the consumption of computational resources can be drastically reduced.

In this case, the citation information may be different in the notation method such as the application number, registration number, or publication number for each person who displays the citation information. In other words, even if the same document (document unique number # 1) is cited as the application number in the A document, the public number in the B document, and the registration number in the C document, the processing of the citation information is an error. Is more likely to occur. In order to find other document information quoting the document unique number # 1, a problem arises in which a different document number of all attributes related to the document unique number # 1 must be inquired once.

Therefore, the prior art citation information preprocessing module of the present invention will need to map the obtained citation information to one document unique number independently of the attribute of the citation information obtained, and represent the document number as the citation information. Needs to be. That is, when the document unique number is an application number, the application number is separately cited as the representative information for obtaining the application number having the publication number or registration number, regardless of whether the cited information obtained is a publication number or a registration number. Information can be managed, and accordingly, citation analysis and counting can be processed based on the cited representative number. That is, the representative name of the citation information can be managed. This is especially important because the United States has introduced patent publications. Process the same document internally for the citation information in some form. For this purpose the application number is most preferred.

In this case, the representative citation information may be represented by a publication number or "country name + publication number", but may be represented by an application number or "country name + application number", or is preferable. In the case of a registration number, there is a possibility that an unregistered document may occur, and a publication number (document issue number) also has a problem that a publication number occurs whenever an application publication number or a registration publication number or other publication occurs.

The manner in which the prior art citation information preprocessing module operates is the same as the manner in which the reference citation information preprocessing module operates. The reference citation information preprocessing module may also add the new document to information related to the specific document (the search document or the record for the specific document information of the patent document master DB) when the specific document is cited backward from the new patent document as the reference citation. Stores the application number or unique number of a patent document.

A citation information preprocessing method that represents a citation number is described in detail in FIG. 88. The citation information preprocessing module obtains at least one citation number existing in a specific document (S3820), obtains a citation number that is not a preset citation number attribute of an attribute of the citation number (S3830), and obtains the obtained citation number. Query the search engine or DBMS to obtain the reference number of the predetermined attribute (S3840), and represent the obtained reference number with the obtained reference number of the predetermined attribute (S3850).

It is necessary to include the back-cited document information in the document information of the specific document after obtaining the back-cited document information, which is information on the post-application document citing a specific document, and an exemplary method of this process is illustrated in FIG. 89. Is shown.

The citation information preprocessing module obtains specific document data (S3920), obtains at least one or more of an application number, a registration number, and a publication number included in the specific document data (S3940), and retrieves the obtained number. Query the citation field of the query or define a query field by defining a DBMS (S3950), receive back citation information as the query result (S3960), and include the back citation information in the bibliographic information of the document data (S3970). ).

Verification Citation Information Preprocessing Module

The reference citation information preprocessing module performs the role of the prior art citation information preprocessing module for the prior art citation information exactly the same for the reference citation information.

Counting Pretreatment Module

The counting preprocessing module will be described. The counting preprocessing module includes an individual document unit counting preprocessing module for performing counting processing in individual document units and a plurality of document unit counting preprocessing module for performing counting processing in two or more document units.

Individual Document Unit Counting Pretreatment Module

The individual document unit counting preprocessing module includes the number of claims in each step such as 1) the number of applicants and / or patentees, 2) the number of inventors, 3) the application or registration stage, 4) the number of specification pages, and 5). Number of drawings, 6) number of patent classification symbols, 7) number of patent classification symbols of each type, 8) number of references, 9) number of examination citations, 10) number of patent classification symbols examined by the examiner, 11) Number of priority claims, 12) number of family patents by country, 13) number of family patents, 14) number of independent terms, 15) number of dependent terms, 16) number of patents at national level among references, 17) number of patent documents among references, 18) At least one or more numerical information may be calculated from the number of non-patent documents among the references.

In addition, the individual document unit counting preprocessing module is based on 1) the period from the date of application to the date of registration, 2) the period from the priority date to the date of application, from administrative processing information such as the individual patent document or the Patent Office that issues the individual patent document, etc. 4) At least one or more of the period of information from the filing date to the request for examination, and 4) the period from the notice of submission of opinion to the date of submission of the opinion can be calculated.

In addition, the individual document unit counting preprocessing module may obtain information on whether or not to use or apply a specific system for each country in the individual patent document or administrative information. 2) whether the split application is used, 3) whether the application is still part of the application, 4) whether the reissued patent system is used, 5) whether to apply for priority examination, 5) whether to use the decision-making referee, or 6) whether a non-party lawsuit has occurred. One or more can be found, and if there is use or application of the system, the counting value can be set to 1, otherwise it can be set to 0.

In addition, the individual document unit counting preprocessing module examines the patent document mast DB in relation to the individual patent document (using a search engine or DBMS) to obtain information related to the individual patent document in at least one second country. Can be obtained and counted. The above information can be obtained by counting at least one or more of: 1) number of applicants, 2) number of registered countries, 3) international application, 4) number of families in each country, and 5) total number of families. Can be.

In addition, the individual document unit counting pre-processing module may be configured through the search engine or the DBMS 1) forward citation (other domestic and international patent documents citing me), 2) backward citation (prior documents cited by me), 3) collateral You can count and count any one or more of the citations.

In addition, the individual document unit counting preprocessing module may obtain information on the document, such as 1) objection, 2) information provision, 3) party judgment. Information regarding the above 1) to 3) can be obtained directly from the Korean Intellectual Property Office or the Patent Judge in the case of the Republic of Korea or inquiries.

The value counted by the individual document unit counting preprocessing module is preferably stored in the processing surge DB of the patent information mast DB. In addition, the counted value is more preferably stored in such a way that the patent document is included in the index generated when the search engine indexes. The counting value is included in the index. 1) When the search engine indexes, the individual document unit counting preprocessing module is operated to obtain a counting value, or 2) counting value stored in the patent information mast DB. Any one or more of the available ways may be used.

The counting preprocessing module counts 1) the total number of lower patent classification symbols and 2) the number of subordinate patent classification symbols for each patent classification symbol by referring to the patent classification code mast DB. Can be stored in the master DB.

The manner in which the counting preprocessing module handles the counting is well illustrated in FIG. 78. First, a counting preprocessing module obtains at least one or more patent document information (S2820), and calls at least one or more individual document unit counting preprocessing modules for each patent document obtained (S2830), and the individual document unit counting preprocessing module is a patent document Counting is performed on the information (S2840), and the counting result is stored in the patent DB or search index together with the document unique number of the patent document information (S2850).

Multi-Document Unit Counting Pretreatment Module

Hereinafter, a module for obtaining a score calculated in the multi-document units, that is, a multi-document unit counting preprocessing module, will be described. The multi-document unit counting preprocessing module obtains a score by collecting information counted by the individual document unit counting preprocessing module from a plurality of documents that become the target set. That is, the multi-document unit counting preprocessing module generates counting data for at least two document sets.

Weighted preprocessing module

Document Unit Weight Preprocessing Module

It is impossible to accurately and systematically measure the value of the inventive idea contained in one patent document. However, it may be unreasonable to treat all patent documents equally because the inability to accurately measure them. Therefore, the present invention discloses a method for assigning a weight to a patent document based on information that the system can grasp as data. The weighting function of the patent document is the document unit weight preprocessing module of the present invention.

The weight preprocessing module of the present invention includes a document unit weight preprocessing module 3310 and a subject unit weight preprocessing module 3330. The document unit weight preprocessing module 3310 may include a cost expenditure perspective weighting preprocessing module 3311, a citation perspective weighting preprocessing module 3313, a dispute perspective weighting preprocessing module 3315, and / or an intensive perspective weighting preprocessing module 3317. It may include. The subject unit weight preprocessing module 3330 may include an applicant unit weight preprocessing module 3331, an inventor unit weight preprocessing module 3333, and / or an agent unit weight preprocessing module 3335.

Expenditure Perspective Weighted Preprocessing Module

From a probabilistic, statistical or social point of view, the more paid patent will likely be the more important one. From the point of view of the rational distribution of costs in the same subject, this assumption is likely to hold, and this assumption is likely to be made between other competing actors. Considering the structure in which the cost is paid for one patent, the following weighting factor can be considered.

1) From a quantitative point of view (1) Number of claims, number of independent claims, number of specifications (pages), number of families or number of patent classification symbols 2) Patent pending, registered patent, rejected patent from state of view Or in the case of a waived patent, 3) whether it is a request for examination, whether or not to use a priority examination, whether to request a judgment, whether to claim a domestic priority or a divided application, 4) the number of applicants or inventors from a subjective point of view, 5) From an overseas perspective, factors such as the availability of international applications and the number of countries entering the domestic and overseas phases may affect the value of patent documents. These weighting elements may be obtained from bibliographic information, specification content information, or administrative processing information of individual documents.

In general, a large number of claims or a large amount of specification are likely to make the invention broad and diverse, and therefore, it is likely that the invention is time-consuming and expensive, and agents are likely to charge more. In addition, the greater the number of families or the greater the number of patent classification symbols, the higher the possibility of comprehensiveness of the invention. In addition, registered patents are more likely to have a greater weight than patents pending, and a patent application for which an examination is requested or an application for which a priority examination is requested is more likely to be more important than a patent application that is not. If two or more applicants are joint applications, it is a result of cooperative work between two or more other subjects. Therefore, there is a possibility that they are statistically more important than otherwise. It is likely to be important. In addition, foreign patent applications are usually much more expensive than domestic ones, so if the applicant has an overseas application or a large number of foreign applications, the applicant will have significant financial expenditure, so at least the patent is relatively important to the applicant. It is reasonable to judge high.

From the above point of view, the existence of the weighting factor and the number of the weighting factors for a single document are identified. (If only the problem is present, the number will be 0 if it does not exist and the number will be 1 if it exists. The above numerical value will be obtained.) A weighted score may be assigned to each weighting factor according to its presence and / or numerical value. To grasp the existence and the numerical value of the detailed weighting elements constituting each viewpoint of 1) to 5) with respect to one document, and to assign a weighting score to each weighting factor is the cost expenditure perspective weighting preprocessing module of the present invention. This is done.

The method of processing the weights by the expense aspect weight preprocessing module is well illustrated in FIG. 79. The cost expenditure perspective weight preprocessing module obtains at least one or more patent document information (S2920), and at least one or more groups that are related to the expense in patent document information obtained using a counting preprocessing module or obtained through a patent DB or a search index. A counting result value for each set counting criteria is obtained (S2930), and the patent information counting criteria weights are obtained from the patent information processing policy DB (S2940). In this case, the information about the predetermined counting criteria refers to the patent information processing policy stored in the patent information processing policy DB. For example, in terms of the claims of the patent information processing policy DB in view of the cost expenditure, a policy for processing weights for each object is stored in each aspect such as (number of claims * 0.1 + number of independent terms * 0.3). In addition, the cost expenditure perspective weight preprocessing module generates a cost perspective weight value as a weighting result for each counting criterion and a counting criterion (S2940), and generates the weighted value with the unique document number of the patent document information. Or store in the search index (S2950).

Citation Perspective Weight Preprocessing Module

Like papers, many patents cited by many are likely to be important. A module that processes the weight of individual documents in this citation perspective is called a citation viewpoint weight preprocessing module. Breaking down the citation's perspective in detail: 1) From the perspective of forward citation, the total number of back citations, the number of back citations, the number of first-level back citations, the average duration of back citations, the concentration of backward citations over time, 2) There may be weighting factors, such as whether the examiner cites, in terms of examiner citations. That is, the more citations to your patents, either directly or indirectly (that other patents cite those patents that you cite), the more one-step citations that you cite directly, and the average duration of back citations within the appropriate range, The more time a person citing his patent is focused on a recent podium, the more likely it is that the patent is of great importance. In addition, the patent cited by the examiner in the examination process may be more important than the patent that is not. The manner in which the citation perspective weighting preprocessing module preprocesses the weighting in terms of citation is similar to the method in which the expenditure perspective weighting preprocessing module processes, as illustrated in FIG. 80.

The citation perspective weight preprocessing module obtains at least one or more patent document information (S3020), and includes forward citation and / or back citation or review citation in patent document information obtained using a counting preprocessing module or through a patent DB or a search index. Obtain a counting result value for each of at least one predetermined counting criterion that is related to (S3030), obtain a weight for each counting reference with reference to the patent information processing policy DB (S3040), and the weighting count for each counting criterion and counting criterion A citation viewpoint weight value is generated as a counting result for each step (S3050), and the generated weight value is stored in a patent DB or a search index together with a unique document number of patent document information (S3060). For information about counting criteria, refer to the patent information processing policy stored in the patent information processing policy DB. For example, the patent information processing policy DB of the citation point of view stores a policy for processing a weight for each object in each point of view such as "back citation counting number * 0.3".

Dispute Perspective Weighted Preprocessing Module

The more disputes a patent document has, the more likely that patent is of importance. Dispute elements that may be identified in patent information may include 1) invalidity trials, 2) appeals, 3) provision of information, and 4) passive or active coverage checks. Determining the existence and the numerical values of the weighting elements in the disputed viewpoint and assigning the weighted score to each weighting element is performed by the disputed viewpoint weighting preprocessing module of the present invention.

The method of processing the weights by the contention perspective weighting preprocessing module is illustrated in FIG. 81, and is similar to the method by which the cost expenditure viewpoint weighting preprocessing module and the citation viewpoint weighting preprocessing module process weights.

The dispute perspective weight preprocessing module obtains at least one or more patent document information (S3120), and at least one or more preset counting criteria that are related to dispute in patent document information obtained using a counting preprocessing module or obtained through a patent DB or a search index. Obtain a counting result value for each step (S3130), obtain a weight for each counting criterion with reference to the patent information processing policy DB (S3140), and set a dispute perspective weight value as a counting result weight and a counting result value for each counting criterion. The generated weight value is stored in the patent DB or the search index together with the unique document number of the patent document information (S3160).

That is, the method of processing the weights by the dispute perspective weight preprocessing module essentially counts the number of disputes, and obtains the weight for each count from the patent information processing policy DB to generate the dispute viewpoint weights. do.

Intensive Perspective Weighted Preprocessing Module

At a certain point in time, it is likely that the technical field, which is concentrated by several actors, is an important technical field. It is rare for an applicant to apply evenly to various technical fields, and apply for a large number of places where he considers important and where technology investment is concentrated, and defensively a small number of places where it is considered to be less important. Most of them are. Therefore, in the technical field to which one application belongs, many applicants have applied for more than 1) the number of applications, the rate of increase or the rate of increase of applications, 3) the number of applicants, 4) If the share change is outside the preset range, the technical field is likely to be an important technical field. The field of focus of several applicants will be measurable by such analytical factors as technology attractiveness, rate of increase, rate of increase, change in market share, and the like. At this time, it is preferable that the technical field of this paragraph grasp the layered patent classification code included in patent documents such as IPC, USPC, FI, FT, ECLA, etc. in each hierarchical unit. Of course, it would be more desirable to automatically include the lower patent classification code to calculate the numerical value for each analysis element. That is, when H04B 7/26 is an IPC, numerical values for various analysis elements for this IPC may be calculated, and numerical values for various analysis elements may be calculated in view of H04B 7/00. In this case, it may be more preferable that other patent classification codes belonging to the lower patent classification code of H04B 7/26 are also included in the calculation. The concentration and weight pretreatment module of the present invention performs grasping the existence and the numerical value of the weighting elements from the viewpoint of concentration and assigning the weight score to each weighting element.

The manner in which the concentrated viewpoint weighted preprocessing module processes weights is similar to the manner in which the cost expenditure viewpoint weighted preprocessing module and the citation viewpoint weighted preprocessing module process weights. That is, the bibliographic information on each individual document is used to measure the value of the patent information analysis index including the meaning of concentration such as the degree of concentration, activity, and occupancy of the applicant / patent in the technical field, and the weight of each value is assigned to the patent information. Obtain the concentration viewpoint weight value obtained from the processing policy DB.

The manner in which the centralized viewpoint weighted preprocessing module processes the weight is illustrated in FIG. 82, and is similar to the manner in which the expense aspect weighted preprocessing module and the citation viewpoint weighted preprocessing module process weights.

The centralized viewpoint weighting preprocessing module obtains at least one patent document information (S3220), obtains applicant information and patent technology classification information from a patent document DB (S3230), and obtains the obtained applicant information and patent technology classification information as values. The value is calculated based on at least one of the applicant's concentration / activity / occupation rate of the patent technology classification obtained on the patent technology classification code system or at least one or more higher patent technology classification symbols with reference to the patent technology classification code DB. (S3240), weight information on concentration / activity / occupancy rate of each criterion is obtained from the patent information processing policy DB (S3250), and a concentration perspective weight value is generated as the weight value and concentration result value of each criterion (S3260). In addition, the generated weight value is stored in the patent DB or the search index together with the unique document number of the patent document information (S3270).

The centralized viewpoint weighting preprocessing module may process weights in the inventor unit rather than the applicant, and a method of processing the weights is illustrated in FIG. 83. The centralized viewpoint weighting preprocessing module obtains at least one patent document information (3320), obtains inventor information and patent technology classification information from patent document information (S3330), and obtains the obtained inventor information and patent technology classification information as a value. By referring to the patent technology classification code DB, the inventor's concentration on the patent technology classification or at least one or more higher patent technology classification symbols obtained on the patent technology classification symbol system is calculated based on at least one criterion (S3340), and the concentration of each criterion. Obtain weighting information on (S3350), generate a concentrated perspective weighting value with reference weights and concentration result values (S3360), and generate the weighting values along with the unique document number of patent document information to a patent DB or search index. Save (S3370).

Subject Unit Weight Preprocessing Module

When there is a patent document, if the subject involved in the patent document is an important subject, the patent filed by the subject is most likely an important patent. The key is to determine which subject can be more important. Subjects included in one patent document may be an applicant, an inventor, and an agent. Performing a function of preprocessing a weight of a patent document with respect to one patent document from the viewpoint of the subject is called a subject unit weight preprocessing module. As shown in FIG. 4, the subject unit weight preprocessing module includes an applicant unit weight preprocessing module, an inventor unit weight preprocessing module, and an agent unit weight preprocessing module according to the type of the subject.

Applicant Unit Weight Preprocessing Module

If the technical field included in a specific document is a technical field focused by the applicant of the document, the importance of the application will be high. The field of focus for the applicant of a particular document may be measured by analysis factors such as concentration rate, market share, and AI. Applicants unit weight preprocessing module of the present invention is to determine the numerical value of the weighting element from the viewpoint of the analysis element as described above, and to assign a weight score for each weighting element.

The manner in which the applicant unit weight preprocessing module processes the weight is shown in FIG. 84. The applicant unit weight preprocessing module obtains reference information for processing weights in the applicant unit (S3420), and obtains a result value calculated by at least one document unit weight preprocessing module for documents in the applicant's name meeting the criteria. (S3430), the weight information for each document unit weight preprocessing module is obtained from the patent information processing policy DB (S3440), and the weight of each document unit weight preprocessing module and the applicant unit weight value for each criteria obtained as the result value are generated. In operation S3450, the generated weight value is stored in the patent DB or the search index together with the unique document number of the patent document information together with the reference or independently (S3460).

Inventor Unit Weight Preprocessing Module

If the technical field included in a specific document is a technical field focused by the inventor of the document, the application is likely to be important. The field of focus for the inventor of a particular document will be measurable by such factors as the concentration rate, share, and AI (formula is the same as when replacing the inventor in the applicant's formula) with that inventor's criterion. The inventor's unit weight preprocessing module of the present invention performs grasping the numerical value of the weighting element from the viewpoint of the analysis element as described above and assigning the weighting score to each weighting element.

The manner in which the inventor unit weight preprocessing module processes weights is illustrated in FIG. 85. The inventor unit weight preprocessing module obtains reference information for processing weights in inventor units (S3520), and obtains a result calculated by at least one document unit weight preprocessing module for documents in the inventor's name that meet the criteria. (S3530), the weight information for each document unit weight preprocessing module is obtained from the patent information processing policy DB (S3540), and the weight of each document unit weight preprocessing module and the inventor unit weight value for each criteria obtained as each result value are generated. In operation S3550, the generated weight value is stored in the patent DB or the search index together with the unique document number of the patent document information together with the reference or independently (S3560). The weight information may include a policy such as "Registration Rate * 0.1 + Overseas Application Family Number * 0.5".

Delegate Unit Weight Preprocessing Module

If the technical field included in a particular document is a technical field focused on the agent of the document, the completeness of the patent specification will be higher than that of the agent. Therefore, the area of focus for a particular document's agent may be measurable by analytical factors such as concentration rate, share, and AI (formula is the same as when replacing an agent in an applicant-based formula). will be. Determining the numerical value of the weighting element from the viewpoint of the analysis element as described above, and assigning a weight score to each weighting element is performed by the agent unit weight preprocessing module of the present invention. The weight of the agent unit may be less important than the weight of the applicant unit or the weight of the inventor unit.

The manner in which the agent unit weight preprocessing module processes the weight is illustrated in FIG. 86. The agent unit weight preprocessing module obtains reference information for processing weights in agent units (S3620), and obtains a result calculated by at least one document unit weight preprocessing module for documents in the name of an agent meeting the criteria. (S3630), obtain the weight information for each document unit weight preprocessing module from the patent information processing policy DB (S3640), and generate the weight of each document unit weight preprocessing module and the agent unit weight value for each criteria obtained as each result value. In operation S3650, the generated weight value is stored together with the reference or independently together with the unique document number of the patent document information in the patent DB or the search index (S3660). The weight information may include a policy such as "Registration Per Application * 0.3".

The document unit weight preprocessing module and the subject unit weight preprocessing module of the present invention calculate weights of document units by assigning predetermined weights to the weighting elements obtained by each of the at least one or more weight preprocessing modules. In this case, each weighting module of each viewpoint or each weighting module for each subject may calculate a weight of a document unit in its own viewpoint by assigning a predetermined weight to the weighting factor obtained by the subject. Each of the above weights may be given differently according to an input of an administrator or a user.

Subject mast DB generation module (301-4)

Type of subject

For patent documents, the subject can be largely 1) applicant, 2) inventor, 3) agent. Applicants may have individuals and organizations such as companies. In the present invention, the DB for storing and managing the subject is called the subject mast DB 204, and the module for generating the subject mast DB 204 is called the subject mast DB generation module 301-4.

The subject mast DB 204 may further include a representative name applicant DB, representative name inventor name DB, representative name agent name DB, and may further include a company information DB independently. Each of these will be described.

Configuration of the subject mast DB generation module 301-4

The subject mast DB generation module 301-4 of the present invention includes a representative name preprocessing module 301-4-1. The representative name preprocessing module includes: Applicant representative name preprocessing module (301-4-1-1) for processing the representative name of the applicant, inventor representative name preprocessing module (301-4-1-2), agent for representative name processing It includes any one or more of the agent representative name preprocessing module (301-4-1-3) for the representative name processing. Meanwhile, the subject mast DB generation module 301-4 of the present invention may further include a company information DB generation module. The company information DB may further include one or more of company's financial information, company status information, accounting information, main products, representatives, corporate registration number, business number, website, telephone number, and fax number. The subject mast DB 204 includes a company information DB 204-1, a representative applicant name DB 204-2, a representative inventor name DB 204-3, and a representative agent name DB ( 204-4), and / or organization information DB 204-5.

Concept of representative painting

First, the concept of representative mastery will be explained. Representative naming means incorporating at least one notation by one or more countries or by more than one language for the subject's name. For example, if there is a company called Samsung Electronics Co., Ltd. in Korea, the official English notation of this company is Samsung Electronic Co, ltd. It is not. The discrepancy in notation is natural if the languages are different, and there are many cases of discrepancies in one language. Representative inconsistencies include 1) organizational type notation, 2) misspelling, 3) spacing, and 4) the use of special characters such as punctuation. For example, Co, ltd, company, corporation, corp., Corpm co limited, etc. are often used interchangeably. In addition, typos such as not adding an s like Samsung Electronic frequently occur, and do not use spaces like Samsung Electronics, enter unnecessary spaces, or confuse dots or commas in abbreviations. do.

If the company's company name is changed, the company registration number or the business registration number, etc. are maintained after the change, but the applicant's name is changed, and the patent document with the name after the change is coexist. In addition, in the case of filing an overseas application from the first country to the second country, the applicant's notation in the first country and the applicant's notation in the second country are often different, and even in the second country, many notations are often expressed. Solving the problem that the same subject is expressed in different notation as described above is a function performed by the applicant representative name preprocessing module 301-4-1-1 of the present invention.

When representative name is a problem

There are three main reasons for such a problem. The first type is when the applicant's notation changes substantially due to a change in company name or merger. In this case, it is necessary to reflect the information regarding the change of the applicant's name in the subject mast DB 204 through the company information DB. The second type is the case in which different companies use different names for their representatives. In particular, foreign companies often do not have a uniform representation. In this case, unification is based on the applicant's notation within the country, considering that 1) the notation in the country is relatively consistent, and 2) most of the incidents coming from abroad include priority claims. You can make it.

Representative Name

In the present invention, subject representative naming preprocessing module 301-4-1 performs processing of representative naming of applicants, inventors, and / or agents. At least one of the pretreatment module 301-4-1-1, the inventor representative name preprocessing module 301-4-1-2, and the agent representative name name preprocessing module 301-4-1-3 is included.

In one country, the applicant representative preprocessing module 301-4-1-1 may perform the applicant representative name as follows. Within a country, there may be domestic and foreign firms, and it is preferable to first carry out a representative name. First we define the term. When "Samsung Electronics Co., Ltd." is indicated, the entirety of the notation is called the applicant's name, "Samsung Electronics" is called the Organization Name stem, and "Co., Ltd." is represented by the Organization Type. It is called.

Create Organization Type Notation Set

First, it creates a set of organizational type notations. Examples of organization type notation set are ^ Corporation ^, ^ Corporation ^, ^ School corporation ^, ^ Kaishiki Kabusha ^, ^ Kaishiki Gyabusha ^, ^ Co, ^, ^ ltd ^, ^ Co., Ltd. ^ , ^ GMBH ^ and many others. Here's how to create an organization type notation set: Based on one country, obtain all the applicant name values of the applicant field from the bibliographic DB of the patent document of that country. (Acquire the name of the applicant such as Samsung Electronics Co., Ltd., hitachi co., Ltd.) Applicant names are truncated based on spaces or other punctuations or known tissue type notation (e.g., Inc., co. Ltd, limited, etc.) in the value (e.g., Inc., "Space + Co", or All applicant names obtained are cut on the basis of " blank " and the like.) All the cut objects are sorted in order of frequency. The organization type notation is selected based on the high frequency among the sorted ones, and a set of organization type notation is generated based on the selected organization type notation. The reason for this sorting in order of frequency is that the number of tissue types is relatively much smaller than the type of applicant, so that one tissue type is often used repeatedly for multiple applicant names, so the frequency of use of the tissue type rather than the name of the tissue is often higher. This is because there are much more, and thus the frequency of tissues after cutting is much higher.

***example***,

There may also be ways to generate a set of organizational type notations. The first is to register a representative organization type notation that is known. You can register a known organization type notation, such as a corporation or a corporation. Second, for each of the various notations for that applicant for at least one known applicant, remove the tissue name stem and collect the remaining tissue type notations. Using these tissue type notations to separate tissue type notations from other Applicant notations, tissue name stems can be extracted from that Applicant notation. Subsequently, the extracted tissue name system finds the applicant notation in which the tissue name stem is used, and if the organization name stem is removed from the found applicant notation, a new tissue type notation can be obtained again. This process is repeated and the tissue type notation set extracted in the first or the third method below can be used to separate the tissue name stem and tissue type notation from the applicant notation. Third, the tissue type notation is found by cutting the applicant's notation by at least one letter, calculating the frequency of the cut object, and extracting the longest phrase that exceeds a predetermined frequency. In other words, organization type notation, such as "stock", is used much more than the organization name stem, and the frequency of "stock", "stock", "stock", "stock", "company", etc. is calculated and then the longest. Select "company" as the organization type notation.

If the applicant is natural, the individual's notation will not appear in the tissue type notation belonging to the tissue type notation set. Therefore, in the present invention, when the same notation as the applicant notation appears in the inventor notation, the applicant is treated as a natural person, and even when it is composed only of the notation not shown in the tissue type notation set among the elements constituting the applicant notation, the applicant is treated as a natural applicant. . Therefore, the number of applicants belonging to the natural applicant may decrease or increase according to the increase or decrease of the constituent elements of the tissue type notation set.

The organization type notation set is generated in the above manner, and the organization type notation set is periodically updated. One exemplary method of updating is as follows. 1) Extract a set of applicant names from a newly acquired patent document set or any selected patent document set. 2) The organization type is separated from the applicant name by the tissue type notation set in the tissue type notation set for the extracted applicant name set. The set of applicant names obtained through this is divided into a set of applicant names where separation of tissue types has occurred and a set of applicant names where separation of tissue types has not occurred. 3) Examine whether the tissue name of the tissue type is separated from the applicant name in the set of applicant names for which no separation of the tissue type has occurred. 4) If present, remove the organization name from the applicant's name and obtain the remaining character string. 5) Collect the remaining strings and sort them in order of frequency to extract strings of tissue type from the remaining high frequency strings, or examine the remaining strings one by one and include the remaining strings in the tissue type notation set. Confirm that the organization type is notation.

For example, as follows. In step 1), Samsung Electronics Co. Ltd. And Samsung Electronics Co, Inc. Assume that Ltd is an element of the organization type notation set, but Co, is not an element. In the step 2), the former is separated, and the latter is known to have not occurred, and the name of the organization, Samsung Electronics, is obtained. In the step 3), the latter is examined by the organization name of Samsung Electronics. 4) As a result of the investigation, there is an organization name of Samsung Electronics. If Samsung Electronics is removed from Samsung Electronics Co, Co, remains. In step 5), it is determined whether Co belongs to the tissue type, and if it belongs to the tissue type, it is added to the conventional tissue type notation set.

The applicant representative name in the first country is performed in the same manner as described above. In order to increase the accuracy of the applicant representative name in the first country as described above, it is effective if the following two additional processes are further performed.

First, if there is an applicant's name included in a patent document filed from the nth country to the first country on the basis of the first country, there may be a significant difference in the notation of the first country and the notation of the nth country. In other words, the name of the applicant of the Republic of Korea filed in the Republic of Korea is not significantly different on the basis of the Republic of Korea, while the representation of the name of the applicant of the United States, Japan, Germany, etc. filed in the Republic of Korea is relatively large. There are two ways to solve this problem: using abstract data in English and using priority claim numbers. Explain one by one.

*** Add Method ***

A single application may be filed in multiple countries, in which case, if the languages are different in those countries, the applicant's notation may be indicated in two or more languages in at least two countries as priority claim information or family data. In this case, English abstract information issued by each country such as KPA (Korea Patent Abstract) or PAJ (Patent Abstract of Japan) may be used. At this time, by using the application number as a key value, the applicant's notation in each country's language and the applicant's notation in English can be mapped.The applicant's notation in English and the applicant in English in the United States, Europe, etc. The notation of can be mapped. In the mapping process, it is natural to separate the tissue name stem and the tissue type notation from the applicant notation. The use of the English abstract is identical to the logic described above, which is a representative name in the English-speaking country. For example, instead of attempting to name the "Sumcom Inc." of Application No. # 1 and " Qualcomm Incorporated" of Application No. # 2, the application No. Based on the English notation of Application No. # 2, the representative name (meaning that both of them go through the procedure of judging by the same organization), and the corresponding application No. as the key value of "Sumcom Inc.""Qualcomm Co., Ltd." of Application No. # 2 performs a representative name that judges the same organization.

Representative names using priority claim numbers and the like are well illustrated in FIG. 99. If there is treaty priority information in a patent document obtained from a first country, the representative nameization preprocessing module obtains a treaty priority number and information of the applicant's name of the acquired patent document (S4920), and retrieves the treaty priority information from a search engine or Obtain the patent document information of the home country that is the basis of the treaty priority through the DBMS 201 (S4930), and the name of the applicant included in the patent document information of the home country and the applicant of the patent document obtained in the first country. The applicant's name is representatively represented among the countries in a manner that is handled in a simple manner (S4940). If a priority is filed from a first country to a second country, the basis of the priority claim number and the claim of priority in the second country patent document mast DB 202 and the first country patent document mast DB 202 is applied. The information regarding the patent application number to become can be obtained. If the patent application number in the first country filed by the applicant whose company name is in the first country is present in the priority claim number of the second country, the second country patent application including the priority claim number Applicants may be considered to be substantially the same applicants. (Of course, the transfer of priority in a second country may occur, but this is not common.) Of course, in the second country patent document mast DB 202, If there is a priority claim number originating in the first country, the same subject in the first and second countries can be found by searching this first claim number in the first country. If the same subject is found, the notation with different names in the second country may unify the notation with any of them. That is, in the case where the notation of the e applicant is indicated as f, g, or h in the second country, f may be selected by selecting f from the f, g, or h. For example, when the priority numbers of "Sumcom Inc." of Application No. # 1 and "Qualcomm Inc." of Application No. # 2 are US Application No. # 1 and US Application No. # 2, respectively, If the applicant in that US document is judged to be the same organization (for example, both are called Qualcomm Incorporated, etc.), then "Sumcom Inc." of application number # 1 and "Qualcomm Inc." of application number # 2 The names of both applicants of the "Ted" are regarded as the same organization and the representative name is made.

Next, the representative name for the applicant, which occurs for other reasons such as a change of company name, will be described.

Applicant representative name preprocessing engine 3610, which is a core engine including an algorithm for processing the representative name of the applicant, and representative name rule DB 3630 including rules for processing the representative name, and information on which the name of the applicant is changed. Applicant's history DB 3650 is included. In particular, the applicant's history DB 3650 may include license change information, and may include company name change due to a company name change, merger and division. In order to process the representative name, the representative name standard DB (3670), which is the standard of the representative name, is required, and these include the treaty priority information DB (3671), PAJ DB (3673), KPA DB (3675), and Inpadoc DB. (3677), family information DB (3830), and in order to solve problems such as notation, in particular, PAJ DB (3673), KPA DB (3675) is preferably utilized sufficiently. Representative Applicant information is stored in the Applicant Representative Name DB 3690. Hereinafter, the Applicant Representative Nameization Preprocessing Module 301-4-1-1 performs the Applicant Representation Name.

An exemplary method for the representative name preprocessing module to perform the applicant name name is well illustrated in FIG. 98.

The representative name preprocessing module removes the non-identifiable elements from the applicant name information (S4820), obtains the company name change information (S4830), and performs all the documents in the name of the applicant before the change and the name after the applicant for a specific period of time. Find (S4840), obtain the address information for the mission before and after the change and compare its identity (S4850), and if the address information is the same, represent it with the same applicant, if the address information is not the same, (S4806), if the representative name is the same, if the representative name is the same, if not the same, if not the IPC main group is extracted from the application in the name before the change (S4870), the application in the name after the change The IPC main group is extracted from (S4880), and if there is something common in both of the extracted IPC main groups, it is represented by the same applicant (S4890).

Hereinafter, the representative nameization process will be described in more detail. The company information DB of the present invention includes 1) company personal information such as company name, address, CEO, corporation registration number, business registration number, etc., 2) financial information, 3) information on products or handling items, and 4) company name change. Change information, such as mergers, mergers, and injections, and 5) disclosure information. The applicant representative name preprocessing module 301-4-1-1 may represent the name of the applicant in another notation based on the company name recognized as the identity based on at least one or more specific time points (for example, the current time point). Representative name of the applicant is preferably stored in a separate field rather than overwriting the patent document mast DB 202 consisting of information obtained from the patent publication. The first type of representative naming process is performed as follows.

First, the applicant's notation removes tissue types that are indistinguishable elements such as corporations, Kaishiki Gabur, Inc., ltd, co, and the like. Organizational types, such non-identifiable elements, need to be managed and stored separately in the organization type table. At this time, it will be necessary to store the frequency / frequency at which each tissue type is found.

Second, the company information DB obtains the change of company name. Assume that at some point in time A, the company name is a, the address is a1, the company name at another point in time b is b, and the address is b1. The patent document mast DB 202 or a search engine finds all documents in the application name a from time A to time B as period information. At this time, if the address information is a1 (or b1-address is more likely to be moved earlier. There are more than the change in the name of the company), all of them are identified as the same applicant. If the address information is c1 to cn, the tag Paste and save. The patent document mast DB 202 or a search engine finds all documents in b under the name of the applicant from point A to point B as period information. In this case, if the address information is a1 or b1, all are identified by the same applicant. If the address information is c1 to cn, a tag is stored. At this time, even if the address information is different (c1 ~ cn), the name of the representative director is the same (the patent document filed in a specific period includes the representative director information), or the patent classification code of the patent document of b name is a If there is an identical document at the subclass and / or subgroup level identified in the patent application in the name, it is classified as the same applicant.

Third, the patent information is obtained by querying the search engine or the patent document mast DB 202 for a (company name, address) set identified for each viewpoint.

Fourth, a union operation is performed on the document sets identified in the second and third, and the representative name is performed based on the most recent company name. It is preferable that 1) the company name at each time point and / or 2) each corresponding patent document correspond to the representative company name. If the public information on the patent document of the representative name of the company is obtained, the applicant representative name may be updated. (The patent information is included in the company disclosure information or the patent information is included in the company information DB. An example is the case where it is included.)

For example, when a company based in Japan, SONY, is listed in Korea, it can be used in various ways, such as "Sony Kaishiki Gavusha" and "Sony Kaishiki Kabusha." Unified notation with ". The same will be true in the United States, even when a US applicant is filed in Korea or Japan. In other words, in a patent application that crosses a border, the priority claim number allows the subject of the first country and the subject of various notations of the second country to be identified and representative.

On the other hand, such a representative name was a representative name from the viewpoint of removing the type of organization from the viewpoint of each country and the applicant. However, in order for the representative name to be further developed, more representative name in the organization name is needed. For example, there are over a dozen different organization name designations that may be controversial to be recognized as Samsung Electronics in the United States, and the same holds true for other applicant names. Examples of Samsung Electronics include the following. SAM-SUNG ELECTRONICS, SAM SUNG ELECTRONIC, SAM SUNG ELECTRONICS, SAMASUNG ELECTRONICS, SAMAUNG ELECTRONICS, SAMGSUNG ELECTRONICS, SAMSUG ELECTRONICS, SAMSUGN ELECTRONICS, SAMSUMG ELECTRONICS, SAMSUN ELECTRONIC, SAMSUN ELECTRONICS, SAMSUNE ELECTRONICS, SAMSUNG ELECTRONICS SAMSUNG EELCTRONICS, SAMSUNG ELCTRONICS, SAMSUNG ELECETRONICS, SAMSUNG ELECRONICS, SAMSUNG ELECRTONICS, SAMSUNG ELECTONICS, SAMSUNG ELECTORNICS Looking at them by type, there are typos, typos, additions, spaces, and special characters such as "-". It is a policy issue to map them to Samsung Electronics and treat their representative names as Samsung Electronics, and it is reasonable to name Samsung Electronics only those that have passed the criteria by applying strict standards. The representative naming problem can be supplemented to some extent through the above-described ones such as priority information, but it will be more appropriate to perform a process such as performing an edit distance algorithm described below. First, the logic for the editing distance is explained. The editing distance is a numerical evaluation of how many times the unit processing is performed in the first string when the first string and the second string exist, and the second string can be converted to the second string. Unit processing includes adding, deleting or changing spelling, adding, deleting or changing spaces or special characters. When the first string is referred to as Samsung Electronics, it is shown in Table 8 to calculate the editing distance with a few strings from the above example.

TABLE 8

First string Second string Edit distance Remarks Samsung electronics SAMSUNG ELECTRONICs 0 Same Samsung electronics SAMSUNG ELECTRONIC One s Samsung electronics SAM SUNG ELECTRONIC 2 Spacing, missing Samsung electronics SAMSUN ELECTRONIC 2 g missing, s missing Samsung electronics Jamsung Electronics One First letter difference Samsung electronics LG Electronics 7 First letter difference Samsung electronics Sony 17 Large editing distance to length Samsung electronics SAMSUNG EECTRONICS One eadded

If there is a given set of organization names, select one selected from the strings of the organization name set (preferably higher frequency) as the first string, and edit distance using the other name as the second string for the organization name set. Calculate At this time, whether the first character matches and the edit distance to the length of the string may be further calculated. On the other hand, it is determined whether to view the same organization name based on the editing policy (determination of whether to view the same organization name or another organization name) when using the edit distance algorithm. For example, in the editing policy, when the editing distance is 0, the same name is used, and when the editing distance is 1, when the editing distance is set based on the length of the entire string unless the first start character is different (for example, within 30%), Is the same name, if it is more than that, it is regarded as a different organization name.If the editing distance is 2, the standard for setting the editing distance relative to the length of the entire text string (for example, within 10%, or 20% when blank or special characters are inserted) In the case of the same name, if it is more than that, it is regarded as a different name, and when the editing distance is 3 or more, it is regarded as another name unless there is a special condition. For example, such exceptions cause LG Chemical and LG Chem to change edit distance 3 to edit distance 0. These special conditions are often abbreviations (e.g., If it can be like University = Univ), Verbatim's acronym (IBM = International Business Machines).)

When applying the policy as described above, each string shown in Table 8 may be processed as shown in Table 9.

TABLE 9

First string Second string Whether to represent a representative name as the first string Remarks Samsung electronics SAMSUNG ELECTRONICS O Samsung electronics SAMSUNG ELECTRONIC O Length ratio 6% (1/18) Samsung electronics SAM SUNG ELECTRONIC O 11% of length ratio (2/18), space exception rule applies Samsung electronics SAMSUN ELECTRONIC X 11% of length ratio Samsung electronics Jamsung Electronics X First letter difference Samsung electronics LG Electronics X First letter difference Samsung electronics Sony X 94% Samsung electronics SAMSUNG EECTRONICS O

Whether the organization name is representative or not may vary greatly according to the editing policy. By viewing the result of the representative name according to the editing policy, the editing policy may be tuned periodically.

On the other hand, in the case of Korean, the organization name may be unwrapped when calculating the editing distance. For example, when there are "Qualcomm" and "Sumcom", they will be "ㅋ ㅋ ㅓ ㄹ ㅋ ㅁ ㅁ" and "ㅋ ㅗ ㅏ ㄹ ㅋ ㅁ ㅁ", respectively, or the suffix policy will be a consonant and / or a vowel sound. In one case, it would be "ㅋ ㅝ ㄹ ㅋ ㅓ ㅁ" and "ㅋ ㅘ ㄹ ㅋ ㅗ ㅁ". At this time, if the representative name can be made based on the editing policy with respect to the written name, both of them become different organization names. On the other hand, the organization name of "communication" becomes the same organization name.

At this time, the priority between the representative names by the editing distance and the representative names by processing with the priority claim may be a problem, and those represented by the editing distance become more supplementary (ie, subordinated). desirable. That is, it may be more preferable to perform the representative name by the editing distance only to those which are not represented by other methods.

Representative applicants as described above is stored in the representative name applicant name DB of the present invention, the representative name applicant name DB in the identification code for the applicant, the representation of the representative name for each country, all representations for applicants for each country Includes a full set of national notation of applicants. In this case, a global representative name may be specified, but the global representative name may be any language, but it may be preferable in English. On the other hand, since the patent data is obtained at the national level, the representative name is performed based on the country. However, if a plurality of standard languages are recognized even in the same domain as the EU or the EU, the representative name for each language may be performed. . In this case, if the language notation is written separately, like the EU, the representative name may be performed for each language. If there is no language notation, the representative name for each language may be performed based on the standard language corresponding to the country. Different cases occur depending on English and British English, for example, Center / Centre.

When the applicant representative name is processed as described above, representative name data as shown in Table 10 are generated.

TABLE 10

Global ID Global representative name First country representative name First Country Name Set National representative name N-Country Name Set 1111 Samsung electronics Samsung Samsung Electronics Samsung electronics Samsung Electronics Samsung Electronic Samsuung Electronics 2222 Qualcomm Qualcomm Qualcomm Qualcomm Qualcomm Qualcom 3333 LG Electronics LG Electronics LG Electronics Venus LG Electronics LG Electronics LG Electronic L G Electronics

The n-th country name set further includes incorrect names such as spaces and typos. The representative naming data may store a name combining representative organization types with representative names of countries n (for example, Samsung Electronics Co., Ltd.), on the other hand, a name set including various organization types in a first country name set. (Eg, ^ Samsung Electronics Co ^, ^ Samsung Electronics Co. ltd, ^, ^ Samsung Electronics Co., ltd ^, etc.)-in this case there are a lot of elements that make up the name set, When any applicant's name is obtained, the speed and accuracy of assigning the representative name of the n-country or the global representative name are increased.)

Next, representative names of the inventors will be described. Representative names of inventors' names include 1) representative names in one organization, 2) representative names beyond organization, and 3) representative names of inventor names expressed in two or more languages at the national level. It is explained one by one below. The inventor representative naming preprocessing module 301-4-1-2 of the present invention corresponds to one inventor name based on one organization name system as long as there is no same name in one country in the organization name system. . In this case, punctuation may be included in the inventor's name in the United States and the like. On the other hand, if one applicant is a large company, there may be another inventor (name of the same name) using the same name, in this case, it is possible to distinguish between the same name using the address information.

Representative names that transcend organization within a country use representative name and address information to perform representative names. At this time, the inventor may move, etc., but the address may be changed, but the inventor is regarded as the same inventor that exists in the same organization name system. That is, one inventor corresponds to one or more codes (organization name stem reference code, country unit code, transcendence code, etc.), and the inventor's identification information corresponding to the code in one country is two or more ( "Inventor name (organization name stem a) + address A", "inventor name (organization name stem a) + address B", etc.).

If a patent application invented by a Korean inventor is filed in the United States, a representative name of the inventor's name in two or more languages is required. At this time, the priority claim number, etc. as a key value, and maintains the identity of the documents in two or more countries, matching the Korean name and the English name based on the documents that the identity is recognized. The mapping method uses the information when the first country (Korea) knows the English notation of the inventor. In the case of unknown English notation in the first country, 1) In the patent document, if there is only one inventor, it is mapped. If there are two or more, the most optimally matched inventor is used by using language conversion rules such as Hangul-Roman conversion. Match first. This matching may apply to all applications with the same organization name stem containing a particular inventor (i.e., when there are # 1 and # 2 documents filed simultaneously in the first and second countries, the # 1 document may If there are three inventors (A, B, C), and there is one inventor (A) in the # 2 document, the inventor A can first process the # 2 document, and in the other documents for the inventors B and C, If you have a clue that can be distinguished, you can use that clue first to remove one, and the other can automatically map Korean notation to English notation in # 1.)

The representative inventors as described above are stored in the representative named inventor DB of the present invention, and the representative named inventor name DB includes an identification code and a notation for each country.

Next, the agent representative name preprocessing module 301-4-1-3 will be described. Typically, representatives should be representative in one country, and if they are active in more than one country, they should be treated according to the applicant. If the agent is an organization such as a law firm, it may be treated exactly the same as the non-natural applicant described above (if the applicant's name can be distinguished from the organization name system and organization type notation). On the other hand, in the case of non-organizations, the names of a number of natural agents are displayed, in which case they represent one office organization (e.g. in the Republic of Korea, civil unions, and in the case of unions, not the name of the union). It belongs to the name.) Becomes important. At this time, the office organization can display an anonymous abstract source (a degree in which one code can be matched, and the name of the office does not need to be determined). Multiple representatives of an office organization are all listed in some applications, some applications may contain only a few who are selected in different ways, and in some cases only one person may be listed. . In this case, it is a question of whom to assign to the office organization code as an abstract source. To solve this problem, a frequent pattern such as association algorithm, apriori, etc., association and correlation processing algorithm, which are common techniques of data mining, are applied. Alternatively, various published clustering algorithms can be applied. That is, a number of such algorithms are disclosed, and the principle of operation of these algorithms is briefly described as follows. By analyzing the actual combinations of each agent in each document, we look for agents who are more likely to work with this agent based on a particular agent, and calculate the frequency and probability that pairs (specific agents, frequent agents) exist. Process. Clustering, on the other hand, processes a set of agents in which an average distance falls within a predetermined distance by calculating a distance between the agents. In other words, between agents that exist in one document at the same time, a predetermined minimum distance is given to the document, and the distance is calculated for all agents for all documents belonging to a specific document set. It is possible to find a cluster of agents whose average distances for a particular agent fall within preset criteria, and assign the code as an abstract source to the agents belonging to that cluster.

The processing for the representative agent by processing as described above is performed in the agent representative name preprocessing module 301-4-1-3 of the present invention, and the processed information is stored in the representative agent name DB.

Representative Phrase Extraction Pretreatment Module

Concept of representative phrase

Next, the representative phrase extraction module of the present invention will be described. First, the concept of representative phrases will be explained. Representing phrase refers to a phrase that consists of a word or two or more words that can represent all of a specific document or a part of the specific document through the representative phrase. The basic attribute of the representative phrase of the present invention is quite sparse in a given set of representative phrase extraction entire documents, while a given representative phrase extraction target document set (including one representative phrase extraction purpose document) or representative phrase extraction is given. Extraction of a portion of the target document or a set of representative phrases The common part of each document in the set of target documents frequently appears above the predetermined criteria.

For example, 1) a patent document set generated by a method of specifying a patent document set by a specific patent classification code on a specific patent classification code system in a specific country DB. A set of patent documents generated by a method of specifying a set of patent documents by a specific applicant's name in a country DB, and 3) the name of a specific inventor included in the patent document of a specific applicant in a specific country DB (ie, the name of the applicant and the name of the inventor) 4) a patent document set generated by a method of specifying a patent document set by 4) a patent document set generated by a method of specifying a patent document set by a specific agent name in a specific country DB, and 5) in a specific country DB. A set of patent documents by a specific patent classification code in a specific patent classification code system and a specific applicant name 6) Patent document set generated by a specific method; 6) Patent document generated by a method of specifying a patent document set by a specific patent classification code, a specific applicant name and a specific inventor name in a specific patent classification code system in a specific country database; A set of patent documents generated by a method of specifying a set of patent documents by a specific applicant name and a specific agent name in a specific country DB, 8) a set of all patent documents in a specific country, 9) a set of at least two countries For example, all patent document sets, and 10) patent document sets specified in units of periods set in 1) to 9) above will be examples.

On the other hand, the representative phrase extraction target document set may be any subset of the representative phrase extraction entire document set (from one document to at least one specific attribute (for example, a specific patent classification code such as IPC, applicant, period, country, Agents, inventors, or one or more combinations thereof, etc.) may be a subset of common documents).

All.

The representative phrase extraction target document set is a concept relative to the representative phrase extraction entire document set, and a subset of the representative phrase extraction full document set is sufficient, but preferably the size of the subset is smaller. Examples of the representative phrase extraction target document set include: 1) one specific application, all applications of a specific company, a specific inventor's application, a specific inventor's application of a specific IPC, all applications of a specific IPC of a specific company, and a specific year of a specific IPC. All applications, etc. will be an example.

On the other hand, an example of a representative phrase document for extracting phrases may be, for example, the claims, the independent claims in the claims, the dependent claims in the claims, the effect or industrial applicability of the invention, and the like. On the other hand, a common part of each document of the predetermined representative phrase extraction target document set will be the aforementioned patent claims of all the documents constituting the representative phrase extraction target document set.

The representative phrase may include 1) a phrase consisting of only nouns, 2) a phrase consisting only of nouns and verbs, and 3) a phrase formed of an adjective or an adverb. The length of the representative phrase is preferably 1 to 5, more preferably 2 to 3 based on the number of words. Usually, the technical concept is generated by one word, but in many cases, the technical concept or technical action or effect is composed of two or three words.

The representative phrase extraction target document set is a concept relative to the representative phrase extraction whole document set, and a subset of the representative phrase extraction full document set is sufficient, but preferably the size of the subset is smaller. Examples of the representative phrase extraction target document set include: 1) one specific application, all applications of a specific company, a specific inventor's application, a specific inventor's application of a specific IPC, all applications of a specific IPC of a specific company, and a specific year of a specific IPC. All applications, etc. will be an example.

Next, the representative phrase extraction process will be described in detail. The representative phrase extraction is performed by the representative phrase extraction preprocessing module of the present invention. The representative phrase extraction pre-processing module of the present invention includes 1) a morphological analysis engine for each language, 2) a phrase generating engine, 3) a counting engine for each phrase, and 4) various dictionary databases including a thesaurus dictionary, and 5) a representative phrase. It includes an extraction engine. The representative phrase extraction pretreatment module is well illustrated in FIG. 8. When the representative phrase extraction preprocessing engine 3710 including an algorithm for extracting the representative phrase, the representative phrase extraction policy DB 3730 including information on the policy that is the basis for extracting the representative phrase, and the representative phrase are extracted. Representative phrase-frequency-application number correspondence DB 3750 including various counting or calculation information such as the frequency value for each representative phrase and information on which position (field, etc.) of the representative phrase came from It is. On the other hand, using a thesaurus DB (3770) for the processing of synonyms or synonyms and a translation system for processing two or more languages, or using a pre-translated dictionary, representative phrase translation DB containing translation information about the representative phrase (3790) may be further included. Hereinafter, the representative phrase extraction preprocessing engine of the present invention will be described in more detail.

Morphology is the minimum language element that makes up a word. The easiest things to think about are `` nouns '', `` pronouns '', `` investigations '', etc., `` adjectives '', `` verbs '', etc. Independent words such as '' 'words' '', '' 'adverb' '', '' 'tubular' '', '' 'interjection' '', '' 'investigation' '', '' 'mother' '' , '' 'Word ending' '', '' 'macro' '', and so on. Morphologies are divided into real morphemes that have practical meanings and functional morphemes that are in charge of functional elements, and the process of separating them first comes first in morphological analysis for index word extraction. In a nutshell, morphemes can no longer be seen as minimal language elements that can be separated into separate meanings or functions. The functions that the morphological analysis engine should have are the identification and characterization of string types, the search function of the electronic dictionary, the operation of various rules (eg, judgment, verb judgment, part-of-speech transition judgment, etc.), and compound noun processing. Since each language morpheme analysis engine is a well-known technique, the operation method thereof will be omitted. The morpheme analysis engine analyzes a given sentence or phrase by morpheme and outputs real morphemes.

The phrase generation engine of the present invention receives the output real morphemes, and generates 1-5 (preferably 2-3) real morpheme arrays in the real morpheme units. For example, when one sentence included in one patent document is input, the actual morphemes to be output are a b c d e f, g in order of morphemes. In this case, when the phrase generation unit of the phrase generation engine is three words, the phrase generation engine generates the actual morpheme-specific phrases abc, bcd, cde, def, and efg. The phrase generation engine generates phrases in units of sentences for a given document or part of a document. As an example of generating a phrase for a document part, the phrase may be generated by targeting sentences in a patent claim. The method of generating the array is a method of generating a phrase using nC2, nC3, nC4, nC5 for the n phrases when n is processed in one sentence unit but there are n real morphemes extracted from one sentence. (For example, when nC2 is applied, the globs are generated ab, ac, ad, ae, af, ag, bc, ... fg.), 2) If there are m substantial morphemes in the document unit, Method of applying the method of 1) in the document unit, 3) Method of arranging two to five real morphemes sequentially (for example, when two real morphemes are arranged sequentially ab, bc, cd, de , ef, fg, and three real morphemes are abc, bcd, cde, def, and efg. Of course, you can use both two and three real morphemes.)

On the other hand, the real morphemes that have passed through the morpheme analyzer are also termed term. The real shape or term may vary depending on the settings in the morphological analyzer. At this time, examples of the setting value may include 1) separating only the spoken word, 2) separating only the word, extracting only the representative type, and 3) extracting only the representative type in the case of adjectives or adverbs.

Subsequently, the phrase-specific counting engine essentially generates (phrase, document number) or (phrase, counting value) information for each phrase generated, preferably (phrase, counting value, document number) or (phrase). , Document number, location field name on the document number (e.g., claims)) information, more preferably (phrase, counting value, document number, location field name on the document (e.g., claims) ) Generate information. The phrase counting engine stores the generated information in the representative phrase DB 207-2. Table 11 below shows an example of data included in the representative phrase DB 207-2. The example shown in Table 11 below shows an example of data generated for a specific phrase by country.

TABLE 11

Phrase ID Phrase Phrase word count Cumulative counting value in document (based on document number) country code Document number Phrase location ... "Phrase absolute ID" One abc 3 One KR 10-2003-0012345 Claims One One abc 3 2 KR 10-2003-0012345 Claims 2 One abc 3 3 KR 10-2003-0012345 Detailed description of the invention 3 One abc 3 4 KR 10-2003-0012345 Detailed description of the invention 4 One abc 3 5 KR 10-2003-0012345 Detailed description of the invention 5 One abc 3 6 KR 10-2003-0012345 Detailed description of the invention 6 One abc 3 7 KR 10-2003-0012345 Detailed description of the invention 7 2 bcd 3 One KR 10-2003-0012345 Claims 8 2 bcd 3 2 KR 10-2003-0012345 Claims 9 2 bcd 3 3 KR 10-2003-0012345 Claims 10 2 bcd 3 4 KR 10-2003-0012345 Detailed description of the invention 11 2 bcd 3 5 KR 10-2003-0012345 Detailed description of the invention 12 2 bcd 3 6 KR 10-2003-0012345 Detailed description of the invention 13 2 bcd 3 7 KR 10-2003-0012345 Detailed description of the invention 14 ... ... ... ... ... ... ... ... One abc 3 One KR 10-2003-0056789 Claims 150 One abc 3 2 KR 10-2003-0056789 Detailed description of the invention 151 One abc 3 3 KR 10-2003-0056789 Detailed description of the invention 152 ... ... ... ... ... ... ... ...

It can be seen that the data shown in Table 11 was generated by the patent documents 10-2003-0012345 and 10-2003-0056789, and the phrase generation engine can be seen that the phrase is generated in three real morphological units. In addition, the phrase abc in 10-2003-0012345 can be seen that twice in the claims and five times in the detailed description of the invention. In addition, the phrase bcd can be seen that three times in the claims and four times in the detailed description of the invention. In addition, the phrase abc is found in the patent document 10-2003-0056789 once in the claims and two times in the detailed description of the invention. Therefore, the phrase counting engine of the present invention can generate the data as shown in Table 11 for all patent documents obtained repeatedly or recursively. In addition, the phrase generation engine will be able to generate phrases in units of two real morphemes, and for phrases generated in the two real morphological units, the phrase counting engine of the present invention generates the data shown in Table 11 above. It can be said that it can be done. Furthermore, the phrase-specific counting engine may generate data as shown in Table 11 even for one or four to five real morphological units.

In this case, generating the phrase ID may be a problem. The method of generating the phrase ID may be obvious in a general DB technology, but some exemplary methods are provided. In the first method, IDs are sequentially assigned to a phrase that is generated first, and a new ID is assigned to a phrase that is generated next, and a new ID is assigned, if any, and an existing ID is assigned. This process is repeated or recursively performed for every phrase obtained. In the second method, a temporary ID is sequentially assigned to all phrases obtained without giving an ID to a phrase to generate data as shown in Table 11, and then the same official ID is assigned to the same phrase while reading one for each phrase. This process is performed recursively or recursively for phrases that are not assigned any canonical ID.

The phrase-specific counting engine of the present invention may generate a phrase absolute ID every time a phrase is generated and processed and assign it to every phrase.

In addition, the phrase counting engine for each phrase of the present invention may generate a cumulative counting value for each phrase based on a target document set. In the target document set, a cumulative counting value may be calculated based on a single document as shown in the above table, and it will be apparent that the cumulative counting is possible even in a specific field (for example, a claim) in one document.

In addition, the phrase counting engine for each phrase of the present invention may generate a cumulative counting value for each phrase based on the entire document set and may be stored in the manner as shown in Table 12. (In the example, the entire document set is stored in a patent document. Assuming 10-2003-0012345 and 10-2003-0056789, numerical values were entered on the basis of the table above, and in practice the present invention applies to a much larger set of documents.)

TABLE 12

Phrase ID Phrase Cumulative Counting Value (Based on Entire Document Set) Related phrase absolute ID ... ... ... ... ... ... ... ... One abc 10 1, 2, 3, 4, 5, 6, 7, 150, 151, 152 2 bcd 7 8, 9, 10, 11, 12, 13, 14 ... ... ... ... ... ... ... ...

In addition, the phrase-counting engine of the present invention may generate a cumulative counting value for each phrase, as shown in Table 13 below, based on a specific field (eg, a claim) on a document.

TABLE 13

Phrase ID Phrase Cumulative Counting Value (Based on Claim) Related phrase absolute ID ... ... ... ... ... ... ... ... One abc 3 1, 2, 150 2 bcd 3 8, 9, 10 ... ... ... ... ... ... ...

A representative phrase extraction engine of the present invention will be described. Representative Phrase Extraction When there is data (phrase, document number) extracted from the entire document set, it will be easy to generate (phrase, counting value, document number) data as shown in Table 14 below. The (phrase, counting value, document number) data may be generated by increasing the counting value by 1 when there is the same phrase. , The phrase extraction engine of the present invention can generate (phrase, counting value, document number) based on the (phrase, document number) data, and (phrase, document number, location field name on the document number). ), You can create data (phrases, counting values, document numbers, location field names on document numbers). The above process can be performed not only for all documents belonging to at least one representative phrase extraction target document set, but also for at least one representative phrase extraction whole document set that is a collection of the representative phrase extraction target document sets. You can do it.

The representative phrase extraction engine is an example of a full set of representative phrase extraction documents: 1) all documents belonging to one country database, 2) all documents corresponding to at least one patent classification code given, 3) all documents for a particular applicant, 4 A) all documents for a particular inventor or 5) all documents created with a limited time period in each of them, or 6) all generated by combining each of them (including any set operations such as union, difference, intersection), etc. It may be possible to generate (phrase, article number) data or (phrase, article number, location field name on the article number) data for the document. On the other hand, it is possible to generate (phrase, document number) data or (phrase, document number, location field name on the document number) data for all documents corresponding to any subset of the representative phrase extraction entire document set.

In this case, the representative phrase extraction engine 1) for each phrase for at least one representative phrase extraction target document set preset for each of at least one representative phrase extraction total document set 1) frequency in a representative phrase extraction target document set And 2) a total frequency (T) in the representative phrase extraction entire document set that is a collection of the representative phrase extraction target document set may be calculated. On the other hand, it is possible to calculate the total number of phrases (All frequency, A) in the representative phrase extraction full document set, it is possible to calculate the total number of phrases (All frequency of target set, AT) in the representative phrase extraction target document set. Of course I will.

At this time, the representative phrase extraction engine extracts the representative phrase of the given representative phrase extraction target document set from the given representative phrase extraction entire document set with reference to the representative phrase extraction policy DB. In this case, the representative phrase extraction policy may be whether or not the probability value of the phrase is included in a predetermined reference range under (preferred phrase, probability value of a phrase) under a predetermined condition. Examples of the probability value may be F / T, F / A, and F / AT values for specific phrases. On the other hand, the representative phrase extraction policy is 1) change value of F, T, A, AT value (rate of increase / decrease rate), 2) change of F / T, F / A, F / AT value (rate of increase / decrease rate) It may be a map that satisfies this preset reference range. In this case, the predetermined condition of the phrase extraction policy is based on the term (probability value of the phrase or phrase) by period, by country, by applicant, by inventor, or by at least one patent classification symbol or a set of documents sharing a predetermined attribute. The probability value of a phrase may be mapped to a value of another reference range. For example, the probability value of the phrase extracted from the IPC H section with many patent applications and the probability value of the phrase extracted from the IPC D section with few patent applicants may be applied differently. Of course, the phrase is also extracted from 1) fluctuation values (rate / rate, increase / decrease rate) of F, T, A, AT value, and 2) fluctuations (change rate, increase / decrease rate) of F / T, F / A, F / AT. Different criteria may be applied flexibly depending on the nature of the document set (representative phrase extraction full document set, representative phrase extraction target document set).

Hereinafter, the concept of representative phrase extraction of the present invention will be described with a brief table. The representative phrase extraction engine of the present invention 1) corresponds to a probability value (= the number of occurrences of a specific phrase / the total number of phrases) for each phrase ID on the basis of the frequency of appearance of all the phrases, and the phrases corresponding to the preset probability value range As a candidate representative phrase, the representative phrase for each target document set is selected by referring to the predetermined representative phrase selection rule. It will be apparent that the number of appearances can be calculated for each specific field. In the example shown in Table 14 below, the number of occurrences of each phrase may be counted for each document number to extract the representative phrase for each of the document numbers corresponding to the preset representative phrase extraction policy.

[14]

Phrase Number of occurrences of representative phrase extraction target document set document (F) (frequency) Document number Number of occurrences of representative phrase extraction in the entire document set (T) F / T (%) Number of occurrences / Total number of appearances (%) Number of occurrences in total T / A / Total number of fishing gear (1 billion) (%) The phrase total number (A) abc 40 #One 80000 0.05 0.008 1,000,000,000 abc 2 #2 80000 0.0025 0.008 abc 200 # 3 80000 0.25 0.008 bcd 8 #One 200 4 0.00002 bcd 100 #2 200 50 0.00002 bcd 2 #4 200 0.1 0.00002 cde 15 #One 3000 0.5 0.0003 cde 100 #4 3000 3.3 0.0003

As can be seen in Table 14, every phrase will have a phrase absolute ID corresponding to 1: 1. Therefore, the (phrase, article number) data is essentially equivalent to the (phrase absolute ID, phrase, article number) data. A phrase ID corresponds to each phrase absolute ID, a document number corresponding to the phrase phrase corresponds to each phrase absolute ID, and bibliographic matters correspond to each document number. Therefore, obtaining the document number of Table 14 and the number of occurrences of the phrase ID by the document number may be an example of a target document set as the document number. Similarly, a predetermined level of IPC may correspond to the document number digit, a specific applicant or a specific inventor may correspond, and in each case, the number of occurrences of the phrase ID for each target document set may be counted. Counting the number of occurrences per phrase ID per set of target documents may be performed through a join command for each table in the DBMMS.

At this time, for example, the selection rule constituting the representative phrase extraction policy in the representative phrase extraction policy DB is 1) the total number (T) of the total number of phrases (eg 1 billion) is 1 / Less than 1000%, 2) the number of appearances in the desired document number is 1 / 1,000,000% or more, and 3) the number of appearances / total appearances (%) is 10% to 1%, Review if you can be a representative phrase. The phrase abc cannot be representative of any of document numbers # 1 to # 4 (condition 1), and the phrase bcd can be a phrase representing the document number only in document number # 2 (document number 1 And condition 2) in violation of document number 4), the phrase cde may be a representative phrase only in document number # 4 (condition 3) in violation of document number 1). bcd and cde are not included, the representative phrase of document number # 2 may include bcd, the phrase abc is not included in document number # 3, and the phrase cde may be included in document number # 4.

As described above, the representative phrase extraction preprocessing module of the present invention may generate the counting data for each phrase by using the indexer 401-3 and the index of the search engine, based on the counting data for each phrase. As described above, the representative phrase can be extracted from the content of each document, a predetermined document set, a specific field, or all specific field content extracted from the predetermined document set. For example, in the upper paragraph, the representative phrase can correspond to bcd in document number # 2. Subsequently, if the representative phrase is extracted for each of the representative phrase extraction target document sets, the representative phrase may be stored in the representative phrase extraction target document set unit or an individual document unit constituting the representative phrase extraction target document set. That is, at least one representative phrase may correspond to one representative phrase extraction target document set unit or the individual document. The representative phrase may correspond to at least one of the individual field units of the representative phrase extraction target document set or the individual field units of the individual documents constituting the representative phrase extraction target document set. For example, the phrase bcd can be stored in association with the document number # 2. At this time, when the # 2 is called, the phrase bcd can be displayed.

In this case, too many representative phrases may be assigned to a specific document number. That is, for a particular document, there may be 100 representative phrases satisfying the above-described conditions, in which case, 4) 10 to 30 representative phrases are presented in order of high frequency of occurrence calculated for each representative phrase. By applying the extraction rule, the number of representative phrases can be limited. On the other hand, a particular document may occur when there is no one or more representative phrases, in this case, by applying the condition 3) under the above-described conditions, the number of representative phrases may be maintained at ten to thirty. .

If the above process is repeated for all documents or all document sets, representative phrases of a predetermined number range (for example, 10 to 30) for all documents may be matched. Therefore, the representative phrase corresponding to the document number can be obtained through the above process. The representative phrase and the calculated value for the representative phrase (e.g., the number of appearances in the document of the document number, the number of appearances in the whole, etc.) are treated as one example of the bibliographic matter for the document number, and are used for various analysis. Can be utilized.

In the above embodiment, the appearance frequency is calculated in document units, and the appearance frequency is 1) a document set or 2) a field unit within a document (such as a claim) or 3) a field inside all documents in a document set. It may be calculated on the basis of the contents contained in (for example, paragraph 1 of the claims included in all documents in IPC H04B 7/02 of the Republic of Korea, filed from 2000 to 2005). At this time, in the case of one document, it may be treated as one document (corresponding point of document number). However, in the case of two or more document sets, the correspondence of document numbers may seem to disappear, but any one of the following three methods may be used to easily solve this problem. This method extracts representative phrases in units of individual documents belonging to a set of documents, performs a union operation on the representative phrases, and then limits the number of representative phrases to a predetermined range (for example, 10 to 30). 2) if a representative phrase is already extracted for each document number, a method of obtaining a representative phrase corresponding to the document number belonging to the document set, performing a union operation, and then limiting the number of representative phrases to a predetermined range; After all the documents belonging to the document set are processed, each phrase may be used to extract a representative phrase corresponding to the unique ID of the document set rather than the document number. 3) Even in the case of the content unit included in the fields inside all the documents of the document set, the above 1) summing processing and 2) a document set unique ID correspondence processing method may be used. 3) The summation treatment method after acquisition may be used.

For example, a set of documents to be extracted from the representative phrases may be exemplified by 1) a patent document set generated by a method of specifying a patent document set by a specific patent classification code on a specific patent classification code system in a specific country DB, and 2) a specific country. A set of patent documents generated by a method of specifying a set of patent documents by a specific applicant's name in a DB, 3) the name of a specific inventor included in the patent document of a specific applicant in a specific country DB (ie, the name of the applicant and the name of an inventor) 4) A patent document set generated by a method of specifying a patent document set by 4) A patent document set generated by a method of specifying a patent document set by a specific agent name in a specific country DB, and 5) A specific country DB. A set of patent documents is specified by a specific patent classification code in a patent classification code system and a specific applicant name. 6) A patent document set generated by a method; 6) A patent document set generated by a method of specifying a patent document set by a specific patent classification code, a specific applicant name, and a specific inventor name in a specific patent classification system in a specific country database. 7) a set of patent documents generated by a method of specifying a set of patent documents by a specific applicant name and a specific agent name in a specific country DB, 8) a set of all patent documents in a specific country, 9) all of at least two countries Patent document set, 10) Patent document set specified in the unit of time set in 1) to 9), 11) Documents directly or indirectly cited or cited with the document set specified in 1) to 10) above. 121) A document set consisting of, or 121) generated by attaching specific conditions such as whether to register, request for examination, etc. in 1) to 9). Could be more than one set of patent documents.

Hereinafter, a method of extracting the representative phrase by using an indexer 401-3 of the search engine connected to the morphological analysis engine will be described. The reason for using a search engine is that the search engine generally has an excellent performance in counting the number of search word values, and the data indexed by the indexer 401-3 can be easily converted into data for DB. Because. The indexer 401-3 stores the term to be obtained and the document number from which the term is derived or, in some cases, the field name from which the term is derived. For example, (term 1, # 1) means that term 1 was obtained from document # 1, and (CL: term 1, # 1) means that term 1 is a patent claim (Claim, CL of document # 1). ) Is obtained from a field called). The search engine index stores a lot of data such as (term 1, # 1) or (CL: term 1, # 1), and enters term 1 as a search term or restricts the search field to the claims. If you enter 1, the document number # 1 is displayed as a search result. The search engine, on the other hand, handles very quickly how many terms 1 are in the entire document or how many are in the patent claims. (Usually, the search engine returns the total number of search results first. , And then show only the first part of the total number first.)

In this case, the phrase generation engine of the present invention obtains n real morphemes / terms that have passed through the morpheme analysis engine, and combines the received n terms in a predetermined manner to combine the terms with the document number. Together with the indexer 401-3 of the search engine. (If the document number is already known by the index, it is not necessary to transmit the document number. For convenience of understanding, the document number as the source of the combined term is transmitted.) In this case, the terms are transmitted. As described above, as described above, 1) processing by one sentence unit to combine and process all the terms obtained in one sentence by 2 to 5 terms in all possible ways, 2) the entire document unit A method of combining all the terms included in the term and processing them in every possible way by 2 to 5 terms (in this case, the combination of the terms is very large, which increases the processing time, but it is possible to generate the most precise phrase set. 3) Any one or more methods of sequentially combining 2 to 5 terms in one sentence may be used. Of course, the processing method of the entire document unit may also be handled in a corresponding manner as described above, the processing method of paragraph units, specific field units (claims, etc.).

For example, as follows. The phrase generation engine of the present invention receives the output real morphemes, and generates 1-5 (preferably 2-3) real morpheme arrays in the real morpheme units. For example, when one sentence included in one patent document is input, the actual morphemes to be output are a b c d e f, g in order of morphemes. In this case, when the phrase generation unit of the phrase generation engine is three words, the phrase generation engine sequentially generates actual morphological phrases such as abc, bcd, cde, def, and efg or ab, ac, ad, ae, af, The phrase can be generated in two real morphological units, such as ag, bc, ... fg (as described above). In this case, the phrase generation engine transmits a phrase (= multiple terms) such as abc, bcd, ab, ac, etc. to the indexer 401-3. The indexer 401-3 stores the received phrases (a plurality of terms) as (abc, # 1), (bcd, # 1), (ab, # 1), (ac, # 1), and the like. At this time. If the fields are limited, the phrase (pl: abc, # 1), (CL: bcd, # 1), (CL: ab, # 1), (CL: ac, # 1) Save as index including the specific field name. Table 15 below is a conceptual table showing an example of an index.

TABLE 15

Phrase Document number field abc #One D abc #One D abc #One D abc #One C abc #One C bcd #One D bcd #One C ac #One D abc #2 D abc #2 D abc #2 C

D is the description of the invention, C is the claim.

In Table 15, the phrase abc appears three times in the detailed description of the invention of document # 1, two times in the claims, and the phrase bcd is once in the detailed description of the invention of document # 1. And appeared once in the claims, showing that the phrase ac appeared once in the detailed description of the invention in article 1. On the other hand, the phrase abc appears twice in the detailed description of the invention of document # 2, and shows that one appears in the claims.

At this time, the phrase counting engine for each phrase of the present invention receives the index data, and generates data about the number of phrases (a plurality of terms) and a document number as a source of the phrase. The generated data is the same as or corresponding to the data generated in the above-described manner, and may be, for example, as shown in Table 16 below.

TABLE 16

Phrase ID Phrase Document number field Appearance One abc #One D 3 One abc #One C 2 One abc #2 D 2 One abc #2 C One 2 bcd #One D One 2 bcd #One C One 3 ac #One D One

As described above, the representative phrase extraction preprocessing module of the present invention may generate the counting data for each phrase by using the indexer 401-3 and the index of the search engine, based on the counting data for each phrase. As described above, the representative phrase can be extracted from the content of each document, a predetermined document set, a specific field, or all specific field content extracted from the predetermined document set.

Subsequently, if the representative phrase is extracted for each of the representative phrase extraction target document sets, the representative phrase may be stored in the representative phrase extraction target document set unit or an individual document unit constituting the representative phrase extraction target document set. That is, at least one representative phrase may correspond to one representative phrase extraction target document set unit or the individual document. The representative phrase may correspond to at least one of the individual field units of the representative phrase extraction target document set or the individual field units of the individual documents constituting the representative phrase extraction target document set.

The representative phrase extraction engine of the present invention has been described as generating document number information for a phrase. This is for the purpose of facilitating extracting a document set having the phrase as a representative phrase when the document number corresponds. If a document set having the phrase as the representative phrase can be easily extracted, various analyzes can be performed using the document set as the analysis target document set. For example, for a set of documents representing the phrase “radio frequency identification tag,” 1) various analyzes such as the number of applications, the number of registrations, the registration rate, the share rate, the concentration rate, the activity rate, etc., by country, year, and applicant. Indicators 2) Analysis of the distribution of IPC or other patent classification symbols by country / year / applicant by depth, 3) Citation and citation by detailed document set (by applicant and patent classification symbol) of this document set When a single document set is provided, such as an analysis, various analyzes supported by the analysis module and the patent intelligence module 60 of the present invention can be performed. If a document containing a specific phrase is extracted through a search engine without a document number being associated with the phrase, there is a problem in that it is not possible to determine whether the specific phrase is a representative phrase of the document.

The representative phrase information preprocessing method as described above is well illustrated in FIGS. 105 to 107.

A method of generating a combination of real morphemes for representative phrase extraction through the representative phrase extraction preprocessing module will be briefly described with reference to FIG. 105.

The representative phrase extraction preprocessing module obtains a document set including at least two patent documents (S5520), and processes two or more real morphemes by processing the entire patent document extracted from the document set or contents of various fields by a morpheme analyzer (S5520). Term) (S5530), and generate all possible combinations of two or more real morphemes in a predetermined manner according to a predetermined criterion with respect to the extracted real morphemes (S5540), and generate the combinations of the generated real morphemes. The patent document information is stored in the DB or transmitted to the search indexer 401-3 (S5550) to generate a real morpheme for extracting the representative phrase.

Based on FIG. 106, a method of generating a combination of real morphemes for extracting representative phrases for each field / partial document set through the representative phrase extraction preprocessing module and extracting a combination of real morphemes meeting a predetermined condition will be briefly described. do.

The representative phrase extraction preprocessing module obtains a document set including at least two patent documents (S5620),

The individual patent documents extracted from the document set or contents of various fields are processed by a morpheme analyzer to extract two or more real morphemes (terms) (S5630), and two or more according to preset criteria for the extracted real morphemes. Generate all possible combinations of real morphemes (S5640), store the generated combinations of real morphemes in the DB including the patent document information, or send them to a search indexer 401-3 (S5650), and the document set Obtain bibliographic details of each document constituting the document to generate a subset of documents based on at least one criterion (S5660), wherein the number of combinations of the real morphemes limited to the subset of documents and the actuals belonging to the set of documents (full) The number of combinations of morphemes is obtained, and the combinations of real morphemes satisfying predetermined conditions for the entire document or various fields are extracted ( S5670),

Referring to FIG. 107, a method of generating a combination of real morphemes for extracting representative phrases through the representative phrase extraction preprocessing module and comparing each document set will be described briefly.

The representative phrase extraction preprocessing module obtains a plurality of document sets including at least two patent documents (S5720),

For each document set, two or more real morphemes (terms) are extracted by processing with a morpheme analyzer on all individual patent documents extracted from the document set or contents of various fields (S5730), and generating a combination of the extracted real morphemes (S5730) In operation S5740, the generated combination of the actual morphemes is stored in a DB including the patent document information or transmitted to a search indexer 401-3 (S5750), and the plurality of combinations of the actual morphemes are preset based on a predetermined criterion. To compare each document set (S5760)

Multiple Patent Classification Symbol Relationship Preprocessing Module

There are many documents in which a patent document is assigned two or more patent classification codes belonging to one patent classification code system. All. This trend, and 1) a plurality of patent classification symbols are assigned to a single patent document according to 1) expansion of the scope of 1 patent application, 2) advancement of the invention, 3) trend of technology convergence, 4) precision of patent classification symbol assignment, etc. There is a growing tendency to be. The fact that two patent classification symbols are assigned to a patent document means that the patent document has two or more technical viewpoints, technical configurations, technical contents, technical aspects, technical attributes, and technical features corresponding to each of the two or more patent classification symbols. It may mean that things like Therefore, a document to which two or more patent classification codes are assigned needs to be treated differently from a document to which one patent classification code is assigned.

The plural patent classification code relation preprocessing module of the present invention is made of only a document to which at least two patent classification codes are assigned, and one of the objects is to find and utilize the hidden relationship between the plural patent classification codes. The plurality of patent classification code relationship preprocessing module discovers the hidden relationship between the same patent classification code based on only one of IPC / UPC / FT (that is, based on the same type of patent classification code), When two or more patent classification symbols selected from among IPC, UPC, and FT are included in a single document, the same type of multiple patent classification symbol relational preprocessing module aimed at generating information on the convergence of technology; And a plurality of heterogeneous patent classification code relationship preprocessing modules for grasping the relationship between the above-described patent classification codes. FIG. 60 shows the configuration of the plurality of patent classification code relational preprocessing modules, the configuration of the same plurality of patent classification code relational preprocessing modules, and the configuration of the heterogeneous plurality of patent classification code relational preprocessing modules.

Homogeneous Multiple Patent Classification Symbol Relationship Preprocessing Module

Hereinafter, the same kind of patent classification code relation preprocessing module of this invention is demonstrated. Even if a document is assigned a plurality of patent classification codes of one kind or two or more types of patent classification codes, the processing of the plurality of patent classification codes on the selected one patent classification code system is performed. The classification code relational preprocessing module performs this. The processing of the plurality of patent classification codes is performed by the same kind of patent classification code relational preprocessing engine of the present invention, and the result of the execution is the same representative phrase of the present invention. 3) are stored.

Theoretically, the explanation is given first, followed by concrete examples. First, as a simple model, a relationship processing method for a plurality of patent classification symbols in a case where two documents are included in one document is presented. Assume that a patent document is assigned a patent classification code A1 and a patent classification code B1. At this time, the parent node (parent patent classification code) of A1 is A2, and the parent node of A2 is A3 in the patent taxonomy, and when this rule continues, the parent node of A1 is A2, A3, A4,. ... can be An. Similarly, B1 can be assigned to B2, B3, B4, and Bn. The highest of these An and Bn become sections. And above the section, let's just have a patent classification code name such as IPC. In this case, the An and the Bn have at least one or more common nodes under a section or patent classification system name of IPC (even if the sections are farthest from each other, such that they share the same classification system of IPC). Are identical at the lowest level on the patent classification code system of the common nodes. At this time, if the lowest common patent classification code of A1 and B1 is Ai + 1 and Bj + 1, respectively, Ai + 1 = Bj + 1. When A1 and B1 are present in the contents of one patent document given above, A1 and B1 are related to the patent document (i.e., the invention or inventions described in one patent document are equivalent to those of A1 and B1). In this case, A1 is related to all parent nodes of B1 (B2, B3, ... Bj +). . Similarly, B1 has a relationship with all of A1's parent nodes A2, A3, ... A (i + 1). At this time, since the relationship between A1 and Bj + 1 is the same as Bj + 1 and Ai + 1, it is a parent node anyway, so there is no need to discuss the relationship, and the relationship between all As and Bj + 1 The surnames exist in the parent node anyway, so there is no need to discuss them separately. In other words, the lowest common parent node would not be considered in the relation calculation table.

In this case, when processing the plurality of patent document symbols, the same type of patent classification code relation preprocessing module may generate data as shown in Table 17 below with reference to the patent classification code system (tree structure) data. At this time, Ai + 1 and Bj + 1 are cells that are not needed in the actual table (only Ai and Bj are needed in the relational table) or for convenience of explanation (to show that there are no values in the cells). Display. That is, in an actual table, a table is comprised only by Ai and Bj.

TABLE 17

A1 A2 Ai Ai + 1 B1 One One One One B2 One One One One One One One One Bj One One One One Bj + 1 = Ai + 1

That is, all of the crossing pairs between A1 to Ai and B1 to Bj (for example, (A1, B1), ... (Ai, Bj), etc.) can be said to have relevance through the given patent document. All.

Korean Patent Application No. 10-2005-0111868 is issued January 04, 2006 H04B 7/26 and H04B 7/15. This will be described by way of example. The parents of H04B 7/26 become H04B 7/24, H04B 7/00, H04B, H04, H in turn. The parents of H04B 7/15 are, in turn, H04B 7/14 and H04B 7/00. In this case, the lowest common patent classification code is H04B 7/00. Therefore, the table is shown in Table 18 below. The cell associated with H04B 7/00 has no cell value as described above.

TABLE 18

H04B 7/26 H04B 7/24 H04B 7/00 H04B 7/15 One One H04B 7/14 One One H04B 7/00

Therefore, based on the Republic of Korea Patent Application No. 10-2005-0111868, a pair of association of nine crossed patent classification symbols will be generated. At this time, when two or more patent classification codes exist for one document, the first one is referred to as the main patent classification code. Therefore, in this document, H04B 726 becomes the main patent classification code and is located at the parent node of the main patent classification code. Other higher patent classification symbols are also treated as the main patent classification symbol in a given patent document. The problem of the main patent classification code plays an important role in relation to handling when there are three or more patent classification codes in a document. When there are three or more patent classification codes, the patent classification codes can be processed as follows.

First, all patent classification codes are treated equally. When there are n patent classification codes, a method of taking a combination of two patent classification codes among the n patent classification codes is nC2 in mathematical terms. For each combination of these nC2 patent classification codes, a process is performed when the patent classification codes are two, and as a result, nC2 tables are obtained, and each table has information about all crossing pairs. have.

Next, the main patent classification code is determined, the main patent classification code is taken, the other one is taken from n-1 sub-patent classification codes, n-1 combinations are taken, and then n-1 tables for each combination. In each table, first, all patent classification codes are treated equally. When there are n patent classification codes, the method of taking a combination of two patent classification codes among the n patent classification codes is nC2 pieces mathematically according to the combination theory. For each combination of these nC2 patent classification codes, a process is performed when the patent classification codes are two, and as a result, nC2 tables are obtained, and each table has information about all crossing pairs. have. In this case, a weight associated with the main patent classification code may be given to the crossing pairs related to the main patent classification code. Then, n-1C2 combinations are obtained for n-1 subpatent classification codes, and the above processing is performed for each combination to obtain information on all intersecting pairs. In this case, a weight associated with the subpatent classification code may be given to the crossing pairs related to the subpatent classification code.

Korean Patent Application No. 10-2006-0012606 has the H04B 7/04, H04B 7/155, and H04Q 7/30 patent classification symbols assigned as of January 2006. The parent nodes of H04B 7/04 are H04B 7/02, H04B 7/00, H04B, H04 and H, respectively, and the parent nodes of H04B 7/155 are H04B 7/15, H04B 7/14 and H04B 7 /, respectively. 00, H04B, H04 and H, and the parent nodes of H04Q 7/30 are H04Q 7/20, H04Q 7/00, H04Q, H04 and H, respectively. At this time, the lowest common patent classification code is H04B 7/00 for the combination of H04B 7/04 and H04B 7/155, H04 for the combination of H04B 7/155 and H04Q 7/30, and H04B 7/04 and H04Q 7 It becomes H04 about / 30.

At this time, if all patent classification codes are handled equally, the following table of 3C2 numbers appears. First, Table 19 is a relation table of H04B 7/04 and H04B 7/155.

TABLE 19

H04B 7/04 H04B 7/02 H04B 7/00 H04B 7/155 One One H04B 7/15 One One H04B 7/14 One One H04B 7/00

Next, Table 20 is a relation table of H04B 7/04 and H04Q 7/30.

TABLE 20

H04B 7/04 H04B 7/02 H04B 7/00 H04B H04 H04Q 7/30 One One One One H04Q 7/20 One One One One H04Q 7/00 One One One One H04Q One One One One H04

Next, Table 21 is a relation table of H04B 7/155 and H04Q 7/30.

TABLE 21

H04B 7/155 H04B 7/15 H04B 7/14 H04B 7/00 H04B H04 H04Q 7/30 One One One One One H04Q 7/20 One One One One One H04Q 7/00 One One One One One H04Q One One One One One H04

At this time, a method of giving a weight will be described. There are two main weights. First, there is a method of giving a greater weight to the relevance table with the main patent classification code described above, and less weighting the relevance table between the sub-patent classification codes. For example, there may be a method of giving 1 to the relevance table with the main patent classification code and 0.5 to the relevance table with the subpatent classification code.

Then, the relation table of H04B 7/155 and H04Q 7/30 will be as shown in Table 22 below.

Table 22

H04B 7/155 H04B 7/15 H04B 7/14 H04B 7/00 H04B H04 H04Q 7/30 0.5 0.5 0.5 0.5 0.5 H04Q 7/20 0.5 0.5 0.5 0.5 0.5 H04Q 7/00 0.5 0.5 0.5 0.5 0.5 H04Q 0.5 0.5 0.5 0.5 0.5 H04

Secondly, there will be a method of equally weighting a patent document having a plurality of patent classification codes. For example, when two patent classification symbols have a pair of patent classification symbols, the value of the association pair is 1, and for a pair of patent classification symbols having n patent classification symbols, the pair value of the relation is given. You can evenly allocate the number of possible combinations of nC2. That is, the weight may be 1 / (nC2). For example, when doing so, the relation table of H04B 7/155 and H04Q 7/30 is as Table 23 below.

TABLE 23

H04B 7/155 H04B 7/15 H04B 7/14 H04B 7/00 H04B H04 H04Q 7/30 1/3 1/3 1/3 1/3 1/3 H04Q 7/20 1/3 1/3 1/3 1/3 1/3 H04Q 7/00 1/3 1/3 1/3 1/3 1/3 H04Q 1/3 1/3 1/3 1/3 1/3 H04

Third, there may be a method of assigning different weights to the pair associated with the main patent classification code and the pair associated with the subpatent classification code for a document having three or more patent classification codes. For example, when there are n patent classification symbols, the number of combinations related to the main classification code becomes (n-1), so 75% of the total is assigned to the relevant combination with the main classification code, and the relevant pairs related to them. Can be given a weight of 0.75 * 1 / (n-1). When all combinations associated with subpatent classification symbols are weighted 25%, all pairs associated with subpatent classification symbols may be weighted 0.25 * 1 / ((n-1) C2). will be. At this time, the distribution of the% may be adjusted to a predetermined value. For example, if n> 5, 50% would be allocated to those related to the major patent classification code, 50% to those related to the subpatent classification code, and 60% to 40% for n = 4. If n = 3, we can allocate 75% versus 25%. That is, when the number of IPCs is greater than 3, (sub IPC, Sub IPC) patent classification symbol combination pairs are generated. Therefore, (main IPC, Sub IPC) patent classification symbol combination pairs and (Sub IPC, Sub IPC) patent classification symbol combinations. You can give different weights to the pair.

If n = 3, assign 75% vs. 25% to create the following table. First, Table 24 is a relation table of H04B 7/04 and H04B 7/155.

TABLE 24

H04B 7/04 H04B 7/02 H04B 7/00 H04B 7/155 0.75 / 2 0.75 / 2 H04B 7/15 0.75 / 2 0.75 / 2 H04B 7/14 0.75 / 2 0.75 / 2 H04Q 7/00

Table 25 below is a relation table of H04B 7/04 and H04Q 7/30.

TABLE 25

H04B 7/04 H04B 7/02 H04B 7/00 H04B H04 H04Q 7/30 0.75 / 2 0.75 / 2 0.75 / 2 0.75 / 2 H04Q 7/20 0.75 / 2 0.75 / 2 0.75 / 2 0.75 / 2 H04Q 7/00 0.75 / 2 0.75 / 2 0.75 / 2 0.75 / 2 H04Q 0.75 / 2 0.75 / 2 0.75 / 2 0.75 / 2 H04

Table 26 below is a relationship table of H04B 7/155 and H04Q 7/30.

TABLE 26

H04B 7/155 H04B 7/15 H04B 7/14 H04B 7/00 H04B H04 H04Q 7/30 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 H04Q 7/20 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 H04Q 7/00 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 H04Q 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 0.25 / 1 H04

On the other hand, when the contribution of individual documents having a plurality of patent classification codes is the same (for example, 1) in document units, the following weighting method can be considered. As an extreme example, Korean Patent Application No. 10-2005-0042032 discloses two patent classification symbols, such as H04B 7/02 and H04B 7/14, and when they are in a next sibling relationship (i.e., H04B 7/02). And H04B 7/14, the parent node is the same as H04B 7/00, and the parent node is the parent patent classification code.) There is only one relationship table of H04B 7/02 and H04B 7/14. There is also one cell. Table of the Republic of Korea Patent Application No. 10-2005-0042032 Document is shown in Table 27 below.

TABLE 27

H04B 7/02 H04B 7/00 H04B 7/14 One H04B 7/00

At this time, it can be seen that H04B 7/02 and H04B 7/14 are strongly related to the specific patent document. On the other hand, based on the Republic of Korea Patent Application No. 10-2006-0012606, it has three, each table has six, 16 and 20 cells for each table. At this time, based on one patent classification code combination (H04B 7/02, H04B 7/14), the contribution of the document of Korean Patent Application No. 10-2005-0042032 and the Korean Patent Application No. 10-2006- The question arises: how do we evaluate the contribution of 0012606? The weighting process of the contribution may be as follows.

First, a combination of specific patent classification symbols (Ai, Bj) in any document, regardless of the existence or occurrence of a relationship, regardless of the number of cells occurring in the table or the table (e.g. (H04B 7/02, H04B) 7/14) can be treated as the contribution weight of the same value (e.g. 1).

Second, in consideration of the number of tables and the number of cells, the contribution weight of each patent document for a combination (Ai, Bj) of a specific patent classification code may be given differently. At this time, the contribution weight for the combination (Ai, Bj) of the specific patent classification code may be divided by the total number of all cells present in all tables generated in each patent document. For example, for (H04B 7/02, H04B 7/14), the contribution weight according to Korean Patent Application No. 10-2005-0042032 is 1 (= 1 table, 1 total number of cells), and Korea The contribution weight of patent application 10-2006-0012606 is 1 / (6 + 16 + 20) (= number of all cells in three tables).

Third, the weight of the contribution may be divided by the total number of cells in the table in which the combination (Ai, Bj) of the specific patent classification code appears. For example, for (H04B 7/02, H04B 7/14), the contribution weight according to Korean Patent Application No. 10-2005-0042032 is also 1 (= 1 table, total number of cells 1). And the contribution weight of Korean Patent Application No. 10-2006-0012606 is 1 / (6) (= number of all cells in three tables).

Fourth, there is a method of inversely calculating the contribution weight of the second or third. In other words, the weight of the contribution may be multiplied by the total number of cells in the table in which the combination (Ai, Bj) of the specific patent classification code appears. For example, for (H04B 7/02, H04B 7/14), the contribution weight according to Korean Patent Application No. 10-2005-0042032 is also 1 (= one table, the total number of cells 1), and the contribution weight of Korean Patent Application No. 10-2006-0012606 is 6 (= number of all cells in three tables). This contribution weighting method will be more promising in calculating the degree of convergence of heterogeneous techniques. In other words, if the number of cells in the table is small, the technology may be grouped as a highly related technology in the technology classification system itself. Because it's a distant technology.

By the above method, the following information about the patent classification code combination Ai and Bj may be obtained.

First, it is counting information about (H04B 7/02, H04B 7/14) (Ai, Bj). When the counting refers to the present invention or assigns a weight corresponding to the present invention, weighting counting information to which at least one or more weights are applied to the patent classification code combination Ai and Bj may be obtained.

Second, all patent classification code combinations store information on which document the combination was generated when the combination was created. That is, the combination of the patent classification symbols (H04B 7/02, H04B 7/14) is generated by Korean Patent Application No. 10-2006-0012606, and Korean Patent Application No. 10-2005-0042032, ( H04B 7/15, H04B 7/24) is a combination of patent classification code generated by the document of Korean Patent Application No. 10-2005-0111868. Therefore, a document unique number corresponds to a patent classification code combination (Ai, Bj), and the document unique number can correspond to all information about the document, including all bibliographic details of the document having the document unique number. have. There are two related patent classification symbol combinations in all the tables, and a patent document from which the patent classification symbol combinations are found corresponds, and the bibliography of the patent document may correspond to the patent document. That is, Korean Patent Application No. 10-2005-0111868 corresponds to a document unique number for all patent classification code combinations Ai and Bj present in the relationship tables H04B 7/155 and H04Q 7/30. In addition, in the bibliographic matters of the document corresponding to the document unique number, the application number is 10-2006-0012606 for the document, the applicant is Samsung Electronics, the inventors are Choi Do-in and Hwang Sung-taek, and the application date is February 09, 2006. As such, the bibliographic information and all information about the document corresponding to the document unique number can be mapped. Of course, the document unique number may be an application number, and if the patent application is filed in the United States or the like, the result of the operation may be mapped (mapping) in the bibliography of the United States application.

It is preferable that all of the above-described correspondence between patent classification code combinations Ai and Bj and document numbers are stored in the DB together with the counting value upon generation of the patent classification code combinations Ai and Bj.

When the same or multiple patent classification code relation preprocessing module has two or more patent classification codes for a patent document obtained, the patent classification code system (tree structure) data of each combination of patent classification codes is referred to. Create a preset number of tables for each combination of symbols. In this case, the numerical value entered in each table may be a value reflecting a weight based on a predetermined criterion. Subsequently, the homogeneous plurality of patent classification code relational preprocessing module applies a predetermined contribution weight value to the combinations (Ai, Bj) of the patent classification codes extracted from the tables, for each combination (Ai, Bj) of the patent classification codes. Relationship data for each combination (Ai, Bj) of the patent classification code as shown in Table 28 may be generated.

To illustrate this model, assume that there are only three patent documents: Korean Patent Application No. 10-2005-0111868, Korean Patent Application No. 10-2006-0012606, and Korean Patent Application No. 10-2005-0042032. Of course, iterative processing of all the documents obtained will enable generation of the following data for all patent classification code combinations (Ai, Bj) based on all document sets. , Bj) would create the following table:

TABLE 28

Patent Classification Symbol Combination (Ai, Bj) Document number Frequency (simple frequency) 1 / nC2 equal allocation (M, S), (S, S) differential allocation (H04B 7/15, H04B 7/26) 10-2005-0111868 One One One (H04B 7/15, H04B 7/24) 10-2005-0111868 One One One (H04B 7/14, H04B 7/26) 10-2005-0111868 One One One (H04B 7/14, H04B 7/24) 10-2005-0111868 One One One (H04B 7/155, H04B 7/04) 10-2006-0012606 One 1/3 0.75 / 2 (H04B 7/155, H04B 7/02) 10-2006-0012606 One 1/3 0.75 / 2 (H04B 7/15, H04B 7/04) 10-2006-0012606 One 1/3 0.75 / 2 (H04B 7/15, H04B 7/02) 10-2006-0012606 One 1/3 0.75 / 2 (H04B 7/14, H04B 7/04) 10-2006-0012606 One 1/3 0.75 / 2 (H04B 7/14, H04B 7/02) 10-2006-0012606 2 1/3 0.75 / 2 (H04B 7/14, H04B 7/02) 10-2005-0042032 2 One One One 1/3 .. One 1/3 .. (H04Q 7/00, H04B 7/00) 10-2006-0012606 One 1/3 0.25 / 1 (H04Q 7/00, H04B) 10-2006-0012606 One 1/3 0.25 / 1 (H04Q, H04B) 10-2006-0012606 One 1/3 0.25 / 1

(n is the number of patent classification symbols), (M, S) refers to (main IPC, Sub IPC) and (S, S) refers to (sub IPC, Sub IPC), and if n> 2 or more, (M, S) , The (S, S) differential allocation weighted 75% / (n-1) for (M, S) and 25% / (n-1) C2 for (S, S). When n = 2, just 1 was given.

At this time, when paying attention to the (H04B 7/14, H04B 7/02) patent classification code combination pair, it can be seen that two document numbers correspond. That is, the (H04B 7/14, H04B 7/02) pair of patent classification code combinations means that they appeared simultaneously in two documents, and the value of Frequency becomes two. A weighting factor or contribution weight of a predetermined method such as 1 / nC2 equal allocation and (M, S) and (S, S) differential allocation may be assigned.

Table 28 shows a combination of patent classification symbols for three patent documents of Korean Patent Application No. 10-2005-0111868, Korean Patent Application No. 10-2006-0012606, and Korean Patent Application No. 10-2005-0042032 ( Ai and Bj) are generated. It will be apparent to those skilled in the art that the above method can be applied recursively or recursively to all documents obtained, and based on any one or more of the predetermined weight or the contribution weight, the combination of each patent classification code (Ai, Bj It will also be apparent that the relational data can be generated for.

As described above, the same plural patent classification code relational preprocessing module includes 1) a document number for which (Ai, Bj) is obtained, 2) a weight according to a weighting policy j for a specific situation i Wij), 3) whether or not (M, S) or (S, S), and 4) other (Ai, Bj) attribute information for one or more may be stored in the DB.

Naturally, the method of processing the plurality of patent classification codes by the same type of patent classification code relation preprocessing module is not limited to the IPC. That is, a plurality of patent classification symbols may be processed for USPC for US patent documents, and for FT, FI, and ECLA for Japanese patent documents, in the same manner as the IPC.

Technology Fusion Pattern Analysis Module

In this case, 1) a document number having the value (Ai, Bj) stored by the same or multiple patent classification code relational preprocessing module for the combination (Ai, Bj) of the patent classification code, 2) a weighting policy j for a specific situation i The following information is obtained by processing any one or more of the attribute information for the weight (Wij), 3) (M, S) or (S, S), and 4) other (Ai, Bj). Can be calculated, obtained or extracted. The following information is performed by the technology fusion pattern analysis module of the present invention.

There are essentially two pieces of information that the technology fusion pattern analysis module can process.

First, the same plurality of patent classification code relational preprocessing module can obtain the number of occurrences of each patent classification code pair. For example, (H04B 7/26, H04B 7/15) appears once in Korean Patent Application No. 10-2005-0111868, while (H04B 7/14, H04B 7/02) is 10-2006-0012606 And both 10-2005-0042032. (When not the main patent classification code, or when all the patent classification code is treated equally, it can be treated as (Ai, Bj) = (Bj, Ai). If either is the main patent classification code, (Ai, Bj) And (Bj, Ai) may be treated differently, or may be treated the same.) Thus, the homogeneous plural patent classification code relational preprocessing module records the number of occurrences of the pair for every pair of patent classifications. Can be set. In addition, it is possible to store patent document numbers for the patent documents issued by the pair.

Second, the homogeneous plural patent classification code relational preprocessing module may find a patent classification code that is most frequently combined with the specific patent classification code when the specific patent classification code is given. This may be readily obtained by counting all pairs containing the given specific patent classification code and the number of times they come out. In this case, a patent classification code that is most frequently combined with the specific patent classification code may be found by applying a predetermined weight or a contribution weight, and with reference to the patent classification code system (tree structure), a predetermined level may be found. You will find patent classification symbols that bind most frequently in the unit. That is, for example, the homogeneous plural patent classification code relational preprocessing module may find a patent classification code that most frequently combines with H04B 7/26, and at the IPC sub class, IPC main group, or IPC subgroup level. One may find patent classification symbols that most frequently combine with H04B 7/26.

Third, the homogeneous plural patent classification code relation preprocessing module corresponds to the bibliographic matter corresponding to the patent classification code pair and the patent document number through various correspondences between the patent classification code pair and the patent document number. Can be generated. The analysis result may obtain an analysis result of the patent classification code pair and at least one combination selected from 1) country, 2) period, 3) applicant, 4) inventor and 5) agent. For example, you can see the pairs of patent classification symbols that are combined the most from Samsung's Korean patent application from 2000 to 2005. Furthermore, the same type of patent classification code relational preprocessing module may find a patent classification code that best combines with the patent classification code when the patent classification code is given. Furthermore, when the patent classification code is given, at least one combination selected from among 1) country, 2) period, 3) applicant, 4) inventor and 5) agent is the patent classification code that best combines with the patent classification code. You can also find it in a set of documents. For example, the homogeneous plural patent classification code relational preprocessing module may find the patent classification code for each step in the patent classification code system in which H04B 7/26 and the majority are combined. It may also be found in US patent applications from 2003 to 2005. Through this, it will be possible to grasp the degree of technology convergence shown in the patent specification for a particular subject during the specific period.

Fourth, the homogeneous plural patent classification code relational preprocessing module can match a particular set of patent documents related to the combination (Ai, Bj) of the patent classification code with a given combination of patent classification codes (Ai, Bj). There will be. In this case, when the same type of patent classification code relation preprocessing module performs various types of analysis on the corresponding set of patent documents, the analysis result is an analysis result of the combination (Ai, Bj) of the patent classification code. . The special patent document set includes, for example, 1) a patent document set generated by a method of specifying a patent document set by a specific patent classification code on a specific patent classification code system in a specific country DB, and 2) a name of a specific applicant in a specific country DB. A set of patent documents generated by a method of specifying a set of patent documents by 3) a set of patent documents by a specific inventor name (ie, applicant name and inventor name) included in the patent document of a specific applicant in a specific country DB 4) A patent document set generated by a method of specifying a patent document; 4) A patent document set generated by a method of specifying a patent document set by a specific agent name in a specific country database; By a method of specifying a patent document set by a specific patent classification code and a specific applicant name Set of patent documents generated; 6) set of patent documents generated by a method of specifying a set of patent documents by a specific patent classification code on a specific patent classification code system, a specific applicant name, and a specific inventor name in a specific country database; A set of patent documents generated by a method of specifying a set of patent documents by a specific applicant name and a specific agent name in a specific country DB, 8) a set of all patent documents in a specific country, 9) a set of all patent documents in at least two countries, 10) A patent document set specified in units of time periods set in 1) to 9) above; 11) A dictionary for a patent document set generated by combining specific conditions such as registration status, request for examination, etc. in 1) to 9); You can count on. Of course, the above calculation may be performed for a document set specified or generated by the user.

The same or more patent classification code relational preprocessing module may be provided with two or more patent classification codes among the patent documents constituting the document set with respect to any one or more document sets specified in 1) to 11). It is possible to extract only the existing documents, and obtain the combination (Ai, Bj) of the patent classification code only for the extracted document set. Regarding the obtained combination of patent classification symbols Ai and Bj, the relation data as shown in the table is generated, and the generated relation data is sorted or analyzed (frequency, mode, maximum, minimum, year number). For example, the analysis of the number of subjects, such as the applicant and the inventor, is an example. After the above analysis, the technology convergence can be identified from the viewpoint of the combination of the patent classification symbols (Ai, Bj) according to any one or more document sets or the extracted document set objects specified in 1) to 11). Information can be obtained.

The above-described repetitive execution method performed by the homogeneous patent classification code relational preprocessing module will be merely an embodiment for finding a value generated as a result of the present execution, and those skilled in the art will appreciate the concept of the present invention using computational techniques. You can try various calculation methods. Tables having the results (for example, including weighting or non-weighting) for each patent classification symbol pair for all patent classification symbols of all patent documents to which the patent classification symbol is assigned two or more. ) May be stored in a DB or provided as a view or a materialized view generated in real time.

An exemplary process of performing the above process is illustrated in FIG. 103. A description with reference to FIG. 103 is as follows. The homogeneous plural patent classification code relation preprocessing module obtains a document set including at least one patent document (S5320), and includes a plural patent classification code including two or more patent classification symbols among patent documents included in the document set. Extract (S5330), generate a combination of a plurality of patent classification symbols included in a plurality of patent document symbol documents according to a predetermined criterion for the division process of the main patent classification symbol and the sub-patent classification symbol (S5340), and the combination For each patent classification code included in the combination, information about a parent node on a patent classification code system is obtained (S5350), a lowest common patent classification code is obtained for the parent node (S5360), and the lowest common Generate a pair of each parent node patent classification code including the self up to the patent classification code (S5370), and each pair Stores the weighting applied uniformly or pre-set criteria, and (S5380), and generates the statistical values, parameters or values calculated according to predetermined criteria to the destination for each pair of the stored (S5390)

Next, a method of preprocessing and comparing a plurality of patent classification codes for at least two sets of documents through the same plurality of patent classification code relational preprocessing modules will be described with reference to FIG. 104.

The homogeneous plural patent classification code relation preprocessing module obtains two or more document sets including at least one patent document (S5410), and has two patent classification symbols among patent documents included in the document set for each document set. Extracting the document containing the plurality of patent classification symbols as described above (S5430), and generates a combination of the plurality of patent classification symbols included in the plurality of patent document symbol documents in accordance with a predetermined criterion for the classification processing of the main patent classification symbol and the sub-patent classification symbol. (S5440), and obtain information about a parent node on a patent classification code system for each patent classification code included in the combination for each combination (S5450), and obtain the lowest common patent classification code for the parent node; (S5460), a pair of each parent node patent classification code including itself until immediately before the lowest common patent classification code (S5470), the pairs are weighted by applying an equality or a predetermined criterion and stored (S5480), and generate a statistical value, parameter, or calculated value according to a predetermined criterion for each of the stored pairs. In operation S5490, the statistical value, parameter, or calculated value is compared for each document set (S5495).

115-117 show an exemplary embodiment of a method for analyzing multiple patent classification codes.

115 is an exemplary method of obtaining a patent classification code and presenting analysis information on the least fused patent classification codes based on the obtained patent classification code.

The homogeneous plural patent classification code relation preprocessing module obtains at least one patent classification code (S6520), obtains the most frequent fusion patent classification code information on the obtained patent classification code (S6530), and obtains the least fusion patent classification code. In response to the patent classification code system, the display is displayed in at least one or more steps, or at least one predetermined analysis is performed on a set of patent documents corresponding to the least fused patent classification code (S6540).

116 is an exemplary method of obtaining a set of documents, extracting the most frequent patent classification code from the obtained document set, and presenting analysis information on the least fused patent classification symbols based on the extracted patent classification code.

The homogeneous plural patent classification code relation preprocessing module receives at least one document set (S6620), obtains a patent classification code for each document constituting the document set (S6630), and obtains the patent classification code obtained for each document. At the document set level, the ranking is given based on the mode (S6640), and the information about the most frequently fused patent classification symbol is obtained for each individual patent classification symbol to which the ranking is assigned (S6650), and a patent classification system is applied to the least fusion patent classification symbol. The display may be displayed by reflecting at least one or more steps, or at least one predetermined analysis may be performed on a set of patent documents corresponding to the least fused patent classification code (S6660).

FIG. 117 shows analysis information by obtaining a document set, extracting the least common fusion patent classification symbols on the basis of patent classification symbols extracted from individual document units, and integrating the extracted least fusion patent classification symbols. It is an exemplary method to present.

The homogeneous plural patent classification code relation preprocessing module receives at least one document set (S670), obtains a patent classification code for each document constituting the document set (S670), and obtains the patent classification code obtained for each document. Obtaining the most frequent fusion patent classification code information as a reference (S670), integrating the least fusion patent classification code information obtained by individual documents (S670), and reflecting the patent classification code system for the integrated least fusion patent classification code at least. The display may be performed in one or more steps, or at least one predetermined analysis may be performed on a set of patent documents corresponding to the unified modest fusion patent classification code (S670).

The above-mentioned plural patent classification code relation preprocessing module does not apply the subpatent classification code including the feature of the feature of the present invention, but is simply a patent classification (PC) symbol pair (PC1) included in each document. .., PCi, PCj, ..., PCn). That is, in the above, the patent classification code combination pair may be configured only for (PCi, PCj) shown in the document without considering the upper patent classification code on the given patent classification code system for the given PCi. At this time, since PC1 is the main patent classification code, the above-described main vs. The weighting between subs can be applied as it is.

Heterogeneous Multiple Patent Classification Symbol Relationship Preprocessing Module

The same type of patent classification code relational preprocessing module described the processing for the same type of patent classification code. Next, the heterogeneous plural patent classification code relation preprocessing module will be described. The heterogeneous plural patent classification code relational preprocessing module is completely similar to the operation of the homogeneous plural patent classification code relational preprocessing module. Almost all patent documents issued in the United States combine IPC and USPC in one patent document, and IPC and FT are simultaneously granted in patent documents issued in Japan. IPC1, in one patent document… IPCn is USPC1,... Assume that USPCm is given. In this case, IPC1,... In terms of heterogeneous plural patent classification code relation processing. IPCn and USPC1,... There is a combination of four kinds of relationships between USPCm. First, the relationship between IPC1 and USPC1, which is a pair of main IPC and main USPC, respectively. And, IPC1 and USPC2,… , Which is a pair of USPCm, which is a pair of main IPC and sub USPC. And IPC2,... A pair of IPCn and USPC1, which is a pair of sub IPC and main USPC. And, IPC2,… IPCn and USPC2, respectively. , For each pair of USPCm, which is a pair of sub IPC and sub USPC.

The heterogeneous plural patent classification code relational preprocessing module may perform the same processing as that performed by the homogeneous plural patent classification code relational preprocessing module between the heterogeneous plural patent classification symbols under each kind. In this case, since the heterogeneous patent classification code is targeted, the minimum common patent classification code may not be limited to a preset value. That is, in the case of IPC, the upper patent classification code can be defined for a given IPC and a given USPC pair, targeting only subclasses and classes in USPCs. That is, the homogeneous plural patent classification code relational preprocessing module generates the relation table as described above for a given patent classification code combination Ai, Bj (where Ai and Bj are patent classification codes of the same system). As described above, the heterogeneous plural patent classification code relational preprocessing module may generate the relation table as described above with respect to the patent classification code combination Ai, Bj (where Ai and Bj are patent classification codes of different systems). Can be. A weight or contribution weight may be given to the generated relation table. Further, a patent document number can be associated with each of the heterogeneous patent classification code combinations Ai and Bj (where Ai and Bj are patent classification codes of different systems), and the bibliography obtained from the patent document number. It will also be obvious that the counting and calculations reflecting the matter can be performed. In addition, it is possible to obtain heterogeneous patent classification symbol combinations Ai and Bj for a specific document set, and to perform an analysis equivalent to the homogeneous patent classification symbol relational preprocessing module for these combinations.

Statistical preprocessing module by patent classification code

Multilevel of patent classification code

The statistical preprocessing module for each patent classification code of the present invention performs preprocessing for finding a predetermined statistical value for at least one country-specific patent DB for at least one patent classification code on the at least one patent classification code system.

Fig. 61 illustrates the structure of the statistical preprocessing module for each patent classification code. The statistics preprocessing module for each patent classification code may include a statistics preprocessing engine 3210 for each patent classification code for preprocessing statistical values for each patent classification code, and a view and a table storing the preprocessed statistical values for each patent classification code. Or a preprocessed patent classification code statistics DB 3230 as an information organization.

The patent classification code handled in the statistical preprocessing module for each patent classification code is any one or more of IPC, USPC, FT, FI, and ECLA. In the patent classification symbol preprocessing module for each country, the patent DB of each country includes a patent DB issued by Korea, USA, Japan, and Europe (EPO), and may include a patent DB issued by the patent office of other countries. Do. Each country has a common IPC, and each country has a special classification scheme (USPC or UPC for the US Patent Office, FT, FI for the Japan Patent Office, ECLA for the European Patent Office, etc.). It is desirable to find predetermined statistical values for patent DBs of countries having two or more patent classification systems independently for each of the two or more classification systems. That is, in the US, it is necessary to preprocess IPC and USPC separately. In addition, each patent classification code system has a level for each system. In the case of IPC, there are a section, a subsection, a class, a subclass, a group, and a subgroup, and subclasses below the subgroup express the hierarchical structure as the number of dots in title information corresponding to the patent classification code. Examples of such hierarchies are described in detail in patent classification code preprocessing module 301-3-1 or 3500. Therefore, the level of the patent classification code in the present invention may be classified by level into IPC section, class, subclass, main group, 1 dot subgroup, 2 dot subgroup, ... n dot subgroup. The level can identify the upper and lower hierarchical structure in the classification symbol system up to the main group, but can determine the upper and lower hierarchical structure only by using dot information included in the title from the subgroup. Checking the hierarchical structure by the number of dots as described above is from the subclass in the case of USPC, and after all seven digits (up to "theme code + two alphabetic characters") in the entire FT symbol.

An example of a method in which the patent classification code pretreatment module preprocesses a statistical value for each patent classification code is illustrated in FIG. 100. The statistical preprocessing module for each patent classification code readjusts the patent classification code using a tree structure of the patent classification code so that a sub-classification code of a given patent classification code can be automatically included for a specific patent classification code system in a first country. And store in a DB (S5020), obtain a search expression or search query including patent classification symbols (S5030), and convert to a patent classification symbol readjusted for a given patent classification symbol included in the search expression or search query (S5020); S5040), a search engine or a DBMS 201 is searched for a search expression or a search query including the readjusted patent classification code (S5050), and the patent document information is obtained and quantified as the search result (S5060). , Based on a predetermined time unit or integrated time unit with respect to the search result, described with respect to the subject such as the applicant, inventor, or agent The statistical value, and performs a function of calculating the ranking information based on the parameters or calculations.

In addition, the statistical preprocessing module for each patent classification code may automatically include a lower patent classification code for a given patent classification code to generate a statistical value, a parameter, or a calculated value, and an exemplary method thereof is illustrated in FIG. 101. .

The statistical preprocessing module for each patent classification code obtains a search expression including a patent classification code (S5120), and includes a patent document related to the patent classification code included in the search expression including patent information about a lower patent classification code. Information is obtained (S5130), and the citation information and the cited information included in the patent document process predetermined statistical values, parameters, or calculated values (S5140).

In addition, the statistical preprocessing module for each patent classification symbol generates statistical values, parameter values, or calculation values for each patent classification symbol in terms of including lower patent classification symbols for citation or citation information for a document subset of a specific document set. One exemplary method of this is shown in FIG.

The statistical preprocessing module for each patent classification code obtains at least one document set (S5220), and extracts an applicant, an inventor, an agent, and at least one patent technology classification code for all documents included in the document set (S5230). To generate the document subset by applying any one or more of the extracted criteria to the document (S5240), for all documents constituting the permanent document subset, to obtain the information of the first application documents cited by the document ( S5250), for all documents constituting the document subset, obtain post-application document information quoting the document (S5260), and the preset citation for the obtained pre-application document and / or the obtained post-application document. In operation S5270, the cited related statistical value, parameter, or calculated value may be generated.

Hereinafter, the patent classification code statistical preprocessing module of the present invention will be described in more detail in terms of the patent classification code statistical preprocessing engine.

Option

When the patent classification code statistical preprocessing module processes the predetermined statistical value, the following options are considered, and cross selection for each option may be possible.

First, when each patent classification code is given, the question is whether to automatically include information on the lower classification code of the patent classification code. This is especially true at the level where the hierarchical structure should be checked from the number of dots in the title information. For example, up to levels H, H04, H04B, and H04B 7/00, extensions can be used to easily find patent documents containing patent classification symbols under a given level of IPC. The search method with subclasses can be used to find patent documents that contain patent classification symbols belonging to the subclasses. For example, if there is an IPC of H04B 7/15, and a statistical value is generated without including the lower classification code, a patent document including H04B 7/15 is obtained from the patent documents of each country. However, H04B 7/15 and H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, and H04B 7/19 are used to generate statistics with subclassifiers. Also, patent documents containing H04B 7/195, H04B 7/204, H04B 7/208, H04B 7/212 and H04B 7/216 should also be obtained. The same applies to USPC, FT, etc., and common description is omitted.

In the case of automatically processing the statistical value by including the lower classification code, the method of the range search of the present invention (both using the search engine or the DBMS 201) may be used. On the other hand, in order to achieve the present object, even when the range is not searched, when processing a given patent classification code, all the patent classification codes below it are stored, and when processing statistical values for the patent classification code, the patent classification is performed. In addition to the statistical processing for symbols only, the statistical value may be processed for all patent classification symbols below. For example, through H04B 7/15 above, H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, H04B 7/19, H04B 7/195, H04B 7/204, H04B 7/208, H04B 7/212, H04B 7/216 were obtained from the patent classification code hierarchy information, and then H04B 7/155, H04B 7/165, H04B at the bottom of H04B 7/15. Store 7/17, H04B 7/185, H04B 7/19, H04B 7/195, H04B 7/204, H04B 7/208, H04B 7/212, H04B 7/216, and then save to H04B 7/15 H04B 7/155, H04B 7/165, H04B 7/17, H04B 7/185, H04B 7/19, H04B 7/195, It is also possible to employ a method of obtaining preset statistical values, including H04B 7/204, H04B 7/208, H04B 7/212, and H04B 7/216. The same is true of other patent classification code systems such as USPC and FT.

When there is a patent classification code of a given level, the predetermined statistical value is automatically included by including the sub-patent classification code of the patent classification code, rather than simply generating the predetermined statistical value for a document including the patent classification code. It is desirable to produce.

Second, when there are a plurality of patent classification codes, how to deal with them is a problem. In this case, 1) only the main patent classification code (usually the first one is the main patent classification code) is subjected to statistical processing, and 2) the main patent classification code and the sub patent classification code. 3) with equal weighting, 3) different weighting for the main patent classification symbol and the sub-patent classification symbol (for example, when there are n + 1 patent classification symbols, 50% of the main patent classification symbol) The remaining species patent classification symbols may be weighted as 50% * 1 / n). Among the above 1) to 3), 1) and 3) are preferred, and 3) will be more preferable.

Third, there is a processing method when there are a plurality of applicants and inventors. In this case, 1) the method of giving a weight of 100% for each source of revenue or inventors, and 2) the method for equalizing the number of applicants or inventors (when there are n inventors, 100% * 1 / n for each inventor) Will be there. In the case of the applicant, the weight of 100% for each applicant is preferred from the standpoint of the implementation. In the case of the agent, the above-described applicant or inventor's treatment method may be employed, and it is preferable to set it as 100% for each agent.

Information to be precalculated

Information about the structure of the classification itself

The statistical preprocessing module for each patent classification code may count the number of its own subordinate patent classification symbols on the patent classification code system for a given patent classification code, and count the number of all its lower patent classification codes. Could be. For example, in the case of H04B 7/15, the number of subordinate patent classification symbols is 3, such as H04B 7/155, H04B 7/185, H04B 7/204, and the number of all patent classification symbols is the subordinate patent classification symbol. H04B 7/165, H04B 7/17, H04B 7/19, H04B 7/195, H04B 7/208, H04B 7/212 and H04B 7/216 added to the three are 10. As the number of subordinate patent classification symbols and the number of all lower patent classification symbols increases, the patent classification symbols can be said to have a relatively high level of technical differentiation, so that they are relatively important at the same (dot) level on the patent classification system. Most likely.

Quantitative distribution

The statistical preprocessing module for each patent classification code may calculate the following information based on an application criterion and / or a registration criterion for a predetermined period of time, targeting a country-specific patent DB designated for a given specific patent classification code. The above calculation is performed because a set of patent documents satisfying conditions such as the patent classification code, a designated country, a set time, whether to apply or register is determined. Counting the determined patent document set enables the calculation of various parameters as follows.

For example, the number of applications per year or quarter, application increase rate, application increase rate, application increase acceleration may be calculated as application criteria. The number of applications per year may be calculated by obtaining the total number of patent documents filed in a particular year. The increase rate is calculated as {(current period number-past period number) / past period number} * 100%, and the increase rate is calculated as {(current period number-past period number) / time interval}, and the increase acceleration is It can be calculated as the rate of change over time intervals of increasing speed.

In addition, the number of applicants per year can be obtained (in case of joint application, the above option can be applied), and when the number of applicants is calculated, the number of applicants can be calculated, and the average application per application Embroidery and average application growth rates can also be calculated. In addition, the number of inventors, the number of inventors, the average number of inventors, and the average number of inventors by year may be calculated based on the number of applicants by year.

In addition, since the number of application claims can be calculated for each individual application, it is possible to calculate the number of application claims per year, the application claim increase rate, the average application claim number per application, and the average application claim increase rate. In this case, if the application claims are divided into independent and dependent claims, the number of independent claims and dependent claims may be calculated, and thus, an increase rate thereof may be calculated.

In addition, since the number of patent classification symbols can be calculated for each application, the number of patent classification symbols and the number of patent classification symbols can be calculated by year, and the average number of patent classification symbols and the average patent classification symbol growth rate per application can be calculated. will be. In addition, in the case of a patent DB in which two or more patent classification symbols are used, such as the United States or Japan, it may be calculated for each patent classification symbol. When there is family information about the application for each application, the number of families, the number of milligram growth, the average number of family countries per year, and the average family country growth rate may also be calculated based on the family information.

The above is calculated on an application basis, and the same parameters may be calculated on an registration basis. The parameters that can be calculated based on the registration criteria include the number of registrations, the number of registrations, the number of registrations, the number of registrations, the number of registrants, the number of registrants, the average number of registrants, the average number of registrants, the number of inventors, the number of inventors, the average number of inventors, the average number of inventors. Number of registered claims, number of registered claims, average number of registered claims, average number of registered claims, average number of registered claims, number of patent classification, number of patent classification, average number of patent classification, number of average patent classification, number of families, number of millimeters , Average family country growth, average family country growth rate, and so on.

Such parameters are exemplary, and various parameter values may also be calculated based on various numerical information included in application or registered patent information and pre-counted values of the present invention. The parameter values are essentially count values, rate of change and change of count values, statistics of the count values (average, standard deviation), function values for two or more count values (e.g. registration rate, etc.).

Quantitative subject discovery

The statistical preprocessing module for each patent classification code may calculate the following information based on an application criterion and / or a registration criterion for a predetermined period of time, targeting a country-specific patent DB designated for a given specific patent classification code. The above calculation is performed because a patent document set that satisfies conditions such as the patent classification code, a designated country, a set time, and whether to apply or register is specified. Counting on the specified patent document set makes it possible to calculate various parameters as follows.

For example, by counting the specific patent document set, the number of applicants, inventors, and agents by year or year integrated may be extracted and ranked by each applicant, inventor, or agent.

In addition, the concentration, activity index, etc. may be calculated for each applicant, inventor, or agent extracted above, and the ranking of the applicant, inventor, or agent based on the concentration may be obtained as a result of the calculation. The ranking of applicants, inventors, or agents based on their activity levels may be obtained. For reference, the concentration based on the number of applications may be obtained as {(all applications for a specific patent classification code of a specific subject during a specific period / all applications of a specific subject during a specific period) * 100%}. In addition, the activity may be determined by: {{(all applications for a particular subject's patent classification symbol for a specific period / all applications for a specific subject's specific period)} / {(all applications for a specific patent classification symbol for a specific period of time / All applications for a particular period)} * 100%]. At this time, through the search engine or the DBMS 201 1) all applications for a particular patent classification code of a particular subject for a specific period of time, 2) all applications of a specific subject for a specific period, 3) specific patent for a specific period Since the number of all applications for the classification code and 4) each of all the applications for a certain period of time is available, the above concentration or activity can be calculated. On the other hand, the concentration or activity is only one example of a patent indicator, and a defined function having as input values numerical values that can be obtained directly or indirectly through the search engine or the DBMS 201. If so, the value of the function may be calculated, and the calculated function value may be used to calculate the ranking based on the function value of the applicant, inventor, or agent. The function of the present invention includes all patent analysis indexes defined as a value that can be obtained by the search engine or the DBMS 201 based on the patent DB among all conventional patent analysis indexes that can be defined as patent analysis indexes. This may be the case.

In addition, although the ranking by the applicant, inventor or agent is calculated based on the number of applications, the ranking by the applicant, inventor or agent for the patent indicator or function based on other parameters such as the number of application growth rate and the number of application growth rates. You can calculate The parameters include the number of claims filed, the number of claims filed, the average number of claims per application, the average number of claims filed per application, the number of patent classifications, the number of patent classification numbers, the number of average patent classifications per application, There may be an average patent classification number growth rate per application, family number, millimeter growth rate, average family country number per application, average family country growth rate per application, and the like.

Although the various patent indices or functions have been described on the basis of application, it will be apparent that they can also be defined and calculated on the basis of registration. This is because the limitation of the application or registration is a matter of specifying or confirming the patent document set. Therefore, the patent document set may be specified or determined only for applications that have a request for examination at the application stage of the application. In addition, if a utility model system is introduced, whether to apply for a utility model registration or utility model registration in the application or registration, respectively, and also regarding the method of specifying or determining the patent document set, the method and criteria May be variously determined at the level of those skilled in the art.

Citation related

The statistical preprocessing module for each patent classification code is based on a predetermined period of time for a country-specific patent DB (for example, a patent DB constructed by a patent document issued by the US Patent Office) including citation information for a given specific patent classification code. On the basis of application criteria and / or registration criteria, the following citation related information may be generated. The above calculation is performed because a patent document set that satisfies conditions such as the patent classification code, a designated country, a set time, and whether to apply or register is specified. For example, given USPC = 002/456 (body cover among MISCELLANEOUS among Apparrel, GUARD OR PROTECTOR among Apparrel), a patent document set including USPC = 002/456 as a patent classification symbol of his patent document can be specified. It becomes possible. Counting on the specified patent document set makes it possible to calculate various parameters as follows.

Bibliographic statements in US patent documents contain citation information such as patents of others cited by their documents. That is, when document B refers to document A and a, document A and a are cited, that is, a cited patent, and document B is a patent citing A and a. At this time, Applicants (including asignee), inventors, agents, and patent classification symbols related to Document B become citation applicants, cited inventors, citation agents, and cited patent classification symbols, respectively. The inventor, the agent, and the patent classification code become the citation applicant, the citation inventor, the citation agent, and the citation patent classification code, respectively.

When document B is citing document A and document a, the bibliography of document B contains the number of document A and document a. When document B has a relationship to cite document A and document a, computationally, A-> B, a-> B mapping and C-> A, B-> a from a citation point of view With this mapping relationship, A <-> B and a <-> B can be established. Therefore, the document cited by B can be easily dataized in the bibliography of B. Through the above mapping, the set of documents cited by A and the set of documents cited by a can be easily specified. have. That is, both the document set in which the A document is cited and the document set in which the a document is cited will include the B document.

The process may be performed for all documents that are confirmed or specified. That is, when all the patent documents are handled one by one, or may be established in a computerized mapping relationship between all patent documents. When processed one by one, it will be clear that the documents that are the subject of each treatment are placed in the position of document B. In this case, if both documents A and a are included in the entire set of US patent DBs, information on document A and a may be recorded in the information related to document B, and one of the documents in another country's patent DB Even in the case of a managed document, the information on the mapping of the document B in the country's patent database may be recorded.

If a patent document set including a citation information (document set with B) is specified, the patent classification code-specific statistical preprocessing module includes the cited patent document number (document A) included in each patent document of the patent document set. Number of the document a), and the number of the cited patent document number includes the corresponding cited patent document, and it is apparent that the cited patent document includes bibliographic matters. Accordingly, the cited patent document set cited by all patent documents constituting the specified patent document set is also specified on the basis of a specific calculation time point. Therefore, the specified cited patent document set can be analyzed, counted or calculated. At this time, redundant counting is naturally allowed. That is, when a cited document is cited by a number of documents in the set of patent documents containing the cited information, the cited document must have a weight or counting value by the number of times that the cited document is cited. That is, it would be reasonable to treat each cited document number as multiplied by the weight of the number of citations mentioned above. That is, when document A is cited five times and document a is cited three times, when the following parameter values, such as the number of cited applications, are calculated, a weight of 5 is assigned to document A and a weight of 3 is used for document a. It is preferable to calculate Based on the patent document set including the citation information, all calculations for the cited patent document set may be referred to as "quotation". When the counting is duplicated, it is possible to calculate the ranking of the cited document number cited the most times.

The statistical preprocessing module for each patent classification symbol may calculate the following information for the cited patent document set (Document A, Document set with document a), and then, based on quantitative criteria, the number of cited applications and citations Application growth rate, citation application growth rate (the number of documents constituting the cited patent document set) can be seen, citation application increase acceleration, citation application number, citation application growth rate, citation average Number of applicants, average number of applicants cited, number of cited inventors, number of cited inventors, average number of cited inventors, average number of cited applicants, number of cited applications, number of cited application claims, average number of cited applications, average number of cited applications The increase rate, cited patent classification number, cited patent classification symbol increase rate, cited average patent classification symbol number, cited average patent classification symbol growth rate can also be calculated. Will. Cited application number, cited application growth rate, cited application increase rate, cited application increase rate, cited application number, cited applicant increase rate, cited average applicant number, cited average applicant increase rate, cited inventor number, cited inventor increase rate Citation average inventors, citation average inventors growth rate, citation application claim number, citation application claim increase rate, citation average application claim number, citation average application claim increase rate, citation patent classification number, citation patent classification number growth rate, citation average patent The counting method such as classification code number and cited average patent classification code growth rate is the same as described in the above parameter calculation method. That is, for the set of cited patent documents, the number of applications, application growth rate, application growth rate, application application growth rate, number of applicants, number of applicants, average number of applicants, average number of applicants, number of inventors, number of inventors Increase rate, average number of inventors, average number of inventors, number of claims, number of claims, number of applications, average number of claims, number of claims, number of patents, number of patents, number of patents, average number of patents Calculate the value, the number of citations, number of citations, rate of citations, number of citations, number of citations, number of citations, average citations, average citations, Number of cited inventors, number of cited inventors, average number of cited inventors, average number of cited applicants, number of cited application claims, number of cited application claims, average cited application claims This, cited application claims average number rate, cited patent classifier lake, lake cited patent classifier growth, cited patent mean classifier lake, cited patent mean classifier lake growth is front ofa.

Similarly, the statistical preprocessing module for each patent classification code may calculate the ranking of an applicant, an inventor, an agent, or a patent classification code for each parameter for the cited patent document set. That is, it is possible to calculate the ranking by the largest number of cited applicants and the ranking by the largest number of cited inventors based on the number of applications. At this time, when calculating the ranking according to the patent classification code, it is possible to calculate the ranking of the most cited patent classification code for each step on each patent classification code system using the lower patent classification code system of the present invention. In this case, at least one patent classification code included in the cited patent document included in the cited patent document set will be included, and each of the higher patent classification codes (on the patent classification code system) of the included patent classification code is included. It would be reasonable to see them as quoted. Therefore, it would be desirable to include this aspect in the calculation of the languages of the most cited patent classification symbols for each step. When targeting US patent documents included in the cited patent document set, the patent classification code may be both USPC and IPC.

The statistical classification preprocessing module for each patent classification symbol may obtain the applicant information for the patent document set (the document set with B) including the citation information, and the applicant-specific patent including the citation information. For the document set (the applicant's patent document set of any of the applicants will include the B document), the patent document set for the patentee can also be specified for the applicant's patent document set. It will be possible to calculate various parameters such as the number of cited applications and the like for the specified cited patent document set. At this time, for each parameter, it will be possible to calculate the ranking for each applicant. For example, the applicants can be extracted from a set of patent documents specified by USPC = 002/456 (body cover among MISCELLANEOUS among GUARD OR PROTECTOR among Apparrel, etc.) by year, and the document set by the extracted applicant can be generated. (E.g., U.S. Pat.No. 06401262, assignee Benetton Group SpA, US Cl. 2/456; 2/411, if there are documents, the set of documents corresponding to the Benetton Group SpA, USPC = 002/456) It is possible to generate the statistical pre-processing module for each of the patent classification code to generate the cited document set in the same manner with respect to the set of documents generated by the applicant, and for the cited document set Various parameters of can be calculated.

In view of the above mapping, it is possible to generate a document set, ie, a cited document set, that cites each patent document included in the specified patent document set. For example, when document C is cited by document C and document c, the mappin relationship shows that document B is cited by document C and document c. In order to find such C documents and c documents, if the B document number is entered in the citation document number field of the search engine, the C document including the B document in the citation information is displayed as a search result. Of course, C documents and the like can be obtained as a search result even by querying a query such as select by specifying a quotation number field in the DBMS 201. If the mapping relationship of B-> C and B-> c is organized as data, it is possible to easily specify a document set including the C document citing the B document.

Therefore, the patent classification code-specific statistical preprocessing module, for a specified patent document set, refers to the cited document set that cites the individual patent documents of the specified patent document set. Can be calculated as c).

When a patent document set (document set with B) is specified, the patent classification code-specific statistical preprocessing module queries the search engine or DBMS 201 to cite patent documents citing each patent document in the patent document set. We can collect number (number of document C, number of document c). Since it is apparent that bibliographic matters correspond to the collected cited patent document numbers, the cited patent document sets citing all the patent documents constituting the specified patent document set are also specified on the basis of a specific calculation time point. Therefore, the above-mentioned set of cited patent documents can be subject to analysis, counting or calculation.

In this case, it is preferable to allow duplicate counting. When the document B and the b document belong to the specified patent document set, the weight of the document C should be 2 when both the document B and the document b are cited by the document C. That is, the document C is included in the search result even if the document is searched as the document B, and the document C is included in the search result even if the document is searched as the document b.

Based on the specified patent document set, all calculations for the cited patent document set may be referred to as "cited". When the counting is duplicated, it is possible to calculate the ranking of the citation document number that most cites the patent documents belonging to the specified patent document set.

The statistical preprocessing module for each patent classification symbol may calculate the following information for the cited patent document set (document C, document set with document c), and then, based on quantitative criteria, citation counts, blood The rate of citation application growth and the rate of citation application growth (these can be known as the number of documents constituting the cited patent document set), the number of citations, the number of citations, and the number of citations Growth rate, average number of citations, average number of citations, number of citations, number of citations, number of citations, number of citations, number of citations, number of citations, number of citations Increase rate, average number of citations to citation, average number of citations to citation, number of citations to citation, number of citations to citation, average citation to citations, citation The average will be calculated to be patented sorter growth lake. The number of citations, the number of citations, the number of citations, the number of citations, the number of citations, the average number of citations, the average number of citations, Citation Count, Citation Increase, Citation Average Inventions, Citation Average Inventions, Citation Requests, Citation Averages, Citation Averages, Citation Averages. The counting method such as the increase rate, the cited patent classification number, the cited patent classification number, the cited average patent classification number, and the cited average patent classification number increase rate is the same as described in the above parameter calculation method. That is, for the cited patent document set, the number of applications, application increase rate, application increase rate, application increase rate, number of applicants, number of applicants increase, average number of applicants, average number of applicants, number of inventors, number of inventors , Average number of inventors, average number of inventors, number of claims, number of claims, average number of claims, average number of claims, number of patents, number of patents, number of patents, average number of patents, average number of patents When calculating, the value is the cited application number, the cited application growth rate, the cited application growth rate, the cited application growth rate, the cited application number, the cited application growth rate, the average cited application number, Citation Average Applicant Growth Rate, Citation Inventor Number, Citation Inventor Growth Rate, Citation Average Inventor Number, Citation Average Invention Growth Rate, Citation Application Claim Number, Citation Application Claim Number Gayul, the front ofa this cited application claims average number, the average cited pending claims can increase, cited patent classifier lake, lake cited patent classifier growth, cited patent average classifiers lake, cited patent mean classifier lake growth.

The special coin may be similarly calculated for the applicant, the inventor, the agent, or the patent classification code for each parameter based on the cited patent document set. That is, the ranking of the most cited applicants and the ranking of the most cited inventors may be calculated based on the number of applications. At this time, when calculating the ranking according to the patent classification code, it is possible to calculate the ranking of the most cited patent classification code for each step on each patent classification code system using the lower patent classification code system of the present invention. In this case, at least one patent classification code included in the cited patent document included in the cited patent document set may be included, and each of the upper patent classification codes (on the patent classification code system) of the included patent classification code is also cited. It would be reasonable to see them as related. Therefore, it would be desirable to include this aspect in the calculation of the languages of the most cited patent classification symbols for each step. The patent classification code may be both USPC and IPC when targeting a US patent document included in the cited patent document set.

In this case, the patent classification code-specific statistical preprocessing module may analyze the specific patent document set (document set with B and b) itself. In this case, it is preferable that the duplicate counting is allowed and weighted by the number of duplicates to be handled. For example, when document B is cited five times and document b is cited three times, when calculating the following parameter values such as the number of citations, etc., for all the specified patent document sets, It is preferable to calculate the weight in such a manner that a weight of 3 is assigned to the b document. The statistical preprocessing module for each patent classification code may calculate the ranking of the most cited patent documents based on the specified patent document set by reflecting the weight. Furthermore, the patent classification code-specific statistical preprocessing module targets the specific patent document set (document B, document set with document b), and the number of citations to be cited on a quantitative basis (B is cited by C, Can be used in terms of the number of citations), the rate of increase in the number of citations, and the rate of increase in the number of citations (which can be seen as the number of documents constituting the cited patent document set). Cited Application Acceleration, Citation Count, Citation Count, Citation Average, Citation Average, Citation, Citation, Citation Average, Citation Average Number of inventors increased, Number of citations for citations, Number of citations for citations, Number of citations for average citations, Number of increase for average citations for citations, Number of citations for citations, Citations for citations for citations The increase rate, the cited average patent classification number, and the cited average patent classification number growth rate may also be calculated. The number of citations, the number of citations, the number of citations, the number of citations, the number of citations, the average number of citations, the average number of citations, Citation Count, Citation Increase, Citation Average Inventions, Citation Average Inventions, Citation Requests, Citation Averages, Citation Averages, Citation Averages. The counting method such as the increase rate, the cited patent classification number, the cited patent classification number, the cited average patent classification number, and the cited average patent classification number increase rate is the same as described in the above parameter calculation method. That is, for the cited patent document set, the number of applications, application increase rate, application increase rate, application increase rate, number of applicants, number of applicants increase, average number of applicants, average number of applicants, number of inventors, number of inventors , Average number of inventors, average number of inventors increase, number of claims for application, number of claims for application, number of claims for average application, number of claims for average application, number of patents, number of patents, number of patents, number of patents, average number of patents When calculating, the value is the cited application number, the cited application growth rate, the cited application growth rate, the cited application growth rate, the cited application number, the cited application growth rate, the average cited application number, Citation Average Applicant Growth Rate, Citation Inventor Number, Citation Inventor Growth Rate, Citation Average Inventor Number, Citation Average Invention Growth Rate, Citation Application Claim Number, Citation Application Claim Number Gayul, the front ofa this cited application claims average number, the average cited pending claims can increase, cited patent classifier lake, lake cited patent classifier growth, cited patent average classifiers lake, cited patent mean classifier lake growth. In this case, the patent classification code-specific statistical preprocessing module may calculate the ranking of the applicant, the inventor, or the agent for each of the parameters for the specified patent document set likewise. That is, it is possible to calculate the ranking by the largest number of cited applicants and the ranking by the largest number of cited inventors based on the number of applications.

The statistical classification module for each patent classification code may obtain the applicant information for the specific patent document set (the document set with B), and the patent document set for each applicant (any one of the most frequent applicants). The patent document set of the applicant may include the B document.), The cited patent document set may be specified also for the applicant-specific patent document set, and the specified cited patent document Various parameters such as the number of cited applications and the like may be calculated for the set. At this time, for each parameter, it will be possible to calculate the ranking for each applicant. For example, the applicants can be extracted from a set of patent documents specified by USPC = 002/456 (body cover among MISCELLANEOUS among the Apparrel, GUARD OR PROTECTOR among the Apparrel), and can generate the extracted document set by the applicant. (E.g., US Register No. 06401262, assignee Benetton Group SpA, US Cl. 2/456; 2/411, when the document corresponds to the Benetton Group SpA, by USPC = 002/456). The patent classification symbol preprocessing module may generate the cited document set in the same manner with respect to the document sets generated by the applicants. Various parameters can be calculated.

The method of calculating the various parameters by the patent classification code-specific statistical preprocessing module of the present invention is characterized in that it generates a calculated value for each set of determined or specified patent documents. Therefore, from the standpoint of the statistical classification code preprocessing module for each patent classification code, the existence of a specific patent document set inputted to the statistical preprocessing module for fraud patent classification code is important, and the property and size of the patent document set are not a problem. In order to make a calculation related to citation, citation information must be included in the specified patent document set. At least one or more of the parameter values may be calculated. For example, 1) a patent document set generated by a method of specifying a patent document set by a specific patent classification code on a specific patent classification code system in a specific country DB, and 2) a patent document set by a specific applicant name in a specific country DB. A set of patent documents generated by a method of specifying; and 3) a method of specifying a set of patent documents by a specific inventor name (ie, applicant name and inventor name) included in the patent document of a specific applicant in a specific country DB. 4) A patent document set generated by 4) A patent document set generated by a method of specifying a patent document set by a specific agent name in a specific country DB, 5) A specific patent classification symbol on a specific patent classification code system in a specific country DB, and Patent document set generated by a method of specifying a patent document set under the name of a specific applicant; A set of patent documents generated by a method of specifying a set of patent documents by a specific patent classification code on a specific patent classification code system, a specific applicant name, and a specific inventor name in a country DB, and 7) a specific applicant name and a specific name in the specific country DB. Patent document set generated by a method of specifying a patent document set by the agent's name; 8) Patent document set specified in units of time periods set in 1) to 6); 9) Request for examination in 1) to 8). At least one of a patent document set generated by a method of specifying a patent document set for each inclusion of a predetermined option included in a patent document of a specific country such as whether or not is included in statistical preprocessing for each patent classification symbol of the present invention. The module will be the set of patent documents to be calculated. In addition, the patent classification code statistical preprocessing module of the present invention may calculate at least one or more of the parameters for a patent document set designated by a user using the patent classification code statistical preprocessing module.

Patent Information Processing Basic Module (40)

Next, the patent information processing basic module 40 of the present invention will be described. The patent information processing basic module 40 includes 1) a search engine module, 2) a calculation result table generation module 402 for multi-dimensional analysis, 3) an analysis module, 4) a monitoring module 403, and 5) a patent document set. Module, 6) directory generation module 405, 7) reporting module 406, 8) simple analysis module 407, and the like.

Heterogeneous Multiple Patent Classification Symbol Relationship Preprocessing Module

The same kind of plural patent classification code relation preprocessing module has described processing for the same kind of patent classification code. Next, the heterogeneous plural patent classification code relation preprocessing module will be described. The heterogeneous plural patent classification code relational preprocessing module is completely similar to the operation of the homogeneous plural patent classification code relational preprocessing module. Almost all patent documents issued in the United States combine IPC and USPC in one patent document, and IPC and FT are simultaneously granted in patent documents issued in Japan. Assume that a patent document is assigned IPC1, IPCn, USPC1, ... USPCm.

In this case, the IPC1, IPCn and

There are four kinds of combinations of relationships between USPC1, ..., USPCm. First with IPC1

In the context of USPC1, they are a pair of main IPC and main USPC, respectively. And with IPC1

A pair of USPC2, ... USPCm, which is a pair of main IPC and sub USPC. Then, as a pair of IPC2, IPCn and USPC1, this is a pair of sub IPC and main USPC. And pairs for IPC2, ..., IPCn and USPC2, ... USPCm, respectively, which is a pair of sub IPC and sub USPC

Becomes

The heterogeneous plural patent classification code relational preprocessing module may perform the same processing as that performed by the homogeneous plural patent classification code relational preprocessing module between the heterogeneous plural patent classification symbols under each kind. In this case, since the heterogeneous patent classification code is targeted, the minimum common patent classification code may not be limited to a preset value. That is, in case of IPC, the upper patent classification code may be defined for a given IPC and a given USPC pair, targeting only subclasses and classes in USPCs. That is, the homogeneous plural patent classification code relational preprocessing module generates the relation table as described above for a given patent classification code combination Ai, Bj (where Ai and Bj are patent classification codes of the same system). As described above, the heterogeneous plural patent classification code relational preprocessing module may generate the relation table as described above with respect to the patent classification code combination Ai, Bj (where Ai and Bj are patent classification codes of different systems). Can be. A weight or contribution weight may be given to the generated relation table. Further, a patent document number can be associated with each of the heterogeneous patent classification code combinations Ai and Bj (where Ai and Bj are patent classification codes of different systems), and the bibliography obtained from the patent document number. It will also be obvious that the counting and calculations reflecting the matter can be performed. In addition, it is possible to obtain heterogeneous patent classification symbol combinations Ai and Bj for a specific document set, and to perform analysis equivalent to the homogeneous plural patent classification symbol relational preprocessing module for these combinations.

Search engine module

Structure of a search engine module

First, the search engine module will be described. The search engine is an indexer (401-3) (indexer) for processing the data to be searchable, the search index (401-2) (index) that is the result of processing the data with the indexer (401-3) and And a searcher 401-1 that searches a query for the index. In general, a broad range of search engines includes these three types, and narrowly, only a searcher 401-1 that actually performs a search may be referred to as a search engine. However, in order to provide a search result from the user's point of view, the data to be searched, the indexer 401-3, the search index 401-2, and the searcher are required.

Additional Search Engine Module Components

In addition to the indexer 401-3, the search index 401-2, and the searcher, the search engine module of the present invention may further include 1) the morphological analysis module and 2) the modified search expression generation module. . The modified search expression generation module serves to search for the patent classification code and information about all lower patent classification codes of the patent classification code as a search result when the search classification expression includes the patent classification code. To perform. The method of generating the modified search expression is as described above.

Application of Representative Applicant Name DB

When the indexer 401-3 of the present invention indexes patent data, the indexer 401-3 may index the applicant's name as the representative applicant.

The modified search expression generation module of the search engine module of the present invention utilizes the representative name applicant DB, inquires the input value entered in the applicant field to the representative name applicant DB, and further includes the input representative name module You may be doing

Patent Classification Symbol Search Module (401)

Next, the patent classification code search module 401 will be described. The search for a patent classification code refers to outputting a patent classification code including the technology keyword as title information as a search result when a technology keyword is entered as a search word. Using a patent classification symbol helps to structurally eliminate the accuracy of the search and the noise of the search results, but the patent classification symbol is a symbol or number that does not intuitively mean its meaning, and the number of the symbol is in the tens of thousands Since it is hundreds of thousands, the title information corresponding to the description of each patent classification code is also enormous, and it is not easy to find out the desired patent classification code. On the other hand, the technical keyword desired by the user is found by browsing the multi-level patent taxonomy. 1) the complexity of the patent taxonomy and 2) the technology keyword that is intended when the patent taxonomy is searched from the upper patent taxonomy. It is true that it is quite difficult, such as the probability of not meeting.

Therefore, when there is input of the technical keyword, it is necessary to search for the patent classification code desired. It generates an index that indexes the patent classification code and the title information corresponding thereto, and then queries the search term included in the title information through the searcher, and searches for the title information including the search term and the corresponding patent classification code. You can provide it with This is a conventional technology anyone skilled in the art will be able to easily implement a patent classification code search.

Next, a characteristic patent classification code search of the present invention will be described by the following example.

Section: H Electric

Class: H01 Basic electric element

Subclass: H01F Magnet

Main group: H01F 1/00 Magnet or magnetic material characterized by magnetic material

1-dot subgroup: H01F 1/01 * Inorganic materials

2-dot subgroup: H01F 1/03 ** Characterized by coercivity

3-dot subgroup: H01F 1/032 *** Of hard magnetic material

4-dot subgroup: H01F 1/04 **** metal or alloy

5-dot subgroup: H01F 1/047 ***** Alloy characterized by composition

6-dot subgroup: H01F 1/053 ****** containing rare earth metals

The first feature of the patent classification code search of the present invention is that when the lower patent classification code is searched, the upper patent classification code of the searched lower patent classification code is provided together as a result. For example, when "hard magnetic material" is entered as a search term, "H01F 1/032 ... of hard magnetic material" is presented as a search result, and the H01F 1/032 is displayed as a patent classification code mast DB ( 203), the patent classification code system tree may be provided as a search result of a higher patent classification code up to a predetermined level of the H01F 1/032. That is, when H01F 1/032 is entered as a search word, the following result may be preferable as a search result. It is preferable that the preset level is a subclass in the case of IPC, a class in the case of USPC, a theme in the case of FT, or a subclass in case of ECLA or FI, but the upper or lower level may be presented as a search result. something to do.

Subclass: H01F Magnet

1-dot subgroup: H01F 1/01 * Inorganic materials

2-dot subgroup: H01F 1/03 ** Characterized by coercivity

3-dot subgroup: H01F 1/032 *** Of hard magnetic material

In order to output the above result, the following steps are performed. First, the keyword entered as a search term is queried into a patent classification code index indexed with a patent classification code along with title information, and the search term is searched for a patent classification code as one or more search results included in the title information of the patent classification code. (When "hard magnetic material" is inputted, H01F 1/032 is found.) Second, the found patent classification code is queried to the patent classification code mast DB 203 until the predetermined step of the patent classification code. (H01F 1/03, H01F 1/01, H01F 1/00, and H01F are found.) Third, the found upper patent classification code is stored in the patent classification code mask DB (203). Query to obtain the title information of the found upper patent classification code. Fourth, the acquired patent classification code and the title information of the patent classification code from the first to the third step are combined and displayed together with information indicating a hierarchy such as a dot structure.

On the other hand, when the search query is made with "hard magnetic material" AND "rare earth" in the patent classification code search, the search results cannot be provided because no rows contain "magnet" and "rare earth" in any of the above rows. However, when considering the patent classification code system, the "H01F 1/053 ****** including rare earth metal" should be output as a search result. Therefore, in order to solve the above problem, simply indexing on a row basis does not provide a search result reflecting the systematicity of the patent classification code. To this end, the patent classification code and its title information are modified as follows. The key to the modification is to include all title information of its upper patent classification code in each title information. Table 29 below shows one embodiment.

TABLE 29

IPC sign Merged Title Information H Electricity H01 Electricity; Basic electrical components; H01F Electricity; Basic electrical components; magnet H01F 1/00 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials H01F 1/01 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials H01F 1/03 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by the coercive force H01F 1/032 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials H01F 1/04 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials; Metal or alloy H01F 1/047 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials; Metal or alloy; Alloy characterized by the composition H01F 1/053 Electricity; Basic electrical components; magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials; Metal or alloy; Alloys characterized by the composition; Containing rare earth metals

Meanwhile, the title information is merged up to the highest patent classification code, but only up to a predetermined level can be merged. Preferably, the merged predetermined level is a subclass for IPC, a class for FT, a theme for FT, and a subclass for ECLA or FI. Table 30 below shows one example.

TABLE 30

IPC sign Merged Title Information H Electricity H01 Electricity; Basic electrical components; H01F magnet H01F 1/00 magnet; Magnets or magnetic bodies characterized by magnetic materials H01F 1/01 magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials H01F 1/03 magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by the coercive force H01F 1/032 magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials H01F 1/04 magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials; Metal or alloy H01F 1/047 magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials; Metal or alloy; Alloy characterized by the composition H01F 1/053 magnet; Magnets or magnetic bodies characterized by magnetic materials; Made of inorganic materials; Characterized by coercive force; Of hard magnetic materials; Metal or alloy; Alloys characterized by the composition; Containing rare earth metals

When the patent classification code and the merged title information are indexed on a row basis, when a "hard magnetic material" and "rare earth" are obtained as search terms, H01F 1/053 can be found as a search result. have. On the other hand, when one patent classification code comes out as a search result (for example, when H01F 1/053 appears), all or predetermined upper patent classification code of the search result and title information of the patent classification code are searched together. The result can be provided as described above. That is, when the search term "hard magnetic material" AND "rare earth" is input, the following search data should be included in the search results.

Subclass: H01F Magnet

1-dot subgroup: H01F 1/01 * Inorganic materials

2-dot subgroup: H01F 1/03 ** Characterized by coercivity

3-dot subgroup: H01F 1/032 *** Of hard magnetic material

4-dot subgroup: H01F 1/04 **** metal or alloy

5-dot subgroup: H01F 1/047 ***** Alloy characterized by composition

6-dot subgroup: H01F 1/053 ****** containing rare earth metals

If “metal AND coercive force” is entered as a search word, it is preferable to finally appear as a search result as follows.

Subclass: H01F Magnet

1-dot subgroup: H01F 1/01 * Inorganic materials

2-dot subgroup: H01F 1/03 ** Characterized by coercivity

3-dot subgroup: H01F 1/032 *** Of hard magnetic material

4-dot subgroup: H01F 1/04 **** Alloy

16 is a diagram illustrating one embodiment in which the patent classification code search module 401 of the present invention operates. The search may include selecting a patent classification code, receiving a selection of a search language for searching for a patent classification code, receiving at least one or more search terms, and generating a search result by performing a search using the received search expression. .

Patent Document Set Acquisition Module (404)

Next, the patent document set obtaining module 404 of the present invention will be described. The patent document collection acquisition module 404 includes an automatic selection document collection acquisition module 404-1 and a user-generated document collection acquisition module 404-2, and the user-generated document collection acquisition module 404-2. The document acquisition module through selection on the directory where the document set can be obtained through the selection on the directory where the document set can be specified, such as the document acquisition module 404-2-1 through the search expression and the IPC directory ( 404-2-2). The document acquisition module 404-2-1 through the search expression includes a document set acquisition module 404-2-1-1 and a DBMS 201 through a search engine that queries a search engine according to a query target of the search expression. Document set acquisition module 404-2-1-2 via DBMS 201 . The automatic selection document set obtaining module 404-1 may be automatically obtained except for the one generated by the user among various confirmed patent document sets introduced in the determination of the patent document set described on the specification of the present invention.

Operational Results Table Generation Module (402) for Multidimensional Analysis

Introduction background

Next, the calculation result table generation module 402 for multidimensional analysis of the present invention will be described. When data about patent information is constructed in DB, it usually outputs desired result by using SQL query statement. At this time, even if the DB structure is well designed, multiple tables must be joined in order to derive one result, and when each table is large, it takes a considerable time to process a select statement.

For example, if Samsung Electronics Co., Ltd. obtains yearly market share by multi-application IPC 1 dot subgroup, and configures it as a simple select statement, it can be different depending on the design of DB schema but usually several Tables (eg country applicant table, IPC table, document table, etc.) must be joined first, multi-application IPCs must be extracted from Samsung Electronics' IPC 1 dot subgroup level, and the lower IPCs are automatically included. If necessary, the sub-IPCs should be extracted for each IPC.Then, the number of applications by year is found based on all the patent application data of Samsung Electronics in Korea for the extracted 1 dot subgroup IPC. Next, find the ratio of the number of Samsung Electronics applications and the total number of applications for each of the found multiple application 1 dot subgroup IPCs. Obtain the share of Samsung Electronics for each 1 dot subgroup IPC If you write this as one SQL query statement, the SQL query statement is not only long, but also takes a long time to be processed. In order to automatically include the document corresponding to the patent classification code, it is necessary to perform a great deal of computation in order to obtain the above-mentioned object, which can seriously impair the response speed, while processing the same contents for LG Electronics or the IPC subgroup. It is likely that various variations will occur repeatedly, such as dealing with the share of LG Electronics at the level.

In this case, the patent information is processed in advance to meet the various modified requirements, and operations for multidimensional analysis are performed, and when there is table data storing the results, querying the table dramatically improves the response speed. Will bring. Multidimensional operations include rollup operations and cube operations, and the results of performing these operations on patent data are stored in a table in a DB. In this case, the table may be referred to as various names such as a cube, a physicalized view (a materialized view, a result table for performing multidimensional analysis), a view, and the like, but the names refer to the same contents independently of the names. Extract desired data by using SQL query for the above table. Extracting the desired data from the multidimensional result table that performed the cube operation usually uses a multidimensional expression (MDX) query, but in the present specification, all are collectively referred to as SQL. In other words, it is common to access the result table after performing rollup operation for multidimensional analysis by using SQL query, and to access the result table after performing cube operation for multidimensional analysis is generally by querying with MDX query. In this specification, both access the result table for multi-dimensional analysis, and since SQL and MDX are similar in the form of query, they are referred to as SQL for convenience of description. In other words, querying the result table for multi-dimensional analysis with SQL means that 1) if the table has a roll-up operation, query it with an SQL query; and 2) if the table has a cube operation, MDX. It should be understood that a query is a query. On the other hand, the above dimension means that the analysis is performed in more than one dimension. (The concept of including one dimension of course is also included.)

Conventional Data Warehouse (DW) modeling differs from typical Online Transaction Processing (OLTP) database modeling in terms of deployment methodologies in terms of denormalization and star schema. Denormalization is the opposite of normalization. Normalized database schemas (patent document mast DB 202, patent classifier mast DB 203, subject mast DB 204, etc. of the present invention use a normalized schema). (Preferably), which reduces the DB join process. Examples of how the denormalization works well are illustrated in rollup options, cube options, and the like, which are described below. Denormalization is about consolidating a column of a frequently referenced table into a table by declaring it redundantly in the frequently referenced table. This design greatly reduces the join process by simply referencing duplicate declared colors.

The multi-dimensional analysis operation result table generation module 402 of the present invention refers to an engine that rolls up or cubes patent information and generates the result as a table in order to quickly output desired information. In particular, the analysis module of the present invention using the table, when a roll-up, drill down, or drill through occurs, the multi-dimensional analysis operation result table generation module 402 By creating a simple SQL query on the table created by), you can quickly generate the desired data.

The multi-dimensional analysis operation result table generation module 402 rolls up one or more analysis subject categories for the patent document mast DB 202, the patent classification code mast DB 203, and the subject mast DB 204. Perform the operation and / or cube operation and generate the result as a table. As described above, the patent document mast DB 202 includes a bibliographic item mast DB, and the patent classification code mast DB 203 includes a patent classification code DB according to the type of the patent classification code system. 204) includes an applicant name DB, an inventor name DB, and the like, and the applicant name DB is preferably represented by each country.

Build Analysis DW

analysis DW The composition of

The multi-dimensional analysis operation result table generation module 402 is data included in a patent document mast DB 202, a patent classification code mast DB 203, a subject mast DB 204, and the like, corresponding to at least one analysis subject. In order to generate an analysis result suitable for the analysis subject, a multidimensional analysis operation is performed and the results are generated as a table. There may be a plurality of tables, and the plurality of tables constitute an analysis data warehouse (DW). Preferably, the DB schema of the patent information table used as the material by the multi-dimensional analysis operation result table generation module 402 is constructed as a star schema.

The star schema may be preferably reconstructed using the patent document mast DB 202, the patent classification code mast DB 203, the main mast DB 204, and the like. The mast DBs are normalized and are often not optimized for rollup or cube operations. Therefore, the multi-dimensional analysis operation result table generation module 402 preferably uses DBs or tables reconstructed in a star schema structure for the master DB. It is possible.

The E-R diagram of the tables reconstructed with the star schema has a FACT table in the center, and a dimension table as a reference information table referring to the FACT table. The FACT table contains bibliographic data on the patent document (excluding data in the dimension table, except that there is an ID for each dimension of the dimension table, for example IPC_ID that points to a specific IPC (s) contained in the patent document, specific applicant (At least one or more date IDs corresponding to the date, such as the applicant ID indicating (s)) are preferably included for each field. The dimension table includes a patent classification code (IPC is mandatory, and in the case of a country-specific FACT table, a patent classification code for each country), a date table such as a year, a subject table such as an applicant / agent / inventor, a country / region / property, etc. There is a table of other objects that can belong to a dimension, such as a location table. On the other hand, it will be obvious that the dimension table includes IDs corresponding to various IDs included in the FACT table. For example, the IPC dimension table includes IPC_ID, and one specific IPC corresponds to the IPC_ID. Tables reconstructed with the star schema may be basically constructed for each country, or may be built by integrating countries. In the case of integrated country construction or in other cases, it is desirable to manage multiple countries, such as family information data or INPADOC data, in a separate table.

The multi-dimensional analysis operation result table generation module 402 of the present invention generates a multi-dimensional analysis operation result table for various analysis purposes of the present invention using data of a patent information table reconstructed with a star schema. On the other hand, the multi-dimensional analysis operation result table generation module 402 is a multi-dimensional analysis operation result table for the analysis of other analysis subjects or complex analysis subjects as another material for the multi-dimensional analysis operation results You can also create a table. That is, a table made of the material of the multi-dimensional analysis calculation result table generation module 402 of the present invention includes 1) a patent information table reconstructed with a star schema, 2) a patent document mast DB 202, and a patent classification code mast. At least one of the DB 203, the subject mast DB 204, and 3) a multi-dimensional analysis operation result table. In the meantime, for convenience of description, the above-described 1) to 3) will be collectively described as a reconstructed patent information table, but it will be obvious that 2) and 3) are not excluded.

Relationship between the operation result table generation module 402 and the analysis module for multidimensional analysis

The multi-dimensional analysis calculation result table generation module 402 of the present invention includes an analysis DW generation module 402-1 for generating the analysis DW, various modules for generating the calculation performance results for multi-dimensional analysis for each analysis subject, and other specifications herein. There is a module that performs the functions necessary to generate the result of performing calculation for multi-dimensional analysis. The various modules for generating calculation results for multi-dimensional analysis for each analysis subject include a calculation result table generation module 402-2 for calculation of total amount analysis and a result table generation module for multi-dimensional analysis for citation analysis 402-3. ), The result table generation module 402-4 for competitive analysis, the result table generation module 402-5 for multi-dimensional analysis for inventor analysis, the result table for multi-dimensional analysis for analysis by patent technology classification Generation module 402-6, multi-dimensional analysis operation result table generation module 402-7 for fusion analysis, multi-dimensional analysis operation result table generation module 402-8 for representative phrase analysis, and the like. Each of the above modules is the analysis DW (205-1), the calculation result table for multi-dimensional analysis (205-2), the calculation result table (205-3) for multi-dimensional analysis for citation analysis, the multidimensional analysis for competitive analysis Calculation result table (205-4), inventor analysis result table for multidimensional analysis (205-5), patent technology classification result analysis table for multidimensional analysis (205-6), fusion analysis multidimensional analysis Arithmetic execution result tables 205- 7 and a multi-dimensional analysis arithmetic result tables 205-8 for representative phrase analysis.

Next, the relationship between the multi-dimensional analysis operation result table generation module 402 and the analysis module of the present invention will be described. The analysis module includes an analysis expression (SQL query expression) corresponding to at least one analysis subject, and obtains a desired analysis result for each analysis subject by querying the multi-dimensional analysis operation result table using the SQL query expression. It is processed according to the interface provided by the system (1) of the present invention and provided to the user of the system.

On the other hand, the multi-dimensional analysis operation result table generation module 402 is a predetermined size / step or more in consideration of the performance of the analysis module (reduce the time required for data extraction / calculation / acquisition, saving computational resource consumption, etc.) At least one table may be created for analysis purposes. That is, the multi-dimensional analysis operation result table generation module 402 may generate the final analysis result screen data provided on the user's screen, but generates only up to a predetermined level for each analysis subject (data up to the intermediate step). It is more preferable to generate the final analysis result screen data using various commands provided by the SQL query itself. The latter is preferable because the final analysis results in enormous inefficiency (waste of computational resources) when there are many kinds of screens, but if the former method gives up optimization of computational resources (for maximum response speed), It might be acceptable.

Roll up and drill down

The concept of roll up and drill down is described before processing patent information data into the physical view data. Rollup and drilldown are fundamental to data operations, and are described with examples for ease of understanding. Assume that there is 2006 application data as shown in Table 31 below.

Table 31

2006's January February In March April In May June In July August September October November December Monthly application One 2 3 4 5 6 7 8 9 10 One 2

Such data can be combined by quarter and year as follows. Table 32 below is for conceptual description, and the data structure for this rollup may differ from actual development.

Table 32

2006's January February In March April In May June In July August September October November December Monthly application One 2 3 4 5 6 7 8 9 10 One 2 quarter First quarter Q2 Q3 Q4 Quarterly Applications 6 15 24 13 year 2006 Number of applications by year 58

In this way, a rollup operation that combines from one small unit to one larger unit in one direction at a time is called a rollup operation. In other words, combining monthly values into quarterly values and quarterly values into yearly values is an example of a rollup operation, and accessing / obtaining / extracting the result of merging into higher units in one dimension. This is called rollup.

On the other hand, 58 data in 2006 consisted of the sum of 6 data in 1Q, 15 data in 2Q, 24 data in 3Q, and 13 data in 4Q. It can be divided into 15 quarters, 24 third quarters, and 13 fourth quarters (because they were originally divided), and each quarter can be divided monthly. As such, accessing / obtaining / extracting / dividing from larger units to smaller units in one dimension is called drilldown. Therefore, rollup and drilldown are two sides of the same coin, and if you have multidimensional data that performs multidimensional operations such as rollup operation from the smallest unit to the largest unit, drill down from the largest upper unit to the lower unit. You can also view the data.

In general, accessing a result table for a multi-dimensional analysis performed on a cube and obtaining desired data is called an online analytic process (OLAP). A typical OLAP engine supports functions such as drill down and drill through.

Roll up and drill down

The data shown in the following table describes in more detail the roll up and drill down based on the dimension of patent classification code. The following table shows the importance of securing information about the immediate patent classification code for a given patent classification code on a patent classification code system such as IPC. In particular, the present invention emphasizes the necessity of obtaining the direct patent classification code information with respect to the patent classification code (for example, in the case of IPC, the patent classification code of 1 dot subgroup or less) that can be distinguished from the upper and lower parts only by the representation of the patent classification code. It becomes possible to realize the importance of the information processing method of automatic inclusion of lower patent classification code.

Description of rollup and drilldown is exemplarily described based on Table 33 below. Table 33 below shows the number of patent applications of a specific company A centered on H04B 7/02. The numbers in parentheses, (), refer to the number of documents in which A specific IPC is issued among the patent documents issued by Company A. On the other hand, the number in brackets {} is a multi-dimensional calculated number. Title information has been added for reference and is not related to rollup, but it is reasonable to provide title information because it is difficult for users to know what the classification symbol means when drilling down. (For convenience of understanding why a lower patent classification code should be included). The 4 dot subgroup is not under H04B 7/02, but is added as a reference to show the multi-levelness of the patent classification code. Except for the following table, a 4-dot subgroup and title information will be described as a table without any information.

Table 33

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup 4-dot subgroup Title / IIPC Description / Patent Classification Code Description H04B 7/00 (2000) {2605} Wireless transmission system H04B 7/005 (64) {64} Control of transmission; Equalization H04B 7/01 (4) {4} Reduction of phase shift H04B 7/015 (3) {3} .Reduction of echo effects H04B 7/02 (88) {207} Diversity system H04B 7/04 (57) {114} .. using multiple independent aerial lines H04B 7/06 (36) {36} ... in the sending country H04B 7/08 (21) {21} ... in the receiving country H04B 7/10 (3) {3} Using single airborne systems characterized by polarization or directional characteristics H04B 7/12 (2) {2} Frequency Diversity System

See H04B 7/02 to understand the rollup. The multidimensional calculated number of H04B 7/02 is 207 (= 88 + 114 + 3 + 2), of which 88 is the number of patent documents containing H04B 7/02, and 114 is H04B 7/04 and 3 and 2 are derived from H04B 7/06 and H04B 7/08, respectively. On the other hand, H04B 7/04 is a multi-dimensional calculated number 114 (= 57 + 36 + 21), of which 57 is the number of patent documents containing H04B 7/04, and 36 and 21 are H04B 7/04 respectively. 06 and H04B 7/08. On the other hand, the document multi-dimensional calculated in H04B7 / 00 is 2650, which is a multi-dimensional calculated from the number of patent documents including H04B7 / 00 and other patent classification symbols under the patent classification code, Table 33 "..." included in it means that there are more of the lower patent classification symbols of H04B7 / 00 than those indicated in Table 33 above.

When importing patent document information for H04B7 / 02 (from searching, counting or other statistics, analysis, calculations, etc.), it should be noted that only the information marked H04B7 / 02 should not be imported. It is more reasonable that the patent information for H04B7 / 02 should include the patent information for all patent classification symbols below H04B7 / 02 in the hierarchical structure of the patent classification code system (for example, in H04B7 / 02). This is because if it is required to request all documents belonging to the corresponding technical field, it is appropriate to include not only documents in which H04B7 / 02 is indicated but also documents corresponding to all patent classification symbols under H04B7 / 02. Therefore, it is why the information processing (searching or counting or other statistics, analysis, calculation, etc.) of the lower patent classification of the present invention is necessary. As can be seen in Table 33, if there is H04B 7/08 in the IPC information of the patent document, the document also corresponds to H04B 7/04, and also to H04B 7/02. Therefore, in terms of counting the number of applications or registrations, the counting value for H04B 7/04 is the sum of H04B 7/04 itself, H04B 7/06, and H04B 7/08. Value.

It will be apparent that such rollup and drilldown may be applied to all levels of all IPC and all other patent classification symbols. An example of the drill down for each IPC may be as follows.

Then, the following table shows an embodiment of the case of drilling down by year. If you drill down from the information in Table 33 above with respect to the year, you will see the data in Table 34 below. For the sake of notation, only multi-dimensional calculated numbers are left, and no 4-dot subgroup without data is displayed.

Table 34

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup H04B 7/00 (200) {2605}

If H04B 7/00 is drilled down based on the lower patent classification, the data as shown in Table 35 below may be shown.

Table 35

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup H04B 7/00 (200) {2605} H04B 7/005 (64) {64} H04B 7/01 (4) {4} H04B 7/015 (3) {3} H04B 7/02 (88) {207}

If H04B 7/015 is drilled down, since there is no lower patent classification code information in H04B 7/015, nothing further changes. At this time, if the drill down for H04B 7/02 will be shown the data shown in Table 36 below.

TABLE 36

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup 4-dot subgroup title H04B 7/00 (7) {37} Wireless transmission system H04B 7/005 (64) {64} Control of transmission; Equalization H04B 7/01 (4) {4} Reduction of phase shift H04B 7/015 (3) {3} .Reduction of echo effects H04B 7/02 (88) {207} Diversity system H04B 7/04 (2) {9} .. using multiple independent aerial lines H04B 7/10 (2) {2} Using single airborne systems characterized by polarization or directional characteristics H04B 7/12 (3) {3} Frequency Diversity System

Then, if you drill down on H04B 7/04, you will see the data shown in Table 37 below.

TABLE 37

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup H04B 7/00 (200) {2605} H04B 7/005 (64) {64} H04B 7/01 (4) {4} H04B 7/015 (3) {3} H04B 7/02 (88) {207} H04B 7/04 (57) {114} H04B 7/06 (36) {36} H04B 7/08 (21) {21} H04B 7/10 (3) {3} H04B 7/12 (2) {2}

Tables 34 to 37 show the results of rolling up the application date / year of application for the specific patent classification code of A company. That is, the results of each multidimensional calculation are from the year to the present year from the past to 2000, from 2001, 2002, 2003, 2004, 2005, and present one year and six months before. In order to do this, the numerical value of the document corresponding to each patent classification code should be rolled up for each year of the patent document set of Company A in advance. 38 shows an example.

For ease of explanation, rollup and drilldown by year will be described, focusing on H04B 7/02.

TABLE 38

Main group 1-dot subgroup 2-dot subgroup 3-dot subgroup Before 2000 2001 (+) 2002 (+) 2003 (+) 2004 (+) 2005 (+) Recent (2006-present) (+) Sum (+) H04B 7/02 (88) {207} 25 15 7 30 60 64 6 207 H04B 7/04 (57) {114} 10 6 3 10 42 39 4 114 H04B 7/06 (36) {36} 3 0 0 3 17 12 One 36 H04B 7/08 (21) {21} 5 3 One 5 4 3 0 21 H04B 7/10 (3) {3} 3 0 0 0 0 0 0 3 H04B 7/12 (2) {2} 0 0 One 0 0 One 0 2

For example, if a user clicks on the (+) sign next to 2001 (a symbol used as a sign indicating that drill down is possible for convenience of explanation), the system 1 may display the same. Acceptance of actions may result in the extraction of sub-dimensional data (for example, quarters) for 2001 and the provision of data to the user, which is represented in Table 39 below.

TABLE 39

Inloop 1-Dot Sub Group 2-dot subgroup 3-dot subgroup Before 2000 2001 (+) 01 / 1Q 01 / 2Q 01 / 3Q 01 / 4Q 2002 (+) Sum (+) H04B 7/02 (88) {207} 25 15 One 2 4 8 7 207 H04B 7/04 (57) {114} 10 6 0 0 6 0 3 114 H04B 7/06 (36) {36} 3 0 0 0 0 0 0 36 H04B 7/08 (21) {21} 5 3 3 0 0 0 One 21 H04B 7/10 (3) {3} 3 0 0 0 0 0 0 3 H04B 7/12 (2) {2} 0 0 0 0 0 0 One 2

Rollup Reference direction and dimension

Next, the dimension used in the present invention will be described in more detail. Typically, time may be a dimension that is a direction axis of rollup or drilldown in various units such as day-week-month-quarter-year and the like. Meanwhile, the patent classification code on the patent classification code system may also be a dimension that serves as a direction axis of the roll up or drill down.

First, the most basic time dimension, which may be any one or more selected from day-week-month-quarter-year-multiple year unit periods and includes all kinds of time attributes included in patent documents such as applicant, publication date, registration date, etc. The same multi-level time dimension can be applied to the field.

Next, there may be a dimension of the patent classification code. At least one patent classification code system is introduced for each country, and the patent classification code dimension may be set to reflect the multi-level of the patent classification code system itself. . Meanwhile, a separate dimension may be set based on the bundle of the at least one selected patent classification code as one unit. For example, a multi-level subtopic corresponding to the subject may be set in multiple stages on a large subject called RFID, and a separate dimension may be set by mapping the bundle of patent classification symbols to the subtopic. These extra dimensions are particularly useful for tables / cubes created by individuals to accomplish their analytical objectives / personalized cube-operational result tables for personalized multidimensional analysis.

Regions can also be a dimension. Among the regional units, the unit that can be obtained most easily from the patent document is a country, and from the address information, a dimension having a multi-level hierarchy can be created by dividing by region within a country.

The subject can also be a dimension. Subjects include applicants, inventors, agents, etc., and attributes of applicants (quantities such as corporations, research institutes, universities, etc.), quantitative attributes such as scale (large, medium, small, etc.) and attributes between applicants such as group-affiliated companies Can be. Meanwhile, corporate financial information, such as stock prices, sales, profit margins, etc., may also be a dimension. On the other hand, meta attributes that can be arbitrarily matched to each company, such as a global company or a local company, can also be a dimension.

On the other hand, the attributes of the document, such as application, registration, rejection can also be a dimension. Meanwhile, when grouping various counting values, the group to which each counting value belongs may also be a dimension. Examples of such groups include claims 1 to 5, 6 to 10, 10 to 15, 15 or more, and the number of co-applicants, the number of co-inventors, the number of families (domestic family, foreign family) The number itself cannot be a dimension, but when the number is grouped, the group to which the number belongs can be a dimension.

The multi-dimensional analysis calculation result table generation module 402 of the present invention generates a multi-dimensional analysis calculation result result table by performing a multi-dimensional calculation for each of at least one selected dimension. When the multidimensional analysis calculation result table generation module 402 of the present invention generates the multidimensional analysis calculation result table, when the patent classification code is included in the dimension, the multidimensional calculation for the given patent classification code is performed. When the value is generated, the multi-dimensional calculation value is generated by considering the patent classification code and the upper patent classification code of the patent classification code. When the multi-dimensional calculation value is generated in this way, when the multi-dimensional calculation value for an arbitrary patent classification code is obtained, all the values for the patent classification code and its lower patent classification codes reflect multi-dimensionally calculated numerical values. You will get The multi-dimensional analysis operation result table generation module 402 of the present invention, when there is a patent classification code included in a given patent document, not only the patent classification code but also the upper patent classification code of the patent classification code. Reflects the result of performing a multidimensional operation on a symbol. For example, if document number # 1 is assigned an IPC of H04B 7/06, when generating multidimensional arithmetic data with this document, a counting value 1 is also given to H04B 7/06, and H04B 7/06 is The counting value should also be assigned to the immediate parent H04B 7/04, H04B 7/02. Of course, it would be common sense that the counting value is also assigned to higher levels of H04B 7/00 and above.

As such, the axial axis of rollup and drilldown is mainly a dimension, and such a dimension is representative of a patent classification symbol such as IPC, time, etc., and is a country, a region, an applicant, an inventor, a state, a citation, a family. Information can also be a dimension axis. The multi-dimensional analysis operation result table generation module 402 of the present invention calculates analytical index values such as total amount, occupancy rate, concentration rate, activity rate, etc. in advance using at least one selected dimension as the target axis of the rollup.

First of all, spatially, the region within the whole country-individual-individual country (patent information includes the applicant's and / or inventor's country and / or address information), for example, the whole country-Korea-Seoul-Gangnam-gu Can be broken down step by step, such as This would be a time-do-do --- the same way.

Segmentation of applicants would allow us to subdivide applicants (national consolidation) —applicants by country, and we would be able to consolidate all countries and intermediary regions (North America, Asia, Europe, the Middle East, South America, Africa, etc.) Could be introduced. In order to introduce a medium-sized country, the North American countries (eg, the US, Canada, Mexico, etc.) must be mapped to the North American category.

In addition, it is possible to roll up / drill down from the applicant to the inventor. Examples of the directions are as follows. 1) Roll up / drill down to Applicant (National Integration) -Applicant-Inventor by country, or 2) Roll up / drill down to Applicant (National Integration) -Inventor (National Integration) -Country Inventor. On the other hand, within one country it will be possible to roll up / drill down by the attributes of the applicant. For example, applicants are divided into organization type units such as corporations, universities, and research institutes, and enterprises are divided into large corporations, mid-sized enterprises, and small and medium enterprises, respectively, and universities are divided into public universities, private universities, and technical universities. You can break it down into labs and then roll up and drill down on them. In this case, it should be obvious that mapping information should be provided for each applicant. For example, information should be mapped that Applicant A is a company and its size belongs to SME.

At this time, if there is financial statement data for companies, etc., more various rollup / drill down may be performed. For example, if the company's financial statements contain information on sales, profits, stock prices, and their respective rates of increase and decrease, each company-sales-sales scale, company-profit-profit growth, or corporate-share-stock growth You will be able to roll up and drill down in various categories, including ranges. That is, the essence of rollup / drilldown is to arrange the information processing result for each category when there is a plurality of category information for one object. In one patent document, there is country, time, applicant, inventor, patent classification code information, and the applicant information includes the applicant's local information. And, if the applicant is a company may be built as a company information DB that will contain the financial statement information of the company. At this time, if one selected category has a multi-level hierarchical structure such as 1) patent classification code information in whole or any set of patent document sets, various information processing (search, or Counting or other statistics, analysis, calculations, etc.) and output the results. 2) Various intersections of two or more categories (if the category type is n, the number of possible intersections is nCr). (n combination r, r is greater than or equal to 1 and less than or equal to n), but not all of these combinations may be selectable, and they may have special analytical meanings, as well as combinations. You will be able to roll up / drill down as a reference.

Note the following when rolling up / drilling down through an intersection:

The first is when using two categories. For example, when rolling up / drilling down to category A (eg time) and category B (eg IPC), the results of information processing (searching or counting or other statistics, analysis, calculations, etc.) For ease of presentation, it is preferable to drill down in one direction (for example, the IPC does not drill down to the time base, but only expands the IPC (as illustrated in the above rollup / drill down concept). However, you can drill down the IPC and drill down the time base, but in this case the results may seem too complicated, so it is advisable to avoid it unless it is a special case. The rollup is also rolled up in the A category direction of each cell based on the information value for the cell composed of all combinations of the preset units of category A and the set units of category B. Information processing must be done, and roll-up information processing must also be performed in the direction of category B. The above table shows an example of this, each number in a cell corresponds to an information value for the cell, and {} The numbers in the figures correspond to a roll-up of the number of patents per year from 2000 to 2005 at each patent classification symbol level, where 17 in H04B 7/02 {17} represents H04B 7/02 and its subordinate patent classifications. This table shows the number of patent documents rolled up to the symbol, and the table shows rollups based on the year (for example, the number of all patent documents in 2005 and the number in 2005 in H04B 7/02). Although it is not, it is natural that it can be rolled up and displayed, and it is natural that it is multidimensionally calculated quarterly and monthly, which constitute 2005.)

The second is when using more than two categories. In this case, it is preferable that the drill-down is unfolded in one direction to be selected, and unfolding in two directions may be allowed, but it is preferable to unfold in three directions because it is impossible to visually express or extremely difficult. Even in this case, the rollup is preferably calculated for all categories, but depending on the amount of rollup calculation and the frequency of use, the rollup does not need to be rolled up beforehand. . Even if the roll up is not necessary, the required information can be rolled up and processed at the time of occurrence, but a relatively long time may be consumed for the roll up process. For example, if you do not roll up by year, you can get the year information from the bibliography of all the target documents, calculate it separately by year, and print it out by year. Can be. However, if it is calculated by dividing it by year in advance and rolled it up, when drilling down by year later, it is advantageous in terms of response speed by simply reading and displaying the rolled up yearly information.

Result table data for multi-dimensional analysis for total amount analysis

Result table data for multi-dimensional analysis for total amount analysis

Hereinafter, the multi-dimensional analysis calculation result table generated by the multi-dimensional analysis calculation result table generation module 402 of the present invention will be described in more detail. In addition, a description will be given of how the analysis module accesses the multi-dimensional analysis calculation result table and generates some data.

In general, since the filing date of the patent document is indicated by the year and date, the multi-dimensional calculation may be performed on a monthly basis, quarterly basis, and annually when the patent classification code is counted on a daily basis.

Table 40 below shows a portion of one embodiment of a result table in which multidimensional operations are performed. The data shown in Table 40 below is calculated by the multi-dimensional analysis operation result table generation module 402 based on the applicant and the IPC and the year by date in the patent information table reconstructed by the star schema. This is data generated by counting the number of patent documents for each layer of multi-level IPC. Data such as Table 40 may be stored in any format such as a table / view / materialized view, and the data included in the stored table may be extracted using an appropriate SQL query statement. For example, importing the number of applications by multi-application IPC (multidimensional calculated) on the basis of applicant A's IPC 1 dot subgroup (C5 level in the following) is an example. Of course, it is natural to generate the data. (To this end, it is better to have the total data for each year (not shown due to space constraints in Table 40, but may be added by year even if there is no).) The generation module 402 refers to the patent classification code DB of the IPC or the like, and corresponds to a lower patent classification code of its own patent classification code that cannot be accessed by wildcards (*,?, Etc.). Must perform multidimensional operations on information (this is especially important at levels with sub-dots in title information, especially for IPC subgroups, sub-classes for USPCs, and sub-class levels for USPCs).

TABLE 40

GC C5 C6 C7 C .. 01 02 03 04 05 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 12 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 12 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 3 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 3 H04B 7/02 H04B 7/04 3 2 2 21 24 H04B 7/02 H04B 7/04 3 2 2 21 24 H04B 7/02 H04B 7/04 6 3 10 42 39 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 9 3 20 18 24 H04B 7/02 9 3 20 18 24 H04B 7/02 9 3 20 18 24 H04B 7/02 15 7 30 60 64

The table is described below. AppName is the name of the applicant, and the number under C is the node hierarchy from the root node of the IPC (C1 is section, C2 is class, C3 is subclass, C4 is main group, C5 is 1 dot subgroup, C6 is The two-dot subgroup, C7, refers to the three-dot subgroup, where C8 to C20 may be assigned. Typically, C15 is sufficient. The year is even before 00, and the number is omitted. In the present year, the data of each cell is based on the filing date based on the documents published so far.

On the other hand, when generating a value from the table, the value entered in each cell of the total field is to take the sum of the values for each year, the gray period is 1 year and 6 months from the current date (forced after the normal filing date) It is a period of disclosure from the year in which the previous date belongs to the current date, and take the sum of the values for each year in the period.

The GID is used to show up to what level the result of multidimensional computation. In the case of H04B 7/02 of GID 7, the result of multidimensional calculation up to oneself (C5) (for example, 64 = 24 + 39 + 0 + 1), and H04B 7/02 of GID 3 which is one level lower than this is C6 level Up to (except for yourself) shows multidimensional computed results. 24 is a value derived from GID 1, which is the number of the patent document marked with the patent classification code H04B 7/02. The notation of the GID value may be any method (arbitrary promise), but is expressed in such a manner as to form a sequence of "n-left-1 of 2".

GID refers to the rollup stage. Rolled up to the C8 level and rolled up to the GID 0 and C7 levels. Rolled up to the GID 1 and C6 levels. Rolled up to the GID 3 and C5 levels. Rolled up to GID 15, C3, and rolled up to GID31 and C2 are displayed as GID 63. At this time, it is obvious that the GID can be allocated even before the C8 level and the C1 level, but it is not effective and may not be reflected in the system 1. In H04B 7/06, GID 0 rolled up to C8 level, and GID 1 rolled up to C7 level. (In this example, there is no value in C8. )

An example of a correspondence relationship between a GID and a level of a patent classification code used in the present specification is described, with reference to Table 41 below.

Table 41

GID Target range IPC level USPC Level FT level ECLA level 0 C8 4 dot 4 dot 4 dot 4 dot One C7 3dot 3dot 3dot 3dot 3 C6 2dot 2dot 2dot 2dot 7 C5 1dot 1dot 1dot 1dot 15 C4 main group 0dot 00 level main group 31 C3 subclass class Theme code subclass 63 C2 class super category super category class 127 Applicant Applicant Applicant Applicant Applicant 255 all all all all all

The GID is a code meaning to roll up to the corresponding level. For example, GID 15 is at C4 level, which is the information rolled up to the main group by IPC, 0 dots by USPC, level 00 by FT, and main group by ECLA. It is written on. Therefore, if you want to see the information at any level, if you specify a GID corresponding to that level, you can see the multi-dimensional computed information up to that level. In this specification, the abbreviation GID is used, but it may be referred to as a "rollup level code" semantically. Above and below, the rollup level code and GID are used interchangeably. In the FT, the 00 level is normally a 2-letter level without a number, such as AA. In the case of the USPC or the FT, the super category means that there are too many items of class and theme code level, grouped by category.

As can be seen from Table 40, it can be seen that each patent classification code of each row in the table has its own upper patent classification code on the left side. The calculation result generation module for multi-dimensional analysis of the present invention generates multi-dimensional calculated numerical values for each IPC symbol using the data shown in Table 40 above. This can be generated for every patent classification set included in each patent document set for every patent document set of a specific applicant A country. On the other hand, the data as shown in Table 40 and above may be generated based on all application documents, or may be generated for all registered documents only.

The multi-dimensional analysis operation result table generation module 402 may generate the multi-dimensional analysis operation result table shown in Table 40 for each document set having a specific predetermined property. The multi-dimensional analysis operation result table generation module 402 may include: 1) a key (key, document unique information) for specifying a patent document such as an application number of a patent document included in the set of documents having the predetermined attribute. Obtaining a value, 2) extracting only key values having a key value capable of specifying a patent document from a FACT table of a table reconstructed with the star schema structure, 3) patent documents corresponding to the extracted key values A step of generating a result table for performing multi-dimensional analysis for at least one or more predetermined analysis subjects is performed. After the steps 1) to 2), the multi-dimensional analysis operation result table generation module 402 generates a set of document to be processed as a target of a predetermined process. The FACT table includes FACT information for the entire set of all documents. Naturally, the document specific information is included in the FACT information, and various bibliographic information may be further included. The document specific information is preferably an application number or a document unique code or a document serial code.

The multi-dimensional analysis operation result table generation module 402 generates at least one multi-dimensional analysis operation result table for at least one preset analysis theme for all documents for each country or country integration for each preset analysis theme. It is desirable to. The set of arbitrary documents may be any set of documents that share at least one property that can be defined in advance. Such arbitrary document set types include, for example, 1) a patent document set generated by a method of specifying a patent document set by a specific patent classification symbol on a specific patent classification system in a specific country DB, and 2) a specific applicant in a specific country DB. A set of patent documents generated by a method of specifying a set of patent documents by name, and 3) a patent document by a specific inventor name (ie, applicant name and inventor name) included in the patent document of a specific applicant as a inventor in a specific country database 4) Patent document set generated by a method of specifying a set; 4) Patent document set generated by a method of specifying a patent document set by a specific agent name in a specific country database; By a method of specifying a set of patent documents by a specific patent classification symbol and specific applicant's name on the Set of patent documents generated; 6) set of patent documents generated by a method of specifying a set of patent documents by a specific patent classification code on a specific patent classification code system, a specific applicant name, and a specific inventor name in a specific country database; A set of patent documents generated by a method of specifying a set of patent documents by a specific applicant name and a specific agent name in a specific country DB, 8) a set of all patent documents in a specific country, 9) a set of all patent documents in at least two countries, 10) For example, a set of patent documents specified in units of periods set in 1) to 9) above may be used. Meanwhile, the arbitrary document set may be generated only for having family information in a foreign country other than the first country, or may be a document set consisting of only reissued documents in the United States. The document set referred to in this paragraph is exemplary, and for these exemplary examples, the multi-dimensional analysis operation result table generation module 402 of the present invention may include at least one or more groups for the document set according to a predetermined analysis subject. It is preferable to generate at least one or more calculation result table for the multi-dimensional analysis for each set analysis subject.

Meanwhile, the multi-dimensional analysis calculation result table generation module 402 may generate a multi-dimensional analysis calculation result table as shown in Table 40 with respect to any document set generated by the user. The multi-dimensional analysis operation result table generation module 402 includes: 1) a key for specifying a patent document, such as an application number of a patent document included in an arbitrary document set generated by the user, and a key (document specific information). ) Obtaining a value, 2) extracting only key values having a key value that can specify a patent document from the FACT table of the table reconstructed with the star schema structure, and 3) patents corresponding to the extracted key values. The document is generated by generating a result table for performing multi-dimensional analysis for at least one predetermined analysis subject for documents only.

Method of processing operation result table generation module 402 for multidimensional analysis of total amount data

In order to generate a multi-dimensional analysis operation result table such as the table, the multi-dimensional analysis operation result table generation module 402 may generate the preset / received document set (for example, all documents registered or registered in a specific country). The following steps are taken.

First, a table for obtaining a key (key, document unique information) for specifying a patent document, such as an application number of a patent document included in the document set having the predetermined attribute, and reconstructed into the star schema structure Extracts only key values with key values that can specify patent documents from the FACT table, and sets a set of instructions for a base table for multi-dimensional calculations for each of the at least one preset analysis subject to only patent documents corresponding to the extracted key values. (A set of instructions may be script.). (At this time, a multi-dimensional operation execution instruction set for each analysis subject targeting the basic table for multi-dimensional operations may also be obtained at the same time.) At least one or more of the patent documents, without further processing, Only a set of commands by analysis topic are generated to generate a result table for performing multi-dimensional analysis for a predetermined analysis topic. That is, in this step, the multi-dimensional analysis operation result table generation module 402 determines a set of documents to be processed for predetermined processing, and sets a set of commands for each analysis subject for predetermined processing.

Second, a basic table for multidimensional operations is generated by executing a set of commands for each analysis subject on each document in the document set to be processed. The base table for the multi-dimensional operation includes information on at least one dimension, and each record should be input with the basic data for each dimension for performing the multi-dimensional operation. The basic data includes information about presence (1 or null) (application / registration occurred or not) and counting information (one or more of the information referred to as counting information in this specification, such as the number of claims). May be included. The dimension is 1) patent classification symbol by type of patent classification symbol 2) date dimension, 3) location dimension (country, region, etc.), 4) the attribute dimension of the subject (applicant, inventor, agent or any one selected from the scale) , Applicants can be divided into companies, universities, research institutes, individuals and others, and companies can be divided into multinational companies, large corporations, small and medium enterprises, etc.-This is the subject mast DB (204). It may be obtained by referring to the properties of each applicant in the).) Any one or more selected from among them, or any one or more of the composite dimensions resulting from a combination thereof.

In this case, a field must be designed in the base table for the multi-dimensional operation in order to input the bibliographic details required for each dimension. The field can accept all the high and low patent classifiers in the patent taxonomy, such as IPC in the section, n dot subgroups from the section, and (from supercategory) class to n dot level in the USPC. There must be a field for each level. On the other hand, for the time dimension, it is sufficient to have the smallest unit of field for analysis. For patent analysis, if you do not need daily or weekly analysis, the monthly field will suffice. On the other hand, if there is an applicant attribute dimension, there would be fields for companies (multinational companies, large enterprises, small and medium enterprises), universities, researchers, individuals, and others.

For example, if it is sufficient to include only the patent classification code dimension and the filing date dimension on the basis of the applicant in the basic table for the multi-dimensional operation, the applicant filed on January 3, 2005 of the applicant A, and the IPC is the IPC H04B 7/06. For the document, the data as shown in Table 42 may be generated.

Table 42

Station C1 C2 C3 C4 C5 C6 C7 05/01 05/02 05/03 H H04 H04B H04B 7/00 H04B 7/02 H04B 7/04 H04B 7/06 One

On the other hand, it may be entered as follows in the period and the attribute of the applicant. The data shown in Table 43 below may be processed at the time of the multi-dimensional operation described later by performing the multi-dimensional calculation partially on the year and the applicant's attributes, but there is no problem even if it is generated in advance.

TABLE 43

C1 C2 C3 C4 C5 C6 C7 05 05/01 05/02 05/03 H H04 H04B H04B 7/00 H04B 7/02 H04B 7/04 H04B 7/06 One One

On the other hand, it may further include a company evaluation, such as the applicant's location or the applicant's financial level, the number of families based on the number of families, the number of claims based on the number of claims, etc. Table 44 This can be done by including additional data.

Table 44

country State City Number of families Category 1 Number of families Category 2 Enterprise rating n-1 Enterprise rating n-2 Claim range 1 Scope of Claim 2 Kr Gyeonggi-do Suwon One One One

The method of generating such data is as follows. 1) Inquiry of the patent classification code included in the document to the patent classification code mast DB 203 to obtain all the upper patent classification code of the patent classification code, and to obtain all the higher patent classification codes of the acquired patents according to their respective levels. Enter it. For example, if IPC is H04B 7/06, H04B 7/06 at 3 dot subgroup (C7) level, H04B 7/06 at 2 dot subgroup (C6) level, H04B 7 at 1 dot subgroup (C5) level. / 02, H04B 7/00 for the main group (C4) level, H04B for the subclass (C3) level, H04 for the class (C2) level, and H for the section (C1) level. 2) From the various date information such as filing date / registrant, enter the necessary value in the time dimension on the basis of the necessary reference date (here, filing date). If the filing date is the base date, January 3, 2005, and the time dimension is divided into month, quarter, and year dimensions, enter 1 in January 2005. 3) Enter other bibliography items in other dimensions. And enter 1 in a large company. Fill in field values such as the applicant's location using the address information, query the patent document mast DB 202 to search the family number of the target patent document, determine which category the query value falls into, and determine the category. Record 1 in the Claim ranges are handled just like family numbers.

If a document has two or more types of patent classification codes (for example, IPC and USPC), a base table for the multi-dimensional operation may be generated for each type, and two or more types of patents are included in one table. You can also treat the classification symbols. On the other hand, when a plurality of patent classification codes of the same kind are included in the document, it may be desirable to create an independent record for each patent classification code. On the other hand, if the applicant is a co-application of two or more applicants, records for each applicant are generated by the number of applicants. (If the subject is the inventor, the record is generated for each inventor. Create a record for each delegate.)

Third, a multidimensional operation is performed on the base table for the multidimensional operation, and a multidimensional analysis operation result table is generated. The data in the table shown in Table 45 below shows the results of performing the rollup operation up to the year dimension based on the IPC dimension. (For convenience of description, it indicates that only the year 04 is rolled up for the quarter-year.) .)

TABLE 45

C5 C6 C7 C .. 01 02 03 04 04 / 1Q 04 / 2Q 04 / 3Q 04 / 4Q H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 One 2 3 11 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 One 2 3 11 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 One 2 One 0 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 One 2 One 0 H04B 7/02 H04B 7/04 3 2 2 21 4 4 4 9 H04B 7/02 H04B 7/04 3 2 2 21 4 4 4 9 H04B 7/02 H04B 7/04 6 3 10 42 6 8 8 20 H04B 7/02 H04B 7/10 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 H04B 7/02 H04B 7/12 0 One 0 0 H04B 7/02 H04B 7/12 0 One 0 0 H04B 7/02 H04B 7/12 0 One 0 0 H04B 7/02 9 3 20 18 6 6 6 0 H04B 7/02 9 3 20 18 6 6 6 0 H04B 7/02 9 3 20 18 6 6 6 0 H04B 7/02 15 7 30 60 12 14 14 20

The method of performing the multidimensional operation is divided into a method of performing a rollup operation and a method of performing a cube operation. The rollup and cube operations will be described in terms of dimensions IPC, date, and applicant. With the IPC, Date, and Applicant dimensions, the rollup operation proceeds in only one direction selected from the three dimensions, whereas the cube operation is 3P3 (the number of permutations that occur when three of the three are selected, here 6). Proceeds in the direction. In other words, when the IPC dimension is 1, the date dimension is 2, and the applicant dimension is 3, the rollup operation is performed only in one direction such as 1-> 2-> 3, but the cube operation is 1-> 2-> 3 only. Rather, all or selected 1 in 6 directions: 1-> 3-> 2, 2-> 1-> 3, 2-> 3-> 1, 3-> 1-> 2, 3-> 2-> 1 It may proceed in the above direction. Therefore, when performing rollup operation, if the result of operation from 1-> 2-> 3 is not needed (final rollup operation result is generated in the applicant's type unit), that is, only IPC rollup operation result is required. Alternatively, as shown in Table 45 above, when a calculation result of only the IPC and date dimensions is required to be rolled up, a rollup operation may be performed by designating a desired rollup direction for only one or more of the selected dimensions.

Therefore, when the purpose is various (when a result table for multi-dimensional analysis is required from various points of view), if the roll-up operation is to be performed, select as many dimensions as possible by selecting one or two dimensions. You will need to create a table of results of operations for multidimensional analysis. In other words, if desired, a plurality of multi-dimensional analysis operation result tables for each analysis subject are generated, and the analysis module generates the desired analysis result data by querying each of the multi-dimensional analysis operation result tables for each analysis subject. Do it. On the other hand, cube operations can perform multi-dimensional calculations for various combinations of various dimensions in various directions at once, so that the number of result tables for multi-dimensional analysis is relatively small, but the size of the result tables for multi-dimensional analysis is not large. The problem also exists. Therefore, if you want to perform an analysis from various perspectives by subject of analysis (because you can select the desired dimensions and combine the selected dimensions in any order), or if you want to perform an analysis from a dynamic perspective such as drill through, etc. If you want to see the results of a calculation with a small number of viewpoints, roll up and drill down around one dimension, or if you want to see the results rolled up in one direction, even for multiple dimensions, perform a roll up operation for multidimensional analysis. It would be more desirable to create a table of results of the calculations.

The result table for multi-dimensional analysis generated by performing a rollup operation is a subset of the result table for multi-dimensional analysis generated by performing a cube operation. (If the number of dimensions is the same, the cube operation is performed with a permutation of all possible dimensions. The result in one direction can be a subset of the result of the permutation of all dimensions) or a sliced result. Because cube operations typically operate on more dimensions than rollup operations, and on permutations of all possible dimensions (which, of course, exclude some), the rollup operation results in one slice of the cube operation result. Can be. For example, you can extract the results of performing the cube operation on the IPC, date, and applicant dimensions as the axis, and the result of the IPC-only rollup and the IPC dimension-> date dimension, which conceptually performs the cube operation. It is a slice of the result set.

Meanwhile, all cube calculation result data may generate the same result data by combining a plurality of rollup calculation results. For example, in the IPC dimension, the date dimension, and the applicant dimension, a calculation result table for multi-dimensional analysis generated as a cube calculation result selects one, two, and three dimensions from among the dimensions, and one or more dimensions are selected. In this case, the number of permutations of each dimension (one permutation corresponds to one rollup operation direction) is equivalent to generating a result table for multi-dimensional analysis that performs a rollup operation. That is, n rollup operations are equivalent to (equal to) one cube operation, and description based on the rollup operation may reduce the complexity of the description. Hereinafter, the rollup operation results will be described. Of course, it is natural that the present invention is not limited to the roll-up operation for the roll-up operation at the level of those skilled in the art, and can be easily replaced / modified by the cube operation. In the case of multidimensional fields, each dimension is assigned a field for each dimension, and every dimension has at least one lower level field (typically IPC) .It is difficult to put all these fields on the screen as long as they have a physical width limitation. Even if it is not included, there will be no difficulty in understanding and applying the spirit of the present invention. Therefore, hereinafter, the description will be mainly focused on the rollup operation in order to reduce the complexity of the description / notation, and the rollup operation will be described on the basis of one dimension as much as possible. It is no wonder that ideas are applied as they are. It is a unique dimension of patent data such as IPC or USPC, and it is best to understand rollup operation and cube operation through understanding and applying the idea of the present invention. In addition, since the date dimension is commonly used in other data, it is not described in great detail, and it is expressed in year unit if possible, but it will be obvious that it is also a target of rollup operation and cube operation.

The multi-dimensional analysis calculation result table generation module 402 obtains application date information and patent classification code information from a given patent document set, and then registers a patent classification code mast DB 203 or a star having a patent classification code system as data. The upper patent classification code of the obtained patent classification code is extracted by referring to the table reconstructed in the schema structure. Completion of the data of the multi-dimensional analysis operation results table including the information as shown in the table with the extracted upper patent classification code and the year information of the filing date. In this case, when a patent document has two or more types of patent classification symbols (for example, in the United States, there are IPC and USPC), each patent classification symbol may be processed independently. And, if a patent document contains two or more patent classification symbols of a single kind, 1) only the first main patent classification symbol, 2) all patent classification symbols, or 3) the main patent If there is a weight for the classification code and the subpatent classification code, the weighting factor may be reflected and processed. In this case, in the case of 3), a decimal point may appear in the numerical value, and in this case, rounding may be a valid notation method. At this time, it is a matter of selecting which policy to take from 1), 2) and 3). On the other hand, in addition to the above 1), 2) and 3) other policies may be possible, of course. When the method of 2) is taken, the number of lines in the multi-dimensional analysis calculation result table is increased because at least one new data value is generated for each patent classification code (that is, the rollup operation is performed even for the sub-IPC). To do so). On the other hand, when the method of 3) is taken, the numerical value may come out of the decimal point.

The table is an indicator of a quantity basis such as the number of patents or the number of patents among patent indicators. Data processing such as roll-up / drill down with sub-patent classification code of the present invention may be utilized for various patent information analysis indexes such as occupancy rate, concentration rate, activity index (AI), and the like.

Generate results table for multi-dimensional analysis without base table for multi-dimensional calculation by analysis subject

In the above, the generation of the multi-dimensional analysis calculation result table using the multi-dimensional calculation basic table for each analysis subject has been described. In this case, the basic table for multi-dimensional calculation by analysis subject is not essential, and when there is a table reconstructed with a star schema structure, the multi-dimensional analysis operation result table generation module 402 of the present invention performs the multi-dimensional analysis operation result. You can also create a table. This is because the base table for multi-dimensional calculation for each analysis subject serves to reduce the calculation amount / information processing amount of the multi-dimensional analysis operation result table generation module 402. Because it can be used. The multi-dimensional analysis operation result table generation module 402 obtains the necessary information from the table reconstructed into the star schema structure even when there is no basic table for multi-dimensional operation for each analysis subject, and processes the multi-dimensional operation by performing the multi-dimensional operation. You can do it. For example, the base table for multi-dimensional calculation by analysis topic stores information about itself and all of its upper IPCs by application number, even if there is no information stored in the base table for multi-dimensional calculation by analysis topic. The calculation result table generation module 402 for multi-dimensional analysis finds the IPC included in the application number, and searches the found IPC for the patent classification code mast DB 203 or other total high-level patent classification code tables. This is because it is possible to obtain information on all IPCs above the found IPC. Therefore, when there is no base table for multi-dimensional calculation by analysis subject, it is natural that the internal instructions of the multi-dimensional analysis operation result table generation module 402 become relatively complicated. On the other hand, after the multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table, all proceeds equally.

Star Schema Generate result tables for multi-dimensional analysis without rebuilding the table

In the above, the generation of the multi-dimensional analysis calculation result table using the table reconstructed with the star schema structure and the multi-dimensional calculation basic table for each analysis subject has been described. In this case, the table reconstructed with the star schema structure is for efficiently generating the base table for the multi-dimensional calculation for each analysis subject, and is not necessarily required. The multi-dimensional analysis operation result table generation module 402 is not a table reconstructed in the star schema structure, the patent document mast DB 202, the patent classification code mast DB 203 and / or subject mast DB 204 In addition, the necessary data may be read to generate a basic table for the multi-dimensional calculation for each analysis subject. In this case, a program (script, script, etc.) that generates the base table for the multi-dimensional calculation for each analysis subject may be relatively complicated, and the recyclability of the entire program or each module constituting the program may be relatively inferior. There will be. For example, the basic table for multi-dimensional calculation by analysis subject stores information about itself and all of its upper IPCs by application number, and the information is generated based on the multi-dimensional analysis operation performing result table generation module of the present invention. 402 searches for the IPC included in the application number, and searches the patent classification code mast DB 203 or other total high-level patent classification code table for the found IPC, for all IPCs of the found IPC. After the information has been obtained, it is possible to generate a basic table for the multi-dimensional calculation by the analysis subject based on the information.

Patent document mast DB Generate calculation result table for multi-dimensional analysis by analysis subject from 202

The multi-dimensional analysis operation result table generation module 402 generates a multi-dimensional analysis operation result table for each analysis subject even if there is no table reconstructed with a star schema structure and a base table for multi-dimensional operation for each analysis subject, as described above. You can do it. In order to generate the multi-dimensional analysis operation result table, the multi-dimensional analysis operation result table generation module 402 may set the preset / received document set (for example, a set of all documents registered or registered in a specific country). Process the following steps.

First, a key (document unique information) value for specifying a patent document such as an application number of a patent document included in a document set having a predetermined attribute is obtained and stored. Meanwhile, in order to generate a result table for performing multi-dimensional analysis for each analysis subject as shown in Table 45, at least one or more data of data of Tables 42 to 44, which are materials, is used for each patent document. 202, obtained from the patent classification code mast DB 203 and / or the subject mast DB 204, and generated and stored in the memory. The data stored in the memory may be equivalent to data stored in the base table for the multidimensional operation. That is, the data stored in the memory includes information on at least one dimension, and each record should be input with basic data for each dimension for performing a multi-dimensional operation. The basic data includes information about presence (1 or null) (application / registration occurred or not) and counting information (one or more of the information referred to as counting information in this specification, such as the number of claims). May be included. The dimension is 1) patent classification symbol by type of patent classification symbol 2) date dimension, 3) location dimension (country, region, etc.), 4) the attribute dimension of the subject (applicant, inventor, agent or any one selected from the scale) , Applicants can be divided into companies, universities, research institutes, individuals and others, and companies can be divided into multinational companies, large corporations, small and medium enterprises, etc.-This is the subject mast DB (204). It may be obtained by referring to the properties of each applicant in the).) Any one or more selected from among them, or any one or more of the composite dimensions resulting from a combination thereof.

Second, the multi-dimensional analysis calculation result table generation module 402 performs multi-dimensional calculation for a predetermined analysis subject based on data stored in the memory in which at least one or more of the information of Tables 42 to 44 of each patent document is integrated. To generate a multi-dimensional analysis operation result table for each analysis subject as shown in Table 45.

By Applicant IPC By level ranking Information

Given a multidimensional analysis operation result table as shown in Table 45, the analysis module can generate the following analysis results.

First, when the applicant is obtained, the applicants can generate the ranking information based on the number of applications / registrations for each level of the patent classification code. The ranking information may generate ranking information by comparing a count value of the multi-dimensional calculated application or registered document for each level of at least one patent classification code of the applicant. For example, in the multi-dimensional analysis calculation result table data, at the 1dot subgroup level of the applicant A (C5 level), the counting value of the multi-dimensionally calculated documents of H04B 7/02 and at the same 1dot subgroup level of the applicant A By comparing the counting value and the like with the number of multi-dimensional calculated documents corresponding to different patent classification symbols, the multi-application / multi-registration ranking for each patent classification symbol at the 1 dot subgroup level of Applicant A can be obtained.

Total analysis module

Assume that there is a table of calculation results for multi-dimensional analysis that has been counted for each applicant, IPC level, and application (or registration) as shown in Table 46 below.

TABLE 46

GC C4 C5 C6 C7 C .. 01 02 03 04 05 H04B 7/00 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 12 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 12 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 3 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 3 H04B 7/02 H04B 7/04 3 2 2 21 24 H04B 7/02 H04B 7/04 3 2 2 21 24 H04B 7/02 H04B 7/04 6 3 10 42 39 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 9 3 20 18 24 H04B 7/02 9 3 20 18 24 H04B 7/02 9 3 20 18 24 H04B 7/02 15 7 30 60 64 B29C / 100 .. .. H04B ..

In the multi-dimensional analysis calculation result table, the multi-dimensional analysis calculation result table may be generated for all applicants in a specific country unit based on the applicant. For example, if there is a patent classification code and application date (date) included in a specific patent document of Applicant C, the specific cell corresponding to the above table by referring to the patent classification code system with the patent classification code and application date data of the patent document The count value of may be newly established (patent classification code, AppName, etc.) or may be increased for numbers. As shown in the table above, in the case of generating a multi-dimensional analysis operation result table for all applicants in a specific country unit, the analysis module may generate the following analysis results from the multi-dimensional analysis operation result table.

First, in order to show the applicant group A's IPC subclass level grouped information, it is possible to generate a yearly count value at this level. The value of the recent field is calculated for the multi-dimensional analysis from the corresponding year of a year 6 months (18 months) before the date to which the preset base date (for example, the result of performing the calculation operation for multi-dimensional analysis may be an example). It is preferable that it is a count value resulting from patent documents obtained by the execution result table inquiry date. (... refers to the corresponding counting value) Table 47 below shows an example.

TABLE 47

ranking IPC subclass 01 02 03 04 05 lately One H04B (+) 2 B29C (+)

Of course, if you drill down H04B of Table 47, the count value for the documents corresponding to the patent classification code of the patent classification code of the applicant A's patent documents directly below H04B will be shown as Table 48 below. This may be counted under the condition of "AppName = A and IPC level = C4" in the result table for multidimensional analysis as shown in the above table, and the order may be based on C4. (Order by IPC level = C4, C4 Means IPC subclass.)

TABLE 48

ranking IPC subclass IPC main group 01 02 03 04 05 lately One H04B (+) H04B 1/00 (+) H04B 3/00 (+) B29C (+)

If you drill down again in Table 48, the multi-dimensional calculated counting value based on the 1dot subgroup will be as described above. (Of course, if you drill down on H04B, the count values for the documents in which the patent classification code of the applicant's patent document corresponds to the patent classification code immediately below H04B will be shown as follows. In the result table for the calculation, "AppName = A and IPC level = C5" can be counted, and the sum can be sorted based on C5. (Order by IPC level = C5, C5 is IPC main group The counting value in each cell is a multi-dimensional calculated value of all values corresponding to the lower patent classification code.

If you drill down on H04B 1/00 (+) above, the patent classification symbol below it will come out, and the value of each cell should be counted under the condition "AppName = A and IPC level = C6". C6 is based on C6 (order by IPC level = C6, C6 means IPC 1 dot group). The counting value in each cell is multi-dimensionally calculated with all values corresponding to its lower patent classification code. Value. In this way, it is possible to continue drilling down until there are no lower patent classification symbols, and additionally, it is possible to generate the required cell value by inputting the above conditions until it cannot be drilled down further.

Although the above method has been described with respect to IPC, it will be apparent that the same is true for USPC, FT, FI, and ECLA.

Similarly, the above has described a method of generating a calculation result table for multi-dimensional analysis that is multi-dimensionally calculated based on an application (or registration total amount), a method of calculating a cell value when drilling down, and a method of expressing the calculated value. It will be apparent to those skilled in the art regarding the activity rate that can generate a result table for multi-dimensional analysis.

On the other hand, it will be possible to generate a multi-dimensional calculated calculation value in the same manner for the other analysis indicators corresponding to the predetermined definition expression in which the patent classification code is involved in the same way, it is possible to calculate the cell value when drilling down In this case, the calculated value can be expressed.

When there is a calculation result table for multi-dimensional analysis as described above, when at least one or more specific patent classification codes are given, the ranking of applicants and the yearly count value of the applicants may be generated in a rank having a high count value. For example, assuming that H04B is given, the "IPC level = C4 and C4 = H04B" condition, and the sum of the applicants and sorting may be performed by the applicants (order by AppName).

Share and other data generation examples

For example, if the share is defined as "the number of documents in the set of objective documents with a specific attribute divided by the number of documents in the entire set of documents with a specific attribute", the data in the table form as shown in Table 46 may be generated for the share. will be. That is, when obtaining a share for a specific patent classification code of a specific company A, the target document set is a document set including a specific patent classification code among the document sets of A, and the entire document set includes a specific patent classification code. It will be a full set of documents. At this time, when obtaining a document set including the patent classification code, it should be naturally included in the documents including the lower patent classification code of the patent classification code.

Multidimensional analysis operation result table generation module 402 of the present invention is in the same format as the table of Table 46, without the applicant as shown in Table 49 (integrated document numbers of all applicants, this is the applicant-> national dimension This can be done either by a rollup operation or by doing a sum operation.) You will be able to create a table.

Table 49

IC C5 C6 C7 C .. 01 02 03 04 05 H04B 7/02 H04B 7/04 H04B 7/06 10 15 17 42 42 H04B 7/02 H04B 7/04 H04B 7/06 10 15 17 42 42 H04B 7/02 H04B 7/04 H04B 7/08 7 11 20 32 38 H04B 7/02 H04B 7/04 H04B 7/08 7 11 20 32 38 H04B 7/02 H04B 7/04 11 12 19 79 86 H04B 7/02 H04B 7/04 11 12 19 79 86 H04B 7/02 H04B 7/04 28 38 56 153 166 H04B 7/02 H04B 7/10 6 3 3 4 8 H04B 7/02 H04B 7/10 6 3 3 4 8 H04B 7/02 H04B 7/10 6 3 3 4 8 H04B 7/02 H04B 7/12 0 2 2 One 3 H04B 7/02 H04B 7/12 0 2 2 One 3 H04B 7/02 H04B 7/12 0 2 2 One 3 H04B 7/02 44 40 67 83 85 H04B 7/02 44 40 67 83 85 H04B 7/02 44 40 67 83 85 H04B 7/02 78 83 128 241 262

The calculation result table generation module 402 for multi-dimensional analysis of the present invention calculates (divides) the numerical values shown in both tables of Tables 46 and 49 in order to obtain the share of each step of the applicant's patent classification code by year. You will be able to find your share. The obtained occupancy data may also be generated in the same data structure as the table, and Table 50 below will be one embodiment. The share shown in Table 50 is a value not multiplied by 100, and when outputted, it may be multiplied by 100 and outputted in%. As shown in the following table, the share of each patent classification code may be calculated and generated as a table. You can also output the results drilled down to these multidimensional computed table data. On the other hand, when calculating the ratio value, it is natural that the ratio value cannot be simply rolled up by a simple addition operation. (For example, the sum of the share values of each quarter may be the share of the year. It is natural that the sum of the occupancy numbers of each subclassifier cannot be a share based on a specific patent taxonomy, so that the formula / definition of the occupancy rate is applied to each unit of each rollup operation. This is of course applied when performing a rollup operation on a ratio value such as concentration rate, activity rate, etc., and repeated descriptions will be omitted.)

TABLE 50

(unit %)

GC C5 C6 C7 C .. 01 02 03 04 05 H04B 7/02 H04B 7/04 H04B 7/06 0.0 0.0 17.6 40.5 28.6 H04B 7/02 H04B 7/04 H04B 7/06 0.0 0.0 17.6 40.5 28.6 H04B 7/02 H04B 7/04 H04B 7/08 42.9 9.1 25.0 12.5 7.9 H04B 7/02 H04B 7/04 H04B 7/08 42.9 9.1 25.0 12.5 7.9 H04B 7/02 H04B 7/04 27.3 16.7 10.5 26.6 27.9 H04B 7/02 H04B 7/04 27.3 16.7 10.5 26.6 27.9 H04B 7/02 H04B 7/04 21.4 7.9 17.9 27.5 23.5 H04B 7/02 H04B 7/10 0.0 0.0 0.0 0.0 0.0 H04B 7/02 H04B 7/10 0.0 0.0 0.0 0.0 0.0 H04B 7/02 H04B 7/10 0.0 0.0 0.0 0.0 0.0 H04B 7/02 H04B 7/12 50.0 0.0 0.0 33.3 H04B 7/02 H04B 7/12 50.0 0.0 0.0 33.3 H04B 7/02 H04B 7/12 50.0 0.0 0.0 33.3 H04B 7/02 20.5 7.5 29.9 21.7 28.2 H04B 7/02 20.5 7.5 29.9 21.7 28.2 H04B 7/02 20.5 7.5 29.9 21.7 28.2 H04B 7/02 19.2 8.4 23.4 24.9 24.4

When data such as a multi-dimensional analysis operation result table as shown in Table 50 is given, the multi-dimensional analysis operation result table generation module 402 is configured to generate a patent classification code for the applicant. Occupancy-based ranking information for each level may be generated. The ranking information may generate ranking information by comparing the occupancy value of the multi-dimensional calculated application or registration document criteria for each level of at least one patent classification code of the applicant. For example, the occupancy value based on the number of multi-dimensional calculated documents of H04B 7/02 at the 1dot subgroup level (C5 level) of Applicant A and the same 1dot of Applicant A in the result table data for multidimensional analysis. By comparing the occupancy value based on the number of multi-dimensional calculated documents corresponding to other patent classification symbols at the subgroup level, the occupancy ranking of each patent classification symbol at the 1 dot subgroup level of Applicant A can be obtained. Will be.

Similarly, the calculation result table generation module 402 for multi-dimensional analysis of the present invention may generate a table of concentration rates. The concentration rate of a particular subject criterion may be defined as "the number of documents in the set of documents in a particular area of that subject divided by the number of documents in the set of documents in the upper (eg total) area of said particular area of that subject." . When calculating the concentration rate of the A company in the H04B7 / 00 technical area in Korea based on the number of applications, the numerator is the number of documents including H04B 7/00 among the patent documents of the A company filed in Korea, and the denominator. May be the number of patent documents filed by Company A in Korea. In this case, the concentration rate may be calculated for each year, and the concentration rate may be calculated for each quarter. For example, to determine the yearly concentration of applications in the United States by company A's IPC-based technology areas (eg at the IPC subclass level), company A's overall applications in the U.S. or yearly applications at the IPC class level If there is a table including the number data, the value of this table and the value of the table with the application number data by year for each IPC reference description region may be processed to obtain a desired value.

Similarly, the multi-dimensional analysis operation result table generation module 402 of the present invention may generate a table for the activity ratio AI. The activity rate is "{total number of documents in a set of documents for a specific subject / total number of documents in a specific subject}} / {total documents of a set of documents for a specific domain / total number of documents in a set} If it is defined as ", the multi-dimensional analysis operation result table generation module 402 may generate a table for the activity rate with reference to the table information (in extreme cases, unit numerical information) generated for each molecule and denominator . At this time, the activity rate can also be generated in a year unit or a year integrated unit, and the activity rate data in all level units in consideration of the lower patent classification symbol for all patent classification symbols belonging to a specific patent classification system. You will be able to create

The present invention relates to other analysis indexes and other analysis indexes mentioned in this specification other than the aforementioned analysis indexes, by country, by application document criteria / registered document criteria, by type of patent classification code, and obtained by the applicant's patent. A ranking by level of classification code may be generated.

The same process may be applied to various other patent analysis indexes in the same manner. If there are n types of data used to define patent analysis indicators (1 total amount, 2 share and concentration rate, 4 activity rate), n or each of the above n data are included. In the following multidimensional analysis calculation result table (hereinafter, if two or more kinds of data are included in one multidimensional analysis result table), find the required value and calculate it according to the definition of patent analysis index. To get the desired value. At this time, when the value of each table is associated with a dimension value such as a patent technology classification code, the values are rolled up along the dimension axis. At this time, especially when the attribute of the dimension is a patent technology classification code, the value corresponding to the patent classification code of a specific step reflects the dot structure included in the title information of the patent classification code, and all lower patent classifications of the patent classification code. The result table for multi-dimensional analysis, which is rolled up with a value including a value corresponding to a symbol, is used.

Other Patent Analysis Indicators

The following analysis indicators may be used, and for each analysis indicator, the calculation result table for multi-dimensional analysis for total quantity analysis (205-2), the calculation result table for multi-dimensional analysis for quotation analysis (205-3), and competition Analysis result table for multidimensional analysis (205-4) for analysis, Result table (205-5) for multidimensional analysis for inventor analysis, Result table (205-6) for multidimensional analysis for analysis by patent technology classification, Fusion The multi-dimensional analysis operation result table for each analysis index may be generated as shown in the multi-dimensional analysis operation result table for analysis (205 -7) and the representative phrase analysis multi-dimensional analysis operation result table (205 -8). It should be created in the same way as creating the result table. Of course, it is preferable that a detailed multidimensional analysis calculation result table is generated in the multidimensional analysis calculation result table based on application / registration total amount, occupancy rate, concentration rate, and activity rate in detail. Only the equations differ from the following analytical indicators and the analytical indicators such as the total amount, share, concentration, activity rate, and when given, the values for each element constituting the equation are obtained according to each equation. The mechanism for generating the result table for multi-dimensional analysis by calculating the obtained values according to a formula is completely the same or at least equivalent, and it is extremely easy for a person skilled in the art to implement the formula for each analysis index. will be. Therefore, hereinafter, analysis indicators are presented, and the existence, operation principle, operation sequence, specific form of generated tables, and a method of accessing the table are provided for the operation result table generation module for multi-dimensional analysis of each analysis indicator. Or logic), etc., will be apparent to those skilled in the art, and thus will not be described separately. The following is an analysis index.

1.Intensity analysis indicators of technological innovation activities

1) Revealed Technological Advantage The RTA is one of the most used indexes for understanding the status of technology specialization, and the subjects we are interested in are different. It provides information on what technology innovation activities are concentrated in comparison with The RTA index is generally known as the Activity Index (AI), and is characterized by a Specialization Index, a Technological Comparative Advantage (TCA), and a Technological Revealed Comparative Advantage (TRCA). It is used in various names.

RTA index is calculated by the following formula. In the formula below, the molecule

i represents the proportion of field i, and denominator refers to the proportion of field i of patents in all fields.

(Pij is the number of j patents for the field i)

2) Revealed Patent Advantage

The RPA is an indicator that shows the degree of concentration or specialization in a specific technology field, like the RTA index. The analytical meaning is the same as the RTA index, but it is designed to overcome skewness of the RTA index and to ensure the normality of the index.

The relationship between RPA and RTA and the formula for calculating the RPA value are as follows.

3) Concentration Ratio n (CRn)

The Concentration Index (CRn) is an indicator used to assess the level of monopoly in the original market.

to be. By applying this indicator to patent information, we can obtain information on technological monopoly in certain industry sectors, which can be used to gauge the strength of technological competition in that industry sector. The CRn index is also called the Cn index.

Originally, the CRn index is the sum of the market shares of the top n companies in one industry sector.

it means. Turning to the technology aspect from the market side, using patent share instead of market share to grasp the competitive strength of technology can be defined as follows.

(Si is i's patent share, Ni is i's patent, N is total patent)

4) Herfindahl Index (HHI)

Like the CRn Index, the Herfindal Index (HHI) was originally a market monopoly and competitive edge.

Is an indicator used to evaluate the degree. Using the patent data in the HHI analysis in the same way as the previous modification of the CRn index using the patent information, it is possible to obtain useful information on the monopoly situation and technical strength of the technical aspect. The Herfindal Index is also called the Herfindahl-Herschman Index (HHI).

The formula for calculating HHI using patent data is as follows.

(Si is i's patent share, m is the total number of companies in industry, Ni is i's patent, N is the total number of patents)

2. Technical level analysis index

1) Patent Count Weighted by Citations

The number of weighted patents cited is an innovation in the number of patents showing the quantitative aspect of technological innovation activities.

By combining citations that relate to the importance or value of performance, they provide more meaningful information to assess the technical aspects of innovation performance. Various methods of weighting the cited information may be proposed. Hereinafter, the weighted patent counts (WPC) of M. Trajtenberg using the simplest weighting method will be introduced.

The weighted patent counts (WPC) used by M. Trajtenberg are:

This is calculated

(nt is the number of patents registered in year t, Ci is the number of citations of i patent)

2) Cites per Patent (CPP)

The number of citations per patent (CPP) is an indicator of how much the patents of the analysis target (country, company, etc.) have influenced subsequent technological innovation activities. This indicator allows us to examine the technical significance of individual patents, the level of technological innovation activity and the value of innovation performance of a particular country or company. The name CPP is a name commonly used by CHI Corporation43 in the United States, but is an index of the concept that has been used in various names for a long time, rather than an indicator developed by CHI Corporation.

CPP is calculated as follows.

After all, CPP refers to the number of times a patent registered in a particular year (or period) is averaged by subsequent patents.

3) Patent Impact Index (PII)

The Patent Impact Index (PII) is an indicator that can be used to assess the qualitative level of technological innovation performance of a particular country or company. We can use PII analysis to determine how important a particular country or company we are analyzing is performing relative to the average level of technology in the sector.

PII is specific to the number of times that all patents in the technology to be analyzed are cited on average.

It is calculated as the relative proportion of the number of times a patent of a country or corporation is cited.

(Ca is the number of citations of a patent, Na is the number of patents of a, Ct is the number of citations of all patents, Nt is the total number of patents)

4) Current Impact Index (CII)

The Current Impact Index (CII) provides information on the technical impact of the past five years' technological innovation performance of the particular subject we are interested in at the present time. This index provides a glimpse into the technical significance and technical capabilities of recent innovation innovation performances of specific actors.

CII is an index that expresses how much the patent of a specific subject (country or company, etc.) calculated in the previous five years from the present time is cited at the present time (year) as a relative value of the overall citation frequency. The CII is calculated as follows (in the formula below, the specific country or company you want to analyze is represented by A).

t: past 5 years based on current year

rt: the number of times the year t patent in A is cited on average in the current year

Rt is the number of times the entire patent for year t is cited on average for the current year

ct: the total number of times the year t patent in A has been cited in the current year

nt: Number of patents in year t of A

Ct: the total number of times a patent in year t is cited in the current year

Nt: Total patents in year t

5) Technology Strength Index

The technology index (TS) is a table used to look at the technical capacity of a particular country or institution. The TS index allows us to be informed about the technical capacity of a particular country or institution, taking into account both the average level of individual technical performances and the quantitative aspects of technical performance.

TS for a particular year is defined as CII times the number of patents.

(CIIi is the CII value of i of the year, Ni is the number of patents of i of the year)

6) Technology Cycle Time (TCT)

The technology cycle cycle (TCT) index provides information about the pace of technological development, the rate of innovation activities.

Ball. It can be examined whether the speed of technological development in a specific field of technology or the basis of technological innovation activities of a specific subject is based on recent research results or long past research results. The TCT index is used under various names, such as the technology life cycle index.

The cycle of a technology can be measured in a variety of ways. In this case,

We will introduce a method to measure the cycle of technology around the TCT index. CHI

The TCT index is defined as follows.

"The middle of the difference between the publication year of the cited patent and the publication year of the cited patent.

Median age "

7) Science Linkage (SL)

The Science Linkage Index (SL) is a measure of how closely a patented technology is related to scientific research.

Show if you have a relationship. We can indirectly look at which countries or companies are leading in the industry and are focusing on basic research or development of original technologies.

SL is defined as the average of the number of scientific papers cited by the patents analyzed.

(nt is the number of patents registered in year t, Si is the number of scientific and technical papers cited by i-patent)

8) Average Claims per Patent

The number of claims is an indicator for measuring the breadth and scope of the invention for that patent.

Attracted attention. To date, however, there are not many attempts to measure the level of technology based on the number of claims, and mainly attempts to use it as a tool to measure the value of a patent or the likelihood of dispute, along with other indicators (citation index, family index, etc.). It is coming true.

The number of claims per patent is calculated as the average number of claims of the patents being analyzed.

(nt is the number of patents registered in year t, Ci is the number of claims of i patent)

9) Family Size

The size of a patent family directly indicates the regional protection scope of the patent, and indirectly provides information about the technical significance and innovation performance of the patent.

Methods of measuring the scale of a patent family can be defined in various ways. But here

Is defined as the number of countries in which the patent family is formed.

(N is the number of patents, Fi is the number of countries where the family of i patents is formed)

3. Cooperative Relationship and Knowledge Flow Analysis Indicators

1) Number of patent applications with invention

co-applicants, with co-inventors)

The number of joint patents provides quantitative information on the status of cooperation among research subjects related to technological innovation activities.

to provide. The number of co-patents can be broadly divided into two. One is the cooperative relationship of ownership of innovation performance, that is, the number of patents whose ownership is shared and the cooperative relationship of actual technological innovation activities, that is, the number of patents made up of common inventions.

The number of co-patents can be defined in two ways.

Number of jointly owned (filed) patents

"Number of patents jointly owned (filed) by two or more subjects"

Number of joint invention patents

"Number of patents jointly invented by two or more inventors"

2) Salton's Index

The Salton's Index shows the strength of different types of partnerships, including cross-border, regional and institutional collaboration. Used to name Salton's measure, Salton's cosine formula, and Salton's cosine measure.

The Salton index is calculated as follows.

(Pij is the number of joint patents of i and j, Pi is the number of patents of i, Pj is the number of patents of j)

3) Brain Gain, Brain Drain

The analysis of labor inflow and outflow rates can be used to determine where the results of research activities are being attributed.

Used to grasp the status quo. It is an analysis to grasp the direction of knowledge flow by focusing on the human aspect in the process of technological innovation.

The workforce inflow rate and outflow rate can generally be defined as

Attraction rate

"A foreign resident (or foreign) of a patent owned by a resident (or national)

Percentage of Patents Including Inventor's Research Activities "

Personnel outflow rate

"A foreign patent of a patent containing research activities of the inventor (or native)

Proportion of patents owned by runners (or foreigners) "

4) Index for Knowledge Flow with Patent Citations

Analysis of knowledge flow using patent citation information can be based on "knowledge" or

It can be used to identify aspects of the spread of intangible "technical information". This analysis can be applied in various ways depending on the purpose of the analysis.

Citation index for identifying trend of knowledge flow is described as a standardized index

It is difficult to do. The following description will focus on analytical indicators and methods for identifying knowledge flows between countries.

Citation Relationship Index for Understanding Knowledge Inflow Status

"Country Share of Backward Citations Cited by Patents in Certain Countries"

Citation Relationship Index for Knowledge Transfer

"Country Share of Forward Citations Citing Patents of Certain Countries

In the following, more applicable analysis indicators are further introduced, and it will be obvious to those skilled in the art that these are treated completely equivalent to the analysis indicators.

1. Technology Attractiveness: It shows the attractiveness of a specific technology field and whether it is a promising technology.

Analytical indicators

1) RGR (Relative Growth Rate): Relative growth rate (RGR) is a relative value of the growth rate of the number of patents applied in one technology field and the average growth rate of the number of patents applied for the whole technology within a specific period.

PA _F ₍ _t2 _- _t1 ₎ Is the number of patents filed between times t ₁ and t ₂ in the specific technical field F

d (PA _F ₍ _t2 _- _t1 ₎ ) represents the growth rate of the number of patents applied during this period

2) Relative Development of Growth Rate (RDGR): An indicator that takes into account changes in time in calculating growth rates. Two different time intervals t ₃ _to t _{4 for a} particular technical field Wow The t ₁ _to t ₂ are compared with each other and again measured relative to the entire technical field.

3) RCT (Relative Technology Cycle Time): It is the relative TCT value of a specific technology field with respect to the average TCT value of the entire technology field.

TCT (Technology Cycle Time): It is an index using patent citation information. As a median value of patent citation time, the smaller the TCT, the faster the technological development and the greater the technical attractiveness.

2. Technology Activity: It shows the calculating aspect of R'D activity and shows the R'D performance of the company. Generally, it is measured by the number of patents filed by a company within a certain period or by the number of patents registered within a certain period.

Analytical indicators

1) Relative Patent Activity (RPA): Relative Patent Activity (RPA): The relative value of the number of patents applied by a company in a particular technical field and the average number of patents applied by all companies in the art.

PA _iF refers to the number of patents filed by company i in a particular technical field F.

2) Relative Patent Position (RPP): It is the relative value of a company's number of patents applied in a specific technical field and the most active competitor in this technical field.

3) Revealed Technology Advantage (RTA): Relative value relative to the share of patents filed by a company in a particular technology sector and the share of patents filed by the company in all technical fields. .

3. Technology Quality: Reflects the R'D performance of the company, but can conduct a qualitative evaluation of the economic value and effectiveness of the company's R'D activities.

Analytical indicators

1) GR (Grant Rate): Indicates the share of registered patents among applied patents. The higher the ratio, the higher the Patent Quality.

2) TS (Technological Scope): It indicates the number of claims in the patent document of the applied patent. Also, the larger the TS, the higher the Patent Quality.

3) CR (Citation Ratio): One of the most frequently used indicators in the evaluation of patent value. It is an indicator that evaluates patent value by the number of times a patent is cited by a patent filed later.

4. Technology Priority: Shows the importance of a specific technology sector in the various technology sectors in which the company is engaged, and how important it is to the company.

Analytical indicators

1) Importance of Technological Field (ITF): The ratio of the number of patents applied in a specific technical field to the total number of patents filed by a company within a certain period of time.

5. Technology Collaboration: A measure of the degree of cooperation of a company, which consists of two parts: cooperation with other companies and cooperation among members of the company.

Analytical indicators

1) Internal Collaboration (IC): The degree of cooperation between members of an enterprise, which is expressed as the share of the total number of patents filed by members of a company located elsewhere in a particular technical field.

PA _iiF is the number of patents that Company i has filed with internal members in different locations in a particular technical field F.

2) External Collaboration (EC): The number of patents that a company co-files with other companies is calculated as the share of the total number of patents filed by the company in this technical field.

PA _ijF represents the number of _patents that company i and j jointly filed in the technical field R.

standard ranking Generation method

The method of generating the ranking will be described in more detail. When Applicant A is obtained (confirmed), the document set of Applicant A in the first or second country may be obtained (confirmed), and the IPC and the like may be extracted from the document set, and the extracted angle The number of documents can be counted by IPC at each stage in the IPC (from the section to the n dot subgroup) at each stage, and the number of multi-application / multiple registrations, concentrations, and activities at each IPC level with the counted values. It is possible to calculate the ranking of the field. (The above-described generation of ranking information for each patent classification code level of the patent indicator reference applicant A has been described above.) The calculated ranking enables the extraction of a field having a high ranking, and the extraction thereof. At least one patent classification code, such as IPC, may be a higher ranking technical field. The above-described method is a method of extracting the ranking of the most frequent patent classification code of each stage of at least one or more of the corresponding patent classification codes for the subject when one subject (or an attribute when expanded) is given. For inventors, agents, etc., the ranking by patent classification code level can be generated by the same method based on a document set including its own name.)

17 to 25 are patent classification symbols, such as IPC, etc., in the unit of the applicant for the number of applications / registration, share, concentration rate, and / or activity rate in units of countries such as Korea, USA, Japan, and Europe on an application basis or registration basis. It is shown that each criterion ranking is generated for each level of the system. 17, in the case of Samsung Electronics, it can be seen that H01L is No. 1 and H04N is No. 2 at the IPC subclass level based on Korean application documents.

This will be described in more detail as follows. 1) determining at least one set of documents that share an attribute (e.g., where the applicant is A, or the inventor a or agent B of applicant A, or certain keywords are commonly included, certain time periods are common, If a branch shares at least one or more attributes, such as citing or being cited in a document set, a partial document set that shares the attributes in the entire document set can be created through a SQL query or a query to the search engine. 2) extracting at least one or more corresponding patent classification codes for the individual documents constituting the document set; and 3) obtaining all higher patent classification codes with reference to the patent classification code system. Step 4) All of the above higher patent classification symbols are assigned to each level of the patent classification symbol (e.g., in the case of IPC). n dot subgroups)), 5) rolling up and counting individual documents containing stored patent classification codes by level, and 6) performing calculation for each predetermined analysis index by referring to the counting result ( Calculating calculation according to preset analysis indicators such as total amount calculation, concentration calculation, activity rate calculation, etc.), 7) calculating the ranking by considering the rollup for each level of the patent classification symbol for each analysis indicator (SQL statement In this case, the rank of the most frequent patent classification code of each stage of the at least one corresponding patent classification code may be extracted in the first or second station. At this time, the document set may include an applicant attribute such as a specific attribute (e.g., Applicant A (may be a plurality of applicants), a time limit such as the last five years, an inventor limitation such as Inventor C, and / or the It will be obvious that the combined attributes (such as the inventor C of the applicant A, etc.) may be shared, and in the case of a set of applicant-specific documents, the above 1) to 7) may be calculated for every applicant belonging to the first country. It will be self explanatory.

The multi-dimensional analysis operation result table generation module 402 of the present invention may perform one or more of a series of steps 1) to 7) to generate multi-dimensional analysis operation result table data. That is, in the case of 1), in the case where the document set is automatically determined, such as the applicant, the agent, the inventor, etc., the above-described steps 1) to 7) may be performed in series. The steps 2) to 7) may be performed for the given document set. On the other hand, by performing the steps 2) to 7) for each analysis index for all application documents / registration documents of one country unit may be stored as the data. In this case, the patent analysis index (the analysis index corresponding to the analysis index corresponds to the analysis index, the SQL formula to obtain the information constituting the calculation formula in the database to perform the necessary calculation (or in complex cases, an application for performing the calculation) (A repeated description is omitted) may be given in the system 1 in which the present invention is implemented and may be selected by the user. When the user of the system selects one or more of a specific subject, a specific country, a specific index, a type of a specific patent classification code, and a level of a patent classification code, an SQL query will be provided to extract a result corresponding to the selected combination. I will be able. In general, the specific subject, the specific country, the type of the patent classification code, the level of the patent classification code, etc. will often correspond to the conditions of the where clause of the SQL statement. Meanwhile, when the multi-dimensional analysis operation result table generation module 402 generates multi-dimensional analysis operation result table data for each specific index, the target table designation value indicated by the SQL statement is generated for each specific index. It would be desirable to be the table data of the result of performing the calculation for multidimensional analysis.

17 to 31 are exemplary views in which all the above descriptions are implemented.

FIG. 17 is a diagram illustrating an exemplary analysis result of application number data for each application by IPC subclass level of Samsung Electronics Co., Ltd. filed in Korea of the present invention. Note in the left column that the IPC subclass-level IPC rankings are generated. Numerical values shown in the exemplary embodiment of the present embodiment are processed by the computational method of the values input to the analysis module, and are variable according to replenishment, deletion, and change of data, and thus, actual values based on a specific time point. Can be different. Therefore, it is clear that the inventive concept of the present invention is not in the numerical values shown in the various tables and the like in this specification, but in the structure, structure, tool, method, information processing procedure, system, method of use, and the like for generating the numerical values. The same applies to the following.

18 is an exemplary diagram for an analysis result generated when drilling down to H01L. In the case of drilling down, the analysis information value is generated only for drilling down. In this case, using AJAX technology, it is possible to generate and provide only a numerical value for a portion where a change (drill down) occurs quickly without reloading the entire page. In the present invention, the user should be interpreted as the user computer 300 in relation to the system of the present invention. In other words, from the system's point of view, 1) what is provided to the user is actually sent to the user's computer, and 2) what the user enters is that the user's input is actually sent from the user's computer. In addition, since all selections, selections, and the like are actually transmitted from the user's computer, the user designates, pre-specifies, and selected information. However, the reason described for the user is due to the simplicity of the description.

19 is an exemplary diagram of an analysis result generated when drilling down to H01L 21/00 and other lower patent classification codes. You can drill down to see the analysis results based on the total amount. Drill down is possible down to the bottom of the IPC, and drill down to the lowest level as long as data is available. The same will be true for other patent classification codes.

FIG. 20 is a diagram illustrating an exemplary analysis result of application data by year of multi-application IPCs of IPC main group level of Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system 1 of the present invention. When the reference IPC is selected as the main group, the most frequent IPC is extracted from the main group unit, and an analysis result of the extracted IPC is generated and provided to the user. At this time, the reference IPC can be extended to the n dot subgroup, it can be seen that the same for other patent classification code.

FIG. 21 is a diagram illustrating an exemplary analysis result of application data for each application by year of IPC 1 dot subgroup level of Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system 1 of the present invention. . When the reference IPC is selected as the 1 dot subgroup, the most frequent IPC is extracted from the unit of 1 dot subgroup, and the analysis result of the extracted IPC is generated and provided to the user.

22 is an exemplary analysis of the number of application data by year of multi-application IPC of the IPC subclass level based on the application document of Samsung Electronics Co., Ltd. filed in the United States among all applicants in the DB possessed by the patent information system 1 of the present invention The figure for the result.

FIG. 23 shows an exemplary analysis result of application data for each application by year of IPC main group level of Samsung Electronics Co., Ltd. registered in the United States among all applicants in DB held by the patent information system 1 of the present invention. It is a drawing about.

FIG. 24 is an exemplary analysis of the number of application data for each application by year of USPC no dot (sub class) level of General Motors filed in the US among all applicants in the DB possessed by the patent information system 1 of the present invention. The figure for the result.

FIG. 25 is a diagram illustrating an exemplary analysis result of application number data per year of multi-application IPC of US Motor 1 dot level of General Motors filed in the US among all applicants in DB held by the patent information system 1 of the present invention. to be.

FIG. 26 is an exemplary diagram for an example of application total amount analysis and drill down for Korean Patent Application Document Criteria IPC H04B of the present invention. FIG. The results of this analysis are irrelevant to the applicant.

FIG. 27 is an exemplary diagram for an example of application total amount analysis and drill down for US application document standard IPC H04B of the present invention. FIG.

FIG. 28 is a diagram illustrating an analysis of the total amount of a multi-application firm for Korean patent document IPC H04B according to the present invention. The result of the analysis is that when a technical classification symbol is given in a country unit, an analysis index value for the technical classification symbol is generated and provided to the user.

FIG. 29 is an example of technical area analysis using the patent classification code of the present invention, and is an exemplary view of a multiplier cause of the occupancy standard for IPC H04B of the entire Korean application.

30 is an example of technical area analysis using the patent classification code of the present invention, and is an exemplary view of a multi-factor of activity rate based on Korean application document standard IPC H04B.

FIG. 31 is an example of technical field analysis utilizing the patent classification code of the present invention, and is an exemplary view of the analysis of the total amount of applications including drilldown of the US application-wide document standard IPC H04B and their sub-classifications.

Result table of operations for multidimensional analysis for competitive analysis

In order to perform competitive analysis, the multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table data for the competitive analysis. It will be described how the multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table data for the competitive analysis.

Type of competition (subject, technology)

In the spirit of the present invention, there are two types of competition. First, as a competition from one subject point of view, an applicant, an inventor, an agent, or the like becomes a subject, and a competition can be defined from the viewpoint of each subject. Second, there may be competition from one technical field point of view, and the technical field point of view is, for example, a field defined by at least one or more patent classification symbols such as IPC and a field generated by a set of documents generated by technology-related keywords. Hereafter, the powders are separated one by one.

Applicant's view of competition

Competition from an applicant's point of view may be defined as a conflict between other applicants B in the country in which Applicant A belongs (first country) or at least one other country (second country). On the other hand, conflicts between applicants may include 1) common in multi-application technology field, 2) common in high-concentration field, and 3) common in high-activity field. It is preferable that the field is defined by at least one patent technology classification code for each level such as IPC for each level. For example, when Applicant A filed a number of applications in H01L on the basis of IPC subclass in the first country, 1) A number of applicants in H01L based on IPC subclass in the first country (one with high occupancy), 2) Concentration rate This high applicant, 3) high activity rate applicant can be a competitor (in the second country the above 1), 2), 3) applicants can be direct or potential competitors.) Meanwhile, applicant A is H01L When the concentration rate or activity rate is high, the applicants 1, 2) and 3) of the first or second countries may be competitors.

How to Obtain Competitive Information from the Applicant's Perspective

A method of obtaining competitive information from the applicant's point of view will be described. The method of generating ranking information for each level of the patent classification code for each analysis index (total amount, share, concentration rate, activity rate, etc.) of each country of Applicant A has been described above. For example, when H04B 7/02 is selected as the multi-application subclass IPC in the IPC 1 dot subclass (C5 level) of the Korean patent application document of Samsung Electronics Co., Ltd., the result table generation module 402 for multidimensional analysis is performed. Generates the following information:

Regarding IPC H04B 7/02 obtained, 1) Multi-Application / Multi-Registration Applicant in the First Country, 2) Applicant with high concentration in H04B 7/02, 3) High Activity Rate Applicants or 4) other applicants with high calculated values of patent analysis indexes can be extracted. The extraction method can be basically processed as an SQL query statement. In this case, when the calculation result table data for multi-dimensional analysis is generated for each level of the patent classification code based on analysis indexes such as total amount, share, concentration rate, activity rate, etc. By accessing the SQL query statement, information such as at least one or more applicants having high targets of competitive competition and application / registration amount of the applicant's predetermined period may be obtained. If the multi-dimensional analysis operation result table data is not generated, the target information is obtained by using a relatively long and complicated SQL statement having the following order.

The logic of the SQL statement is 1) obtained a specific level of the patent classification code (for example IPC H04B 7/02) and the lower patent classification symbols of the patent classification code (the lower patent classification symbols are the patent classification code mast DB 203 ), And extracting all documents including the above) from the patent document mast DB 202 of the national unit / national integration unit (in this case, the same document is two or more times). If it comes out, duplicates should be removed.), 2) obtaining date information such as the applicant, filing date / registration date, etc. from the bibliographic information of the extracted document, and 3) sorting by multiple applicants / multiple applicants. Obtaining ranking information for multiple applicants; and / or 4) counting an application / registration amount in a predetermined period of time from an application date / registration date information. . Of course, any one or more of the steps 1) to 4) can be processed at one time. Although the SQL logic has been described in terms of total amount, other patent analysis indexes such as occupancy rate, concentration rate, and activity rate can be processed in a similar way.

For example, in order to find applicants with high concentration rate, after making the applicant list after step 2), the number of applications / registrations related to H04B 7/02 of each applicant is determined. The concentration ratio can be calculated by dividing the DB (202), which can be obtained as SQL from DB 202), and then the sorting of step 3) can be processed by the calculated concentration ratio. If the multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table data for each level of the patent classification code as described below, it may be processed by simple SQL. (This simple SQL would be extremely easy for a person skilled in the art.)

If obtaining the lower patent classification code of the patent classification code given in step 1) cannot be easily obtained as an extension (*,?, Etc.), there is a problem that a large load is applied in the data processing of step 1). Therefore, in this case, the data generated by the multi-dimensional analysis operation result table generation module 402 based on the count information of the document related to the lower patent classification code as shown in Table 64 becomes more useful.

TABLE 65

GC C5 C6 C7 C .. 01 02 03 04 05 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 12 H04B 7/02 H04B 7/04 H04B 7/06 0 0 3 17 12 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 3 H04B 7/02 H04B 7/04 H04B 7/08 3 One 5 4 3 H04B 7/02 H04B 7/04 3 2 2 21 24 H04B 7/02 H04B 7/04 3 2 2 21 24 H04B 7/02 H04B 7/04 6 3 10 42 39 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/10 0 0 0 0 0 H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 H04B 7/12 0 One 0 0 One H04B 7/02 9 3 20 18 24 H04B 7/02 9 3 20 18 24 H04B 7/02 9 3 20 18 24 H04B 7/02 15 7 30 60 64 H04B 7/02 50 40 30 30 30

When there is data such as the result table for multi-dimensional analysis as shown in Table 65, a record having H04B 7/02 at the IPC 1 dot level (C5 column) is extracted, and the number of the records is AppName, which is the applicant field. Group by, to rank. In this case, based on H04B 7/02 of A, it is possible to extract B, etc., from the competing applicant, and the number of applications / registrations per year / period of B is determined by the applicant B generated from the calculation result table for multidimensional analysis. It is possible to read the value of the document including the lower patent classification code of H04B 7/02 and the H04B 7/02 of the document and provide it to users.

The multi-dimensional analysis operation result table generation module 402 performs multi-dimensional analysis operation result table data related to occupancy rate, concentration rate, activity rate, and the like. If the concentration rate and the activity rate are the same as described above), it can be processed by simple SQL as described above.

The multi-dimensional analysis operation result table generation module 402 generates at least one competitive analysis multi-dimensional analysis operation result table data as follows. First, the multi-dimensional analysis operation result table generation module 402 generates multi-dimensional analysis operation result table data as shown in Table 65 for each type of competition. When the multi-dimensional analysis operation result table generation module 402 generates the table / data as shown in Table 65 as the share, concentration rate, activity rate, or other analysis index in the form as shown in Table 65, the AppName, The level of patent classification code and the like are equivalent, and the contents of numerical values for each year / period may be the values of occupancy rate, concentration rate, activity rate or other analysis index, respectively.

In the above, 1) Multi-Application / Multi-Registration Applicant, 2) Applicant with high concentration rate in H04B 7/02, based on the application document standard / registration document in the first country for IPC H04B 7/02 obtained The method of extracting this high applicant or 4) other patent analysis indicators with high calculated values is described. However, the method 1) to 4 described above is similar to the second country with respect to IPC H04B 7/02. Can be extracted based on the patent data of the second country in the same manner as the first country's patent document data. If there is a classification symbol, the symbols have a multi-level hierarchical structure like IPC, and the hierarchical structure can be distinguished by the number of dots, etc., so the symbols may be treated as IPC.)

32 is a diagram illustrating an exemplary competitor analysis based on the total amount of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention. The analysis of the representative competitors by the applicants is to obtain the ranking and application amount of the patent classification symbols of the patent classification symbols of the IPC group level among the applicant's multi-application patent classification symbols, and multiply the patent classification symbols The score is scored by a predetermined formula based on the applicant's application amount information, and the applicant with a high score is provided together with the ranking. In the representative competitor analysis, it may be desirable that the number of applications / registrations of the representative competitors come out each year. Of course, when the application number / registration number is clicked, a document corresponding to the application number / registration number is obtained by the query formula embedded in the application number / registration number and transmitted to the simple analysis module 407, the simple analysis Module 407 provides a simplified analysis of these documents. The figures in all the cells of the present invention are based on query values, and it is possible to obtain evidence documents from which the figures are derived from each query expression. The following applies to all cells. If the ratio is a value, a query is associated with each element number included in the formula that produces the ratio, so that the document can be obtained based on the query.

33 is a diagram illustrating an analysis of competition applicants based on the total amount of multi-application patent technology classification symbols based on the total amount of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention. Obtaining competitive applicants for each technology classification code level and technology classification code is as follows: 1) Obtaining ranking information of reference patent classification codes (1st H01L, 2nd H04N, etc. in FIG. 33) by the specific technology classification code level. And 2) a specific country (for example, H01L in FIG. 33) selected by a user or selected automatically by the system 1 by the reference patent classification code for each specific technology classification code level. For applicants with high application volume (based on application volume), high share (based on occupancy rate), high concentration rate (based on concentration rate), high activity rate (based on activity rate), or other patent indicators, 3) the number of applicants / registration of the applicant can be provided together with the extracted applicant information. In this case, when the drill down is performed, 1) through 3) of the lower patent classification code of the drilled down patent classification code having one document of the applicant provides information about the competition applicant. The drill down can be drilled down to the lowest patent classification code. On the other hand, the competitor can be performed in the same way not only IPC, but also USPC, FT, etc., but there will be a limit in the selection of the country. (USPC is a US patent classification code, so it competes with patent documents of Korea and Japan. This is because the competing applicant may be generated based on the total amount of the applicant's application, but may also be generated based on the total amount of the applicant's registration.

Meanwhile, the reference patent classification code is generated based on the patent document of a specific applicant in the first country, and the multi-application, high concentration, high occupancy, high activity competition applicants for each reference patent classification code are not the first country. Can be extracted from a second country. The selection of the first station and the second station may be selected by the user, but may be selected by the system 1 as a default value.

On the other hand, it is also possible to analyze the entry competition applicant, the generation of the analysis information on the entry competitor will be able to process only for the documents of the preset period (for example within the last seven years). That is, 1) information about the criteria of the patent classification code for each patent classification code level is generated based on the documents of the applicant in the first country filed / registered in the latest period, and i for each standard of the patent classification code. ) Applicants in the first or second countries may be able to extract multiple, high concentration, high occupancy, high activity competition applicants based on the recent period or ii) the whole period. 2) generate information on the criteria of the patent classification code for each of the patent classification code levels based on the documents of the applicant in the first country filed / registered for the entire period; Multi-application, high-intensity, high-occupancy, high activity competition applicants based on the reference may be extracted in the first or second country.

FIG. 34 is an exemplary view of analysis of competition applicants based on the total amount of IPC main group of the US patent standard Samsung Electronics Co., Ltd. among the applicants in the DB held by the patent information system 1 of the present invention.

FIG. 37 shows the USPC subclass (no dot) class multi-applicant competing applicants based on the total amount of all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention. Figure 1 is an exemplary view of the results of the ranking analysis.

Competition from the inventor's point of view

In the above, the multi-dimensional analysis operation result table generation module 402 generates data in order to obtain competitive information from the applicant's point of view. The competition from the inventor's point of view is also a method of handling the competition from the applicant's point of view. Approach the data on the basis of the Applicant's method (produce the document set in the inventor unit, generate the table / data in the same way in the document set unit per inventor), and then access this data on the basis of the inventor. Will be generated. Table 66 below shows an example of such data.

Result table for multi-dimensional analysis for inventor analysis

The multi-dimensional analysis calculation result table generation module 402 has been described for generating the multi-dimensional analysis calculation result table data for each patent analysis index including the applicant with respect to the obtained document set. It will be apparent that the inventive concept as described above may be applied to generate calculation result table data for multidimensional analysis for various patent analysis indexes based on the inventor instead of the applicant. The following shows an example of the result of performing the calculation for multi-dimensional analysis generated on the basis of the inventor. The data generated by the multidimensional analysis calculation result table generation module 402 based on the inventor may have two series. The first is to place the inventor under one applicant and to generate data in such a way as to process the set of documents of the applicant's inventor unit (i.e., the applicant is typically a large company or organization, It is assumed that there is an inventor, and the inventor is linked to the lower part of the applicant Applicant A AND Inventor 1, Applicant A AND Inventor 2 ... Data can be generated for each set of documents satisfying the above conditions. Second, the data is generated by processing the set of documents by the inventor unit independently (the form in which the inventor enters the applicant's place). An example of the former will be apparent. Table 66 shows an example of the data format generated by the multi-dimensional analysis operation result table generation module 402 for the latter.

TABLE 66

IAppName Inventor C3 C4 C5 C6 00 01 02 03 04 05 A a H04N H04N 5/00 H04N 5/64 H04N 5/655 0 0 0 One 0 0 A a H04N H04N 5/00 H04N 5/64 H04N 5/655 0 0 0 One 0 0 A a H04N H04N 5/00 H04N 5/64 H04N 5/655 0 0 0 One 0 0 A a H04N H04N 5/00 H04N 5/64 One 2 0 0 0 0 A a H04N H04N 5/00 H04N 5/64 One 2 0 0 0 0 A a H04N H04N 5/00 H04N 5/64 One 2 0 0 0 0 A a H04N H04N 5/00 H04N 5/64 One 2 0 One 0 0 A a H04N H04N 5/00 H04N 5/72 0 One 0 0 0 0 A a H04N H04N 5/00 H04N 5/72 0 One 0 0 0 0 A a H04N H04N 5/00 H04N 5/72 0 One 0 0 0 0 A a H04N H04N 5/00 H04N 5/72 0 One 0 0 0 0 A a H04N H04N 5/00 H04N 5/74 0 One 0 One One 0 A a H04N H04N 5/00 H04N 5/74 0 One 0 One One 0 A a H04N H04N 5/00 H04N 5/74 0 One 0 One One 0 A a H04N H04N 5/00 H04N 5/74 0 One 0 One One 0 A a H04N H04N 5/00 One 4 0 2 One 0 A a H04N One 4 0 2 One 0 A b B z

When the multi-dimensional analysis operation result table generation module 402 has such total amount information, a method of generating data such as occupancy rate, concentration rate, activity rate, etc. for each inventor unit of a specific applicant is equivalent in the applicant unit. As described in the description of how to generate information. Of course, when the data as shown in Table 66 has all inventor information for all applicants, the multi-dimensional analysis operation result table generation module 402 refers to the total amount information in individual inventor units that are not subordinate to a specific applicant. As a result, data on values of patent analysis indexes such as occupancy rate, concentration rate, activity rate, etc. may be generated.

FIG. 35 is a view illustrating an analysis result by year of multi-applicant inventors of Samsung Electronics Co., Ltd. based on the total amount of Korean patent applications among all applicants in the DB possessed by the patent information system 1 of the present invention. The inventor analysis by region of each patent classification code level comprises the steps of: 1) generating a patent classification code reference ranking by specific patent classification code level of a specific applicant; 2) by the patent classification code reference ranking i) the most applicants by year Calculating information (applied to FIG. 35), ii) calculating any one of the multi-applicant ranking (for example, 1st to 10th) information by year integration, 3) The step of providing the user with the inventor analysis information extracted and executed is carried out. On the other hand, the inventor analysis for each region of the patent classification code level may be performed based on the application document or the registration document, without specifying the applicant in a country unit. In this case, 1) generating a patent classification code reference ranking for each patent classification code level based on the entire target document, and 2) calculating information for the most applied inventors by year for each patent classification code reference ranking (FIG. 35). Applied), ii) calculating any one of the inventors' ranking (for example, 1st to 10th) information by year incorporation, and 3) performing inventor extraction information extracted by the above execution. It is broken down into steps that we provide to the user.

FIG. 36 is a view illustrating an analysis result of multi-applicant inventors of Samsung Electronics Co., Ltd. based on the total amount of Korean patent applications among all applicants in the DB held by the patent information system 1 of the present invention. The inventor individual analysis as shown in FIG. 36 is the generation of the individual analysis information of the inventor for all the inventors related to a specific applicant (the inventor of which the applicant is a company, etc.) in a country unit. Obtain registration documents, 2) extract inventors from the obtained documents, 3) count the number of documents for each inventor according to a predetermined policy, 4) generate a ranking by inventor, and 5) make the most according to the ranking. The number of applications / registrations by inventor and year / specific period is generated and provided to the user. On the other hand, when not related to a specific applicant, all patent documents of a national unit are obtained and the above 2) to 5) are executed. The documents may only cover documents for a preset period of time, such as the last seven years. This is essentially a matter of the generation of the document set, and the generation of the document set may be any one selected from the method of generating the document set mentioned in various places in the present specification.

Result table of operations for multidimensional analysis for citation analysis

In order to perform citation analysis, the multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table data for citation analysis. US Patent Application No. 09 / 802,847 (Samsung Electronics Co., LTD.) Describes how the multi-dimensional analysis operation result table generation module 402 generates a multi-dimensional analysis operation result table data for citation analysis. (KR)., Titled the invention User request processing method and apparatus using upstream channel in interactive multimedia contents service, filing date 2001.3.12) will be described by way of example.

Bibliography of the above-mentioned US patent application Ser. No. 09 / 802,847 is shown in Table 51.

Table 51

Application number Filing date Registration Number Registration date Citation Information (US Patent) IPC USPC 09 / 802,847 March 12, 2001 7,302,464 November 27, 2007 5680322 5805804 6044397 6130898 6317131 6611262 6631403 6654761 6654931 6697869 2002/0026642 G06F 15/16 H04 N 7/16 709/203 709/217 725/135

In this case, the multi-dimensional analysis operation result table generation module 402 obtains the following citation parent-child data from bibliographic information of 09 / 802,847 with respect to citation. child is his own document number, and patent is the other document number he is citing (in this case, his document number may be an application number or a registration number. Although the number may be a parent, there may be a publication number or an application number in the parent, so it is preferable to unify the application number, which is a number that all documents have in common. As long as the complexity of the process is not a problem, it is also possible to mix several numbers unless there is a problem in the specification of the document. It is called child data.

Table 52

child-country child (application number) parent (as obtained data) parent-country US 09 / 802,847 5680322 US US 09 / 802,847 5805804 US US 09 / 802,847 6044397 US US 09 / 802,847 6130898 US US 09 / 802,847 6317131 US US 09 / 802,847 6611262 US US 09 / 802,847 6631403 US US 09 / 802,847 6654761 US US 09 / 802,847 6654931 US US 09 / 802,847 6697869 US US 09 / 802,847 2002/0026642 US

From the obtained citation parent-child data as shown in Table 52, citation parent-child data unified by application number is generated as shown in Table 53 below. Each application number is obtained from the parent's registration number and publication number (application number may be applied, and all number information includes document type. A registration, A1 publication, etc.). However, this is only a name, and the attributes of the data indicated by the name are the same. On the other hand, the country columns of the documents in the child and parent columns refer to the country of origin in which the document is related (filed or registered). Such a country may be not only the United States but also a number of countries. In the following description, the country column is omitted for convenience of explanation, but it should be understood that there is a country column. On the other hand, the representation form of the application number may be any representation, such as "two digits + / + 6 digits" or "year + 6 digits".

Table 53

child (application number) parent (application number) 09 / 802,847 08 / 451,470 09 / 802,847 08 / 816,207 09 / 802,847 09 / 055,929 09 / 802,847 08 / 969,965 09 / 802,847 09 / 113,748 09 / 802,847 09 / 152,003 09 / 802,847 09 / 309,895 09 / 802,847 09 / 124,474 09 / 802,847 09 / 236,462 09 / 802,847 09 / 138,782 09 / 802,847 09/736393

The data processing as shown in Table 53 is generated based on one given document, but the data processing as described above may be processed in individual document units belonging to the document set in at least one or more document set units. It is summarized in Table 54. At this time, the duplication of parent (application number) can be allowed in the individual document set unit (ie, if document 1 and document 2 belonging to one document set include document 3 in the citation information, document 2 of document 3 This means that multiple duplicates can be allowed.) That is, you can generate unified number-based unit citation parent-child data for all individual documents in the document set. In this case, the most representative document set may be a US published application document set and a US registered patent document set. It will be apparent that the user-generated document set (which is entered as a specific search expression and then output as a search result) and the computerized system 1 embodying the present invention can be automatically targeted.

TABLE 54

In addition, if the patent document mast DB 202 is inquired about all documents included in the unified number-based unit citation parent-child data, the bibliographic information for each document may be generated as shown in Table 55 below. . The addition of a bibliography may be any one or more selected from all elements constituting the bibliography. Table 55 shows data added with an application date and a registration date among bibliographic items corresponding to only one application number.

TABLE 55

child registration date child application date child application number parent application number parent application date parent registration date November 27, 2007 March 12, 2001 09 / 802,847 08 / 451,470 May 26, 1995 October 21, 1997 November 27, 2007 March 12, 2001 09 / 802,847 08 / 816,207 March 12, 1997 September 8, 1998 November 27, 2007 March 12, 2001 09 / 802,847 09 / 055,929 April 7, 1998 March 28, 2000 November 27, 2007 March 12, 2001 09 / 802,847 08 / 969,965 November 25, 1997 October 10, 2000 November 27, 2007 March 12, 2001 09 / 802,847 09 / 113,748 July 10, 1998 November 13, 2001 November 27, 2007 March 12, 2001 09 / 802,847 09 / 152,003 September 11, 1998 August 26, 2003 November 27, 2007 March 12, 2001 09 / 802,847 09 / 309,895 May 11, 1999 October 7, 2003 November 27, 2007 March 12, 2001 09 / 802,847 09 / 124,474 July 29, 1998 November 25, 2003 November 27, 2007 March 12, 2001 09 / 802,847 09 / 236,462 January 25, 1999 November 25, 2003 November 27, 2007 March 12, 2001 09 / 802,847 09 / 138,782 August 24, 1998 February 24, 2004 November 27, 2007 March 12, 2001 09 / 802,847 09/736393 December 15, 2000

Count information for each application number (counting information such as the number of claims, the number of drawings, the number of families, the number of inventors, and the number of applicants may be additionally added for each count field) to the data shown in Table 55. It would be preferable to create one table for only one corresponding item.

Subsequently, the above data may be generated as a field value in which two or more data can correspond to each application number. These may be applicants, inventors, agents, patent classification symbols, and the like. The patent classification code will be described later. When two or more pieces of data correspond to each application number, all required field values may be included in one table, but the size of the table may be large. For example, 09 / 309,895 in Table 56, the applicant of AT'T Corp. (New York, NY) and Sun Micro Systems (Palo Alto, Calif.), In which case, when an applicant field is added to a line such as Table 56 below that relates to this application number, it should be processed as shown in Table 57 below.

TABLE 56

child registration date child application date child application number parent application number parent application date parent registration date November 27, 2007 March 12, 2001 09 / 802,847 09 / 309,895 May 11, 1999 October 7, 2003

Table 57

child registration date child application date child application number parent application number parent application date parent registration date Applicant November 27, 2007 March 12, 2001 09 / 802,847 09 / 309,895 May 11, 1999 October 7, 2003 AT'T Corp November 27, 2007 March 12, 2001 09 / 802,847 09 / 309,895 May 11, 1999 October 7, 2003 Sun micro systems

As such, if there are two applicants, the number of rows is doubled. On the other hand, if there are six inventors for the application, twelve rows are required when the applicant and the inventors are shown in one table. (2 * 6 = 12) In this case, data related to the (applicant, inventor) pair is provided. There is a problem that duplicates occur in all the excluded data. (Of course, if there is a duplicate, the result of analysis (for example, the applicant by year, the inventor of a specific applicant by year, etc.) is efficient because there is no table join. There is a problem that is large.)

For reference, when processing the information based on time / date, it is preferable to process the following as follows. For example, when there are March 12, 2001, it is preferable to perform the processing by subdividing such as March 12, 2001, March 2001, the first quarter of 2001, and 2001. However, in the present specification, this problem is omitted due to the notation (expressing a large number of various fields on the screen of a limited width), which will be apparent to those skilled in the art.

On the other hand, with regard to the patent classification code, the processing of the data in terms of including the lower classification code will be described. First, data processing for 09 / 802,847 corresponding to a child will be described as shown in Table 58 below, and the same applies to documents belonging to a parent. 09 / 802,847 includes G06F 15/16 as an IPC and H04N 7/16, 709/203 as USPC; 709/217; 709/231; 725/135.

TABLE 58

Filed March 2001 Application year2001 n dot subgroup (C (n + 4)) 1 dot subgroup (C5) IPC group (C4) IPC subclass (C3) IPC Class (C2) IPC Section (C1) child application number One One G06F 15/16 G06F 15/00 G06F G06 G 09 / 802,847 One One H04N7 / 16 H04N 7/00 H04N H04 H 09 / 802,847

In Table 58, C3 refers to the IPC subclass level, and others are described in the above manner. One of the 2001 fields means that there is one case in 2001, and that there is one in March 2001, and that there is one case in the month, even on a monthly basis. . Both G06F 15/16 and H04N 7/16 belonged to the 1dot subgroup, but if the 09 / 802,847 document contained H04N 7/169, then Table 58 would look like the following Table 59. Arbitrarily introduced for the purpose of explaining the invention idea)

TABLE 59

Filed March 2001 Application year2001 3 dot (C7) 2 dot (C6) 1 dot subgroup (C5) IPC group (C4) IPC subclass (C3) IPC Class (C2) IPC Section (C1) child application number One One 09 / 802,847 One One H04N7 / 169 H04N7 / 167 H04N7 / 16 H04N 7/00 H04N H04 H 09 / 802,847

Processing the data in terms of including the USPC in the lower classification code may be as shown in Table 60 below.

TABLE 60

Filed March 2001 Application year2001 USPC n dot level (C (n + 4)) USPC 1 dot level (C5) USPC no dot level (C4) USPC class (C3) Meta class (C2) Meta super class (C1) child application number One One 709/203 709/201 709 09 / 802,847 One One 709/217 709 09 / 802,847 One One 709/231 709/230 709 09 / 802,847 One One 725/135 725 09 / 802,847

USPC 709/203 (Title Information: Client / server ) is in the immediate parent 709/201 (Title Information: DISTRIBUTED DATA PROCESSING), which belongs to Class 709. The same is true for others. The above-mentioned USPC and the like all belonged to the 1dot or no dot level, but if the document 09 / 802,847 includes 725/45, the above Table 60 would be as shown in the following Table 61. Is arbitrarily introduced for

TABLE 61

Filed March 2001 Application year2001 4 dot (C8) 3 dot (C7) 2 dot (C6) USPC 1 dot level (C5) USPC no dot level (C4) USPC class (C3) Meta class (C2) Meta super class (C1) child application number One One One One 725/45 725/44 725/39 725/38 725/37 725 09 / 802,847

IPCs are assigned H04L 12/56, H04L 12/28, and USPC 370/395, 370/235. (The IPCs may differ from those granted by the Korean Intellectual Property Office. Is preferably treated with IPC granted by the US Patent and Trademark Office.)

At this time, since one row is created for each patent classification code, IPC and USPC may be generated in one table, but it is not preferable. This is especially true when there are a plurality of applicants such as a plurality of applicants, a plurality of inventors, etc., but in essence, there are several parents in one child. In other words, one child has k parents, one child has m IPCs, n USPCs, and each k parent documents will have IPCs and USPCs. If you do, you will need a lot of lines, and you will have a lot of data duplicated. This is especially true when the size of the document set is large (eg the entire set of US registered patents, etc.). Therefore, it is desirable to separate the IPC and USPC. With respect to patent classification symbols, the table types are: 1) types processed only for child-side patent classification symbols in cited child-parent data, 2) types processed only for parent-side patent classification symbols in cited child-parent data, and 3) The cited child-parent data may have a type that only handles patent classification symbols for both child and parent. In case of 3), the number of rows is increased, but the possibility of joining a table is reduced.

That is, the multi-dimensional analysis operation result table generation module 402 generates unit citation parent-child data for each document belonging to a given / preset document set, and generates child citation data for each unit citation parent-child data. And / or obtain at least one or more citations of individual documents belonging to the parent, and generate the result table data for multi-dimensional analysis using the acquired citations as field contents.

Examples of the given / preset document sets include: 1) a complete set of application or registration documents in a specific country, 2) a set of documents of any one or more of a specific IPC / USPC, 3) a set of documents by a specific applicant, and 4) a specific search expression. Search-based generated document sets that can be generated, and 5) national integrated all application documents or all registered documents.

At this time, if the data shown in Table 62 is generated for the entire set of documents in one country unit, the following benefits may be obtained. Bibliographic matters are comparable to each application number / registration number constituting the data, and any one or more of the bibliographic matters may be combined and generated as described above.

TABLE 62

The above method of generating data is a method of putting all application / registration documents into a child column, and putting parent document numbers citing the documents into the parent for each document. If there is no parent value, a null value is entered. In this case, there is only data in the child column, and there is no parent document corresponding to the child document. In this case, the child column contains all the document numbers.

For simplicity, let's introduce the simplest model. Suppose Document 1, Document 2, Document 3, and Document 4 all contain citation information, and the order of citations is Document 1-> Document 2-> Document 3-> Document 4 (Document 2 is Document 1). Citation, document 3 refers to document 2, and document 4 refers to document 3). At this time, if document 2 is based on document 2 (document 2 is in the child column), document 1 exists in document 2's parent column, and if document 3 is based (document 3 is in child column), document 2 is It exists in the parent column, and document 3 is in the parent column based on document 4. That is, documents 1 to 4 are all in the child column and at least documents 1 to 3 are in the parent column. Considering the depth of citation (based on Document 4, Document 1 becomes Citation Depth 3, Document 2 becomes Citation Depth 2 and Document 3 becomes Citation Depth 1). Document set generated from citation information included in document 3) to document depth 2 and document 1 to citation depth 2, and document 4 to forward citation depth 1 to document c. In this case, since all of the documents 1 to 4 are in the child, the citation document of citation depth 1 can be found in the parent column based on any document, and the found citation document number is found again in the child, and the parent of the found child is found. If found, the citation text at citation depth 2 will be found. This model is described in Table 63.

Table 63

child (application number) parent (application number) 2 One 3 2 4 3

Although described above as one document, it is obvious that the same is true for the document set. This is also described as a model as shown in Table 64 below. The description is based on Document Set 3. (In Table 64, each cell (contents in a line) means that the cell data of a conceptual line is not a cell data of the result table for actual multidimensional analysis.) It should be interpreted as, for the convenience of explanation of the idea of the present invention.)

TABLE 64

child (application number) parent (application number) Document Set 2 Document Set 1 (set generated by all parent of Document Set 2) Document Set 3 (shared specific attribute 3) Document Set 2 (set generated by all parent of Document Set 3) Document Set 4 (set generated by all children of Document Set 3) Document Set 3 (set generated by all parent of Document Set 4)

In Table 64, the document sets 1 to 4 are all subsets of the entire set of child columns, and the document sets 1 to 3 are all subsets of the entire set of parent columns. To be a subset means that all the documents belonging to the set belong to the whole set, so that they can be identified and extracted from the whole set. The attribute set of the document set may be any attribute, but may include 1) the name of the applicant, 2) the name of the inventor, 3) the IPC or USPC at each stage, 4) country, 5) agent, 6) period (application date / registration date range), 7) the status of the document (pending, registering, rejecting, etc.) or 8) a combination of at least one of them.

For example, a document set consisting of all cited documents of forward citation depth 1 related to the document set 3, assuming that the document set is document set 3 which belongs to IPC H01L of the applicant Samsung Electronics and all registered documents registered in the United States You will be able to extract 2. On the other hand, it is possible to obtain a document set 4 consisting of all the cited documents of the backward citation depth 1 related to the document set 3 and a document set 1 of the forward citation depth 2 related to the document set 3. At this time, the document set 1, the document set 2 and the document set 4 can be analyzed. The analysis targets: 1) the total amount of forward / rear citations, 2) the forward / rear multi-citation applicants, 3) the forward / rear multi-citation inventors, and 4) the forward / rear multi-citation IPC / USPC (each on each patent classification code system). Step-by-step, drill down is already explained.), 5) You'll find separate front and rear multi-person documents. Meanwhile, in the case of 1), the total amount may be displayed by dividing by year / preset period, and the increase / decrease rate and increase / decrease rate may be known based on various numbers. In the case of 2), the front / rear capacity may be divided and displayed by the total amount and / or the year / preset period for each multi-person applicant, which may be the same in 3) to 5). Based on the document set 3, if the back-cited multi-applicant is selected from Sangri 2), information on the post-applicant citing US-registered documents belonging to the IPC H01L of Samsung Electronics will be known. The post-applicant set may include Samsung Electronics, which is self citation. If the numerical value of the self citation is divided by another numerical value or a predetermined process, various analyzes of the self citation may be possible. will be. In addition, looking at the post-applicant by ranking, it is possible to know which applicant cited a lot of documents belonging to the IPC H01L of the Samsung Electronics. Of course, the result of the analysis (rear citations) has a link to obtain a related document, and when the user clicks the link, the relevant document (rear citations) is obtained from the patent document mast DB 202 and provided to the user. I can provide it.

On the other hand, when a set of documents is given (for example, document set 3), the document set 2 that is the parent of the document set 3 is found, and the conditions set in the document set 2 (for example, filed within the last 10 years). Generating a document set 2 '' 'satisfying a period of time such as a document), performing the at least one predetermined analysis by using the document set 2' '' as an analysis target document set, and generating the analysis result It is natural to be able to.

4 types of documents for citation analysis

4 types

On the other hand, when it comes to citation analysis, four kinds of documents to be analyzed are possible. Given a set of baseline documents, the following four sets of citation analysis documents are identified with respect to citations:

FIG. 38 shows that a set of documents that can be analyzed for four types of analysis in relation to citation analysis may be generated for a reference document set. FIG. 38 shows "Company's patent (other-> other)" cited by the company, "Company's patent (other-> party's) cited by its own company" Cited proprietary patents (za-> ta) "are shown, each of which corresponds sequentially to the following first to fourth.

First, it is the entire forward citation document set cited by the individual reference documents included in the reference document set. The entire forward citation document set may be determined as the union of all documents corresponding to the reference cited included in each of the reference documents. In this case, the union may be a union that eliminates duplicates, For example, if document # 1 and document # 2 contain document a as citation information, then a may be an important document cited twice from the counting point of view. It is preferable to allow (the same record is duplicated in parent in the result table for multi-dimensional analysis operation, that is, there are two records related to parent a due to # 1 and # 2). From the standpoint of the existence of a (applicant, date, or other bibliography), it would be desirable to eliminate the duplication (by performing a union operation) Counting is central to the analysis of the present invention. It is more reasonable to allow duplicates.)

Second, the entire forward citation document set citing individual reference documents included in the reference document set. The method for obtaining the entire back-cited document set may include: 1) searching for a document number quoting individual reference documents and combining them; or 2) using a parent column as a reference document number in a child-parent table generated for the entire patent document. There is a way to find in the child column the child document number corresponding to the reference document number in the parent column (there may be two or more) and add them together. The method of 2) is more preferable.

Third, it is a set of reference documents related to the whole backward citation. This document set is a document set composed of all the reference documents that have been cited by other documents among all the reference documents that constitute the reference document set. This document set is a document set composed of the parent of the second document set.

Fourth, it is a set of reference documents for all forward citations. This document set is a document set composed of reference documents including citation information among all the reference documents that constitute the reference document set. This document set is a document set composed of a child of the first document set.

Confirmation of reference document set

The reference document set may be a reference document set determined under arbitrary conditions, but a representative reference document set that may be typically determined include the following. 1) the applicant unit, 2) the applicant's individual inventor unit, 3) at least one or more level of patent classification symbol units in at least one or more patent classification symbol systems contained in the applicant's document, and 4) at least one patent classification symbol system. At least one level of patent classification symbol unit, at least one level of patent classification unit, at least one level of patent classification symbol unit at least one level of patent classification symbol system, at least one level of inventor classification unit at a country ) Individual inventor unit, 8) specific period unit, 9) document set unit including common keyword, 10) document set unit created by receiving user's condition, 11) specific condition among 1) ~ 7) (E.g., an application document, a registered document, etc. in the last five years) may be additionally a document set unit generated by further combining. At this time, since one document unit has existed in the past, it is obvious that the contents of the present invention cannot be independent.

The uppermost tab of FIG. 39 includes 1) "citation analysis for the entire set of patent documents", 2) "citation analysis by technical area", 3) "citation analysis by inventor", and 4) "citation analysis by individual documents for multiple persons." "These are all related to the establishment of a set of reference documents. 1) "Citation analysis of the entire set of patent documents" means the entire set of patent documents as a reference document set determined by any one selected from the applicant or all document determination methods related to the determination of the set of documents of the present invention. . 2) "Precise citation analysis by technology area" refers to a document set subdivided by patent classification symbols arranged by patent classification code reference rankings by level of patent classification code (IPC, USPC, etc.). It means to make a set. 3) "Citation analysis by inventor" means that the inventors are extracted from the entire set of patent documents, the ranking is calculated, and the reference document set is a set of documents classified by inventors having a high ranking. 4) "Citation analysis of individual documents for multi-person use" is to examine the number of citations / citations for all the documents included in the patent document set, extract the documents with a high citation / citation ranking, and extracts individual documents Means a set of reference documents.

Operational Results Table Generation Module (402) for Multidimensional Analysis

When the reference document set is determined, the multi-dimensional analysis operation performing result table generation module 402 determines 1) all forward citation document sets and 2) all backward citation documents as the citation analysis target document set corresponding to the reference document set. Set, 3) the entire set of forward citation-related reference documents, and / or 4) determine the full set of forward citation-related reference documents, and the multidimensional analysis described herein for any one or more of the sets of documents 1) to 4). Operation result table is created and stored.

Results of citation analysis

The analysis module of the present invention may determine the determined 1) full forward citation document set, 2) full backward citation document set, 3) full forward citation related reference document set, and / or 4) full backward citation related reference document set. The analysis result is obtained by a predetermined analysis expression for the multi-dimensional analysis operation result table for each analysis subject. On the other hand, the analysis module may perform the various analysis described in the specification of the present invention for any one or more of the document set of the above 1) to 4) rather than the multi-dimensional analysis calculation result table, in this case It is not very preferable because there is a problem in that it is not possible to obtain a multi-dimensional operation result or a large amount of computational resources are required to obtain such a multi-dimensional operation result.

The analysis module accesses the multi-dimensional analysis calculation result table generated for each document set of 1) to 4), and extracts a target citation analysis result using a predetermined analysis formula. The desired citation analysis results may include the following.

Firstly, with the focus on applicants, i) applicant ranking, ii) ranking of at least one level of patent classification symbol by at least one or more levels of patent classification symbol system by applicant, iii) multi-inventor ranking, iv) document appearance frequency Information about the numerical value data and / or the document number corresponding to each of the reference ranking and / or each of them are rolled up in a time dimension or another dimension may be extracted. That is, for the set of citation analysis target documents of 1) to 4) above, the ranking value for each field included in the bibliographic matter and / or the value corresponding to the time and other values rolled-up and / or various values corresponding to the citation analysis target document sets Information about the document number to obtain the individual document itself may be generated.

Second, based on patent classification symbols, i) ranking of at least one level of patent classification symbol units in at least one or more patent classification symbol systems, ii) applicants belonging to each level of patent classification symbols, iii) patent classification by levels Ranking of inventors belonging to the symbol, iv) the document appearance frequency ranking of the statement belonging to the patent classification symbol by level and / or numerical value data rolled up in a time dimension or another dimension for each of them and / or a document corresponding to each of them Information about the number can be extracted.

That is, for the set of citation analysis target documents of 1) to 4) above, the ranking value for each field included in the bibliographic matter and / or the value corresponding to the time and other values rolled-up and / or various values corresponding to the citation analysis target document sets Information about the document number to obtain the individual document itself may be generated.

Third, based on the applicant's attributes (type of applicant (company, university, individual, etc.), financial attributes, corporate valuation indicators, etc. for enterprises), i) type of applicant, ii) company size valuation indicators such as sales iii) extract rankings by company financial valuation factors, such as annual average returns, and / or numerical value data rolled up in time or other dimensions for each of them, and / or document numbers corresponding to each of them. . That is, for the set of citation analysis target documents of 1) to 4) above, the ranking value for each field included in the bibliographic matter and / or the value corresponding to the time and other values rolled-up and / or various values corresponding to the citation analysis target document sets Information about the document number to obtain the individual document itself may be generated.

Fourth, the first to third, when there is a numerical value data, the data for the change value of the increase and decrease rate, the increase and decrease speed.

For the citation analysis, the multi-dimensional analysis operation result table generation module 402 (based on the applicant citation) generates the multi-dimensional analysis operation result table for the citation analysis mainly for patent documents cited in the applicant's own documents. Generated. On the other hand, it will be apparent that the same multidimensional analysis operation result table can be generated for the examiner citation document. That is, when the examiner views a document cited in the examination process for a specific application document (which becomes a child document) in the examination process as a parent document of the application document, the module for generating a multi-dimensional analysis operation result table of the present invention 402 ) And the resultant table for multi-dimensional analysis for citation analysis and the method of using the result are completely equivalent to those based on the applicant's citation document. On the other hand, for the integrated citation analysis, it is possible to generate a combined citation document set (parent-combined) by combining the applicant citation (parent) and examiner citation (parent 2) based on one application document, Performing the same processing on the integrated citation document set based on the applicant citation document above, an equivalent result is obtained, and a method using the equivalent result is also equivalent. What is described in this paragraph will be apparent to those skilled in the art with reference to what is described on the basis of the applicant citations, and thus will not be redundantly described in detail.

Next, it demonstrates in detail, referring drawings.

FIG. 38 is a set of documents for analysis of a forward citation document set of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention are a reference document set. As an example, the yearly analysis result for the total amount of citations is shown. 38 shows that the total amount tab is selected. In the exemplary embodiment of the analysis screen of the accompanying drawings, various tabs are shown, and the letters of the tabs are written out to mean that the selected tabs are selected. The type of the tab is basically 1) the criteria for the property of the document set at the time of the confirmation of the document set, such as the application document criteria or the registration document criteria, 2) the application / registration total criteria, share criteria, concentration ratio criteria, activity rate Criteria by criteria or other analytical indicators, 3) criteria relating to the determination or purpose of the subject, such as standards, total quantity, applicants, inventors, technologies, and individual documents, etc. Value, rate of increase / decrease of numerical value, rate of increase / decrease of numerical value, 6) selection of patent classification symbols such as IPC, USPC, FT, FI, ECLA, etc., and criteria regarding the level of each selected patent classification symbol. .

39 is a set of documents to be analyzed for forward citation document set of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system 1 of the present invention are set as reference document sets. As an example, the results of the yearly analysis of the multi-person applicant.

40 is a set of documents for analysis of a forward citation document set according to the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention are set as reference documents. FIG. 1 is an exemplary diagram illustrating a yearly analysis result of a multi-patent patent classification code (IPC main group level). In this case, the IPCs may be drilled down, and in this case, the citation analysis result for the lower patent classification code related to the drill down may be extracted and provided to the user. 41 shows that.

FIG. 41 is a set of documents for analysis of a forward citation document set of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in a DB held by the patent information system 1 of the present invention are set as a reference document set. FIG. 1 is an exemplary diagram of a yearly analysis result reflecting when drilling down to a multi-use patent classification code (IPC main group level).

42 is a set of documents to be analyzed for forward citation document set of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system 1 of the present invention are set as reference document sets. As an example, the results of the yearly analysis of the multiplayer inventors.

43 is a set of documents to be analyzed for forward citation document set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention are set as reference document sets. As an example, the results of the yearly analysis of the most cited applicants are shown.

44 is a set of documents to be analyzed for forward citation document set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention are set as reference document sets. As an example, the yearly analysis result for the most cited inventor is shown.

45 is a set of documents to be analyzed for forward citation document set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention are set as reference document sets. As an example, it is an exemplary view of the analysis results of each year when the IPC is drilled down among the analysis of the most cited technology by IPC main group.

46 is a set of documents to be analyzed for forward citation document set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system 1 of the present invention are set as reference document sets. FIG. 1 shows an example of analysis results for each year when drilling down among the analysis of the USPC subclass (no dot) class citation technology.

47 is a set of documents to be analyzed for the entire reference citation-related reference document set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among the applicants in the DB held by the patent information system 1 of the present invention are the reference document sets. As an example, the chart generated by the chart generation module 406-2 of the total amount reference analysis result and the reporting module 406 of the present invention for the analysis result is shown. The reporting module 406 of the present invention includes a table generation module 406-1 for generating a table, a chart generation module 406-2 for generating a chart, a graph generation module 406-3 for generating a graph, and a report. One or more of the report generation module 406-4 to generate.

48 is a set of documents to be analyzed for the entire reference citation-related reference document set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention are the reference document sets. As a result of analysis on the total amount of citations per year of the inventors who received a lot of citations, and when a specific number shown in the analysis result was clicked, a simple analysis result (a list of documents and a list of the most applicants) A simple analysis module 407 is provided which provides the number of applications / registrations per year, the number of applications / registrations per year of the most inventors, and the number of applications / registrations per year (including drill down) by the largest technical field (IPC, USPC, FT). An example diagram of a list of documents.

FIG. 49 is an exemplary view illustrating a simple analysis module 407 of the present invention by providing a drill down function in a maximum technical field (IPC, USPC, FT).

FIG. 50 is a view of the present invention when the reference document set is a document set of a multi-application IPC subclass unit among all US application documents of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system 1 of the present invention. 1) is an exemplary diagram of a yearly analysis result of the total amount of citations using the citation document set as the analysis target document set.

Result table for multi-dimensional analysis for technical analysis by patent technology classification

Next, a description will be given of a technology analysis utilizing patent technology classification. The technology analysis using patent technology classification means an analysis of a document set determined through a patent classification code among document sets of the present invention. The set of documents for technical analysis utilizing patent technology classification generated by patent classification code is a document including at least one patent classification code selected from a specific type of patent classification code (IPC, USPC, FT, ECLA, etc.). Set of generated documents (if two or more patent classification symbols are involved (including various operations such as OR operation, AND operation, NOT operation), and 2) A document set including two or more types of patent classification symbols IPC AND USPC, etc.). All of these can be 1) individual country units, 2) country integration units, and duplications must be eliminated (as a result of the union operation) at the country level; Inclusion and processing will depend on pre-established policies or user choices, all of which involve the establishment of a document set.

The information that can be retrieved from the technical analysis using patent technology classification (hereinafter, referred to as technology analysis) is a document set for technical analysis using the patent technology classification (hereinafter, referred to as a document analysis document set). Information on 1) market share, 2) concentration rate, 3) activity rate, and 4) other patent analysis indicators. In this case, it will be obvious that documents corresponding to the lower patent classification code of the given patent classification code are included in the analysis target document set in the data processing for 1) to 4).

First of all, the analysis of technology using patent technology classification will be described. The total amount analysis is to provide an analysis result in terms of quantity, such as the number of applications / registrations by period / year by level of patent technology classification symbols, and rolls up a document including lower patent classification symbols of a specific patent classification symbol. The result of the multidimensional calculation is stored. (The number of patent documents with a specific patent classification and the number of patent documents with a subpatent classification are summed up (it will be obvious that duplicates are eliminated.) Table 67 below is an exemplary embodiment of a calculation result table for multi-dimensional analysis regarding the yearly distribution of the total amount of applications for the USPC. USPC 002048000 is 2/48, and 002049100 is 2 / 49.1. This is a matter of USPC notation (the first three digits are the class number, and the last six digits divided by 1000 are placed in the left and right "/" to form the USPC normally printed in the publication).

TABLE 67

ano dot 1 dot 2dot 3 dot 4 dot 01 02 03 04 05 06 002048000 002049100 002049200 4 One 6 3 One 8 002048000 002049100 002049200 4 One 6 3 One 8 002048000 002049100 002049200 4 One 6 3 One 8 002048000 002049100 002049200 4 One 6 3 One 8 002048000 002049100 002049300 0 One One 0 One 0 002048000 002049100 002049300 0 One One 0 One 0 002048000 002049100 002049300 0 One One 0 One 0 002048000 002049100 002049300 0 One One 0 One 0 002048000 002049100 002049400 002049500 0 One 3 One One 2 002048000 002049100 002049400 002049500 0 One 3 One One 2 002048000 002049100 002049400 002049500 0 One 3 One One 2 002048000 002049100 002049400 One 2 6 0 2 4 002048000 002049100 002049400 One 2 6 0 2 4

If you want to know the number of applications / registrations in a specific USPC obtained above, you can obtain a multi-dimensionally calculated number or sum by year / period in the specific USPC level (class to n dot) column.

On the other hand, the multi-dimensional analysis operation result table generation module 402 may generate the multi-dimensional analysis operation result table data as shown in Table 68 to perform the analysis by the applicant multi-level patent classification code.

TABLE 68

IAppName C3 C4 C5 C6 C7 C8 05 A C12N C12N 15/09 C12N 15/00 C12N 15/10 2 A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 One A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 C12N 15/33 One A C12N C12N 15/09 C12N 15/00 C12N 15/11 2 A C12N C12N 15/09 C12N 15/00 C12N 15/10 2 A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 One A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 One A C12N C12N 15/09 C12N 15/00 C12N 15/11 2 A C12N C12N 15/09 C12N 15/00 C12N 15/10 2 A C12N C12N 15/09 C12N 15/00 C12N 15/11 4 A C12N C12N 15/09 C12N 15/00 6 A C12N C12N 15/00 6 A C12N 6 B

In the case of extracting multiple sources based on C12N 15/00 when there is data as shown in Table 68, C12N 15/00 (IPC subgroup is abbreviated as C4 level) indicates itself and IPC C4 level. Since the GID 15 has been rolled up to the document count information on the lower patent classification code, in the above table, when the conditional expression "GID = 15 and IPC = C12N 15/00 and IPC level = C4" is applied to this condition, The correct number of documents per applicant (number of documents listed in GID line 15) is shown.

If there is the total amount count information as described above, it is possible to generate data such as the multi-dimensional analysis calculation result table for each analysis index such as occupancy rate, concentration rate, activity rate, and the like. A multidimensional analysis operation result table data as shown below is presented.

Table 69 below is a calculation result table data for multi-dimensional analysis related to occupancy.

TABLE 69

IAppName C3 C4 C5 C6 C7 C8 05 A C12N C12N 15/09 C12N 15/00 C12N 15/10 5.88% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 2.41% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 C12N 15/33 1.32% A C12N C12N 15/09 C12N 15/00 C12N 15/11 5.56% A C12N C12N 15/09 C12N 15/00 C12N 15/10 5.88% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 2.41% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 1.03% A C12N C12N 15/09 C12N 15/00 C12N 15/11 1.85% A C12N C12N 15/09 C12N 15/00 C12N 15/10 5.88% A C12N C12N 15/09 C12N 15/00 C12N 15/11 1.15% A C12N C12N 15/09 C12N 15/00 0.96% A C12N C12N 15/00 0.92% A C12N 0.47% B

Table 70 below is data for performing calculations for multi-dimensional analysis related to concentration.

TABLE 70

IAppName C3 C4 C5 C6 C7 C8 05 A C12N C12N 15/09 C12N 15/00 C12N 15/10 18.18% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 9.09% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 C12N 15/33 9.09% A C12N C12N 15/09 C12N 15/00 C12N 15/11 18.18% A C12N C12N 15/09 C12N 15/00 C12N 15/10 18.18% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 9.09% A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 9.09% A C12N C12N 15/09 C12N 15/00 C12N 15/11 18.18% A C12N C12N 15/09 C12N 15/00 C12N 15/10 18.18% A C12N C12N 15/09 C12N 15/00 C12N 15/11 36.36% A C12N C12N 15/09 C12N 15/00 54.55% A C12N C12N 15/00 54.55% A C12N 54.55% B

Table 71 below is a calculation result table data for activity-based multidimensional analysis.

TABLE 71

IAppName C3 C4 C5 C6 C7 C8 05 A C12N C12N 15/09 C12N 15/00 C12N 15/10 909 A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 454.5 A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 C12N 15/33 227.25 A C12N C12N 15/09 C12N 15/00 C12N 15/11 909 A C12N C12N 15/09 C12N 15/00 C12N 15/10 909 A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/12 454.5 A C12N C12N 15/09 C12N 15/00 C12N 15/11 C12N 15/31 181.8 A C12N C12N 15/09 C12N 15/00 C12N 15/11 303 A C12N C12N 15/09 C12N 15/00 C12N 15/10 909 A C12N C12N 15/09 C12N 15/00 C12N 15/11 213.8824 A C12N C12N 15/09 C12N 15/00 175.9677 A C12N C12N 15/00 170.4688 A C12N 86.5873 B

In the above, the multi-dimensional analysis operation result table generation module 402 generated by the applicant and the method for extracting necessary information from the information have been described. The multi-dimensional analysis operation result table generation module 402 generates information such as total amount, occupancy rate, concentration rate, activity rate, etc. for each patent technology classification symbol level for the inventor in the same manner as the applicant generates the data. Can be set.

Result table of operations for multidimensional analysis for fusion analysis

Hereinafter, the multi-dimensional analysis operation result table data generated by the multi-dimensional analysis operation result table generation module 402 for fusion analysis will be described.

The multi-dimensional analysis operation result table generation module 402 of the present invention generates the multi-dimensional analysis operation result table data as follows for the fusion analysis of the present invention. The multi-dimensional analysis operation result table generation module 402 receives an arbitrary set of patent documents to be input, set or determined, and then when a plurality of types of patent classification symbols included in the set are combined. (If there is only two IPCs, if there are two or more IPCs and if there are two IPCs and USPCs, only two or more IPCs are processed. Only two or more USPCs are processed as a result table for USPC multidimensional analysis.) For the patent classification code of the kind, data as shown in Table 72 is generated.

TABLE 72

Document number AppName Main c1 Main cn Sub 1 C1 Sub 1 Cm Sub i C1 Sub i Ck date Etc One A 2 B

One patent document includes at least one or more types of patent classification symbols such as IPC, USPC, FI, FT, and ECLA, and the patent classification symbol includes at least one information. The data shown in Table 72 shows that there may be at least one Main patent classification symbol and an optional Sub patent classification symbol based on the document number. In this case, it can be seen that the patent classification code included in the document lists the own patent classification code included in the document and all of the higher patent classification codes thereof. If it is assumed that its own main patent classification code corresponds to the IPC 2 dot subclass, the calculation result table generation module 402 for multidimensional analysis refers to the patent classification code mast DB 203 and refers to the main patent. When the classification code is determined to be C6 level, the main patent classification code is entered into the C6 level, and the upper patent classification code of the main patent classification code of 1 dot level is input into the C5 level on the left side, and the main group is entered into the C4 level. Enter the patent classification code of the level and enter this process serially up to the highest level. At least one or more sub IPCs included in the patent document are processed as in the main IPC. At this time, since other bibliographic data such as the applicant is information that can be obtained through the document number, these are optional elements.

Korean Patent Application No. 10-2005-0111868 is issued January 04, 2006 H04B 7/26 and H04B 7/15. This will be exemplarily described through Table 73. (For convenience, the subclass part is omitted from the patent classification code below the main group.

TABLE 73

Document number AppName Main c1 C2 C3 C4 C5 C6 Sub C1 C2 C3 C4 C5 C6 Etc 10-2005-0111868 Samsung H H04 H04B 7/00 7/24 7/26 H H04 H04B 7/00 7/14 7/15

If there are two or more Sub IPCs, the same process is performed in parallel on the right side of the H04B 7/15 related information as shown in Table 74 below. Republic of Korea Patent Application No. 10-2006-0012606 No. H04B 7/04, H04B 7/155, H04Q 7/30 Patent classification code granted as January 2006, it will be described by way of example. (AppName is omitted for the convenience of the notation. H04Q 7/30 is indicated on the line below.)

TABLE 74

Document number Main c1 C2 C3 C4 C5 C6 Sub 1 C1 C2 C3 C4 C5 C6 C7 10-2006-0012606 H H04 H04B 7/00 7/02 7/04 H H04 H04B 7/00 7/14 7/15 7/155 Sub2 C1 C2 C3 C4 C5 C6 C7 H H04 H04Q 7/00 7/20 7/30 ..

Table 75 below shows Korean Patent Application No. 10-2005-0042032 which includes H04B 7/02 and H04B 7/14 as patent classification code information.

Table 75

Document number AppName Main c1 C2 C3 C4 C5 C6 Sub C1 C2 C3 C4 C5 C6 Etc 10-2005-0042032 SK Telecom H H04 H04B 7/00 7/02 H H04 H04B 7/00 7/14

It will be apparent to those skilled in the art that the multidimensional analysis operation result table generation module 402 can generate the same data as the multidimensional analysis operation result table data for USPC, FT, and the like.

The multi-dimensional analysis operation result table generation module 402 generates plural patent classification symbol pair information as shown in Table 76 below from the serial information of the plural patent classification symbols for each level. The type of the plurality of patent classification code pairs has been described above in the description of the same patent classification code relationship preprocessing module. In the following Table 76, only one bibliographic matter, such as the filing date, may be newly added and included (for example, a registration date). However, in the case of the co-applicant or the co-inventor, a separate line may be included. What needs to be established is as above-mentioned. That is, the same record that only the applicant or the inventor differs is generated, and based on such information, the fusion analysis by the applicant and the fusion analysis by the inventor are possible.

To illustrate this model, assume that there are only three patent documents: Korean Patent Application No. 10-2005-0111868, Korean Patent Application No. 10-2006-0012606, and Korean Patent Application No. 10-2005-0042032. Of course, iterative processing of all the documents obtained will enable generation of the following data for all patent classification code combinations (Ai, Bj) based on all document sets. , Bj) would create the following table: The combination of patent classification symbols is 4 pairs in Korean Patent Application No. 10-2005-0111868, 22 pairs (6 + 16) in (Main IPC, Sub IPC) pair in Korean Patent Application No. 10-2006-0012606, (Sub 20 pairs in IPC, sub IPC) pair and 1 pair in Korean Patent Application No. 10-2005-0042032, and 47 pairs in 3 patent applications. Of course, it is also possible to express all of these pairs, but for convenience of description, the spirit of the present invention will be described in a manner as a substitute for the description of the total 47 for a table as shown in Table 76 below.

Table 76

Patent Classification Symbol Combination (Ai, Bj) Document number Applicant Filing date Apply weight policy 1 Apply weight policy 2 (H04B 7/15, H04B 7/26) 10-2005-0111868 Samsung 2006.12.04 One One (H04B 7/15, H04B 7/24) 10-2005-0111868 Samsung 2006.12.04 One One (H04B 7/14, H04B 7/26) 10-2005-0111868 Samsung 2006.12.04 One One (H04B 7/14, H04B 7/24) 10-2005-0111868 Samsung 2006.12.04 One One (H04B 7/155, H04B 7/04) 10-2006-0012606 Samsung 2006.12.04 1/3 0.75 / 2 (H04B 7/155, H04B 7/02) 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 (H04B 7/15, H04B 7/04) 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 (H04B 7/15, H04B 7/02) 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 (H04B 7/14, H04B 7/04) 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 (H04B 7/14, H04B 7/02) 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 (H04B 7/14, H04B 7/02) 10-2005-0042032 SK Telecom 2006.12.04 One One (H04Q 7/00, H04B 7/00) 10-2006-0012606 Samsung 2006.02.09 1/3 0.25 / 1 (H04Q 7/00, H04B) 10-2006-0012606 Samsung 2006.02.09 1/3 0.25 / 1 (H04Q, H04B) 10-2006-0012606 Samsung 2006.02.09 1/3 0.25 / 1

When data is generated by the combination (Ai, Bj) method of the patent classification code, sorting, index processing, and / or rollup operation may be interrupted. Therefore, it is more preferable to conceptually process the combination of patent classification symbols as shown in Table 77 below, and the multi-dimensional analysis operation result table generation module 402 may generate the plurality of patent classification symbol pair information as follows. There will be.

Table 77

From Main IPC From Sub IPC Document number Applicant Filing date Apply weight policy 1 Apply weight policy 2 H04B 7/15 H04B 7/26 10-2005-0111868 Samsung 2006.12.04 One One H04B 7/15 H04B 7/24 10-2005-0111868 Samsung 2006.12.04 One One H04B 7/14 H04B 7/26 10-2005-0111868 Samsung 2006.12.04 One One H04B 7/14 H04B 7/24 10-2005-0111868 Samsung 2006.12.04 One One H04B 7/155 H04B 7/04 10-2006-0012606 Samsung 2006.12.04 1/3 0.75 / 2 H04B 7/155 H04B 7/02 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 H04B 7/15 H04B 7/04 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 H04B 7/15 H04B 7/02 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 H04B 7/14 H04B 7/04 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 H04B 7/14 H04B 7/02 10-2006-0012606 Samsung 2006.02.09 1/3 0.75 / 2 H04B 7/14 H04B 7/02 10-2005-0042032 SK Telecom 2006.12.04 One One H04Q 7/00 H04B 7/00 10-2006-0012606 Samsung 2006.02.09 1/3 0.25 / 1 H04Q 7/00 H04B 10-2006-0012606 Samsung 2006.02.09 1/3 0.25 / 1 H04Q H04B 10-2006-0012606 Samsung 2006.02.09 1/3 0.25 / 1

For the purpose of statistical processing and analysis, the multi-dimensional analysis operation result table generation module 402 may further generate the plurality of patent classification symbol pair information as shown in Table 78 below. Although not shown, the procedure is the same as the order of the document numbers in Table 77. The multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table as shown in Table 78 below. The bibliography of the document number in the data (it would also be desirable to include information on other bibliographic matters, such as the applicant, inventor, filing date, and registration date). M, meaning that the field is labeled S if it originates from the sub-IPC, which means that the criterion of convergence is to fold around the main IPC as much as possible. It is because that preferred.)

Table 78

C3 C4 C5 C6 M / S Document number M / S C1 C2 C3 C4 H04B 7/00 7/14 7/15 S H04B 7/15 H04B 7/26 M H H04 H04B 7/00 H04B 7/00 7/14 7/15 S H04B 7/15 H04B 7/24 M H H04 H04B 7/00 H04B 7/00 7/14 S H04B 7/14 H04B 7/26 M H H04 H04B 7/00 H04B 7/00 7/14 S H04B 7/14 H04B 7/24 M H H04 H04B 7/00 H04B 7/00 7/14 7/15 7/155 S H04B 7/155 H04B 7/04 M H H04 H04B 7/00 H04B 7/00 7/14 7/15 7/155 S H04B 7/155 H04B 7/02 M H H04 H04B 7/00 H04B 7/00 7/14 7/15 S H04B 7/15 H04B 7/04 M H H04 H04B 7/00 H04B 7/00 7/14 7/15 S H04B 7/15 H04B 7/02 M H H04 H04B 7/00 H04B 7/00 7/14 S H04B 7/14 H04B 7/04 M H H04 H04B 7/00 H04B 7/00 7/14 S H04B 7/14 H04B 7/02 M H H04 H04B 7/00 H04B 7/00 7/14 S H04B 7/14 H04B 7/02 M H H04 H04B 7/00 H04Q 7/00 S H04Q 7/00 H04B 7/00 M H H04 H04B 7/00 H04Q 7/00 S H04Q 7/00 H04B M H H04 H04B H04Q S H04Q H04B M H H04 H04B

When there is the result table data for the multi-dimensional analysis including the fusion information for all patent documents including two or more patent classification codes, the following information can be extracted / calculated.

The first is the analysis of unutilized subject information when the patent classification code is obtained. When one patent classification symbol is obtained or given, one may find a ranking of patent classification symbols that are well fused (often paired) with the patent classification symbol. In the above table, for example, a patent classification code that fuses well with H04B 7/04, the frequency of fusion at each IPC level such as H04B 7/155 at C7 (IPC 3 dot level) level, H04B 7/15 at C6 level, etc. Can obtain high IPCs and generate statistics / analysis information about them (process them by group by, count, and rank commands), so that they can obtain rank information of IPCs with high convergence frequency. For example, it is possible to generate fusion-related ranking information such as "The C6 level IPC that fuses best with H04B 7/04 is H04B 7/15, and the C6 level IPC that converges with the next best is H04Q 7/30." Of course, the specific document in which the fusion occurs may be specifically identified through the document number, but the corresponding documents may be obtained by querying the search engine or the DBMS 201 for the IPC pair in which the fusion has occurred. The patent classification obtained It is possible to generate statistics / analytical information for each year / period of patent classification symbols that are well converged with the call by IPC level (IPC subclass, main group, 1 dot, 2 dot, ...). Find the patent classification code (e.g., H04B 7/04) obtained from on one side (preferably the main IPC), and determine the level among C1-Cn in all records (rows) containing the found patent classification code. And statistical / analyze (possible by group by, count and rank commands) for all patent classification symbols existing at the determined level, and the period / date information is included in the result table data for the multi-dimensional analysis. If so, patent classification symbols that fuse well with the acquired patent classification symbols may generate statistical / analysis data for each period (eg, by year).

Although the fusion analysis described one patent classification code, the fusion analysis when two or more patent classification codes are obtained also includes a patent classification code with a high frequency of fusion for each of the two or more obtained patent classification codes. You will get and sum them. On the other hand, depending on the policy, there is a case where a duplicate of the plurality of patent documents obtained in a single document occurs, wherein the statistics / analysis information may be generated while allowing the duplication, or the statistics / analysis information is removed while eliminating the duplication. You can also create In the latter case, the obtained plurality of patent classification symbols are queried, and then a deduplication command such as a distinct command is performed on the document number included in each of the records resulting from the query, and then only each record is delimited. It is possible to generate statistics / analysis information on patent classification symbols that are well converged by IPC level. The fundamental reason that this problem of duplication occurs is that if the fusion occurs in the lower patent classification code when processing the fusion information, it is naturally considered that the fusion occurred between the respective upper patent classification codes. Meanwhile, the plurality of patent classification codes need not be at the same IPC level, and the invention idea when the plurality of patent classification codes is obtained can be applied to any two or more patent classification codes selected in the entire IPC classification scheme. Will be taken for granted.

Second is the convergence analysis of subject information when the patent classification code is obtained. When there is a fusion occurrence document set associated with the set of patent classification symbols that best converges the obtained patent classification symbols (one or plural), each bibliographic matter is utilized by utilizing the bibliographic information related to the document number of the fusion occurrence document set. Can generate statistics / analysis information for each field. For example, a patent document set of Samsung Electronics Co., Ltd., filed with the Korean Patent Office, which contains a patent classification code that has been merged with H04B 7/00, is a document set to be analyzed, and the IPCs in the document set are classified step by step (IPC sub Divided into class, main group, 1 dot, 2 dot ... (the idea of including sub-patent classification symbol is of course applied). This information can be provided for each inventor of Samsung Electronics Co., Ltd.

Third is the analysis of convergence within a given document set. The set of documents given above may be generated by a combination of at least one or more of various fields constituting bibliographic matters such as 1) applicant, 2) inventor, 3) patent classification code, 4) country, and 5) date. When the document set is obtained, the query set may be queried in the calculation result table for multi-dimensional analysis including the fusion information, and thus, the document set in which the fusion has occurred can be specified among the obtained document sets. (The document number and the record including the document number are specified.) At this time, the information that can be generated for the document set in which the convergence has occurred is as follows. 1) It is possible to extract the IPC for each IPC level of the least frequently generated document set from which convergence has occurred, and to generate statistical / analytic information about the IPC of each level where convergence has occurred for each extracted IPC. The information is counting information, and if the counting information value exists, increase / decrease rate, increase / decrease rate, etc. can also be obtained. (The ranking information of the inventors based on the most inventors, the most applicants, the most fused patent classification symbols, etc.)

Fourth, statistics / analysis for fusion target discovery. It finds a criterion within a given document set and, for that criterion, a fusion generated for all or a set of document sets in the first or second country (for example, a set of documents filed in the last seven years by the filing date). Refers to statistics / analysis that obtains convergence statistics / analysis information for each criterion by querying the result table data for multi-dimensional analysis including information. For example, it may be a mode of each patent classification symbol level according to the total amount / occupancy rate / concentration rate / activity rate or other analysis indicators by a specific applicant or applicant. More specifically, for example, Samsung Electronics Co., Ltd. can extract the main group-based high concentration rate IPCs in Korea, and these extracted high concentration rate IPCs query the result table data for multi-dimensional analysis including convergence information. Will be. In this case, the result table data for the multi-dimensional analysis including the converged information to be queried may be a first station (country where the criterion was created), but a second station (country irrelevant to the generation of the criterion). ) Is generated from a set of patent documents. Using the above method, the IPCs of the high-concentration IPC main groups are extracted from the patent documents of Samsung Electronics in Korea, and the multi-dimensional analysis operation including the converged information generated as a set of patent documents in Japan or the United States through the extracted IPCs is performed. By querying the result table data, it is possible to obtain the patent classification code information of each stage that is well fused (high frequency of fusion) with the IPCs of the high concentration main groups extracted in Japan or the United States.

In addition, the present invention has been described with respect to the multi-dimensional analysis operation result table data generation through the roll-up operation processing the mediated patent classification code on the multi-level patent classification code system. In addition, even when the roll-up is premised or not, the information for the immediate sub-patent classification of the given patent classification symbol is generated in real time to drill down information on the sub-patent classification symbol of the specific patent classification symbol. Could be provided to the user. This inventive idea applies to fusion analysis as it is. The following provides a more detailed description of drill down in fusion analysis.

In Table 79 below, the reference IPC is an obtained or given IPC, and the target IPC is a converged document set or a preset or obtained set of the first or second station-wide converged document set or a preset or obtained document set. The most frequent / high frequency IPC obtained from a set of fusion occurrence documents. At this time, it is a matter of course that the level of the reference and target IPC can be selected. (Of course, the reference IPC may be a fusion occurrence document set extracted from a first or second station fusion occurrence document set or a preset or obtained document set, or a frequency / frequency obtained from a preset or obtained fusion occurrence document set. Of course, if there is an independent IPC level selection from the user for each of the reference IPC and the target IPC, the selection information is obtained and the result calculated for each selected level is extracted and provided to the user. .

Table 79

ranking standard IPC object IPC IPC Explanation 00 01 lately Sum One B60C 3/00 (+) H01B 1/00 (+) 11 52 4 2 H03M 3/00 (+) B29C 31/00 (+) 2 5 9

When the reference IPC B60C 3/00 in Table 79 is pressed to obtain drilled down fusion statistics / analysis information, the system 1 of the present invention generates data drilled down in the same manner as in Table 80 below. You can provide it to the user. (Drill down gets only from its own child under the IPC scheme, not all of the IPCs obtained, but only the IPCs that have converged.)

TABLE 80

ranking standard IPC object IPC IPC Explanation 00 01 lately Sum One B60C 3/00 (+) H01B 1/00 (+) Tires characterized by cross section Conductor or conductive object characterized by the conductive material 11 52 4 67 B60C 3/04 (+) Cross section Relative law Characterized 5 30 3 40 B60C 3/06 (+) Asymmetric 4 10 One 12 B60C 3/08 (+) Collapsible in storage or unused condition One 2 0 2 2 H03M 3/00 (+) B29C 31/00 (+) 2 5 9 16

(Note) 52 = from child (30 + 10 + 2) + self (10)

When the user of the system presses H01B 1/00 anywhere in the result data as shown in Table 80, the system 1 may generate and provide drilled down data in a format as shown in Table 81 below.

Table 81

ranking standard IPC object IPC IPC Explanation 00 01 lately Sum One B60C 3/00 (+) H01B 1/00 (+) Tires characterized by cross section 11 52 4 H01B 1/02 (+) Mainly of metals or alloys 3 35 2 H01B 1/04 (+) Mainly carbon Silicone blend As carbon or silicon 2 10 0 H01B 1/06 (+) Mainly consisting of other nonmetallic substances 2 0 0 2 H03M 3/00 (+) B29C 31/00 (+) 2 5 9

On the other hand, the system 1 may provide the following result data to the user in relation to the drill down. The reference IPCs in Table 82 below are obtained or given IPCs, and target IPCs that are the targets of the 1st to nth rankings are extracted from the entire set of originating or second station fusion occurrence documents or a preset or obtained document set. The most frequent / frequently IPCs obtained from a set of convergent generation documents that have been set up or from a set or obtained set of convergence occurrence documents. In this case, the level of the reference and the target IPC can be selected as a matter of course. (Of course, the reference IPC is a convergence occurrence document set extracted from the entire set of convergence occurrence documents of the first or second station or a set of preset or acquired documents. Or it may be natural that the most frequent / frequently IPCs are obtained from a set of acquired or acquired fusion occurrence documents.)

Table 82

ranking standard IPC IPC Explanation 1st 2nd 3rd place One B60C 3/00 (+) Tires characterized by cross section H01B 1/00 (67) (*) G03B 5/00 (15) (*) C07B 31/00 (10) (*) 2 H03M 3/00 (+)

In this case, when H01B 1/00 (*) is pressed in the result data as shown in Table 82, it may be desirable to obtain a result of fusion analysis of the reference IPC and the target IPC as shown in Table 83 below.

TABLE 83

ranking standard IPC object IPC IPC Explanation 00 01 lately Sum One B60C 3/00 (+) H01B 1/00 (+) Tires characterized by cross section Tires characterized by cross section ( more ) Conductor or conductive object characterized by the conductive material 11 52 4 67 H01B 1/02 (+) . Mainly of metals or alloys 3 35 2 40 H01B 1/04 (+) Mainly carbon Silicone blend As carbon or silicon 2 10 0 12 H01B 1/06 (+) . Mainly consisting of other nonmetallic substances 2 0 0 2

It would be desirable for all IPCs visible to users to include an IPC description. Since it is difficult to expect to remember all of the IPCs among the users, it is preferable that the title information of each IPC comes out as the IPC description information primarily. At this time, the title information of each IPC is brief and most of the characteristics are only explained. Therefore, when the user clicks on the description of the IPC secondly, all the upper IPCs of the IPC or a certain level (for example, class) are popped up. To the level).

The multi-dimensional analysis operation result table generation module 402 has been described how to generate the multi-dimensional analysis operation result table data for fusion analysis around the IPC, the method is USPC, FT, FI, ECLA, etc. It will be apparent to those skilled in the art that other patent classification symbols may be applied in an equivalent manner.

Meanwhile, in the fusion analysis, the combination of fusion occurrence patent classification symbols has been described based on the combination (Ai, Bj) of two patent classification symbols, but the present invention is a combination of three or more fusion occurrence patent classification symbols (Ai, Bj). , Ck ...) will be apparent to those skilled in the art that the present invention is applied as it is. For example, in the case of fusion analysis focusing on the combination of the two patent classification symbols, a patent document including two or more homologous patent classification symbols is set as a fusion occurrence patent document set, but the combination of three patent classification symbols is mainly used. In the case of fusion analysis, a patent document containing three or more homologous patent classification symbols shall be a set of fusion occurrence patent documents. In the case of a combination of two patent classification symbols, when there are n patent classification symbols in a document, nC2 (Ai, Bj) combinations are shown, but in the case of a combination of three patent classification symbols, nC3 (Ai , Bj, Ck), and (Ai, Bj, Ck) can be processed as one unit, like (Ai, Bj). Also in this case, any one of Ai, Bj, and Ck may be derived from the main IPC, or all of them may be from the sub IPC.

Result table for multi-dimensional analysis for representative phrase analysis

The representative phrase generation method through the representative phrase extraction preprocessing module of the present invention has been described above. As described above, the representative phrase extraction preprocessing module generates data as shown in Table 84 below.

TABLE 84

The multi-dimensional analysis operation result table generation module 402 of the present invention obtains data as shown in Table 84 to generate multi-dimensional analysis operation result table data for representative phrase analysis, and obtains the obtained document number. Various information (applicant, patent classification code, inventor, date information, etc.) included in the bibliographic details of the document may be extracted from the patent document mast DB 202 (the join number may be processed by using the document number as a key value). To generate multi-dimensional analysis operation result table data of the form as shown in Table 85. First, the patent classification code may be extracted to generate data as follows. For convenience, document number # 1 includes one IPC called H04B 7/15, and document number # 2 includes IPCs called H04B 7/26 and H04B 7/02.

TABLE 85

Phrase ID Phrase field Appearance Document number C1 C2 C3 C4 C5 C6 One abc D 3 #One H H04 H04B 7/00 7/14 7/15 One abc C 2 #One H H04 H04B 7/00 7/14 7/15 One abc D 2 #2 H H04 H04B 7/00 7/24 7/26 One abc C One #2 H H04 H04B 7/00 7/24 7/26 One abc D 2 #2 H H04 H04B 7/00 7/02 One abc C One #2 H H04 H04B 7/00 7/02 2 bcd D One #One H H04 H04B 7/00 7/14 7/15 2 bcd C One #One H H04 H04B 7/00 7/14 7/15 3 ac D One #One H H04 H04B 7/00 7/14 7/15

Document number # 2 corresponds to the IPCs H04B 7/26 and H04B 7/02, so that each record has a record for processing two IPC information. Although the above has been described with respect to the IPC, it will be apparent to those skilled in the art that the present invention is applied in the same manner to USPC, FT, and the like.

First, the patent classification code may be extracted to generate data as shown in Table 86 below. For convenience, suppose that Document Number # 1 includes Applicant A, and Document Number # 2 includes Applicants B and C. In addition, it is assumed that Document # 1 was filed in May 2005, and # 2 was filed in January 2006. In this case, the multi-dimensional analysis operation result table generation module 402 of the present invention may generate the multi-dimensional analysis operation result table data of the form as shown in Table 86 below. (In the table below, 05 / 1Q refers to the first quarter of 2005.)

TABLE 86

Phrase ID Phrase field Appearance Document number 05 05 / Q1 5 / 2Q 5 / 3Q 5 / 4Q 06 06 / 1Q AppName One abc D 3 #One One One A One abc C 2 #One One One A One abc D 2 #2 One One B One abc C One #2 One One B One abc D 2 #2 One One C One abc C One #2 One One C 2 bcd D One #One One One A 2 bcd C One #One One One A 3 ac D One #One One One A

Document No. # 2 includes Applicants B and C, so that each record has a record for processing two Applicant information.

The multi-dimensional analysis operation result table generation module 402 of the present invention generates the two multi-dimensional analysis operation result table data related to the representative phrase as one integrated multi-dimensional analysis operation result table data. Could be On the other hand, if the translation is applied, multi-dimensional analysis operation result table data using the phrases for each language corresponding to language information may be generated for the phrases. The AppName field means the name of the applicant, and an inventor field may be generated next to the field, and one record may be generated for each inventor. If the applicant is 2 and the inventors are 3, one record without the applicant information and the inventor information is expanded to 6 (2 * 3 = 6) records including the applicant and the inventor one by one.

If there is data as shown in Table 85 and Table 86, the following information can be generated in relation to the representative phrase.

Firstly: 1) by applicant or by inventor, or by the inventor of the applicant, 2) a specific level of a specific patent classification code (eg IPC main group, etc.), 3) a specific period (year, quarter, month, etc.), 4) a specific country You may be able to provide a set of phrases. For example, extract a list of phrases that appear in an IPC main group H04B 1/00, Q1 2005, filed by Samsung Electronics Co., Ltd. (in the patent claim or abstract or in a specific field) of a Korean patent application document. You can do it. It will be apparent to those skilled in the art that a simple SQL statement can be processed when there is a result table data for multi-dimensional analysis such as Table 85 and / or Table 86.

Second, 1) by applicant or by inventor, or by the inventor of the applicant, 2) a specific level of a specific patent classification code (eg IPC main group, etc.), 3) a specific period (year, quarter, month, etc.), 4) a specific country The representative phrase set may be extracted and provided. The representative phrase extraction depends on the representative phrase extraction policy. Examples of the representative phrase extraction policy may include the following.

1) It is the case that the value of processing "appearance probability = appearance frequency in a specific document set / appearance frequency * 100 in own higher document set" for a specific phrase extracted is included in the first narrow range value. .

2) The value of processing the appearance frequency probability of the specific phrase with respect to the specific phrase is not included in the predetermined first narrow range value but is included in the second predetermined wide range value, and the occurrence frequency probability is It is a case of being included within the set ranking. This may be applied to the case where the number of representative phrases is very small as a result of using the representative phrase extraction policy of 1) in a specific document set.

3) The case where the extracted specific phrase is included in the predetermined first narrow range value of increase / decrease rate of appearance frequency in a specific document set.

When calculating the probability of appearance frequency, the higher document set is referred to. The type of the upper document set includes 1) the title of the dimension if the element specifying the document set includes at least one dimension. A set of parent documents of the type that would be the parent or all parent document sets (of course two or more dimensions of parent or all parents at one time are possible), 2) when a document set is specified solely by pure keywords, the entire set of documents There is a parent document set that becomes a set (if a set of documents at the national level, all documents in that country). The above-described set of upper documents of type 1 will be described in detail. For example, in the case of a document set characterized by IPC H01L 1/00, i) H01L or a document set specified by the upper patent classification symbols H01 and H may be the higher document set. Meanwhile, the top document set of the document set specified by Samsung Electronics Co., Ltd. in the first quarter of 2000, IPC H01L 1/00, includes i) the upper patent classification symbol, ii) 2000 or year integration, and iii) all countries of Samsung Electronics Co., Ltd. ( In the United States, Japan, EU, etc., of course, at this time, the translation of the phrase into the corresponding language of that country may be premised. The document set generated as may be the upper document set. If a set of documents is created with a combination of specific keywords and the dimensions (for example, the keyword "RFID tag" and the country, applicant, and year dimensions, such as the year 2000 of Samsung Electronics Co., Ltd.), the keyword is maintained. If the above i) to v) is applied, it is of course possible to generate a higher document set. On the other hand, it is preferable that the first narrow range value and / or the second wide range values are determined for each higher document set of i) to v).

On the other hand, in the case of extracting the representative phrase generated from the document set generated by including at least one or more dimensions in the element specifying the document set, it will be possible to drill down the dimension to the axis. When drilling down, a representative phrase that satisfies a predetermined condition for each dimensional element may be extracted and the result may be provided to users. Table 87 and Table 88 below show one embodiment of a representative phrase related drill down. The above example shows representative phrases for year 00. At this time, if the year 01 is clicked, the representative phrase corresponding to the year 01 is extracted and shown as shown in Table 87.

TABLE 87

ranking standard IPC object IPC IPC Explanation 00 01 02 03 04 05 06 lately Sum One B60C 3/00 (+) Tires characterized by cross section Slip, improve friction, tire resonance frequency, noise, electrostatic discharge 2 H03M 3/00 (+) Modulation, sigma delta, Pulse width Modulation, Red flag reset

The representative phrase is preferably different in color, font size, etc. according to the appearance frequency probability. On the other hand, when clicking on the representative phrase will be able to extract the document containing the representative phrase. Document extraction is 1) a method of extracting and providing a corresponding document by querying a patent document mast DB 202 based on the document number if there is a document number in the result table data for multidimensional analysis for extracting the representative phrase. , 2) there may be a way to query the search engine by combining all the dimensions and representative phrases mentioned above. For the latter example, if the representative phrase "noise" is selected (if this representative phrase is from Claim, C, the search expression you enter is "IPC = B60C 3/00 AND Claim =" Noise "). , Date = (20000101 ~ 20001231 ", Country = South Korea". If you want to see US documents with this phrase, you can substitute "country = USA" and query the search engine. If the "noise" corresponding to the keyword is not English, which is the basic language of US documents (i.e., judging the type of language), the "noise" is translated into English using the translation system included in the system (1). It is a matter of course to search by inputting "noise" into the "noise" position.

If B60C 3/00, the reference IPC in Table 87, is pressed in order to extract the drilled down representative phrase (you can also press the (+) sign, which is an interface problem), the system (1) of the present invention is shown in Table 88. In the same way, you can generate the drilled down data and provide it to the user. (Drill down is only available under its own under the IPC scheme, and not all of the IPCs appear, of course. For example, if there is no document in B60C 3/02 or if there is no representative phrase, the representative for this IPC. Phrase is not presented, the representative phrase corresponding to the B60C 3/00 may be shown or hidden.

TABLE 88

ranking standard IPC object IPC IPC Explanation 00 01 lately Sum One B60C 3/00 (+) Tires characterized by cross section Slip, improve friction, tire resonance frequency, noise, electrostatic discharge B60C 3/04 (+) Cross section Relative law Characterized Fixed stability, Radial Tires, compressed air B60C 3/06 (+) Asymmetric Abrasion Resistant, Cornering, Asymmetrical Sidewalls, Asymmetric Radial tire B60C 3/08 (+) Collapsible in storage or unused condition Rim Escape, Fair Tire Carriers, Storage Devices 2 H03M 3/00 (+) Differential modulation

The above example is only an embodiment of presenting a representative phrase, and the analysis result to which the present invention is applied can be provided to extract the representative phrase from any place. This is because a document set corresponds to an analysis result to which the present invention is applied, and the representative phrase of the present invention is generated by obtaining the document set (patent documents constituting the document set).

The calculation result table generation module 402 for multi-dimensional analysis of the present invention generates a multi-dimensional (n-dimensional) cube by performing a multi-dimensional calculation for at least one or more selected dimensions. One of the characteristics of multi-dimensional cube generation of the present invention is that when a cube is generated by including a patent classification code in a dimension, when a rollup operation value for the given patent classification code is generated, the patent classification code and the patent classification code are used. The rollup calculation value is generated by considering the upper patent classification code. When the rollup calculation value is generated in this way, when the rollup calculation value for any patent classification code is obtained, all the values for the lower patent classification codes of the patent classification code reflect the multi-dimensional calculated numerical values. You get it. The multi-dimensional analysis operation result table generation module 402 of the present invention, when there is a patent classification code included in a given patent document, not only the patent classification code but also the upper patent classification code of the patent classification code. Reflects the processing value for the symbol. For example, if the document number # 1 is assigned an IPC of H04B 7/06, when the calculation result table for cube multidimensional analysis is generated with this document, a counting value 1 is also given to H04B 7/06. H04B 7/06 shall also be assigned a counting value to the immediate H04B 7/04 and H04B 7/02. Of course, it would be common sense that the counting value is also assigned to higher levels of H04B 7/00 and above. To do this, as in the above tables, for each given patent classification code (e.g. IPC), the own patent classification symbols are arranged in a row in the row to which they belong, and corresponding to each higher patent classification code. The cell value is calculated by reflecting the value derived from the patent classification code given to the cell. The unique roll-up operation method of the present invention has been repeatedly described in the multi-dimensional analysis operation result table generation module 402 and various places in the present specification.

Self Set Standards Self Set Analysis and Self Standards Other analysis type

The analysis module of the present invention is composed of a set of analysis modules for various analysis topics. The analysis subject analysis module includes an analysis subject analysis query expression corresponding to the analysis subject to generate an analysis result related to the analysis subject. The multi-dimensional analysis calculation result table generated by the multi-dimensional analysis calculation result table generation module 402 is obtained by using the analysis topic analysis query formula, and obtains a target topic analysis result.

There are two relations between the analysis result generated by the analysis module of the present invention and the analysis target document set initially determined.

The first is a "self-set based self-set analysis type" in which the analysis result comes from the analysis target document set. For example, the number of applications, share, concentration rate, and activity rate for each stage of each applicant's IPC by year, or the number of applications, share, and concentration for each stage of each applicant's technical field (IPC). Examples are rates, activity data, or data about citations by applicant or individual inventor (by technical field). At this time, the most frequent technical field of the specific applicant is extracted, and the most frequent technical field (IPC) is extracted as a set of documents related to the specific applicant itself, and the extracted least frequent technical field (IPC) becomes a reference according to the standard. Analysis results such as the number of applications and share of each year are generated. Therefore, it is generated based on its own document set, and the final analysis result by the standard is also the number of applications by its own year, so it becomes an analysis of itself. The "self-set based self-set analysis" type has a characteristic that a final analysis result to be output can be generated from a document set (self-document set) first inputted. No wonder the whole value is required.)

The second type is "self-set based on other set analysis", in which the analysis result does not come from the analysis target document set. For example, if, as a result of the analysis, you want to obtain data on a citation person who cites applicant A's documents by multiple application fields, the first data entered is about the applicant's document set, but the final data is output. Is data about a set of documents that are not A. In addition, in the case of ranking and extracting the applicants filing for the IPCs in the least frequent main group stage and the applicants with high concentration in the IPCs, the "self-set based on other set analysis" criteria is own. In terms of extracting the most frequent IPC for each main group from the set of documents, it is a self-set criterion, and the other set is analyzed in that the final result is about the set of documents of others.

In this second case, two or more steps of calculation are required. In the above example, first of all, the document set of Applicant A is outputted for each technical field, and the document set of others citing the documents belonging to the document set is obtained from the document sets for each technical field. Using the document set as the document set to be analyzed, multiple sources are extracted and provided as final analysis results.

For the self-set reference self-set analysis, the multi-dimensional analysis operation result table generation module 402 may generate the multi-dimensional analysis operation result table from the multi-dimensional analysis operation result table to a predetermined final result in the same multi-dimensional analysis operation result table. have. For example, in order to generate an analysis result such as the number of applications or share by year of each applicant's least IPC, even one multidimensional analysis operation result table is often sufficient.

On the other hand, in the case of the self-set reference other set analysis type, 1) when the final result is generated in advance, 2) the final result is divided into stages, and the staged result generated in each stage is used as an input value of the next stage. There may be a way. In the latter case, 1) the final result can be divided into stages, and 2) the result table generation module for multi-dimensional analysis for generating output values of the next stage using the first and the result of each stage as input values 402 3) the multi-dimensional analysis operation result table data or cube data for which the output value for the input value is calculated, and 3) the multi-dimensional analysis operation result table generation module 402 inputs the first and each step result. The same conditions that can be obtained by value must be met.

Referring to the case of trying to obtain data for a citation (applicant, assignee) citing a number of documents of the applicant A by the multi-application technology field as follows. The result can be obtained through the following steps. 1) Applicant A is confirmed (Applicant A is an element of the entire set of all applicants, so A is confirmed), and 2) A set of documents of A in one country (e.g. United States) For example, A is a complete set of application documents.) 3) At least one IPC or USPC is assigned to all document sets in A, and therefore, the type and level of the selected patent classification code (for example, IPC, subclass level). If given, based on the given value, if not given, at all types and types of patent classification symbols, document subsets belonging to each type and level (e.g., IPC H04N, H01B, etc.) are determined, 4) extracting the ranking of the multi-application technical field by counting the document subsets by technical field; and 5) by individual document subsets (e.g., for IPC H04N of all documents in A). door A step of extracting (e.g. document # 1, ... # n), 6) extracting other document number and cite the document number for each individual document, 7) to the assignee of the extracted other document numbers Extracting, 8) summing the applicants of the extracted other document number by the partial document set (the frequency values of the applicants are stored together), and 9) the combined applicants by the technical field in order of frequency (type of patent technology classification). And level) (in this case, the number of documents (and / or other document number information) of each applicant is also stored), and according to the multi-application technology field (patent technology classification and level) of the applicant A's documents. Applicants who have cited many can be confirmed. The multi-dimensional analysis operation result table generation module 402 may perform steps 1) to 9) for any one or more applicants selected from all applicants and store the result values as data. . 1) to 2) and / or 3) are related to the determination of the set of analysis target documents corresponding to the initial magnetic input, and 4) to 5) correspond to the own analysis of the initial magnetic input criteria. Is related to the generation of a set of other documents for obtaining the analysis result, and 7) to 9) are related to the generation of the final analysis result corresponding to the final other output. In this case, the analysis module may load and display the final analysis result generated by the multi-dimensional analysis calculation result table generation module 402 according to the conditions for obtaining the analysis. The condition may be a kind and level of a patent technology classification such as IPC, a country, etc. input from the user or the like.

When the steps 1) to 9) go through the multi-step analysis, the following grouped multi-steps may be possible. In this case, the analysis module may be operated in the following manner.

First, the analysis module is a step of determining the set of documents for applicant A's multi-application field. In this case, the analysis module performs the process of generating the result table for multi-dimensional analysis of the multi-dimensional analysis module 402 by performing the processes of 1) to 4) for all applicants. Data concerning applicant A can be obtained from the data for which the number has been determined.

Second, the document set acquisition module of the analysis module obtains the document number of 5). When there is a number of the document subset, all the numbers can correspond to an SQL query, so that the document number of 5) can be determined by the SQL query. The document number may be stored in the result table or cube data for multidimensional analysis in advance, but it would be reasonable not to store it (since it can be recalled at any time by SQL). The document set acquisition module refers to a module that obtains a document set according to a predetermined condition and performs a union operation. In this case, SQL is a condition for obtaining a set of documents.

Third, the document set acquisition module of the analysis module performs step 6). The document set obtaining module queries the bibliography DB for the document number of step 5), or the other document set of step 6) by obtaining a query engine. Of course, in the bibliography DB, each document includes reference information about other document numbers that it cites, or when a specific document number is queried in the search index 401-2, You should be able to output a document number that includes the document number as the citation / reference information above (i.e., if you have document # 1 and # 2 as citation information in document a when indexing, When the search indexer 401-3 generates reverse file information such as a <-# 1, a <-# 2, and processes the search index 401-2 to search for the citation field, citation search by the search engine. Entering the document number # 1 in the field will print a number of documents, including the document a that quotes itself.)

Fourth, the set of other documents to be analyzed through step 6) is generated, and the subsequent analysis of the generated other document sets is performed by the direct analysis module of the present invention. The analysis module calls the direct analysis module included in the analysis module to perform 7) to 9). The direct analysis module refers to an analysis engine that executes a predetermined analysis processor when a document set is given to a given document set (a document set that cannot be preset and a document set to be preset). If the number is too large, it is impossible or unreasonable to generate the analysis result of the document set in advance, and therefore, it is reasonable to generate the operation result table generation module 402 for multidimensional analysis only up to step 4). .) The role played by the direct analysis module is equivalent to the role performed by the operation result table generation module 402 for multidimensional analysis. That is, through the step 5)

The direct analysis module and the multi-dimensional analysis operation result table generation module 402 differ only in the first step of inputting an analysis target document set, and the analysis result data generation process performed after the document set is obtained is the same. .

Self-Set Criteria Types of Other Set Analysis

The self-assembly criterion other analysis will be described in more detail. The first step of the self-set reference set analysis is to extract the reference to the self set. The criteria include 1) patent classification symbols for each level with many applications or registrations, 2) patent classification symbols for each level of concentration / occupancy rate / activity rate, and 3) level for each large domestic / overseas family. Patent classification code for each level with high citation, 4) patent classification code for each level with high convergence, and 6) patent classification code for each level with high counting value such as the average number of claims. Can be. 1) to 6) may be extracted for a predetermined period (for example, the last seven years as of the filing date), and 1) to 6) may include the number of applications / registration, share / concentration rate / activity rate, It may be extracted under the condition that the fluctuation value (increase / decrease rate, change rate, etc.) of the quantified value of family size, citation, and fusion exceeds a predetermined standard. You can do it.

The ranking # 1 H01L, # 2 H04N, and the like included in FIG. 17 and the like show an example of the criteria extracted for the self-assembly. The self set refers to a document set determined according to the determination of the document set described in the present specification.

The second step is to generate an analysis result with respect to the criteria extracted for the self set. In order to generate the analysis result, the analysis module for each analysis topic queries a predetermined result table for performing multi-dimensional analysis using a predetermined analysis query expression. In this case, the multi-dimensional analysis operation result table is generated for a set other than the self set or another whole set that may include the self set. Can be. For example, in the case of a set that is not a self-set, 1) If the self-set is a set of a first country standard, a document set of a second country other than the first country is representative. On the other hand, the other full set that may include the self-set includes: i) full forward citation document set, ii) full forward citation document set, 2) full backward citation, which are used in citation analysis. Document sets are representative.

On the other hand, competitive analysis is a representative example of self-set criteria other set analysis. Competitive analysis will be described taking Applicant unit competitive analysis as an example. Multi-application of the applicant in at least one patent classification symbol level unit (e.g. IPC subclass), with the document set as its own set, targeting all or a portion of the document set of a specific applicant of the first country. Highly ranked IPCs can be extracted by registration criteria IPC, high concentration rate, high share rate, high activity rate or other patent indicators. With respect to the extracted IPC, the ranking of the multi-application / multi-registered applicant, the applicant with high concentration rate, the applicant with high activity rate, or the applicant with high patent index can be extracted in the first or second country. There will be. On the other hand, even if it is not the applicant unit, the information processing performed by the applicant unit in this paragraph can be performed in the same manner for each unit such as the inventor unit, inventor unit, agent unit, etc. of the applicant, and competition analysis for each unit can be performed. Of course, in order to generate the other set analysis results, the analysis module of the present invention includes the application / registration total amount, occupancy rate, concentration rate, and activity rate for each level of IPC. Approach the set analysis query expression and extract the desired analysis result.

The analysis module obtains and outputs an analysis result by querying a corresponding multi-dimensional analysis operation result table as an analysis query expression for each analysis subject. In this case, the output of the analysis result is preferably output for each language.

On the other hand, given a numerical value, the change value (rate of increase, velocity, acceleration) and statistical value (average and standard deviation, etc.) of the numerical value may also be useful information. In this case, 1) the change value and the statistical value may be applied to generate the multidimensional analysis operation result table. 2) There may be a method for instant calculation with the numerical value obtained from the analysis module stage. In this case, the method of 2) may be easy when the numerical value is a counting value such as a simple application / registration amount, and the method of 1) may be easier when the numerical value is a ratio value such as occupancy rate, concentration rate and activity rate. will be. For example, in the H04N 7/02 technology area of the A company, the total number of applications from 2000 to 2005, the increase or decrease of the number of applications, etc., can be easily calculated only by the number of applications per year since the numerical value is counting information. On the other hand, when we calculate mean value (the total of ratio value is meaningless) with share, concentration rate and activity rate according to year in H04N 7/02 area of A company from 2000 to 2005, ratio Since the summation of the values is not meaningful, the formula corresponding to each analytical indicator should be applied collectively for the years 2000 through 2005, not for each year. Therefore, it is preferable that the calculation of the ratio value is generated together with the generation of the result table for performing the multi-dimensional analysis in advance as in the method of 2).

The analysis module of the present invention includes a mast DB analysis module that targets the DB itself, such as a patent document mast DB 202, a patent classification code mast DB 203, a subject mast DB 204, and the like. The mast DB analysis module is provided for each topic, the mast What the DB Analysis Module does is an SQL query statement that contains information about how to treat values from joining any table in a certain mast database for each subject of analysis, or an application that reprocesses the data obtained by the SQL query statement. It includes Sean. The master DB analysis module should use a plurality of sub queries, and the sub query may be a part of a multi-dimensional analysis operation result table equivalent to that generated by the multi-dimensional analysis operation result table generation module 402 ( One record or a plurality of records) or all of the contents.

The multidimensional analysis operation result table generation module 402 and the sub-queries will be described in more detail by taking an example in which a user-generated document set generated by a user inputting a search word as an analysis target document set. Since the document set is not fixed in advance until the user generates the user-created document set, the operation result table generation module for multi-dimensional analysis cannot be previously created by the operation result table generation module 402 for multi-dimensional analysis. there is a problem. Therefore, when the analysis is performed on a user-generated document set, the analysis target document set is used and then used in at least one of the following two methods. First, the multi-dimensional analysis operation result table generation module 402 generates a multi-dimensional analysis operation result table for the user-generated document set, and analyzes various types of the multi-dimensional analysis operation result table. Thematic analysis module accesses the table using the corresponding analysis query expression and generates the analysis result. Second, the master DB analysis module generates data (table or view) that is equivalent to data included in a result table for performing multi-dimensional analysis using at least one or more subqueries on documents included in the user-generated document set. In this case, the master DB analysis module for each analysis subject accesses the data generated by the analysis query formula (main query) to generate the analysis result. The data equivalent to the data included in the multi-dimensional analysis operation result table refers to data that is equivalent to the result of the dimension operation processing on at least one or more dimensions, such as a rollup operation or a cube operation.

Direction of Analysis

The multi-dimensional analysis operation result table generation module 402 generates the multi-dimensional analysis operation result table may have the following series. Since analysis is essentially what you want from a particular piece of data, the type of analysis is determined by 1) the nature of the document set to be analyzed and 2) the subject of the analysis, and 3) the output of the analysis. It depends. Therefore, naming is performed for each analysis type according to 1), 2) and / or 3).

Hereinafter, various types of analysis will be described as examples, with reference to 1), 2), and 3).

First, the attributes of the document set to be analyzed are determined according to the criteria of the confirmation when the document set, which has been mentioned many times in this specification, is confirmed. Applicants, inventors, certain levels of patent classification symbols, forward cited document sets, etc. all depend on the nature of the document set to be analyzed. These analysis target document sets serve as an example of a document set unit obtained by the multi-dimensional analysis operation result table generation module 402 of the present invention.

Second, according to the analysis subject, according to the analysis subject, the multi-dimensional analysis operation result table generation module 402 generates a multi-dimensional analysis operation result table and the multi-dimensional analysis operation result table You decide which analysis query to query.

Third, there are two types according to the type that outputs the analysis result. There are 1) single analysis and 2) comparative analysis. In a single analysis, one type of analysis is presented by one single criterion, while a comparative analysis is presented by two or more criteria of the same type of analysis. For example, comparison by applicant / company that presents the same analysis results (eg number of applications in a specific country by IPC level by year) by two or more applicants / company (eg Samsung Electronics and LG Electronics). This is the simplest example of analysis. At this time, in the above example, it can be seen that the number of applications in a specific country for each IPC level for each year is the same kind of analysis result for each of the two or more applicants / company. As the final analysis result is different from the result of the final output to the user according to the selection of the comparison target, the user receives the selection of the comparison target from the user, and analyzes the analysis topics set in the multi-dimensional analysis operation result table for each selection target. You can then query the query and combine the analysis results for each selection. In the above example, the equivalent analysis results are generated for Samsung Electronics and LG Electronics, respectively, and combined in a comparison manner, thereby generating a comparative analysis result value.

Surveillance Module (403)

Patent information monitoring refers to monitoring, monitoring, and alerting activities for updated patent information. Monitoring methods include 1) a search engine, and 2) a DBMS 201. The monitoring method using a search engine includes a search formula (usually a technical keyword is often included) and a monitoring cycle (preferably a period in which the patent data is updated or an integer multiple of the cycle) for obtaining the desired information. Is set, the scheduler automatically assigns the search expression to the search expression for each period (for example, when the monitoring period is set to one week, the past one week based on the query time point). Limited to the target of the search by the target patent documents), and the search index (401-2), if there is a search result, the key is to obtain the result. On the other hand, 2) the monitoring method using the DBMS 201 is the method of 1) except that the query method for the mast DB, the multi-dimensional analysis operation result table data, or the cube data, respectively. Equivalent to the method. On the other hand, if there is a description keyword, the DBMS 201 has a slower response time than the search engine (the DBMS 201 supports full text search, but the performance is lower than that of the search engine specialized there). Is preferred. Meanwhile, in order to monitor the multi-dimensional analysis operation result table data and the cube data, the multi-dimensional analysis operation result table data and the cube data are rolled up in a unit of time that is shorter or at least equal to the monitoring period. Should be In other words, if the multi-dimensional analysis operation result table data and the cube data are rolled up on a monthly basis, the detailed results of the weekly unit cannot be combined and distinguished in such a rollup operation. Would not be suitable. Therefore, in the case of monitoring through the DBMS 201 of 2), monitoring through the mast DB is more preferable than monitoring of the result table data or cube data for multidimensional analysis.

Monitoring performed using the mast DB is called a mast DB utilization monitoring module 403, and performing monitoring on a multidimensional analysis operation result table data is performed by using a multidimensional analysis operation result table DB monitoring module ( 403), monitoring the cube data is called cube DB utilization monitoring module 403, and performing monitoring using a search engine is called search engine utilization monitoring module 403.

Meanwhile, the mast DB utilization monitoring module 403, the multi-dimensional analysis operation result table DB utilization monitoring module 403, the cube DB utilization monitoring module 403, and the search engine utilization monitoring module 403 may perform monitoring activities. At this time, when the patent classification code is included in the monitoring search query or the query expression, the inventive idea described in the sub-category search is applied as it is. That is, the search expression or query expression (query expression) is modified to search or query for outputting a monitoring result of all the information on the document corresponding to the included patent classification code and its lower patent classification code. This has been described above.

The monitoring service provided by the monitoring module 403 is to register a specific search expression and, when there is new patent information corresponding to the search expression, provides the new patent information to a person who receives the monitoring service. The monitoring module 403 performs the monitoring in the present invention. The patent information monitoring module 403 registers a search expression to be monitored and queries the search engine or the DBMS 201 for the search expression at a predetermined time unit (daily, weekly, monthly, quarterly, and / or annually). When a new patent document is generated, information about the occurrence of the new patent document is informed to the monitoring service user of the fact, the document, or the fact or information about the document. In this case, the notification method may be any one or more of a method of transmitting by e-mail or other message transmission method to a person who will receive the monitoring result, or notifying when the monitoring service user accesses the monitoring service.

In order to perform a monitoring service, 1) registering a search expression for monitoring; Since the purpose is to query the search engine or DBMS 201 for a predetermined time unit (to query for a new document when querying), a time range (for example, a search date based on a publication date). It is common to limit the work to a time range), and to send a new patent document to the supervised service if there is one.

Characteristic

The monitoring search formula input to the monitoring module 403 of the present invention is characterized in that the unit patent information service system of the applicant name and / or the unit patent information service system 1 of the inventor name are set in advance. That is, when generating the unit patent information service system of the applicant's name or the unit patent information service system of the inventor's name, or without the user's intervention even after the generation, the surveillance search required by the applicant's name unit or the monitoring required by the inventor's name unit The search expression is automatically extracted and set in the monitoring service of the applicant's unit patent information service system or the inventor's unit patent information service system. The type of monitoring search formula to be set and the method of generating the monitoring search formula to be set may be as follows. First, the type and generation method of the monitoring search expression set in advance in the document set unit are explained. The information to be included in the monitoring search expression is often a result of analysis generated by the analysis module of the present invention. Examples such as the above include mode values such as the numerical value and the ratio value, maximum values, and the like. Specific examples are explained one by one.

Unit name of applicant or unit name of inventor

First, at least one or more of the most frequent patent classification symbols are ranked and extracted in order of frequency from the set of documents which are the subject of management of the applicant's unit patent information service system or the inventor's unit patent information service system, and the least frequent patent classification symbols are recorded. It may be registered as a search expression to be monitored by at least one or more countries selected from the set country or the country of origin of the document (including WIPO) obtained by the patent information mast D. The patent classification code is preferably one or more selected from among IPC, USPC, FI, FT, and ECLA, and in the case of depth, IPC, section, subsection, class ( class, subclass, group, and subgroup, and the depth is determined by the number of dots below the subgroup Other patent classification symbols have their own upper and lower patent classification symbol system 1) a method of determining in units of groups or subgroups, and 2) a method of performing in units below the subgroups, such as a 2-dot subgroup or a 3-dot subgroup. However, the depth of the patent classification code may be a subclass unit, but in this case, too many patent documents may be issued, resulting in severe noise.

Second, for at least one or more of the most frequent patent classification symbols derived from the first method, 1) those who file or register the largest number of patents at home and abroad; 3) The applicants whose patent application speed or patent registration speed is higher than or equal to a predetermined standard within a preset period of time set at home and abroad can be extracted, and the applicant can be a surveillance search formula.

Third, a patent classification code having a high rate of increase in the number of new documents within a preset recent period corresponding to the patent classification code and the lower patent classification code of the patent classification code for at least one or more poor patent classification code derived in the first method. Can be monitored search.

Fourth, at least one or more domestic and foreign inventors who apply or register the most patents to at least one or more of the most frequent patent classification symbols derived in the first method may be extracted, and the inventors may be monitored and searched.

Fifth, extract citation information (backward citation, forward citation) from the set of documents that are subject to management of the applicant patent unit patent information service system or inventor name unit patent information service system, and obtain the most frequent citations at home and abroad. The child and / or the most frequently cited person can be extracted, and the least frequently cited person or the cited person may be a surveillance search expression.

Sixth, in order to monitor a person who cites any one or more individual patent documents of the document set (to find a person who cites the patent document), the application number or registration number of the entire patent documents constituting the document set is ORed. It can be bundled and watched.

Seventh, it is possible to generate a surveillance search formula by mixing two or more methods selected from the first to sixth methods. It is possible to input a search expression for each field supported by the search engine, and in the case of the same field, a search expression derived from each method can be generated using an operator such as and or or. It will be natural.

It is preferable that the monitoring search expression of the entire document set unit managed by the applicant's unit patent information service system or the inventor's unit patent information service system is exposed to the user while being grouped by each monitoring subject and under the monitoring service. For example, it is preferable that a folder named "patent monitoring through the most frequent patent classification code" is created in the monitoring subject of the monitoring method of the first method, and in the folder, a subfolder or a lower hierarchy for each country. It would be desirable for the structure to be made. It is preferable that the result of monitoring by the monitoring search method of the entire document set unit is confirmed in the folder or hierarchy. Of course, it is a matter of course that the administrator or inventor of the unit patent information service system of the applicant name or the unit patent information service system of the inventor name can modify the search expression set in advance.

Individual document unit monitoring

Hereinafter, the type and generation method of the monitoring search expression set in advance in each document unit will be described. Since the individual document unit includes at least one patent classification code, the description will be mainly focused on the patent classification code.

First, the patent classification code itself may be generated by surveillance search. At this time, it would be more appropriate to automatically include the lower patent classification code of the patent classification code. In addition, when there are two or more patent classification codes, a plurality of search expressions generated by combining and calculating the patent classification codes may be possible. In this case, it would be natural to include the main patent classification code, and it would be preferable to include a supervisory search expression for and calculating all the patent classification codes obtained from the individual patent documents.

Second, the patent classification code for 1) those who apply or register the most patents at home and abroad, 2) those who apply more than a certain number at home and abroad, but have a high concentration rate in the least frequent patent classification code, and 3) preset at home and abroad. Applicants whose patent application speed or patent registration speed is equal to or greater than a predetermined standard within a recent period can be extracted, and the applicant can be monitored and searched.

Third, the patent classification code with respect to the patent classification code and the patent classification code with a high rate of increase in the number of new documents within a preset recent period corresponding to the lower patent classification code of the patent classification code may be used as a monitoring search method.

Fourth, at least one or more domestic and foreign inventors who apply or register the most patents in the patent classification code may be extracted, and the inventors may be monitored and searched.

Fifth, in order to monitor a person who cites the individual patent document (to find a person who cites the patent document), the application number or registration number of the individual patent document may be monitored.

Sixth, the monitoring search equation may be generated by mixing two or more methods selected from the first to fifth methods. It is possible to input a search expression for each field supported by a search engine, and in the case of the same field, a search expression derived from each method can be generated by using an operator such as and or or. It will be natural.

The monitoring search expression of the individual document unit managed by the applicant patent unit patent information service system or the inventor patent unit information service system is folder-by-monitored for each monitoring subject, and it is located and exposed on the bibliography or management screen of the individual document. desirable. For example, the monitoring subject of the monitoring method of the first method is preferably a folder named "patent monitoring similar to this document through a patent classification code", and the subfolders for each country within the folder. It would be desirable to have a hierarchical structure. It is preferable that the result of monitoring by the monitoring search method of the entire document set unit is confirmed in the folder or the hierarchical structure.

Patent Intelligence Module (60)

Hereinafter, the patent intelligence module 60 of the present invention will be described. The patent intelligence module 60 of the present invention performs a high-level advanced patent analysis, stores a result of performing the advanced patent analysis, or performs a function of providing the user.

The patent intelligence module 60 includes an analysis module 601 including a citation analysis module 601-1 and a comparative analysis module 601-2, a core field discovery module 602 related to discovery of a core field, and a relevant Relevance discovery module 603 for generating group information, document aggregation unit collective attribute discovery module 604 for finding aggregate attributes of document aggregation units such as degree of red ocean, discovery of new subjects or selection of subjects Portfolio analysis module 605, convergence analysis module 606, promising field analysis module 607 to find promising fields, North Korea technology analysis module 608 to find blank technologies, patent strategy analysis to analyze patent strategies It includes any one or more of the modules 609.

The core field discovery module 602 includes a core applicant discovery module 602-1, a core inventor discovery module 602-2, a key agent discovery module 602-3, and a core patent classification code discovery module for discovering key applicants. (602-4).

The relevance discovery module 603 includes a related applicant discovery module 603-1, a related inventor discovery module 603-2, a related agent discovery module 603-3, and a related patent classification code discovery module.

The document set unit collective attribute discovery module 604 includes a red ocean attribute discovery module 604-1.

The portfolio analysis module 605 includes a new subject discovery module 605-1 and a starting subject discovery module 605-2, and the new subject discovery module 605-1 includes a new entry applicant discovery module 605-1. -1), the new entry inventor discovery module 605-1-2, the new entry agent discovery module 605-1-3, and the start agent discovery module 605-2 include a selection applicant discovery module 605-2. -1), a selection inventor discovery module 605-2-2, and a selection agent discovery module 605-2-3.

The fusion analysis module 606 includes a technology fusion pattern analysis module 606-1 for each subject, a technology fusion pattern analysis module 606-2 for each technical field, a technology fusion pattern analysis module 606-3 for each country, and a technology for each time period. There is a fusion pattern analysis module 606-4. The fusion analysis module 606 will be described in the generation module of the plurality of patent classification code relational preprocessing module 3900 and the multi - dimensional analysis operation performing result table 205-7 of the present invention, and the analysis module using the same. It was. In the case of limiting the technology or the technical field to the patent classification code such as IPC , USPC , FT, etc., in terms of the patent classification code by the patent classification code, the applicant / inventor, and the patent classification code by country, The description has been made sufficiently and it has been explained that a limitation on the timing can be made for each of the above.

The promising field analysis module 607 includes a promising field analysis module 607-1 and a promising field analysis module 607-2. The patent strategy analysis module 609 includes a patent strategy evaluation index analysis module 609-1, a patent utilization analysis module 609-2, and an acquired patent analysis module 609-3.

Applicant's Unit Patent Information Service System

concept

Next, a patent information service system and an individual unit patent information service system hierarchically integrated as one application example of the patent information system of the present invention will be described.

The unit patent information service system of the applicant name of the present invention refers to a patent information system in units of individual applicant names, and the patent document information included in the unit patent information service system of the applicant name includes the applicant name in the applicant and / or patent holder field. Characterized in that it is included. The unit patent information service system of the inventor's name of the present invention refers to a patent information system in units of individual inventor's names. It features. At least one inventor name unit patent information service system may correspond to one applicant name unit patent information service system. In the lower part of the applicant patent unit patent information service system, the at least one inventor name unit patent information service system may be configured, wherein the administrator of the applicant patent unit patent information service system has a unit patent information of a specific inventor of the inventor. Manage access to the service system.

The detailed configuration of the unit patent information service system of the individual unit applicant of the present invention is shown in FIG. In the unit patent information service system of the applicant name of the individual unit, the unit patent information service system generation engine of the applicant name that generates the unit patent information service system of the applicant name, and the unit patent information service system of the inventor name that generates the patent information system of the inventor unit A management module for managing an engine, the unit patent information service system of the applicant name and / or the unit patent information service system of the inventor name, the unit patent information data of the applicant name managed in the unit of the applicant name, and the unit patent of the inventor name managed in the inventor unit There is information data.

When the name of the applicant is determined, the patentee unit patent information service system generation engine 1100 obtains an applicant name document acquisition module 1110, which obtains documents to at least one or more countries whose applicant is the applicant / patent holder. Statistics, analysis, and The unit patent information intelligence generation module 1150 of the applicant name for generating the intelligence information of the present invention such as monitoring and reporting is included. At this time, the unit document set of the applicant name is strictly speaking, the unit document information data of the applicant name may be stored as a physical DB, a physical table, a view (including a materialized view), Or any other type of data structure, referred to as a unit document set DB of the applicant name for convenience of description, and the same applies to the unit document set DB of the inventor name.

The inventor unit patent information service system generation engine 1300 of the present invention operates in the same way as the unit patent information service system generation engine of the applicant name, the inventor unit document acquisition module 1310, the inventor unit document set DB generation module beneath it 1330 and the inventor patent unit intelligence information generation module 1350.

In the management module 1500 of the present invention, a unit management module 1510 for managing an applicant name unit patent information service system and an inventor name unit management module for managing an inventor name unit patent information service system for each inventor unit and management information There is a management information DB 1550 that includes. The applicant name unit management module includes an applicant name unit document set management module 1511 for managing a list of document sets in units of the applicant name, an applicant name unit patent information intelligence management module 1513 for managing patent information intelligence in units of the applicant name, and There is an inventor authority management module 1515 that manages the rights of individual inventors belonging to the unit of the applicant's name, and there is a multi-stage layer generation management module 1517 that manages the inventors in at least one or more stages. Meanwhile, the inventor unit management module 1530 includes an inventor name unit document set management module 1531 and an inventor name unit patent information intelligence management module 1533.

The applicant patent unit patent information data 1700 includes the applicant patent unit patent document set DB 1710 and the applicant name unit patent information intelligence DB 1730, and the inventor unit patent document mast DB 1900 includes the inventor unit. Patent document set DB 1910 and inventor unit patent information intelligence DB 1930.

On the other hand, the individual applicant name unit patent information service system may have a unit patent information service system of at least one other applicant name, the method of having a unit patent information service system of the other applicant name to the unit patent information service system of the other applicant name There may be a method that has the link information of, or that the unit patent information service system of another applicant name is under its management authority. Thus, having the unit patent information service system of the other applicant's name is possible for the individual inventor who is the subject of the inventor unit patent information service system. That is, the unit patent information service system of an individual inventor's name as one subject may have link information on at least one unit patent information service system of another applicant's name, or have management authority on the unit patent information service system of another applicant's name. It may be.

The individual unit patent information service system generation engine of the present invention includes at least one of the applicant patent unit patent information service system generation engine and the inventor name unit patent information service system generation engine. The applicant name unit patent information service system generation engine includes an applicant name unit document set acquisition module and an applicant name unit document set management module. The inventor name unit patent information service system generation engine includes an inventor name extraction module and an inventor name unit document. It includes a set acquisition module and a unit document set management module of the inventor's name. Applicant's unit patent information service system generation engine and inventor's unit patent information service system generation engine of the present invention is obtained or registered in the applicant's name in the country-specific patent DB, at least one or more countries, when the applicant's name is obtained. Obtain all patent documents, automatically extract individual inventor names from the obtained patent documents, and automatically obtain patent documents for each individual inventor name to obtain a patent document set in the unit of the applicant's name and the individual inventor unit under the applicant's name Generate a patent document set of the document, and manage the unit document set of the applicant name and the inventor unit document set

Next, a method of generating a patent information system of the applicant's name in the patent information database will be described.

Applicant's Unit Patent Information Service System Generation Engine

Applicant's Unit Document Set Get Module

Applicant's name unit patent information service system of the present invention includes a set of patent documents associated with the applicant's name. The patent document set includes a set of unit documents of the applicant's name filed or registered in the applicant's name, and a set of document units of the inventor's name for each inventor included in the set of documents of the unit of the applicant's name. The applicant's unit document set and the inventor's unit document set may be obtained from an entire patent DB including at least one country-specific patent DB, but preferably from a separate country-specific patent DB.

The method of obtaining a unit document set of the applicant's name may be obtained through a search engine or a method obtained through a DBMS.

First, we explain how to obtain through a search engine. The search engine of the present invention indexes a patent document through an indexer so that the patent document can be searched through a search word input to a searcher or a search box. When the indexing is performed, a search index is generated. In the indexing process of generating the search index, an index may be generated for each field (applicant, inventor, patent classification code, agent, name of invention, etc.) included in the patent document. If a field to be searched is designated in the search engine, and a search word is input, a search result limited to the field is provided. For example, if the name of the applicant is entered in the applicant field (for example, "Samsung Electronics Co., Ltd."), a patent document in the name of the applicant appears as a search result. At this time, if a country or a period is limited, a search result limited to the country or the period is displayed. Through such a query to the search engine, a set of patent documents in the applicant's name is determined. In addition, the search results may allow access to a part (eg, bibliographic details) or all (eg, full text) contents of patent documents related to the search results. Of course, by obtaining only a document unique number (for example, an application number, etc.) that can specify a document set of a specific applicant name through a search engine, and querying the patent DB through the DBMS the obtained document unique number, Information on some or all of the contents of the patent document may be obtained.

Next, a method of generating a unit document set of the applicant's name filed or registered in the applicant's name by querying a patent DB through a DBMS will be described. The patent DB may have various fields (applicant, inventor, patent classification code, agent, name of invention, etc.), and the value of each record constituting the patent DB is input for each field. By specifying a field name in the patent DB and querying a query such as SQL such as select, a desired query result can be generated. That is, if a DB or table to be queried is specified, and a desired field and a field value (for example, "Samsung Electronics Co., Ltd." as the applicant's name) are specified in the DB or table, the DBMS responds to the received query. Outputs the query result. In this way, a unit document set of the applicant's name is generated.

All patent documents may be assigned a document ID (ID) to identify the document, the application number or registration number, etc. may be the unique number, "country mark + application number", "country mark + Registration number may be a unique number, and other number systems such as a public number are also possible. Any numbering scheme that is independent of these numbers, or numbers utilizing these numbers, or which can specify a particular patent document can be a document unique number.

The generating of the applicant document unit document set is performed by the applicant document unit document set acquisition module included in the applicant patent unit patent information service system generation engine. The unit name acquisition module acquisition module of the applicant name obtains a patent document filed or registered under a specific applicant name in a search index or a patent DB using the applicant name as a search expression or a query. In this case, the documents obtained by the unit document set acquisition module of the applicant name may generate a temporary view or a materialized view in units of the applicant name, and the unit document of the applicant name under the unit of the applicant name. The document unique number, such as an application number or a registration number for specifying a set, may be stored.

That is, the automatic acquisition of patent documents is possible through a search engine but also through a DB query. The method of processing with a search engine may be processed by inputting an applicant or an applicant identification code into a search expression. Of course, in this case, the search index must include the applicant or applicant identification code in the index. The method of processing a DB query may be performed by querying a DBMS with an applicant or an applicant identification code using a standard SQL statement or another corresponding query statement. Of course, the patent information data should also include the applicant or applicant identification code. That is, the unit document set acquisition module name of the applicant receives the applicant or the applicant identification code, and corresponds to the patent document corresponding to the applicant or the applicant identification code by querying the obtained applicant or the applicant identification code to a search engine or a DBMS. Obtain a document unique number such as an application number. When the document unique number is determined, a patent document corresponding to the document unique number can be obtained.

Get Documents by Country

It is preferable that the method for obtaining the patent document by the applicant's unit document set acquisition module is performed for each country. For this purpose, if the applicant identification code is unified for each country (not only the representative representative name of the applicant is based on at least one second country, but also the representative name of the applicant) If the applicants of the two countries are the same subject, the same applicant identification code can be assigned, and the applicant is preferably represented by at least one of the first country standard or the second country standard. Applicant representative names will also be described.) When representative names are applied, the applicant identification code vs. the applicant name in the first country vs. the applicant name in the second country vs. the applicant name in the n country Of course, there may be a plurality of display names of applicants in country i. Will be derived in the course of the extension of life of a representative, a plurality of the applicant had shown in the i-th station can be released to the extension of life represented as a unit in each country of the applicant had shown in a single i-th station.

Therefore, a patent document corresponding to the applicant identification code can be obtained from the search index of each country or the patent information data of each country or the entire patent information data only by inputting the applicant name or the applicant identification code into the search engine or the DBMS. will be.

If the applicant's representative name has not been provided in the first and second countries, the method of obtaining a patent document for each country may include: 1) a method using a treaty priority number; and 2) the name of the applicant in a country-specific method. There may be a method using. In particular, in order to reduce the number of errors in the latter, it is preferable that the non-identifiable components such as Co., Ltd., Kaibushi Co., Ltd., ltd, etc. are excluded. This will be described in detail in the Applicant Representative Naming Method.

Through the above method, a set of patent documents to be managed at a specific applicant unit in each country unit or national unit of integration is determined first. The patent document set obtained by the applicant's unit document set acquisition module is managed in the form of the applicant's unit document set data. The applicant document unit document set data may be stored in the form of temporary data from a DBMS point of view, but may also be stored in a manner of storing a document unique number constituting the applicant document unit document set data under the applicant name. In this case, a method of storing data in a temporary data format may be stored in a general view. In addition, the applicant's unit document set data may be stored in a materialized view supported by the latest DBMS, or may be stored in a table in a permanent manner. Regardless of the form in which the applicant document unit document set data is stored, the applicant document unit document set data may provide a visual view including information required by the user by a command such as a join supported by a DBMS. . The unit document set data of the applicant's name must include a document unique number for accessing each of the patent documents, and in addition to the document unique number, information selected from the entire data on the unit patent document of the applicant's name is further included. It may be. The generated applicant's unit document set is managed by the applicant's unit document set management module.

Document management

The management of the patent document of the applicant document unit document set management module includes a CRUD (create create, read read, update update, delete delete). The management of creation, reading, updating, deleting, etc. is managed in such a manner that at least one or more of the four management functions are performed according to the access right to the document. Through the applicant name unit management module, the administrator of the applicant name unit patent information service system may delete a wrong record or document unique number that is different from the fact, and may add a specific record or document unique number, and may be unique to the document. Any change or modification of the content information corresponding to the number may reflect the change or correction. The reason why such deletion, addition, or update is necessary is that there may be a change in information on a specific patent document between the time of obtaining the patent document and the present time. Examples of such changes include: 1) name change through transfer of assignment during or after application; 2) modification of information of the inventor or applicant name (merger, change of company name, address change, addition or reduction of applicant or inventor, etc.) 3) revision of patent classification code, 4) revision of patent specification, etc., if any change in the contents of patent specification is reflected, and 5) increase or decrease of family such as foreign application or divisional application. .

Through the above process, it is possible to extract a subset of patent documents belonging to a specific applicant name unit, it is possible to create a unit patent information service system of the applicant name using the patent document subset as a management target document.

FIG. 76 is a diagram illustrating one embodiment of a process of performing a CRUD on a document set managed by an applicant in a unit patent information service system administrator.

The management module allows the access of an administrator having the management authority for the applicant patent unit information service system (S2620), and if there is a change in the set of managed patent documents by the administrator, obtains the change (S2630). ), If there is a change to the inventor patent information service system by the administrator, the change is obtained (S2640), and if there is a change to the unit patent information service system of the second applicant name by the administrator, Obtain the changes (S2650), if there is a change in the service provided by the applicant's unit patent information service system, the change is obtained (S2660), and the obtained change is stored (S2670). .

FIG. 77 illustrates an embodiment of a process of performing the CRUD on a document set managed by an inventor in a unit patent information service system administrator.

The management module allows access of an administrator having a management authority with respect to a unit patent information service system of an inventor's name (S2720), and if there is a change to a set of managed patent documents by an administrator, obtains the change (S2730). ), If there is a change in the unit patent information service system of the inventor's name by the manager, the change is obtained (S2740), and if there is a change in the unit patent information service system of the second applicant's name by the manager, Obtain the change (S2750), if there is a change in the service provided by the applicant's unit patent information service system, the change is obtained (S2760), and the obtained change is stored (S2770). .

Inventor Name Unit Patent Information Service System Generation Engine

Configuration

When the set of patent documents in the applicant's name unit is determined, at least one or more inventors included in the set of patent documents can be extracted, and the extracted inventors are managed as inventor list data. The inventor's extraction and the inventor list data generation and management are performed by the inventor's name extraction module.

It is possible to obtain a set of documents of at least one individual inventor unit based on the information of the applicant name and the inventor information of the inventor list data. Obtaining the inventor unit document set is performed by the inventor name unit document set acquisition module included in the inventor name unit patent information service system generation engine. The inventor name unit document set acquisition module utilizes the inventor list related to the patent document from the patent document set of the unit of the applicant name, and the inventor name unit patent document which is a subset of the unit patent document set of the applicant name in the inventor unit. Will create a set. The generated inventor document unit document set is managed by the inventor name unit document set management module.

Inventor List and Identification Code

When the patent document set in the unit of the applicant's name is specified / determined, all the bibliographic details of the patent document set are obtained, and then all the information of the inventor field among the bibliographic details can be obtained. In the process of obtaining the inventor information, a full list of inventors associated with the patent document set can be known, and an inventor identification code can be generated and assigned to each inventor included in the inventor full list.

Document extraction by inventor

If there is a list of inventors generated by the inventor name extraction module, the following three methods may be exemplarily employed as a specific method of extracting the inventor and the subset of patent documents in which the inventor is listed as an inventor. . First, using a search engine, a method of searching for "applicant or applicant identification code" and "individual inventor name" in the search expression with and conditions. The second is to use a DBMS to query the SQL query statement with "applicant or applicant identification code" and "individual inventor name" with and conditions. Third, when the inventor list is generated from the unit document set of the applicant's name, the document's unique number such as the application number and the inventor's corresponding relationship (1: n relationship for each patent document) are generated. Conversely, it creates a "document unique number of a set of patent documents containing an inventor vs. an inventor" relationship. When the corresponding process is performed for the entire patent document set, at least one patent document subset corresponding to the inventor identification code can be extracted. Through any one or more of the above three methods, at least one or more patent document subsets in which a particular inventor is listed as an inventor are primarily determined, and the patent document subsets determined in the inventor unit are The inventor unit document DB is stored in the inventor unit document DB by the module.

FIG. 66 illustrates a unit patent information service system of an inventor for each inventor included in a patent document set that generates a unit patent information service system of one applicant name from patent information data of all applicants; One embodiment of the method is shown.

The patentee service system generation engine of the applicant name obtains the information of the applicant name ( S1620 ), A patent document filed or registered in the name of the applicant's acquired search engine or DB Search by query ( S1630 ), To determine if there are any documents filed or registered under the name of the applicant ( S1640 In the following case, a unit patent information service system of the applicant's name is used, wherein the retrieved patent document is a set of documents to be managed. S1650 ). Subsequently, the inventor's unit patent information service system generation engine extracts the inventors from the retrieved patent document set, and maps patent documents for each inventor ( S1660 ), And creates a unit patent information service system of the inventor's name using the extracted patent documents for each inventor as a management target document set ( S1670 ). If there is no document, the patent information service system of the applicant's name is not created.

67 obtains a list of applicant names, generates a unit patent information service system of the applicant name by the number of applicants in the list from the unit patent information data of all applicant names, and generates the unit patent information service of the applicant name; An embodiment of a method for generating a unit patent information service system for each inventor included in a patent document set constituting the system is shown.

The patent name service system generation engine of the applicant name obtains list information of the applicant name ( S1720 ), And obtain information about the individual application name listed. S1730 ), Search for patent documents filed or registered under the name of the individual application holder obtained ( S1740 ), Judging whether there is a document ( S1750 Create a patent information service system for individual applicants whose searched patent documents are managed document sets; S1760 Next, the inventor name unit patent information service system generation engine extracts the inventors from the patent document set, and corresponds to the patent documents for each inventor ( S1770 ), To create a unit patent information service system of the inventor's name using the extracted patent document corresponding to each inventor extracted as a management target document set ( S1780 Recall the above process, Received Determine whether there is an applicant in the list of applicants S1790 ), The unit patent information service system generation of the applicant's name and the unit patent information service system generation of the inventor's name are repeated until there is no next time.

FIG. 68 obtains a list of applicant names from the set of obtained documents, and, for each list, generates a unit patent information service system of the applicant name by the number of applicants listed in the unit patent information data of all applicant names; and each of the generated applicants An embodiment of a method for generating an inventor-specific unit patent information service system for each inventor included in a set of patent documents constituting the title unit patent information service system is shown.

The unit patent information service system generation engine of the applicant name obtains a set of documents for generating the unit patent information service system of the applicant name; S1820 ), Extracts the applicant name from the set of documents and generates list information of the applicant name ( S1830 )

Subsequently, the applicant patent information service system generation engine of the applicant acquires the list information of the applicant name and the information of the individual application name on the list ( S1840 Search for patent documents filed or registered under the name of the individual applicant S1750 ), Judging whether there is a document ( S1860 If there is a next, create a unit patent information service system of the individual applicant's name with the retrieved patent document as a set of documents to be managed; S1870 Next, the inventor name unit patent information service system generation engine extracts the inventors from the patent document set, and corresponds to the patent documents for each inventor ( S1880 ), To create a unit patent information service system of the inventor's name using the extracted patent document corresponding to each inventor extracted as a management target document set ( S1890 Recall the above process, Received Determine whether the name of the application is the name of the application that has not been processed; S1891 The unit patent information service system of the applicant's name and generation of the unit patent information service system of the inventor's name are repeated until there is no).

69 shows a patent document for creating a unit patent information service system of one applicant name from unit patent information data of all applicant names when the order of generation of an applicant patent unit patent information service system is ordered and constituting the unit patent information service system of the applicant name; An embodiment of a method for generating a unit patent information service system for each inventor included in a set is shown.

The unit name patent information service system generation engine of the applicant name obtains the information of the applicant name from an external third party ( S1920 ), Through the search engine or DB query to search for patent documents filed or registered in the name of the obtained applicant name ( S1930 ), Determine whether there is a document filed or registered in the name of the applicant ( S1940 If present, a unit patent information service system of the applicant's name is used, wherein the retrieved patent document is a set of documents to be managed. S1950 ). Subsequently, the inventor's unit patent information service system generation engine extracts the inventors from the retrieved patent document set, and maps patent documents for each inventor ( S1960 ), And creates a unit patent information service system of the inventor's name using the extracted patent documents for each inventor as a management target document set ( S1970 ). If there is no document, the patent information service system of the applicant's name is not created.

The method of FIGS. 66 to 69 may not be applied to individual countries, or may be applied to each individual country. If no country is specified, it will be clear that the representative representation of the applicant should be assumed (especially if the language is of a different country).

70 relates to a method for generating a unit patent information service system of the applicant name on a national basis. Generating the unit patent information service system of the applicant's name in the unit of the present state is to produce the unit patent information service system of the applicant's name and the unit patent information service system of the inventor's name in the unit of the applicant's unit patent information data of the applicant's unit of the national unit. Shall be. Therefore, all of the methods of FIGS. 66 to 69 may be applied.

The patentee service system generation engine of the applicant name obtains the information of the applicant name ( S2020 ), Get a list of countries ( S2030 ), And search for a list of patents filed in the name of the applicant in the name of the applicant in the list of countries in which they were obtained ( S2040 ), The document Presence Judge and S2050 ) If not, complete at the national level, and if there is, create a national patent information service system of the applicant's name at the national level, with the patent documents retrieved at the national level as the managed document set ( S2060 Next, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents at the national level, and corresponds to the patent documents for each inventor ( S2070 ), To create a unit patent information service system of the inventor's name of the unit of the unit of the unit of the name of the inventor to manage the set of patent documents corresponding to each extracted inventor ( S2080 ) Determine if there are any countries in your list of countries that you S2090 Search for patent documents filed or registered in the name of the applicant's name on a per-country basis, if any. S2040 Perform step) again.

FIG. 71 is a view illustrating an embodiment of a process of generating a unit patent information service system of an applicant name and a unit patent information service system of an inventor name by using family information.

The patentee service system generation engine of the applicant name obtains the information of the applicant name ( S2120 ), To search for patent documents filed or registered under the name of the applicant obtained in the first country ( S2130 ), Judging whether there is a document ( S240 ) Next, if there is no completion in the first country, and generates a unit patent information service system of the first country reference applicant name using the patent documents retrieved in the first country as a set of managed documents ( S2150 Then, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents retrieved from the first country, and corresponds to the patent documents for each inventor ( S2160 ), To create a unit patent information service system of the first country's reference inventor name using a patent document corresponding to each inventor extracted from the first country as a management target document set ( S2170 Next, the unit patent information service system generation engine of the applicant name is based on the patent application number of the patent document retrieved in the first country. family Obtain a patent number ( S2180 ), In the second country family Create a unit patent information service system of the applicant's name in a second country, the patent document which can be obtained by patent number as a set of documents to be managed; S2190 Then, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents obtained in the second country, and corresponds to the patent documents for each inventor ( S2191 ), To create a unit patent information service system of the second country's reference inventor name using a patent document corresponding to each inventor extracted from the second country as a management target document set ( S2193 ),

FIG. 72 illustrates an embodiment of a process of generating an applicant name unit patent information service system and an inventor name unit patent information service system by using a priority claim number in a country unit.

The patentee service system generation engine of the applicant name obtains the information of the applicant name ( S2220 ), Retrieving a patent document filed or registered in the name of the applicant name obtained in the first country ( S2230 ), To determine if there is a document ( S2240 ) If not, complete in the first country, and if there is, create a unit patent information service system of the first country's reference applicant name using the patent documents retrieved from the first country as a set of documents to be managed; S2250 Then, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents retrieved from the first country, and corresponds to the patent documents for each inventor ( S2260 ), To create a unit patent information service system of the first country reference inventor name using a patent document corresponding to each inventor extracted from the first country as a management target document set ( S2270 ).

Subsequently, the unit patent information service system generation engine of the applicant's name searches the priority claim number in the second country, targeting the patent application number list of the patent documents searched in the first country ( S2280 To generate a patent information service system of the applicant's name in the second country, wherein the patent documents retrieved in the second country are managed document sets; S2290 Next, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents obtained in the second country, and corresponds to the patent documents for each inventor ( S2291 ), To create a unit patent information service system of the second country's reference inventor name using a patent document corresponding to each inventor extracted from the second country as a management target document set ( S2293 ).

73 is a view illustrating an embodiment of a process of generating a unit patent information service system of the applicant name and a unit patent information service system of the inventor name by using family information in units of countries.

The patentee service system generation engine of the applicant name obtains the information of the applicant name ( S2320 ), To search for patent documents filed or registered under the name of the applicant obtained in the first country ( S2330 ), Judging whether there is a document ( S2340 ), If there is, create a unit patent information service system of the name of the first applicant based on the first country based on the patent documents searched in the first country as a set of documents to be managed ( S2350 Then, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents retrieved from the first country, and corresponds to the patent documents for each inventor ( S2360 ), To generate a unit patent information service system of the first country reference inventor name using a patent document corresponding to each inventor extracted from the first country as a management target document set ( S2370 ),

The unit patent information service system generation engine of the applicant name is based on the patent application number of the patent document retrieved from the first country. family Obtain the patent number ( S2380 ), In the second country family Generating a patent information service system of the applicant's name in a second country, in which a patent document obtainable by patent number is a set of documents to be managed; S2390 ), Patent documents from second countries family In the unit patent information service system of the first country reference inventor name based on the patent application number retrieved in the first country through the patent number; S2391 Next, the inventor's name unit patent information service system generation engine generates a unit patent information service system of a second country reference inventor name using a patent document obtained from an integrated second country as a management target document set ( S2393 ).

FIG. 74 illustrates an embodiment of a process of generating a unit patent information service system of an applicant name and a unit patent information service system of an inventor name by using family information and priority claim information in a country unit.

The patentee service system generation engine of the applicant name obtains the information of the applicant name ( S2420 ), To search for patent documents filed or registered under the name of the applicant obtained in the first country ( S2403 ), Judging whether there is a document ( S2440 Next, if there is a unit patent information service system of the name of the applicant of the first country reference, the patent document retrieved from the first country as the set of documents to be managed is created; S2450 Then, the inventor name unit patent information service system generation engine extracts the inventors from the set of patent documents retrieved from the first country, and corresponds to the patent documents for each inventor ( S2460 ), To generate a unit patent information service system of the first country reference inventor name using a patent document corresponding to each inventor extracted from the first country as a management target document set ( S2470 ).

Subsequently, the unit patent information service system generation engine of the applicant's name searches the priority claim number in the second country, targeting the patent application number list of the patent documents searched in the first country ( S2480 To generate a patent information service system of the applicant's name in the second country, wherein the patent documents retrieved in the second country are managed document sets; S2490 Then, the inventor's name unit patent information service system generation engine is based on the patent application number retrieved in the first country based on the patent application number obtained in the second country through the priority claim number unit patent of the first country reference inventor name Integrate into information service systems, S2491 ), To create a unit patent information service system of a second country reference inventor name using a patent document obtained from an integrated second country as a management target document set ( S2493 ).

Find a different problem

If one applicant is a large company, there may be another inventor using the same name, in which case, address information may be used to distinguish the two. In addition, it is preferable that the method of obtaining the patent document of the inventor unit as described above proceeds for each country as in the case of obtaining the patent document of the applicant's name. At this time, the inventors can obtain patent documents for each country by 1) the applicant's representative name or the inventor's representative name. It is preferable to make it through the number. That is, the treaty priority claim number identifies the identity or correspondence of a second country's application related to the first country's application, and the inventors may adopt a method of treating the same in the applications of both countries where the identity or equivalence is recognized. Can be.

Document management

The management content of the patent document of the unit document set management module of the inventor's name may be largely one or more of deletion, addition, or update (including reflecting or modifying changes). A wrong record or document unique number that is different from the fact that an administrator of the inventor's unit patent information service system (the manager may be the inventor itself or another third party) is provided through the inventor's unit document set management module. Can be deleted, and a specific record or document unique number can be added, and the changed or modified contents information corresponding to the document unique number can reflect the change or modification. The reason why such deletion, addition, or update is necessary is that there may be a change in information on a specific patent document between the time of obtaining the patent document and the present time. Examples of such changes include: 1) name change through transfer of assignment during or after application; 2) modification of information of the inventor or applicant name (merger, change of company name, address change, addition or reduction of applicant or inventor, etc.) 3) revision of the patent classification code, 4) revision of the patent specification, etc., if any change is made to the contents of the patent specification, and 5) increase or reduction of the family such as foreign application or divisional application. have.

Through the above process, it is possible to extract a subset of patent documents registered by the inventor as the inventors belonging to a specific inventor unit belonging to the unit of a specific applicant's name, and the inventor's name unit patent using the patent document subset as the management target document. An information service system can be created.

FIG. 75 illustrates an exemplary process of what the applicant's unit patent information service system generation engine and the inventor's unit patent information service system generation engine perform when a new document is added.

The unit patent information service system generation engine of the applicant's name acquires a new patent document (S2620), extracts applicant information from the new patent document (S2630), and determines whether there is a unit patent information service system of an existing applicant's name as applicant information. (S2640).

If there is a determination result , new patent document information is added to the unit patent information service system of the existing applicant's name (S2650), and the inventor's unit patent information service system generation engine extracts the inventor information (S2660), and the unit of the existing applicant's name It is determined whether there is an existing inventor's unit patent information service system with inventor information under the patent information service system (S2680), and if there is, new patent document information is added to an existing inventor's name unit patent information service system (S2690). In this case, the unit patent information service system of the inventor name is generated based on the inventor information, and the unit patent information service system of the inventor name having the new patent document information as the management target document set is generated (S2699).

On the other hand, if there is no patent information service system of the existing applicant name of the applicant information, the unit patent information service system generation engine of the applicant name generates a unit patent information service system of the applicant name whose new patent document is a management target document set (S2693). Then, the inventor name unit patent information service system generation engine extracts the inventors from the new patent documents, and corresponds to the patent documents for each inventor (S2695), the inventors to set the patent documents corresponding to each extracted inventor as the management target document set A name patent information service system is created (S2697).

Unit analysis of applicant name or inventor name

At least one analysis algorithm set in the analysis module of the present invention is characterized in that it is set in advance in the applicant's unit patent information service system and / or the inventor's unit patent information service system. That is, when generating the unit patent information service system of the applicant's name or the unit patent information service system of the inventor's name, or without the user's intervention even after the generation, the analysis algorithm required by the applicant name unit or the analysis algorithm required by the inventor name unit The automatic setting is made to constitute an analysis service of the applicant's unit patent information service system or the inventor's unit patent information service system. The execution of the analysis algorithm is executed when generating the unit patent information service system of the applicant name or the unit patent information service system of the inventor name, and the analysis result is applied to the unit patent information service system of the applicant name or the unit patent information service system of the inventor name. It is desirable to have in advance. However, if the analysis algorithm is executed at a high speed or fast enough to not wait for the user in real time, the analysis algorithm may be executed when there is external access to the analysis subject (such as a click of an administrator or a user). There will be. Therefore, when an analysis module is composed of at least two or more analysis algorithms, the analysis subject and the corresponding analysis algorithms are set in advance and the timing of execution may be different.

The analysis service is set in advance in the applicant's unit patent information service system or the inventor's unit patent information service system will be described in more detail. The analysis service is composed of a set of documents to be analyzed, selection of an analysis subject for which analysis to perform on the document set, and an analysis algorithm corresponding to the analysis subject. At this time, since the selection of the analysis subject and the analysis algorithm are prepared in the analysis system of the present invention, a specific analysis target document set to execute the analysis algorithm should be specified. However, since the document set is already specified in the step of generating the applicant patent unit patent information service system or the inventor name unit patent information service system, the applicant patent unit patent information service system or the inventor name unit patent information service system By managing the set of documents to be analyzed, the analysis can be performed automatically on at least one predetermined analysis subject.

In this case, when documents managed by the applicant name unit patent information service system or the inventor name unit patent information service system are clustered, a preset analysis algorithm for a predetermined analysis subject is performed on the clustered partial document set. can do.

In addition, the most frequent IPC (patent classification code, etc.), maximum concentration field, and AI maximum for the entire document set or the clustered partial document set managed by the applicant's unit patent information service system or the inventor's unit patent information service system. Various statistics or analytical results such as fields may be generated first or secondarily, and an analysis algorithm for each analysis subject may be set for the various statistics or analytical results. For example, the first IPC is first extracted from an IPC subclass or group unit for a set of management documents of a unit patent information service system of a specific applicant's name, and secondly, the second IPC is targeted to the most frequent IPC. Extracts the largest number of applicants at home and abroad, and outputs the number of applications per year by the applicants as an analysis result, and generates the results of the annual analysis of the concentration rate and share of the IPC by the applicants. Could be. As in the above example, two or more analyzes may be combined, and the results of the immediately preceding analysis may be used as the material for the immediately following analysis.

The result of the analysis may be generated in advance without a user's input in a unit of document set managed by the applicant's unit patent information service system or the inventor's unit patent information service system or in a partial document set unit of the document set. . Since the analysis execution time may vary depending on the size of the document set and the analysis algorithm to be analyzed, the execution of the at least one analysis algorithm of the present invention is performed by the applicant's unit patent information service system or the inventor's unit patent. It may be performed at the time of creation of the information service system (when the analysis target document set is small or the analysis subject takes less time to perform the analysis algorithm), or may be performed after the creation of the patent information service system. At this time, since the patent information is updated every day or in a predetermined period, and new documents continue to occur, it is preferable that the analysis of the present invention is automatically performed in the predetermined period.

The patent document set managed by the applicant's name unit is largely divided into 1) a company-related patent document set and 2) a third-party related document set (for example, a manager of a unit patent information service system under one applicant's name is not himself or herself). It may have a unit patent information service system of the applicant's name of another company (competitor, etc.), 3) at least one subset of at least one set of patent documents managed by the administrator of the unit patent information service system of the applicant (the manager) It is possible to perform at least one or more analysis of all the above-described analysis for the patent document set clustered or grouped by the criteria of), and at least one or more again using the result of the analysis as an input value. Analysis can be performed.

Applicant's Unit vs. Inventor's Unit Patent Information Service System Relationship

Hereinafter, the relationship between the applicant's unit patent information service system and the inventor's unit patent information service system will be described in more detail. At least one inventor name unit patent information service system may be generated in the unit patent information service system of the applicant. The number of inventors' unit patent information service system is preferably less than or equal to the total number of inventors included in the patent document filed or registered in the applicant's name.

Hereinafter, the administrative aspects of the applicant's unit patent information service system and the inventor's unit patent information service system will be described in detail.

First, the administrator of the applicant patent unit patent information service system will be able to set the authority for the inventor patent unit service information system itself. The authority setting by the administrator of the applicant patent unit patent information service system includes management of access rights to the patent information service system of a specific inventor unit existing under the applicant patent unit patent information service system. For example, when the specific inventor belongs to his or her organization, the inventor may be given access to the inventor unit patent information service system. From the standpoint of the applicant's unit patent information service system administrator, when the inventor belonging to his organization leaves the company, it is not necessary to maintain the unit patent information service system of the inventor's inventor. In this case, 1) a method of deleting the unit patent information service system of the inventor's inventor name, 2) a method of removing the inventor's access authority (by deleting an ID, changing the PW, etc.) in preparation for the re-entry. May be employed.

Second, the administrator of the applicant's unit patent information service system will be able to delete the unit patent information service system itself. The deletion occurs when the first person determined as the inventor does not belong to him or there is no need for management. The inventors listed in a specific patent application often belong to the applicant organization, but often they do not. At this time, the unit patent information service system of the inventor's name for the inventor who does not belong to the applicant organization is also generated, so it is necessary to delete it. Of course, the administrator of the applicant's unit patent information service system may allow access to the unit patent information service system of the inventor's name not belonging to the applicant's organization.

Third, the administrator of the applicant's unit patent information service system will be able to add and update the inventor's unit patent information service system itself. The addition refers to further generation of the unit patent information service system of the inventor. The unit patent of the name of the inventor whose management document is the new document targeted at the inventor listed for the first time when an inventor is first listed in a newly generated document after at least one inventor name unit information service system is generated. You just need to create an information service system.

Fourth, when the patent document is updated, the newly updated patent document may be added to the management target of the applicant patent unit information service system, and the newly updated patent document may be included in the updated patent document. In the case of the inventor, the new patent document may be added to the patent information service system of the inventor unit as a management object. Of course, the update includes the extinction, and when extinguished, the existing patent document may be deleted from the management target, or the patent document may be marked for extinction.

Problems in production from public documents (public information vs Self information)

Hereinafter, a description will be given of a problem and a solution for the case where the applicant's unit patent information service system and the inventor's unit patent information service system are produced in an open patent document. In general, when a patent application is filed, it is compulsorily published after 1 year and 6 months after the application date or at the time of registration, and may be disclosed even before the applicant's intention. The published patent documents are also available from outside. However, the applicant knows that a patent document with a specific application number has been filed, but if it is not yet published, there is a problem in that the unpublished patent document cannot be reflected as a management object based on patent information obtained from outside. There may be. In this case, a method of including the information on the unpublished patent document by the administrator of the unit patent information service system of the applicant name in the unit patent information service system of the applicant name may be considered. The information on the unpublished patent document is preferably less than the information on the public document for security reasons. An example of information included in the unit patent information service system of the applicant name among the information about the unpublished patent document may be an application number, an inventor, an application date, and the like, and may include a name of the invention and other information for management. . In this case, when the inventor information is included in the information about the unpublished patent document, the unpublished application number may be included as a management target document of the unit patent information service system of the inventor's inventor. When the unpublished patent document is released after a certain time, the applicant patent unit information engine system generation engine and / or the inventor name unit patent information service system generation engine sets the application number as a key value. By using the published patent information, patent document information corresponding to the unpublished patent document may be updated.

Multi-tiered patent information service system management

Management module

Hereinafter, a multi-layered management module included in the management module will be described. First, the situation where the problem of multi-tier management occurs. If a single applicant has a significant number of inventors, the inventor may belong to at least one hierarchical organization. For example, one inventor may belong to team a1 <department <a1 <a1 <company A <alpha group. Therefore, from an administrative point of view, it would be natural for the team a1 to manage the patent information service system of the inventor unit and the patent documents related to the team a1, and the department a in the inventor name unit associated with each department from a1 to an. The patent information service system and / or patent documents are managed for the management of the patent information service system and / or patent documents in units of the inventor's name related to the department a through the department a. The unit patent information service system and / or patent document of the inventor's name related to the series, and the alpha group is the unit patent information service system of the applicant's name from company A to company Z, the unit patent information service system of the inventor's name and / or patent document, etc. I will do it by management. However, since most of the information available in the published documents is only the inventor's name and company's name (that is, the patent document generally does not have information about the group to which the inventor belongs except the applicant's information). The manager and / or the manager at the highest level will need to clarify where a particular patent document should belong. Since the patent information service system is generated through a patent document to be managed, if there is affiliation information on which inventor, which team, which department, which affiliate, and which company the patent document corresponds, through the affiliation information It is possible to create a patent information service system. In order to determine the affiliation information, a patent document, information on an inventor, and a company can be obtained from the patent document itself. Therefore, the inventor knows which team, which department, which family, and which company the company belongs to, or knows which team, which department, which family, which company the document belongs to, etc. By means of the group to which the patent document belongs can be determined.

Generation method

In this case, the method of generating the multi-tiered patent information service system may take one or more of the following two methods. The first is to create a patent information service system of the inventor unit, and then integrate the patent information service system of the inventor's name into the team unit to which the inventor belongs, integrate the patent information service system of each team unit into the department unit, and each department unit. To integrate the patent information service system of each subsidiary unit, integrate the patent information service system of each subsidiary unit into a company unit, hierarchically integrate the patent information service system of each company unit in a group unit, and in each hierarchical unit It is a method of managing patent information service system of each layer. Integration can be physical and meta-integration. Physical integration means that each layer has independent data, and meta-integration means that the upper layer patent information service system has only the meta information of the lower layer patent information service system. Say what it is. Taking physical integration as an example, the team a1 has a unit patent information service system of the inventor of the inventor, but the department a also has a unit patent information service system of the inventor of the department. Taking meta-integration as an example, the a1 team belongs to the inventor's inventor's unit patent information service system, and the a department has only the management information for the a1 team and the inventor's unit patent does not belong to the inventor. It does not have own information service system. Second, if each team, department, affiliate, company, and / or group knows (if it can be specified) a patent document that it needs to manage, it generates a patent information service system for the above-mentioned specific document. It is a way. In this case, it is preferable that the patent information service system for the document that can be specified is a patent information service system in the unit of the applicant's name in a formal form. In this case, any one or more of physical integration and meta-integration may be employed. In addition, the manager may exist for each patent information service system of the multi-tier hierarchy, and the management function of the manager is provided by a management module of the multi-tier patent information service system unit. In this case, the administrator of the upper layer patent information service system may grant an authority to at least one lower layer patent information service system at the time of managing the multi-tier patent information service system. It may be provided in a management module of a service system unit.

Through the above process, the following result may be generated.

First, the applicant's unit patent information service system can obtain the management target document information managed by the applicant at the unit level of the applicant's name. The document information may be a national unit and a national integration unit, and the applicant patent unit patent information service system may be a unit patent information service system of at least one or more multi-tier applicants. The manager of the unit patent information service system of the applicant name may manage document information of the unit patent information service system of the applicant name and the unit patent information service system of the inventor name through the unit management module of the applicant name, and own lower layer patents. It can also manage information service systems.

Secondly, the inventor's unit patent information service system can obtain the management target document information managed by the inventor at the unit level of the inventor's name. The document information may be a national unit and a national integration unit. It is preferable that the administrator of the inventor's name unit patent information service system is an inventor authorized by the administrator of the applicant's name unit patent information service system, and the inventor has a unit name of the inventor's own name through a unit management module of the inventor's name. Document information of the patent information service system can be managed.

Patent Information System Batch Generation Engine

The foregoing has described the generation, operation and management of the applicant patent unit patent information service system based on one applicant and the inventor patent unit service information system dependent on the applicant name unit patent information service system. Next, a method of generating and managing a unit patent information service system of a plurality of applicant names will be described. The patent information service system batch generation engine of the present invention is responsible for generating the unit patent information service systems of two or more applicant names and the unit patent information service systems of the inventors dependent on each of them. The configuration of the patent information system batch generation engine is shown well in FIG.

Applicant Name Acquisition Module

In order for the front-end patent information service system generation engine to generate the unit patent information service system of several applicant names, it is necessary to first obtain the name of the applicant. The function of acquiring the name of the applicant, which is the object of generating the patent information service system, is performed by the acquiring name of the applicant of the present invention. The Applicant Name Acquisition Module will preferably obtain at least one applicant name in the form of a list. The applicant name is not limited to one country, but may be an applicant name selected from several countries.

Individual Unit Patent Information System Generate Engine Drive Module

When the list of applicant names is obtained, it is sufficient to generate the unit patent information service system of the applicant name and the unit patent information service systems of the inventor name dependent on the unit patent information service system of the applicant. The role of generating one by one for each applicant name listed in the applicant name list is performed by the individual unit patent information system generation engine driving module of the present invention. That is, the individual unit patent information system generation engine driving module acquires one applicant's name, and drives the patent information service system generation engine to generate the unit patent information service system of the applicant's name and the unit patent information service system of the inventor's name, and Upon receiving the information that the generation is completed, a method of driving the patent information service system generation engine for another applicant is taken.

In this case, in the generation of the applicant patent unit patent information service system and the inventor name unit patent information service system, specific monitoring search expressions and analysis search expressions included in the monitoring service and analysis service are the individual unit patent information system generation engine driving module. It may be generated during the operation, but may be added later to the unit patent information service system of the individual applicant name and the unit patent information service system of the inventor name.

In this case, the patent information service system generation engine driving module may generate only the unit patent information service system of the applicant name in some cases, and do not generate the unit patent information service system of the inventor name dependent on the unit patent information service system of the applicant name. You can do that. In this case, one company may attempt to create a patent information service system of the applicant's name for each competitor in order to systematically obtain patent information of several domestic and foreign companies which are its competitors. When it is necessary to know the patent activity information of each competitor's inventors, a unit patent information service system of the inventor name may be generated, but otherwise, the unit patent information service system of the applicant name may be sufficient.

For some companies, it is necessary to generate a unit patent information service system of the inventor's name, and for some companies, it is sufficient to generate the unit patent information service system of the applicant's name. Therefore, it is preferable that the information on the range of generation is further obtained for each list of applicant names, and then the individual unit patent information system generation engine driving module is operated to reflect the range of generation.

When the applicant's name selected in several countries is obtained, when the applicant's name patent information data is constructed in a country unit, a patent information service system may be generated in a country unit. Such a situation is a problem regarding the order of batch processing, and therefore can be easily implemented by those skilled in the art.

Individual unit patent information service system multi-stage grouping module

When a plurality of applicant name unit patent information service systems have been created, the patent information service systems may be grouped with at least one or more steps according to the attributes of the applicant. On the contrary, a unit patent information service system of several applicant names may be arranged under one group. For example, the unit patent information service system of the applicant's name such as Hankook Tire Co., Ltd. and Kumho Tire Co., Ltd. may be arranged under a group having a multi-tiered structure of manufacturing industry-automobile-tire. Grouping as described above is easy in the scope of those skilled in the art, and thus detailed description is omitted.

Individual Unit Patent Information System Management Module

When two or more applicant name unit patent information systems are generated, the patent information service systems need to be managed, and the management is performed by the individual patent information service system management module. The open patent information service system management module manages authentication, authority management, authority transfer, deletion, change of location or affiliation, etc. of the applicant patent unit patent information service system.

As described above, when the unit patent information service systems of a plurality of applicants have been created, it is necessary to integrate and manage the system from the viewpoint of the entire system, and the integrated management module of the present invention performs the integrated management. FIG. 65 illustrates an embodiment of a configuration of an integrated management module. The integrated management module includes an individual unit patent information system multi-stage grouping module 6100 and an individual unit patent information system management module 6300, and each unit patent. The information service system can be grouped in multiple stages, and the multi-stage grouped patent information service system can be managed.

Implementation of Applicant Unit Patent Information Service System

Hereinafter, the embodiment of the applicant unit patent information service system of the present invention will be described in more detail by taking an example.

FIG. 120 is an embodiment of a screen showing an applicant list screen of the top 500 applicants of the Republic of Korea as one embodiment of a screen on which the applicant unit patent information system of the present invention is implemented. In the Country tab, South Korea is selected, and the Top 500 tab is selected from Top 500, Exchange Registered Companies, KOSDAQ Registered Companies, Multi-Application Companies, and All Companies. On the other hand, it is obvious that the tabs of universities, public institutions, research institutes, and multi-application individuals can be newly established in the tabs under the countries. Which category the applicants belong to is determined with reference to the applicant's representative name preprocessing module of the present invention, multi-dimensional analysis operation result table for total amount analysis, and company information DB. The country's Top 500 may be extracted based on analytical indexes such as multi-application / multi-registration in the multi-dimensional analysis calculation result table for total amount analysis. Meanwhile, which stock exchanges are listed on which stock applicants can be obtained through the company information in the company information DB. Meanwhile, whether the applicant is a company or a university may be determined in the organization type information and / or the company information DB during the applicant representative process. This applies equally to other countries. On the other hand, a link is attached to each applicant list, and clicking on the link moves to the applicant's patent information service system. This is well illustrated in FIG.

121 is an embodiment of a screen showing an applicant list screen of a NASDAQ registered company in the US as an embodiment of a screen on which the applicant patent information system of the present invention is implemented. The United States tab is selected in the Country tab, and NASDAQ is selected in the Top 500, NYSE (NYSE), NASDAQ, AMEX, and all companies in the subtabs. There are many NASDAQ registered companies, so there are tabs in ABC and all NASDAQ companies. Among the NASDAQ registered companies, the companies in bold letters have a patent applied or registered under the name of the company, indicating that a patent information service system is generated for each applicant. On the other hand, the light text indicates that the patent information mast DB of the present invention does not have a patent or a registered patent identified as the company name, or that the patent information service system is not generated on the basis of the applicant. This occurs mainly when 1) there is no application / registration under the name of the company, and 2) when the name of the company and the name of the applicant do not match or are not representative, and the computer system cannot map them. do. Therefore, it will be apparent that the number of companies in light letters will decrease as the applicants appear in light letters as the applicant's representative name becomes more complete. The same applies to other countries.

FIG. 122 is an embodiment of a screen showing an applicant list screen for a JASDAQ registered company in Japan as an embodiment of a screen on which the applicant patent information system of the present invention is implemented. Japan is selected in the Country tab, and Top 500, JP1 (Tokyo 1 Exchange registered company), JP2 (Tokyo 2 exchange registered company), JPM (MOTHERS registered company), JASDAQ registered company and all companies JASDAQ is selected screen.

FIG. 123 is an embodiment of a screen showing an applicant list screen of a London stock exchange registered company in Europe as an embodiment of a screen on which the applicant patent information system of the present invention is implemented. In the Country tab, Europe is selected, and in the subtabs, UK1 is selected from Top 500, UK1 (London stocks), AIM (London stocks), OVERSEAS LISTED (London stock listed foreign companies), EURONEXT, and all companies. The list of publicly traded companies in Frankfurt is not shown. On the other hand, it is natural that companies registered on other stock exchanges may be added.

FIG. 124 illustrates an exemplary embodiment of a U.S. patent tab when selecting a country from a patent list among patent portfolios of 3COM's patent information system that appears when 3COM (No. 6), one of NASDAQ registered companies, is selected in FIG. 121; Implementation. If there is an input to the 3COM, the patent information service system of the present invention can search the US document for the 3COM, or query the patent document mast DB to generate a list of documents for the 3COM can be presented to the user.

125 is a screen showing the inventor list of 3COM when the inventor list is selected in FIG. 124 and then the United States is selected in the country tab. The patent information service system of the present invention may extract the list of all inventors of 3COM by querying the patent document mast DB or by querying the result table for performing multidimensional analysis for inventor analysis. On the other hand, although the inventors of 3COM may be listed in ABC order, it will be apparent to those skilled in the art that the 3COM inventors are listed in other orders such as multi-application / multi-registration or multi-person use.

FIG. 126 is a screen showing a list of patent documents related to this inventor when selecting Aldous Stepha .. (No. 9) from the inventor list in FIG. The patent information service system of the present invention can obtain the selection information for the inventor, and then query the patent document mast DB with the obtained inventor information, thereby generating a list of documents corresponding to the inventor and presenting it to the user. will be.

FIG. 127 is the number of applications per year for each IPC at the IPC subclass level based on the total volume of the application when clicking the Statistical Analysis tab in FIG. 124 for 3COM and clicking the US in the Technical Analysis (SA) menu, Country tab. Example Embodiments. This screen pops up a simple analysis screen that appears when you drill down and press a specific cell value. This is structurally equivalent to FIG. 17 where there is an embodiment for Samsung Electronics.

FIG. 128 is a competition based on the number of applications by year for each USPC at the USPC subclass level based on the total amount of application that occurs when pressing the Statistical Analysis tab in FIG. 124 for 3COM and pressing the United States under the competitor SA (Systematic Analysis) menu, Country tab. In one embodiment, information about the applicant is presented. This screen is structurally equivalent to FIG. 38 where there is an embodiment for Samsung Electronics.

FIG. 129 is a front view of the present invention when the entire set of application documents of 3COM, which is obtained when the user presses the Statistical Analysis tab in FIG. 124 for 3COM, and the United States SA (Systematic Analysis) menu and the Country tab, is set as a reference document set. (backward) An exemplary view of a yearly analysis result for a multi-citation applicant using a cited document set as an analysis target document set. This screen is structurally equivalent to FIG. 39 where there is an embodiment in Samsung Electronics.

FIG. 130 shows the multi-applicant inventors for each IPC by year at the IPC subclass level based on the total volume of the application when the user presses the Statistical Analysis tab in FIG. 124 for the 3COM and the inventors in the SA (Systematic Analysis) menu and the Country tab. Example Embodiments. This is structurally equivalent to FIG. 35 where there is an embodiment for Samsung Electronics.

Each of the embodiments of FIGS. 124 to 130 and each embodiment of Samsung Electronics Co., Ltd. or General Motors has a name corresponding to the patent information mast DB of the present invention among the companies / applicants shown in FIGS. It will be apparent to those skilled in the art that the same applies to companies / applicants whose identity is recognized throughout the representative naming process. DB in the context of the present invention refers to a database, which will be a common term to those skilled in the art.

The present invention can be utilized throughout the patent information industry. Specifically, the present invention can be comprehensively used in various patent information related industries such as search, analysis, consulting, surveillance, investigation for infringement, analysis of competitors, and technology convergence pattern research.

1 is a diagram illustrating an embodiment of a patent information system configuration of the present invention.

2 is an exemplary diagram of a DB unit of the present invention.

Figure 3 is an embodiment of a patent document mast DB of the present invention.

Figure 4 is an exemplary view of a patent classification code mast DB of the present invention.

5 is an exemplary diagram of a subject mast DB of the present invention.

FIG. 6 is a diagram for one embodiment of a calculation result table DB for multi-dimensional analysis of the present invention. FIG.

7 is an exemplary view of a support DB unit and a secondary processing DB unit of the present invention.

8 is an exemplary view of a pretreatment module unit of the present invention.

9 is an exemplary diagram of a mast DB generation module of the present invention.

10 is an exemplary diagram of a citation information preprocessing module of the present invention.

11 is an exemplary diagram of a patent information processing basic module of the present invention.

12 is an exemplary diagram of a support module of the present invention.

Figure 13 is an exemplary diagram of a patent intelligence module of the present invention.

14 is an exemplary diagram of the delphion of the present invention based on the subclassification of the present invention is not an analysis of the present invention.

15 is a diagram illustrating one embodiment of an embodiment of a directory creation module of the present invention.

16 is a diagram illustrating one embodiment in which the patent classification code search module of the present invention operates.

FIG. 17 is a diagram illustrating an exemplary analysis result of application number data for each application by IPC subclass level of Samsung Electronics Co., Ltd. filed in Korea of the present invention.

18 is an exemplary diagram for an analysis result generated when drilling down to H01L.

19 is an exemplary diagram of an analysis result generated when drilling down to H01L 21/00 and other lower patent classification codes.

FIG. 20 is a diagram illustrating an exemplary analysis result of application number data by year of multi-application IPCs of the IPC main group level of Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system of the present invention.

FIG. 21 is a diagram illustrating an exemplary analysis result of application number data for each application by IPC 1 dot subgroup level of Samsung Electronics Co., Ltd. among all applicants in the DB possessed by the patent information system of the present invention.

FIG. 22 is a view illustrating an exemplary analysis result of application data for each application by year of IPC subclass level of Samsung Electronics Co., Ltd. based on an application document of Samsung Electronics Co., Ltd., which is filed in the US, among all applicants in the DB possessed by the patent information system; FIG. Drawing.

FIG. 23 is a diagram illustrating an exemplary analysis result of application data for each application by year of IPC main group level based on the registered document of Samsung Electronics Co., Ltd. registered in the United States among all applicants in the DB held by the patent information system of the present invention; FIG. to be.

FIG. 24 is a diagram illustrating an exemplary analysis result of application number data for each USPC no dot (sub class) level of general motors filed in the US among all applicants in the DB held by the patent information system of the present invention. Drawing.

FIG. 25 is a diagram illustrating an exemplary analysis result of application number data per year of multi-application IPC of US Motor 1 dot level of General Motors filed in the US among all applicants in DB held by the patent information system of the present invention.

FIG. 26 is an exemplary diagram for an example of application total amount analysis and drill down for Korean application document standard IPC H04B of the present invention. FIG.

FIG. 28 is a diagram illustrating an analysis of the total amount of a multi-application firm for Korean patent document IPC H04B according to the present invention.

FIG. 29 is a view illustrating one example of a multiplier cause based on the share of the Korean application document Criteria IPC H04B of the present invention.

30 is an exemplary view of the activity ratio based on the cause of application for the Republic of Korea application document IPC H04B of the present invention.

FIG. 31 is an exemplary diagram for application total volume analysis including drilldown to US Application Document Criteria IPC H04B and subclasses thereof of the present invention.

32 is a diagram illustrating an exemplary competitor analysis based on the total amount of Samsung Electronics Co., Ltd. of Korea among all applicants in the DB held by the patent information system of the present invention.

33 is a diagram illustrating an analysis of competition applicants based on the total amount of multi-application patent technology classification symbols based on the total amount of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention.

FIG. 34 is an exemplary view of analysis of competition applicants based on the total amount of IPC main group of the US patent standard Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention.

FIG. 35 is a diagram illustrating an analysis result of multi-applicant inventors by year of Samsung Electronics Co., Ltd. based on the total amount of Korean patent applications among all applicants in the DB held by the patent information system of the present invention.

FIG. 36 is a diagram illustrating an analysis result of multiple applicants of multiple applications by Samsung Electronics Co., Ltd. based on the total amount of Korean patent applications among all applicants in the DB held by the patent information system of the present invention.

FIG. 37 is a ranking analysis of USPC subclass (no dot, class down) competing applicants based on the total amount of all US patent applications of Samsung Electronics Co., Ltd. among all applicants in DB held by the patent information system of the present invention. One exemplary drawing of the results.

38 is a set of documents for analysis of forward citations of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are set as reference documents; FIG. 1 is an exemplary diagram of a yearly analysis result for the total amount of citations.

FIG. 39 shows a set of forward citation documents of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are the reference document sets. FIG. 1 is an exemplary diagram of a yearly analysis result for multi-patent applicants.

40 is a set of documents for analysis of forward citations of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are the reference document sets; FIG. 1 is an exemplary diagram of a yearly analysis result of a multi-patent patent classification code (IPC main group level).

41 is a set of documents for analysis of forward citations of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are the reference document sets; FIG. 1 is an exemplary diagram of a yearly analysis result reflecting when drilling down to a multi-patent patent classification code (IPC main group level).

42 is a set of documents for analysis of forward citations of the present invention when all US patent applications of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are set as reference documents; FIG. 1 is an exemplary diagram of a yearly analysis result for a multiplayer inventor. FIG.

43 is a set of forward citation documents of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are the reference document sets; 1 is an exemplary diagram of a yearly analysis result of the most cited applicants.

44 is a set of forward citation documents of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are the reference document sets. One example of the results of the yearly analysis for the most cited inventor.

45 is a set of forward citation documents of the present invention when all US patent registrations of Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are the reference document sets. In the analysis of the most cited technologies by IPC main group, an example of the analysis results by year when drilling down to IPC is shown.

46 shows a forward document citation set of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are a reference document set; USPC subclass (no dot, class below) An example of the analysis results by year when the drill down of the analysis of the most cited technology.

47 is a set of analysis documents based on the entire set of reference documents related to back citation of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are set as reference documents. One embodiment of a chart generated by the chart generation module of the total amount based analysis results and the reporting module of the present invention for the analysis results.

48 is a set of analysis documents based on the entire set of reference documents related to back citation of the present invention when all US patents registered by Samsung Electronics Co., Ltd. among all applicants in the DB held by the patent information system of the present invention are set as reference documents. , An analysis of the total number of citations by year of the inventors who have received a lot of citations, and a simple analysis of the set of documents corresponding to the number when the user clicks on a specific number (the list of documents, by the year of most applicants). For a list of documents generated by the simplified analysis module that provides the number of applications / registrations, the number of applications / registrations by year of most inventors, and the number of applications / registrations by year (IPC, USPC, FT) by year (including drill down) One embodiment drawing.

FIG. 49 is a diagram for one embodiment showing a drill down function in a simple analysis module according to the present invention in a maximum technical field (IPC, USPC, FT).

FIG. 50 is a forward citation document of the present invention when a document set is a reference document set among multi-application IPC subclass units among all US application documents of Samsung Electronics Co., Ltd. among all applicants in the DB possessed by the patent information system; FIG. FIG. 1 is an exemplary diagram of a yearly analysis result of the total amount of citations using a set of documents to be analyzed.

51 is a diagram illustrating an exemplary hierarchical patent information service system that is a sub-system of the patent information system of the present invention.

52 is a diagram illustrating an embodiment of an individual unit patent information system generation engine according to the present invention.

53 is a diagram illustrating the internal configuration of a patent information preprocessing module according to the present invention.

54 is a diagram of one embodiment of a weight preprocessing module of the present invention.

55 is a diagram for one embodiment of a citation information preprocessing module according to the present invention.

Fig. 56 is an exemplary diagram of the patent classification code preprocessing module 301-3-1 or 3500 of the present invention.

FIG. 57 is a view illustrating an exemplary representative preprocessing module of the present invention. FIG.

58 is an exemplary diagram of a representative phrase extraction pretreatment module of the present invention.

59 is a diagram illustrating an embodiment of a family information preprocessing module according to the present invention.

60 is a diagram illustrating an embodiment of a plurality of patent classification code relation preprocessing modules of the present invention.

61 is a diagram illustrating an exemplary embodiment of a statistical preprocessing module for each patent classification code according to the present invention.

62 is a diagram illustrating one embodiment of a patent information intelligence module of the present invention.

63 is an exemplary diagram of the analysis module of the present invention.

64 is a diagram illustrating an exemplary embodiment of a patent information system batch generation engine of the present invention.

65 is a diagram illustrating the configuration of an integrated management module of the present invention.

FIG. 66 illustrates a unit patent information service system of an inventor for each inventor included in a patent document set constituting a unit patent information service system of one applicant name in the entire patent information database 2300 and constituting the unit patent information service system of the applicant name; One embodiment of the method is shown.

67 obtains a list of applicant names, generates a unit patent information service system of the applicant name by the number of applicants in the list in the entire patent information database 2300 for each list, and generates the unit patent information service of the applicant name generated in each of the lists; An embodiment of a method for generating a unit patent information service system for each inventor included in a patent document set constituting the system is shown.

FIG. 68 obtains a list of applicant names from the set of obtained documents, and, for each list, generates a unit patent information service system of the applicant name by the number of applicants listed in the entire patent information database 2300; and each of the generated applicants An embodiment of a method for generating an inventor-specific unit patent information service system for each inventor included in a set of patent documents constituting the title unit patent information service system is shown.

FIG. 69 illustrates a patent document for creating a unit patent information service system of one applicant name from the entire patent information database 2300 and configuring the unit patent information service system of the applicant name when an order for generation of the applicant patent unit patent information service system is ordered. An embodiment of a method for generating a unit patent information service system for each inventor included in a set is shown.

70 relates to a method for generating a unit patent information service system of the applicant name on a national basis. Generating the unit patent information service system of the applicant's name in the unit of the country is the nature of generating the unit patent information service system of the applicant's name and the unit patent information service system of the inventor's name in the national unit for the patent information database 2300 of the national unit. Shall be. Therefore, all of the methods of FIGS. 66 to 69 may be applied.

76 is a diagram illustrating one embodiment of the present invention.

78 is a diagram illustrating one embodiment of the present invention.

79 is a diagram for one embodiment of a method for processing weights by the costing aspect weight preprocessing module 3311 of the present invention.

FIG. 80 is an exemplary diagram of a manner in which the citation perspective weight preprocessing module 3313 preprocesses weights in citation perspective.

81 is a diagram for one example of a method of processing a weight by the dispute perspective weight preprocessing module 3315 of the present invention.

82 is a diagram for one example of a manner in which the concentrated viewpoint weight preprocessing module 3317 of the present invention processes weights.

83 is a diagram illustrating a method of processing a weight in the inventor unit of the present invention and a method of processing the weight.

84 is a diagram for one example of a method by which the applicant unit weight preprocessing module 3331 processes weights.

85 is a diagram for one example of how the inventor unit weight preprocessing module 3333 processes a weight.

86 is a diagram for one example of a method by which the agent unit weight preprocessing module 3335 of the present invention processes weights.

87 is a diagram for describing a method of processing family information by the family information preprocessing engine 3810 of the present invention.

88 is a diagram for one example of a citation information preprocessing method representing a citation number of the present invention.

89 is a view illustrating a method of including the rear citation document information in the document information of the specific document after acquiring rear citation document information, which is information on a post-application document citing a specific document of the present invention. to be.

90 is a diagram for one embodiment of a citation information preprocessing method of the present invention.

FIG. 91 is a view showing an embodiment of a processing method of a patent classification code preprocessing engine for processing a modified patent classification code of the present invention. FIG.

92 is a diagram for explaining a method of generating a modified patent classification code database by the hierarchical modified patent classification code database module of the present invention.

93 is a diagram of an exemplary method of creating a tree structure from USPC patent classification codes.

FIG. 94 is a diagram illustrating an exemplary method for allocating a modified patent classification code corresponding to a tree structure with the USPC patent classification code of FIG. 93.

FIG. 95 is a diagram illustrating the tree structure of modified patent classification codes having the same structure as that of the USPC patent classification code of FIG. 93.

96 is Index to US Patent Classification (aka, Classification Index File ) file is a exemplary embodiment showing that a tree structure of a patent classification code as shown in FIG. 93 can be created.

97 is a diagram showing an embodiment of a patent classification code preprocessing method according to the present invention.

98 is an exemplary diagram of a method for performing representative representative preprocessing by an exemplary representative preprocessing module of the present invention.

99 is a diagram illustrating an exemplary naming method utilizing the priority claim number of the present invention.

100 is a diagram illustrating a method of preprocessing a statistical value for each patent classification code of the present invention.

FIG. 101 is an exemplary diagram of a method for automatically including lower patent classification code for generating a statistical value, parameter or calculated value for a given patent classification code of the present invention.

FIG. 102 relates to a method for generating a statistical value, a parameter value, or a calculated value for each patent classification code from the viewpoint of including a lower patent classification code for citation or citation information for a document subset of a specific document set according to the present invention; FIG. Example drawing.

103 is a diagram illustrating an embodiment of a method for preprocessing multiple patent classification code relations according to the present invention.

104 is a diagram illustrating one embodiment of a plurality of patent classification code relation pretreatment methods of a comparative aspect of the present invention.

105 is a diagram illustrating an exemplary phrase information preprocessing method of the present invention.

106 is a diagram illustrating a representative phrase information preprocessing method of the present invention.

107 is a diagram illustrating the exemplary phrase information preprocessing method according to the present invention.

108 is a diagram for explaining an analysis index calculation method according to the present invention.

109 is a diagram illustrating a method of obtaining an analysis target patent document set according to the present invention.

110 is a view illustrating a trend analysis method of the present invention.

111 is a diagram illustrating an exemplary analysis method of the present invention.

112 is an exemplary view of a citation analysis method of the present invention.

113 is a diagram showing the exemplary citation analysis method of the present invention.

114 is a diagram illustrating an example of a judging citation analysis method according to the present invention.

115 is a diagram illustrating an embodiment of a method for analyzing multiple patent classification codes according to the present invention.

116 is a diagram for one embodiment relating to a method for analyzing a plurality of patent classification codes according to the present invention.

117 is a diagram illustrating an exemplary method for analyzing a plurality of patent classification codes according to the present invention.

118 is a diagram for one embodiment of a method for operating an individual unit patent information system multi-stage grouping module of the present invention.

119 is a diagram illustrating an embodiment of a method for generating a patent information system batch according to the present invention.

FIG. 120 is an embodiment of a screen showing an applicant list screen of the top 500 applicants of the Republic of Korea as one embodiment of a screen on which the applicant unit patent information system of the present invention is implemented. In the Country tab, South Korea is selected, and the Top 500 tab is selected from Top 500, Exchange Registered Companies, KOSDAQ Registered Companies, Multi-Application Companies, and All Companies.

121 is an embodiment of a screen showing an applicant list screen for a NASDAQ registered company in the US as an embodiment of a screen on which the applicant unit patent information system of the present invention is implemented. The United States is selected in the Country tab, and NASDAQ is selected in the Top 500, NYSE (NYSE), NASDAQ, AMEX, and All Companies sub-tabs. There are many NASDAQ registered companies, so there are tabs in ABC and all NASDAQ companies.

FIG. 123 is an embodiment of a screen showing an applicant list screen of a London stock exchange registered company in Europe as an embodiment of a screen on which the applicant patent information system of the present invention is implemented. In the Country tab, Europe is selected, and in the subtabs, UK1 is selected from Top 500, UK1 (London stocks), AIM (London stocks), OVERSEAS LISTED (London stock listed foreign companies), EURONEXT, and all companies. The list of publicly traded companies in Frankfurt is not shown.

FIG. 124 illustrates an exemplary embodiment of a U.S. patent tab when selecting a country from a patent list among patent portfolios of 3COM's patent information system that appears when 3COM (No. 6), one of NASDAQ registered companies, is selected in FIG. 121; Implementation.

125 is a screen showing the inventor list of 3COM when the inventor list is selected in FIG. 124 and then the United States is selected in the country tab.

FIG. 126 is a screen showing a list of patent documents related to this inventor when selecting Aldous Stepha .. (No. 9) from the inventor list in FIG.

FIG. 129 is a set of citation documents based on the entire set of 3COM's application documents when the user presses the Statistical Analysis tab in FIG. 124 for 3COM and the United States SA (Systematic Analysis) menu, and the United States tab. FIG. 1 is an exemplary diagram of a yearly analysis result for a multi-citation applicant using a forward citation document set as an analysis target document set. This screen is structurally equivalent to FIG. 39 where there is an embodiment in Samsung Electronics.

Claims

Patent document mast DB;

A pretreatment module unit;

Patent information processing basic module; And

An analysis module;

The analysis module is an automatic analysis support patent information system of the applicant name unit, characterized in that to provide an automated analysis results in the unit of the applicant name.