Summary of the invention
Purpose of the present invention at the defective of existing manual information retrieval and renewal, provides a kind of and finishes patent legal state retrieval and method for updating and system by computer self just.Minimize the participation of personnel in the renewal process.
The invention provides patent legal state automatic update method in a kind of enterprise patent database, said method comprising the steps of:
Step S1: regularly trigger the automatic update request of statutory status, begin the statutory status of the local patent in the patent database is upgraded;
Step S2: successively each desire in the enterprise patent database is upgraded local patent, sign in to the patent legal state information issuing web site of this this locality patent correspondence, and execution in step S3;
Step S3: retrieve from the patent legal state information issuing web site of this this locality patent correspondence automatically by computing machine, obtain the corresponding retrieval results page, and the statutory status that the result for retrieval page is announced and the statutory status of local patent compare, so that the statutory status in the patent database is upgraded.
Wherein, described step S3 further comprises following content:
Step R1: obtain the target retrieval page of corresponding patent from patent legal state information issuing web site, the statutory status feature speech of this this locality patent and the feature speech of the target retrieval page are compared,, then turn to step R2 if there is unmatched feature speech; If feature speech coupling then turns to step R3;
Step R2: carry out a self-learning algorithm, discerning described unmatched feature speech, and upgrade the statutory status feature speech of local patent according to this or to wherein adding new feature speech;
Step R3: according to the position of feature speech, evaluating objects searching page characteristic takes out the DOM Document Object Model of the target retrieval page;
Step R4: according to the DOM Document Object Model of the target retrieval page, obtain feature speech characteristic of correspondence speech content, obtain the current statutory status of this this locality patent;
Step R5: the statutory status of the corresponding patent of preserving in statutory status that step R4 is obtained and the enterprise patent database compares, if the two is identical, does not then have operation; If the two difference is then upgraded the statutory status of corresponding patent in the enterprise patent database.
Simultaneously, the present invention also provides patent legal state automatic update system in a kind of enterprise patent database, and described system comprises:
Regularly trigger module starts the renewal operation of total system according to predefined update cycle T;
Search module, it is used for obtaining to specify in the local data base statutory status target retrieval page of patent by being linked to patent legal state information issuing web site;
Statutory status feature speech contrast module is used for the statutory status feature speech and the local feature speech of the described target retrieval page are compared;
Data memory module is used to carry out the extraction and the renewal of statutory status feature speech;
The statutory status self-learning module is used for carrying out self study at the comparing result according to statutory status feature speech contrast module, and generates new statutory status feature speech;
The feature analysis module is used for carrying out the DOM Document Object Model analysis of the target retrieval page according to the result that statutory status feature speech contrast module produces, thereby obtains the node according to statutory status feature speech correspondence in the dom tree shape structure of target pages;
Content extraction module is used for carrying out on the target retrieval page according to statutory status feature speech the extraction of feature speech content;
Module is preserved in the statutory status contrast, be used for that content extraction module is extracted feature speech content and the existing corresponding statutory status of local data base compares,, then do not need to upgrade if the two is identical, if different, the corresponding statutory status of this patent in the renewal local data base;
Central control module is used for the operation of above-mentioned each module is dispatched and controlled.
Beneficial effect of the present invention is that by the automatic renewal of statutory status database, in time the statutory status of patent changes in the track database, and because the personnel that need not participate in, has reduced probability of errors.Simultaneously also can save human cost.
Embodiment
Usually, the patent database system of enterprise is that of linking together of LAN (Local Area Network) by enterprises or a few station server constitute.Fig. 1 is an embodiment system architecture synoptic diagram according to patent database system of the present invention.In the system shown in Figure 1, comprise enterprise patent database server, enterprise patent document server and application server, the user of enterprises conducts interviews to database by the inquiry terminal that is connected to application server.Wherein, patent database server, enterprise patent document server and application server are logical concept, and promptly it can be realized with multiple servers or PC respectively, also can use a station server or PC to realize.
Wherein, the patent database server is used to store the information such as bibliographical particulars of patent documentation, comprise application number, the applying date, patent name, open (bulletin) number, open (bulletin) day, classification number, application (patent right) people, invention (design) people, statutory status type etc., and support is to the search function of information such as bibliographical particulars; Enterprise patent document server is used to store full patent texts (graphic file); Application server then provides the interface of database and user terminal, makes the user to inquire about and to retrieve and to upgrade database by its terminal (as PC).
The logical partitioning that is noted that above-mentioned each database function also is to exemplify in this as a preferred exemplary of the invention process environment, and its purpose is to make the solution of the present invention to be easier to understand, but can not be with this as limitation of the present invention.Statutory status automatic update method of the present invention is not subjected to the restriction of the enterprise patent database form of the composition or data, services form.
Fig. 2 is patent legal state update system construction module figure of the present invention, and as seen from the figure, patent legal state update system of the present invention comprises: timing trigger module 220; Search module 230; Statutory status feature speech contrast module 240; Data memory module 250; Statutory status self-learning module 260; Feature analysis module 270; Content extraction module 280; Module 290 is preserved in the statutory status contrast, and the central control module 210 that the operation of above-mentioned each module is played scheduling and control action.
Wherein, timing trigger module 220 is promptly used regularly the renewal operation of trigger module patent legal state in database of time T startup according to the renewal operation that predefined update cycle T starts total system.Carry out in order in order to make to upgrade to operate, obviously, update cycle T should carry out a statutory status greater than the patent that whole desires are upgraded and upgrade the required time.Consider the random delay of internet communication link, usually, the selection of T will leave the allowance of certain hour number percent on the basis of the mean value update time of test of many times.Perhaps, also can be set to every day, weekly, the set time such as every month upgrades once.
Search module 230 is one of nucleus modules of the present invention, it is by being linked to authoritative intellecture property both domestic and external website, normally the official website of intellecture property mechanism of each country and/or area and Intergovernmental Organization is (as China national Department of Intellectual Property, United States Patent Office (USPO), EUROPEAN PATENT OFFICE, Jap.P. office, World Intellectual Property Organization etc., for ease of narration, hereinafter it is referred to as patent legal state information issuing web site), login the corresponding statutory status searching page (to call target web in the following text) of each patent legal state information issuing web site, sign by unique definite patent of energy or patented claim---as the patent No. or number of patent application, corresponding patent in the enterprise patent database is retrieved in the statutory status of patent legal state information issuing web site, to obtain the target retrieval page of corresponding specific patent (application) number.Way of realization about concrete will be elaborated in conjunction with method flow of the present invention hereinafter.
The target retrieval page that statutory status feature speech contrast module 240 is used for search module 230 is obtained carries out the contrast of statutory status feature speech and local feature speech, and selects to call statutory status self-learning module 260 or feature analysis module 270 according to comparing result.
250 pairs of patent legal state databases of data memory module are operated, and can carry out the extraction and the renewal operation of the statutory status feature speech of data base entries.
Statutory status self-learning module 260 is optional modules, and it may not all be called in each the renewal.Have only after statutory status feature speech contrast module 240 is comparing, just can call statutory status self-learning module 260 when the statutory status feature speech of the discovery target retrieval page and local feature speech and incomplete coupling, for example, at the feature speech that has occurred on the target web outside the local feature speech, at this moment, statutory status self-learning module 260 will be carried out self study according to the comparing result of statutory status feature speech contrast module 240, and produce new statutory status feature speech.
270 of feature analysis modules are used for carrying out according to the result that the contrast module produces DOM (the Document Object Model DOM Document Object Model) analysis of the page, thereby obtain the node according to statutory status feature speech correspondence in the dom tree shape structure of target pages.
Content extraction module 280 is used for carrying out on webpage according to the content of statutory status feature speech the extraction of statutory status feature speech content.
It is the statutory status feature speech content that content extraction module 280 is extracted that 290 of modules are preserved in the statutory status contrast, promptly existing corresponding statutory status parameter compares in pairing statutory status parameter of this feature speech and the database, if the two is identical, then do not need to upgrade, if difference, more the statutory status parameter of this patent in the new database.
The flow process of the statutory status of carrying out enterprise patent database with method of the present invention being upgraded automatically below in conjunction with Fig. 3 is elaborated.Method of the present invention comprises following content:
Step S1: regularly trigger module 220 triggers the automatic update request of statutory status; In the process that system of the present invention moves continuously, this step S1 moves in circles with period T, to guarantee upgrading in time of statutory status.
Step S2: the local patent in the search module 230 traversal enterprise patent databases, sign in to the patent legal state information issuing web site of local patent correspondence, search the statutory status searching page of specifying patent; For example, can sign in to for Chinese patent
Http:// search.sipo.gov.cn/sipo/zljs/Searchflzt.jsp.
Step S3, carry out the following step successively by central control module control for each local patent:
Step R1: obtain corresponding retrieval results (to call the target retrieval page in the following text) from patent legal state information issuing web site, extract feature speech in the local law slip condition database (as patent (application) number, Granted publication number, the statutory status day for announcing, statutory status type, open, mandate etc.), utilize statutory status feature speech contrast module 240 contrast target retrieval content of pages, if have unmatched feature speech, then turn to step R2; If feature speech coupling then turns to step R3;
Step R2, utilize statutory status self-learning module 260,, discern new feature speech, and upgrade the feature speech in the law status flag storehouse or add new feature speech therein by self-learning algorithm;
Step R3, call feature analysis module 270, evaluating objects searching page characteristic analyzes the DOM structure of the target retrieval page; Because for a page, no matter this pagefile is of which kind of language to make, after it generates corresponding pagefile (as the html document), it just has certain DOM structure, and for common patent legal state information issuing web site, its page frame shelf structure is relatively stable, and this makes that carrying out the analysis of feature speech according to string matching has possibility and feasibility.
Step R4, according to the DOM structure of the target retrieval page, obtain feature speech characteristic of correspondence speech content, call content extraction module 280, extract this feature speech content, obtain the statutory status of current patent;
Step R5, utilize statutory status contrast to preserve module 290, the statutory status and the statutory status that step R4 obtains of corresponding patent in the contrast law slip condition database if the two is identical, then need not to operate, and perhaps only note the current time as the final updating time; If the two difference is then upgraded the statutory status of corresponding patent in the law slip condition database.
Wherein, step R2 is the optional step according to the structure of web page of patent legal state information issuing web site, and, also can first evaluating objects searching page characteristic, analyze the DOM structure of the target retrieval page, carry out the new feature speech self study among the step R2 then on this basis, to raise the efficiency.For example for the statutory status searching page of China national Department of Intellectual Property,, it shows that this just means among the DOM of corresponding target retrieval webpage increases a child node because correspondingly increasing a form at the page according to the variation each time of patent legal state.And also may increase statutory status feature speech simultaneously, for example, in the patent of just having finished announcement for one, its target retrieval page has only a form, for instance, in day operation this method March 10 in 2008, obtains result as shown in Table 1.Wherein statutory status feature speech has only " application (patent) number ", " Granted publication number ", " the statutory status day for announcing ", " statutory status type " and " disclosing "; And when carrying out method of the present invention for the first time after its statutory status renewal on April 30th, 2008, except that table one, the target retrieval page of this patent also can comprise table two, and has increased " coming into force of examination as to substances " in the statutory status feature speech.Just be necessary to carry out self study this moment and carry out the interpolation of feature speech.Then need the feature speech of self study can be more for descriptive entry change etc.
Table one
Table two
Statutory status renewal process with a Chinese patent is an example below, specifies the running of statutory status automatic update method of the present invention and automatic update system.
For example, automatic update system is installed on application server, and the statutory status database then is stored in the enterprise patent database server.After the system start-up, upgrade by the traversal in regular turn that timing trigger module 220 carries out all patent legal states in the statutory status database with the time cycle of setting.And step R1 being implemented as follows to R5:
At first from the statutory status database, obtain patent clauses and subclauses, these clauses and subclauses comprise contents such as patent (application) number, the applying date, open day, wherein, to be called " feature speech " as projects such as " patent (application) number ", " Granted publication number ", " the statutory status day for announcing ", " disclosing ", " coming into force of examination as to substances ", and the pairing codomain of these projects is called " feature speech content ", for example " 200710071070.6 ", " 2008.03.05 " etc.For example, patent clauses and subclauses can be abstract with following data structure:
Structure?patent_legal_state
{
Char?item_application_Number
Char?value_application_Number
Char?item_publication_Number
Char?value_publication_Number
Char?item_publication
Char?value_publication
......
}
Item_**** representation feature speech wherein, value_**** represents corresponding feature speech content.And the number of feature speech is variable, to adapt to the statutory status of continual renovation.
Step R1: obtain corresponding retrieval results (to call the target retrieval page in the following text), at first to from these patent clauses and subclauses, extract the unique project of determining this patent of energy, as number of patent application, afterwards Automatic Program generate mutually should number of patent application the target pages address, is for example to application number 200510043446.3, its target pages address http://search.sipo.gov.cn/sipo/zljs/FlztResult.jsp? searchword=%C9%EA%C7%EB%BA%C5%3D200510043446%2E3.Wherein, need to prove that in character string " %C9%EA%C7%EB%BA%C5%3D200510043446%2E3 ", " %C9%EA%C7%EB%BA%C5%3D " is the urlencode coding, the meaning is " application number="; " 200510043446 " are the patent No.s of specifying patent; The meaning of %2E is ". "; Last position " 3 " is the parity check bit of specifying patent.And described urlencode coding is according to the page of target patent legal state information issuing web site and fixed, need before upgrading operation automatically, in program, set, promptly need carry out manual analysis to the page and obtain, and this analytic process promptly can be carried out for the those skilled in the art that generally are familiar with the page program language.The page structure of considering patent legal state information issuing web site is generally comparatively stable, therefore, only needing to carry out a hand inspection every significant period of time can make the present invention be achieved, perhaps also can set up a wrong supervision module, when the result badly that returns behind the target pages instruction operation or can't report to the police during return results.
Move above-mentioned target pages address, can obtain the target retrieval page of corresponding corresponding this application number by the internet.
With the feature speech of current patent clauses and subclauses (as patent (application) number, Granted publication number, the statutory status day for announcing, statutory status type, open, mandate etc.), compare with the feature speech of the target retrieval page, described contrast realizes by statutory status feature speech contrast module 240, if have unmatched feature speech, then turn to step R2; If feature speech coupling, then turn to step R3, here feature speech coupling has dual mode, a kind of is directly to be undertaken by the mode of string matching, another kind is also can carry out page analysis to the target retrieval page earlier, analyze the DOM structure of the target retrieval page, and reach more according to the DOM structure and to simplify and accurate contrast, for example in the DOM structure as shown in Figure 4, can directly contrast child node under " element<table〉" child node directly to obtain the information in the form relevant (as table one, table two) with statutory status;
Step R2, utilize statutory status self-learning module 260,, discern new feature speech, and upgrade the feature speech in the law status flag storehouse or add new feature speech therein by self-learning algorithm;
Step R3, call feature analysis module 270, evaluating objects searching page characteristic analyzes the DOM structure of the target retrieval page, for example: application number be 200510043446.3 the page part DOM structure as shown in Figure 4.Obtain the DOM value of application number: application number html[0] .body[0] .div[1] .div[0] .div[2] .p[0] .table[2] .tr[0] .td[1] .font[0] .text; If carried out the analysis of DOM structure among the step R1, then can directly call its analysis result herein.
Step R4, according to the DOM structure of the target retrieval page, obtain feature speech characteristic of correspondence speech content, call content extraction module 280, extract this feature speech content, obtain the statutory status of current patent, for example, in Fig. 4, first statutory status can be by DOM value html[0] .body[0] .div[1] .div[0] .div[2] .p[0] .table[2] .tr[1] .td[3] .font[0] .text extracts;
Step R5, utilize statutory status contrast to preserve module 290, the statutory status and the statutory status that step R4 obtains of corresponding patent in the contrast law slip condition database if the two is identical, then need not to operate, and perhaps only note the current time as the final updating time; If the two difference is then upgraded the statutory status of corresponding patent in the law slip condition database, and writes down corresponding update time.