WO2010128967A1 - Method, system, and apparatus for searching an electronic document collection - Google Patents

Method, system, and apparatus for searching an electronic document collection Download PDF

Info

Publication number
WO2010128967A1
WO2010128967A1 PCT/US2009/043174 US2009043174W WO2010128967A1 WO 2010128967 A1 WO2010128967 A1 WO 2010128967A1 US 2009043174 W US2009043174 W US 2009043174W WO 2010128967 A1 WO2010128967 A1 WO 2010128967A1
Authority
WO
WIPO (PCT)
Prior art keywords
document
collection
profile
query
sections
Prior art date
Application number
PCT/US2009/043174
Other languages
French (fr)
Inventor
Jason David Resnick
Randy W. Lacasse
Original Assignee
Cpa Software Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cpa Software Limited filed Critical Cpa Software Limited
Priority to CN2009801599633A priority Critical patent/CN102483744A/en
Priority to NZ596369A priority patent/NZ596369A/en
Priority to PCT/US2009/043174 priority patent/WO2010128967A1/en
Priority to CA2761713A priority patent/CA2761713A1/en
Priority to AU2009345822A priority patent/AU2009345822A1/en
Priority to KR1020117028646A priority patent/KR101560756B1/en
Priority to EP09789651.8A priority patent/EP2427830B1/en
Publication of WO2010128967A1 publication Critical patent/WO2010128967A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/11Patent retrieval

Definitions

  • This invention relates to an electronic document collection, and searching the collection in response to receipt of a query. More specifically, the invention relates to creating search profiles by placing an emphasis on each section of an intellectual property document to be searched, and processing the query responsive to selection of at least one of the search profiles
  • Different classes of searches may be commissioned to achieve different results. For example, a novelty search may be commissioned to ascertain whether or not to submit a filing for an intellectual property asset. A product clearance search may be commissioned to ascertain whether a product is covered under the claims of a current intellectual property asset. An invalidity search may be commissioned to determine if the issued claims of the intellectual property asset are valid, etc.
  • Prior electronic intellectual property document search tools do not support the different classes of searches.
  • the burden is on the person doing the search, also known as the searcher, to limit the sections of the intellectual property document to be reviewed in the search based upon the scope of the search.
  • the burden on the searches increase since more associated documents need to be reviewed for each search.
  • This invention comprises a method, system, and article for efficiently and effectively searching a collection of patent documents.
  • a computer method for searching an electronic document collection.
  • a collection of patent documents Is compiled and indexed, with each of the patent documents in the collection being comprised of multiple sections. Each section of each patent in the collection is identified.
  • a search profile is organized for the document collection. The search profile includes a selection of each identified sections of each document in the collection. For each profile, a weight is assigned to each of the selected sections.
  • a search profile is selected and query data is compared with data in each of the sections of the document collection as identified and assigned a weight in the selected profile. A match of the query data to each profile section with an assigned weight yields a compilation of documents to be returned as part, of the search results.
  • a computer system in communication with storage media, and an electronic document collection maintained on the storage media.
  • the electronic document collection is a compilation of intellectual property documents. Based upon characteristics of intellectual property documents, each of the documents in the collection has multiple sections.
  • a director is employed to index and compile the collection of documents.
  • the director is in communication with a document manager, which identifies each section of the documents in the collection.
  • a profile manager is provided to organize a search profile for the document collection.
  • the profile manager is in communication with the document manager and employs the search profile to include a selection of each of the identified section of each document in the compiled collection, in addition to selecting specific sections for including in the profile, the profile manager assigns a weight to each of the selected sections in each profile.
  • the weight is a reflection of the emphasis on the associated section.
  • a query manager submits a query to the document collection.
  • the query includes a selection of at least one search profile and compares query data with data in each of the sections of the document as reflected in the profile.
  • a compilation of relevant patent documents is returned, with the compilation including a match of the query to data in at least each identified profile section having an assigned weight.
  • an article is provided with a computer-readable earner including computer program instructions configured to search an electronic document collection on computer memory.
  • the computer-readable carrier includes computer program instructions to perform a query over the document collection. Instructions are provided to compile and index a collection of intellectual property documents. Each of the patent documents in the collection is divided into multiple sections. Following indexing of the collection, instructions are provided to identify each of the sections of each document in the collection. Once the sections of the documents are identified, instructions are provided to organize a search profile for the document collection. The search profile is a selection of each identified sections of each document in the collection. Additionally, instructions are provided to assign a weight to each of the sections identified in the search profile.
  • results of the query submission include a compilation of relevant documents returned based upon a match of the query data in at least each identified profile section with one or more documents in the underlying collection,
  • FtG. 1 is a flow chart illustrating a process for identifying sections of a patent document for creation of one or more profiles.
  • FlG. 2 is a flow chart illustrating a process for creating a secondary weight for one or more profiles.
  • FIG, 3 is a flow chart illustrating a process for employing the secondary* weight to reflect the location within each profile sections in which the string match occurs.
  • FIG. 4 is a block diagram of a user interface to support, creation of a profile.
  • FK ⁇ . 5 is a Dow chart illustrating a process for submitting a query to a compiled and indexed document collection, according to the preferred embodiment of this invention, and is suggested for printing on the first page of the issued patent.
  • FlG, 6 is a block diagram illustrating a set of tools employed to create a search profile and to assign one or more weights to different sections of the underlying document collection as identified in the profile.
  • An identified manager and/or director of executable code may, for instance, comprise one or more physical or logical blocks of computer instaictions which may, for instance, be organized as an object, procedure, function, or other construct. Nevertheless, the execuiables of an identified manager and/or director need not be physically located together, but may comprise disparate instructions stored In different locations which, when joined logically together, comprise the manager and/or director and achieve the stated purpose of the manager and/or director.
  • a manager and/or director of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different applications, and across several memory devices.
  • operational data may be identified and illustrated herein within the manager and/or director, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, as electronic signals on a system or network.
  • An intellectual property document collection is a compilation of issued and published applications.
  • a patent document collection is a subset of the intellectual property document collection.
  • Patent documents come in the form of issued patent grants and published patent applications. The difference between the two categories of documents identifies their enforceable value. More specifically, a patent grant is an actual property right that can be enforced in a court of law, whereas a published patent application is a pending application that is a pending patent right.
  • Each patent document is parsed into multiple sections, with each section containing written words and phrases, also known as string data To accommodate searching of the collection, each document in the collection is parsed based upon sections within each document, and a weight is assigned to each of the parsed sections of the intellectual property documents.
  • the weight is a numerical measure of emphasis to be placed on one or more specific sections of the document for the query.
  • a selection of document sections together with weights assigned to the selected sections creates a search profile.
  • the search may be limited to specific sections of the documents, or different emphasis may be placed on matching query data in each section of the document. Accordingly, the creation and selection of a search profile is directly related to the search results.
  • At least one search profile Prior to submission of a query string to the document collection, at least one search profile is selected based upon the intended scope of the search. A compilation of matching documents are returned based upon matching data between the query string data and the document string data in each section of the patent document having an assigned weight as indicated in the selected profile. Accordingly, weights are assigned to one or more sections of a patent document in a patent document collection to efficiently and effectively create a result set with data pertinent to the query submitted to the collection, u herein tiie result set includes one or more documents in the patent document collection with a string matching the submitted query string in a section assigned a weight value in excess of zero
  • Fig I is a flow chart (100) illustrating a process for identifying sections of a patent document for creation of one or more profiles Under current rules of practice, each patent document submitted to the U.S.
  • Patent and Trademark Office will contain the following sections, title, background - including the technical field and a description of the prior art, summary of the invention, brief description of the drawing figures, drawing figures, detail description of the preferred embodiments), claims, and abstract,
  • not ait patent documents will contain drawing figures, such as in chemical practice and some international patents m ⁇ patent documents.
  • there may be a different quantity of sections in a patent document or the sections may be presented in a different order Accordingly, prior to placing a.n emphasis on one or more sections of a patent document in the collection with a query, the origin of the. documents, the different sections of the documents, and the order in which the sections are organized in the collection need to be identified.
  • a collection of patent documents is compiled and indexed ( 102) It is recognized in the art that patents and patent publications are comprised of multiple sections. Following the compilation of the documents, each section in each patent in the collection of documents is identified ( 104). The variable NTotal is assigned to the number of sections in the patent document ( 106). Different profiles are created to address different searching needs A profile is created by placing an emphasis on different combinations of sections of the patent documents, and/or by omitting one or more sections of the document from consideration during the search itself by assigning a value of zero to that section. To support profile based searching, at least one profile is created. However, in one embodiment, there are multiple profiles created to support selection of a profile to meet the needs of the search.
  • a counting variable X associated with the profile designation is initialized and assigned to the integer one (108) and the counting variable N pertaining to the sections of the patent document is assigned to the integer one ( 110).
  • section ⁇ of the patent document collection it is determined if section N will be employed as part of the profile being created, proflle X (1 12)
  • a positive response to the determination at step ( 1 12) joins section ⁇ to profilex ( 114).
  • a primary weight h assigned to seetion N ( 116).
  • the primary weight is a numerical value that signifies the of importance of section ⁇ to profile x with respect to other sections of the patent document collection, including any previously selected sections and other sections to be joined or omitted from the profile.
  • step ( 116) or a negative response to the determination at step (1 12) the variable N associated with the sections of the patent documents is incremented (1 1 S). It is then determined if all of the identified sections of the patent documents in the compiled and indexed collection have been evaluated for joining or omitting from profilex ( 120). A positive response to the determination at step (120) concludes the profile creation process for profilex ( 122). Conversely, a negative response to the determination at step ( 120) is followed by a return to step (1 12) for consideration of additional sections in the collection for profilex .
  • Fig. 2 is a flow chart (200) demonstrating an added dimension of emphasis that may be added to each created profile. More specifically, an added weight in the form of a secondary weight may be employed to either add or subtract from the weight score based upon a quantity of matching strings in select sections of each profile.
  • the variable X ⁇ ,>tai is assigned to represent the quantity of profiles created (202), as demonstrated in Fig. 1 , and a counting variable X is assigned to the integer one (204).
  • variable Y ⁇ o iai is assigned to represent the quantity of sections in profilex with a weight assignment (206), as demonstrated in Fig. I.
  • a counting variable Y is assigned to the integer one (208).
  • h is then determined i f a secondary weight wi Il be added to section Y of profilex (210).
  • a negative response to the determination at step (210) is followed by a jump to step (230) to evaluate the next section in the profile, if any.
  • a positive response to the determination at step (210) is followed by a second query to determine if the secondary weight assignment will be a tiered structure (212).
  • each profile may include a hierarchy of weight values depending upon a quantity of data string matches returned during the search process with the selected profile.
  • a negative response to the determination at step (212) is followed by setting the minimum threshold of data string matches that must be returned in order to employ a secondary weight assignment to sectiony (214).
  • the secondary weight value is set for profiie ⁇ section ⁇ (216).
  • the input at steps (214) and (216) is to set the parameters satisfying the secondary weight structure as established at step (212). Accordingly, for each profile section, a secondary weight value may be set to provide emphasis on the search results when a threshold value of matches has been exceeded.
  • each select section of a. profile may be configured to accommodate a hierarchy of secondary weight threshold values.
  • the variable Z Total is assigned to the quantity of hierarchical thresholds to be assigned to profiic ⁇ ,section ⁇ (218), and a tier counting variable Z is set to the integer one (220).
  • the minimum threshold of data string matches that must be returned in order to employ a secondary weight assignment to profikx, sectiony, tier/, is set (222), and the secondary weight value is set for profs le ⁇ sectiony tier z . (224).
  • the tier counting variable Z is incremented (226), followed by a determination as to whether all the weight values have been set for all of the tiers for profile ⁇ , sectiony (228).
  • a negative response to the determination at step (228) is followed by a return to step (222).
  • a positive response to the determination at step (228) or following step (216) is followed by an increment of the counting variable Y to proceed to evaluation of the next section of the select profile (230).
  • H is then determined if ail of the sections of the select profile have been evaluated for assignment of a hierarchy of secondary weight threshold values (232), A negative response to the determination at step (232) is followed by a return to step (210), and a positive response to the determination at step (232) is followed by an increment of the profile counting variable X (234), Following step (234), it is determined if all of the created profiles have been evaluated for assignment of a secondary weight (236). A negative response to the determination at step (236) is followed by a return to step (206), and a positive response to the determination at step (236) concludes the assignment of a hierarchy of secondary weight threshold values to select sections of created profiles (238).
  • each profile may be configured with a hierarchy of secondary weights to place emphasis on both the select sections of each profile as well as the quantity of matching strings within a profile.
  • a hierarchy of secondary weights i.e. tiers, may be applied to each individual section of a profile, with the secondary weights based upon one or more threshold values for the quantity of matches between the query string and the data in the document collection being parsed.
  • the secondary weight may reflect the location within one or more profile sections in which the string match occurs, as demonstrated in Fig. 3. This secondary weight may be separate from or supplemental to the secondary weight demonstrated in Fig. 2.
  • the variable Xjbtas is assigned to represent the quantity of profiles created (302), as demonstrated in Fig.
  • a counting variable X is assigned to the integer one (304)
  • the variable Y Total is assigned to represent the quantity of sections in profiiex with a weight assignment (306)
  • a counting variable Y is assigned to the integer one (308). It is then determined if a secondary weight will be added to profile*, section ⁇ - ⁇ 310).
  • a positive response to the determination at step (310) is followed by dividing profiiex, section? into multiple subsections (3 12).
  • profiiex, sectiony may be divided into multiple sections, with each section length pertaining to a percentage of the profiiex, sectiony as a whole. Regardless of the method employed for determining the quantity of subsections, each profiiex, sectiony may be divided into two or more subsections with a secondary weight assigned to reflect a matching string not only in profiiex, sectiony but also the location of the match in the select subsection.
  • the variable Z Total is assigned to the quantity of subsections created for profiiex, sectiony (3 14), and a counting variable Z is assigned to the integer one (316).
  • a secondary weight is assigned to profiiew sectiony, subsection Z . (318).
  • the counting variable Z is incremented (320), followed by a determination as to whether there are any more subsections in profiiex, sectiony, that have not been evaluated for a secondary weight assignment (322).
  • a negative response to the determination at step (322) is followed by a return to step (318).
  • a positive response to the determination at step (322) or a negative response to the determination at step (310) Is followed by an increment of the counting variable Y (324).
  • a profile section may be subdivided into multiple subsections based upon their physical location, with a secondary weight assigned to one or more of the identified subsections.
  • Fig 4 is a block diagram (400) of a user interface to support creation of a profile for a search to be submitted to an intellectual property document collection.
  • the user interface functions as a veneer to the underlying code for application of weights to sections of the underlying documents.
  • there are multiple blocks presented within the interface with each box associated with a section identified in the document col Section. More specifically, in the example shown herein there are five boxes (410), (420), (430), (440), and (450), with each box having indicia identifying the respective sections of the underlying documents in the collection.
  • the first box (410) is associated with a first section present in the documents (412)
  • the second box (420) is associated with a second section present in the patent documents (422)
  • the third box (430) is associated with a third section present in the patent documents (432)
  • the fourth box (440) is associated with a fourth section present in the patent documents (442)
  • the fifth box (450) is associated with a fifth section present in the patent documents.
  • the interface represented herein only shows the underlying documents in the collection divided into five sections, the invention should not be limited to this quantity Sn one embodiment, the document collection may be parsed into a larger quantity or a smaller quantity of sections, with each section represented in the interface (400).
  • each box (410) is provided with a slide (414)
  • the second box (420) is provided with a slide (424)
  • the third box (430) is provided with a slide (434)
  • the fourth box (440) is provided with a slide (444)
  • the fifth box (450) is provided with a slide (454).
  • each box (410) - (450) is scaled with a position of the slide indicating the weight to be applied to a match of the query with data present in the specified section of the documents.
  • the numerical indicia of the weights are not shown herein, in one embodiment, the numerical indicia may be provided on a vertical axis of each box (410) - (450). As the individual slide of each box is raised, the weight of the associated section is increased. Similarly, as the individual slide of each box is lowered, the weight of the associated section is decreased. Accordingly, the interface provides a graphical tool to support the allocation of weights to different sections of the intellectual property documents for creation of a profile.
  • Fig. 5 is a flow chart (500) illustrating a process for submitting a query to a compiled and indexed document collection. Initially, one or more document collections are selected to receive the query (502), together with a profile for the query submission (504). It is understood that & properly selected profile will reflect the intended scope of the document query submission. In other words, a search limited to the claims should reflect a profile that substantially limits the document collection to the claims section. Accordingly, a search that does not intend to review the document sections in their entirety should include selection of an appropriately categorized profile.
  • the searcher provides the query and submits it to the document collection (506).
  • the variable X Total is assigned to the quantity of documents that are determined to match with data submitted in the query (508), and an associated counting variable, X, is assigned to the integer one (510).
  • the variable N Total is assigned to the quantity of sections identified in the selected profile with at least one occurrence of the query input (512), and an associated counting variable, N is assigned to the integer one (514).
  • a score is calculated for document X , section N (516) based upon the following mathematical formula:
  • section N (number of matches in section N) (weight assigned to section N)
  • variable N is incremented (518) followed by a determination as to whether all of the sections in the profile have been evaluated for document X (520).
  • a negative response to the determination at step (520) is followed by a return to step (516).
  • a positive response to the determination at step (520) is followed by aggregating a score for document X as the summation of the weighted score value of document X , section N for each of the sections in tlie document (522). This aggregation is compiled for each patent document in the collection that has a match with the query input.
  • the variable X is incremented (524), followed by a determination of whether a weight has been calculated for ail of the documents with a match ( 526).
  • a negative response to the determination at step (526) is followed by a return to step (514) Conversely, a positive response to the determination at step (526) is an indication that the weight has been calculated and assigned to each document in the collection based upon the selected profile (528). Accordingly, a compilation of documents is returned with an attached weight that reflects the relevancy of the document based upon the profile employed in the search.
  • the compilation returned in Fig. 5 is based upon a query submitted to a document collection with the employment of a selected query.
  • the query selection may be dynamically modified.
  • the profile may be adjusted, with the query re-submitted to the selected document collection.
  • the search results in the form of the returned compilation of documents may be different.
  • the profile may be dynamically adjusted through the graphical user interface shown in Fig. 4 to solicit a different compilation of documents for the same search query. Accordingly, the same search query may be submitted to the document collection with a dynamic modification of the query profile to solicit a return of a different compilation of returned documents.
  • FIG. 6 is a block diagram (600) illustrating a set of tools for creation of search profiles and assignment of weights to different sections of the intellectual property documents identified in the search profile.
  • a computer system 602 is provided with a processor unit (604) coupled to memory (60 ⁇ j by a bus structure (60S). Although only one processor unit (604) is shown, in one embodiment more processor units may be provided in an expanded design.
  • the system (602) is shown in communication with storage media (640) configured to house a document collection (642).
  • the electronic document collection includes a compilation of patent documents, including issued patents and published patent applications.
  • the storage media (640) is in communication with the processor unit (604).
  • the system is shown in communication with a visual display (650) for presentation of visual data.
  • Each, of the elements shown and described herein support query submission to the document collection (642).
  • a director (660) is provided local to the computer system (602) and in communication with memory (606).
  • the director (660) is responsible for compiling and indexing the document collection (642).
  • the director (660) is in communication with a document manager (662) which identities each section of each document in the collection.
  • a document manager 662 which identities each section of each document in the collection.
  • each patent or published patent application is comprised of specific uniform sections.
  • the document manager (662) is employed to identify the sections of the documents in the collection, and in one embodiment, the order of the presentation of the identified sections.
  • a profile manager (664) is provided in communication with the document manager (662).
  • the profile manager (664) organizes a search profile for the document collection (642).
  • the profile manager (664) facilitates the selection of one or more sections of the documents, as identified by the document manager (662) for inclusion in a query, and assigns a weight to each selected section, ⁇ n one embodiment, the weight is a numerical value to identify the importance of matching data in the selected section(s). Accordingly, the search profile as organized by the profile manager (664) provides an outline for the sections of the document collection that are pertinent to the query.
  • a query manager (666) is in communication with the profile manager (664), also provided local to the computer system (602) and in communication with memory (606).
  • the query manager (664) is responsible for selection of at least one search profile with submission of a query to the document collection (642). More specifically, the query manager (666) compares query data with data in the sections of the document collection (642) that are identified in the profile and assigned a weight. The comparison as performed by the query manager (666) yields a compilation of relevant patent documents (646). In one embodiment, the compilation is presented on the visual display (650). Similarly, in one embodiment, the compilation may be retained on storage, either volatile or persistent.
  • the director (660), document manager (662), profile manager (664), and query manager (666), may reside in memory (606) local to the computer system (602), However, the invention is not be limited to this embodiment.
  • the director, document manager, profile manager, and query manager (660) - (666) may each reside as hardware tools external to local memory (606), or they may be implemented as a combination of hardware and software.
  • the director and managers (660) - (666) may reside on a remote system in communication with the storage media (640). Accordingly, the director and managers may be implemented as a software tool or a hardware tool to support submission of one or more queries to an electronic patent document collection to yield a compilation of relevant patent documents.
  • the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
  • the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Embodiments within the scope of the present invention also include articles of manufacture comprising program storage means having encoded therein program code.
  • program storage means can be any available media which can be accessed by a general purpose or special purpose computer.
  • program storage means can include RAJVl, ROM, EKPROM, CD-ROM, or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired program code means and winch can be accessed by a general purpose or special memepose computer Combinations of the above should also be included in the scope of the program storage means.
  • the medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
  • Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, random access memory (RAM), read-only memory (ROM), a rigid magnetic disk, and an optical disk.
  • Current examples of optical disks include compact disk B read only (CD-ROM), compact disk B read/write (CD-R/W) and DVD.
  • a data processing system suitable for storing and/or executing program code wiii include at least one processor coupled directly or indirectly to memory elements through a system bus.
  • the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
  • I/O devices can be coupled to the system either directly or through intervening I/O controllers.
  • Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
  • the software implementation can take the form of a computer program product accessible from a computer- ⁇ seahSe or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • Each intellectual property document is known in the art to have a defined outline of sections that are required to meet statutory filing requirements.
  • One or more profiles are created to facilitate submission of a query to the document collection.
  • Each profile imparts a weight to one or more of the identified sections in the document. The weight represents the importance of the identified section and adds value to each document in the returned compilation.
  • Not all queries are the same.
  • intellectual property documents in the chemical technologies have a limited number of drawing figures, if any. As such, a query in the chemical technology may de-emphasize the drawing figures, and place a greater emphasis on the written text. .Different queries are submitted to the collection to achieve different results. Accordingly, the creation of multiple profiles, with each profile employing a different selection of the identified sections, and imparting different weights to the different selected sections, enables a query submission to be efficiently and effectively processed to yield a focused compilation of document results.
  • the electronic document collection has been specifically described pertaining to intellectual property documents, including issued patents and published patent applications, trademark registrations and application, and copyright registrations and applications.
  • the invention should not be limited to these specific categories of electronic documents.
  • the electronic document collection may include any type of document that has a defined plurality of sections. This would enable the managers to parse the documents into the defined sections, create multiple profiles with associated weights for one or more of the defined sections, and submission of a query to the document collection with a selected profile.
  • selection of a query profile may be dynamically modified, In one embodiment, modification of the query profile while maintaining the query content may change the documents returned in the compilation as well as the order in which the documents in the compilation are presented. Accordingly, the scope of protection of this invention is limited only by the following claims and their equivalents.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)

Abstract

A method, system, and article are provided for efficiently and effectively searching an electronic document collection. Each of the documents in the collection is pre-divided into sub-sections. One or more profiles are created, with each profile including a selection of one or more of the sections of the documents in the collection. In addition, a weight is assigned to each of the selected sections in the profile. Based upon the parameters of a query and selection of a profile, select sub-sections of each document are employed in a comparison of query data to the underlying document collection. A compilation of documents is created based upon all documents with data matching the query data within the sections of the document as identified in the submitted profile.

Description

Method, System, and Apparatus for Searching an Electronic Document Collection
BACKGROUND OF THE INVENTION
Technical Field
[0001] This invention relates to an electronic document collection, and searching the collection in response to receipt of a query. More specifically, the invention relates to creating search profiles by placing an emphasis on each section of an intellectual property document to be searched, and processing the query responsive to selection of at least one of the search profiles
Description of the Prior Art
[0002] All intellectual property documents submitted for examination before any of a worldwide selection of patent offices, hereinafter Patent Office, must meet certain requirements, including, each intellectual property document must be deemed new, useful, and non-obvious To properly prepare an intellectual property document for examination. It Is useful to have knowledge of prior intellectual property documents, i.e. prior art, in related areas of technology as only one intellectual property document may be granted per invention. The process of ascertaining prior art is known as a search. The results of the search generally help the drafters of any subsequent intellectual property application to focus their efforts on what appears to be patentable or otherwise protectable subject matter and aids in developing a reasonable strategy for achieving the goals of the inventor or owner of the intellectual property rights.
[0003] Prior to the evolution of technology into the current electronic information age, it was known that intellectual property searches were conducted manually. A searcher would review a disclosure and based upon a classification system, ascertain where the disclosure should be classified, and thereafter conduct a search. It was recognized that the searcher would visually review appropriate sections of the intellectual property document based upon the defined scope of the search being conducted. With the advent of information technology, manual searches are no longer available in most jurisdictions as most intellectual property grants and published applications are only available in electronic form. With the advent of the electronic format of the intellectual property document, similar strategies employed with the manual search may be used for searching an electronic intellectual property database.
[0004] Different classes of searches may be commissioned to achieve different results. For example, a novelty search may be commissioned to ascertain whether or not to submit a filing for an intellectual property asset. A product clearance search may be commissioned to ascertain whether a product is covered under the claims of a current intellectual property asset. An invalidity search may be commissioned to determine if the issued claims of the intellectual property asset are valid, etc. Prior electronic intellectual property document search tools do not support the different classes of searches. Rather, the burden is on the person doing the search, also known as the searcher, to limit the sections of the intellectual property document to be reviewed in the search based upon the scope of the search As the quantity of granted intellectual property rights and published pending intellectual property applications in the database grow, the burden on the searches increase since more associated documents need to be reviewed for each search.
[0005] Accordingly, there is a need for a tool and technique to be used by a searcher to mitigate or avoid the burdens associated with the search and related search scope and to lake advantage of the electronic format of the intellectual property' documents. The tool should enable the searcher to leverage the different sections of the intellectual property document during the search to more efficiently and effectively yield accurate and desirable search results.
SUMMARY OF THE INVENTION
[0006] This invention comprises a method, system, and article for efficiently and effectively searching a collection of patent documents.
[0007] In one aspect of the invention, a computer method is provided for searching an electronic document collection. A collection of patent documents Is compiled and indexed, with each of the patent documents in the collection being comprised of multiple sections. Each section of each patent in the collection is identified. A search profile is organized for the document collection. The search profile includes a selection of each identified sections of each document in the collection. For each profile, a weight is assigned to each of the selected sections. At the time of submission of a query to the collection, a search profile is selected and query data is compared with data in each of the sections of the document collection as identified and assigned a weight in the selected profile. A match of the query data to each profile section with an assigned weight yields a compilation of documents to be returned as part, of the search results.
[0008] In another aspect of the invention, a computer system is provided with a processor in communication with storage media, and an electronic document collection maintained on the storage media. The electronic document collection is a compilation of intellectual property documents. Based upon characteristics of intellectual property documents, each of the documents in the collection has multiple sections. A director is employed to index and compile the collection of documents. The director is in communication with a document manager, which identifies each section of the documents in the collection. In addition, a profile manager is provided to organize a search profile for the document collection. The profile manager is in communication with the document manager and employs the search profile to include a selection of each of the identified section of each document in the compiled collection, in addition to selecting specific sections for including in the profile, the profile manager assigns a weight to each of the selected sections in each profile. The weight is a reflection of the emphasis on the associated section. At query time, a query manager submits a query to the document collection. The query includes a selection of at least one search profile and compares query data with data in each of the sections of the document as reflected in the profile. Following the submission by the query manager, a compilation of relevant patent documents is returned, with the compilation including a match of the query to data in at least each identified profile section having an assigned weight.
[0009] In yet another aspect of the invention, an article is provided with a computer-readable earner including computer program instructions configured to search an electronic document collection on computer memory. The computer-readable carrier includes computer program instructions to perform a query over the document collection. Instructions are provided to compile and index a collection of intellectual property documents. Each of the patent documents in the collection is divided into multiple sections. Following indexing of the collection, instructions are provided to identify each of the sections of each document in the collection. Once the sections of the documents are identified, instructions are provided to organize a search profile for the document collection. The search profile is a selection of each identified sections of each document in the collection. Additionally, instructions are provided to assign a weight to each of the sections identified in the search profile. Upon submission of a query to the document collection, instructions are provided to select at least one search profile and to compare query data with data in the sections of the documents in the collection as identified in the profile. Results of the query submission include a compilation of relevant documents returned based upon a match of the query data in at least each identified profile section with one or more documents in the underlying collection,
[0010] Other features and advantages of this invention will become apparent from the following detailed description of the presently preferred embodiment of the invention, taken in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[001 1] The drawings referenced herein form a part, of the specification. Features shown in the drawing are meant as illustrative of only some embodiments of the invention, and not of ail embodiments of the invention unless otherwise explicitly indicated. Implications to the contrary are otherwise not to be made.
[0012] FtG. 1 is a flow chart illustrating a process for identifying sections of a patent document for creation of one or more profiles.
[0013] FlG. 2 is a flow chart illustrating a process for creating a secondary weight for one or more profiles.
[0014] FIG, 3 is a flow chart illustrating a process for employing the secondary* weight to reflect the location within each profile sections in which the string match occurs.
[0015] FIG. 4 is a block diagram of a user interface to support, creation of a profile.
[0016] FKϊ. 5 is a Dow chart illustrating a process for submitting a query to a compiled and indexed document collection, according to the preferred embodiment of this invention, and is suggested for printing on the first page of the issued patent.
[0017] FlG, 6 is a block diagram illustrating a set of tools employed to create a search profile and to assign one or more weights to different sections of the underlying document collection as identified in the profile.
DESCRIPTION OF THE PREFERRED EMBODIMENT
[0018] It will be readily understood that the components of the present invention, as generally described and illustrated in the Figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the apparatus, system, and method of the present invention, as presented in the Figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. [0019] The functional units described in this specification have been labeled as managers and directors. A manager and/or director may be Implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices, or the like. The manager and/or director may also be implemented in software for execution by various types of processors. An identified manager and/or director of executable code may, for instance, comprise one or more physical or logical blocks of computer instaictions which may, for instance, be organized as an object, procedure, function, or other construct. Nevertheless, the execuiables of an identified manager and/or director need not be physically located together, but may comprise disparate instructions stored In different locations which, when joined logically together, comprise the manager and/or director and achieve the stated purpose of the manager and/or director.
[0020] Indeed, a manager and/or director of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different applications, and across several memory devices. Similarly, operational data may be identified and illustrated herein within the manager and/or director, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, as electronic signals on a system or network.
[0021] Reference throughout this specification to "a select embodiment," "one embodiment,"1 or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included In at least one embodiment of the present invention Thus, appearances of the phrases "a select embodiment," "in one embodiment,'1 or "in an embodiment" in various places throughout this specification are not necessarily referring to the same embodiment.
[0022] Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided, to provide a thorough understanding of embodiments of the invention. One skilled in the relevant, art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
[0023] The illustrated embodiments of the invention will be best understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of devices, systems, and processes that are consistent with the invention as claimed herein.
Overview
[0024] An intellectual property document collection is a compilation of issued and published applications. A patent document collection is a subset of the intellectual property document collection. Patent documents come in the form of issued patent grants and published patent applications. The difference between the two categories of documents identifies their enforceable value. More specifically, a patent grant is an actual property right that can be enforced in a court of law, whereas a published patent application is a pending application that is a pending patent right. Each patent document is parsed into multiple sections, with each section containing written words and phrases, also known as string data To accommodate searching of the collection, each document in the collection is parsed based upon sections within each document, and a weight is assigned to each of the parsed sections of the intellectual property documents. The weight is a numerical measure of emphasis to be placed on one or more specific sections of the document for the query. A selection of document sections together with weights assigned to the selected sections creates a search profile. Depending upon the scope of the search, the search may be limited to specific sections of the documents, or different emphasis may be placed on matching query data in each section of the document. Accordingly, the creation and selection of a search profile is directly related to the search results.
[0025] Prior to submission of a query string to the document collection, at least one search profile is selected based upon the intended scope of the search. A compilation of matching documents are returned based upon matching data between the query string data and the document string data in each section of the patent document having an assigned weight as indicated in the selected profile. Accordingly, weights are assigned to one or more sections of a patent document in a patent document collection to efficiently and effectively create a result set with data pertinent to the query submitted to the collection, u herein tiie result set includes one or more documents in the patent document collection with a string matching the submitted query string in a section assigned a weight value in excess of zero
Technical Details
[0026] In the following description of the embodiment^ reference is made to the accompanying drawings that form a part heteof, and which shows by wa\ of illustration the specific embodiment in which the invention may be practiced It is to be understood that other embodiments may be utilized because structural changes may be made without departing form the scope of the present invention
[0027] It is recognized that documents describing issued and published intellectual property documents are divided into multiple sections Each section is required for a submission of a completed application, and each section has a purpose The details of each .section of the underlying intellectual property are not going to be discussed herein However, for purposes of disclosure, the different sections of a patent, as an example of an intellectual property document, will be identified For the most part, each patent application includes a tide, a priority filing date, an abstract a background description, a summary, a brief description of the drawing figures (if any), drawing figures (if anyy, a detailed description of the invention, and claims There are different search categories that are employed in the patent arena depending upon the purpose of the seaich For example, an infringement and'Or product clearance search is concerned with the language in the claims, and therefore should be essentially directed to the claims present in the document collection 4 validity and/or invalidity search is concerned with any known pnoi art. and requires identification of the priority tiling date of the patent document W hen an inventor(s > seeks to determine the noveit) of their invention prior to or following submission of a patent application, the inventors or his/her agent or representative may commission a novelty search Such a search may de-emphasize the claims and focus on the detailed description of the invention Accordingly, as shown herein, each search places emphasis on different sections of a patent document in the document collection [0028] Fig I is a flow chart (100) illustrating a process for identifying sections of a patent document for creation of one or more profiles Under current rules of practice, each patent document submitted to the U.S. Patent and Trademark Office will contain the following sections, title, background - including the technical field and a description of the prior art, summary of the invention, brief description of the drawing figures, drawing figures, detail description of the preferred embodiments), claims, and abstract, In one embodiment, not ait patent documents will contain drawing figures, such as in chemical practice and some international patents mά patent documents. Similarly, in other countries and regional offices and in prior domestic practice, there may be a different quantity of sections in a patent document, or the sections may be presented in a different order Accordingly, prior to placing a.n emphasis on one or more sections of a patent document in the collection with a query, the origin of the. documents, the different sections of the documents, and the order in which the sections are organized in the collection need to be identified.
[0029] Initially, a collection of patent documents is compiled and indexed ( 102) It is recognized in the art that patents and patent publications are comprised of multiple sections. Following the compilation of the documents, each section in each patent in the collection of documents is identified ( 104). The variable NTotal is assigned to the number of sections in the patent document ( 106). Different profiles are created to address different searching needs A profile is created by placing an emphasis on different combinations of sections of the patent documents, and/or by omitting one or more sections of the document from consideration during the search itself by assigning a value of zero to that section. To support profile based searching, at least one profile is created. However, in one embodiment, there are multiple profiles created to support selection of a profile to meet the needs of the search. Once the sections of the patent documents are identified at step (106), a counting variable X associated with the profile designation is initialized and assigned to the integer one (108) and the counting variable N pertaining to the sections of the patent document is assigned to the integer one ( 110). Starting with section^ of the patent document collection, it is determined if sectionN will be employed as part of the profile being created, proflleX (1 12) A positive response to the determination at step ( 1 12) joins section^ to profilex ( 114). With the selection of sections a primary weight h assigned to seetionN ( 116). The primary weight is a numerical value that signifies the of importance of section^ to profilex with respect to other sections of the patent document collection, including any previously selected sections and other sections to be joined or omitted from the profile. Following step ( 116) or a negative response to the determination at step (1 12), the variable N associated with the sections of the patent documents is incremented (1 1 S). It is then determined if all of the identified sections of the patent documents in the compiled and indexed collection have been evaluated for joining or omitting from profilex ( 120). A positive response to the determination at step (120) concludes the profile creation process for profilex ( 122). Conversely, a negative response to the determination at step ( 120) is followed by a return to step (1 12) for consideration of additional sections in the collection for profilex . It is then determined if there are any additional profiles to create for the document collection { 124) A positive response to the determination at step ( 124) is followed by an increment of the counting variable X ( 126) and a return to step (1 10). Conversely, a negative response to the determination at step ( 124) concludes the creation of the profiles with assignment of the number associated with X to the variable Xτ«iat (128 K Accordingly, one or more profiles may be created for a patent document collection, with each profile placing an emphasis on one or more identified sections in the patent document collection.
[0030] As demonstrated in Fig. I, one or more profiles may be created to emphasize or de-emphasize employment of select sections of the patent documents during the search process. Fig. 2 is a flow chart (200) demonstrating an added dimension of emphasis that may be added to each created profile. More specifically, an added weight in the form of a secondary weight may be employed to either add or subtract from the weight score based upon a quantity of matching strings in select sections of each profile. The variable Xτ,>tai is assigned to represent the quantity of profiles created (202), as demonstrated in Fig. 1 , and a counting variable X is assigned to the integer one (204). Thereafter, the variable Yγoiai is assigned to represent the quantity of sections in profilex with a weight assignment (206), as demonstrated in Fig. I, To assess the individual sections of a profile, a counting variable Y is assigned to the integer one (208). h is then determined i f a secondary weight wi Il be added to section Y of profilex (210). A negative response to the determination at step (210) is followed by a jump to step (230) to evaluate the next section in the profile, if any. Conversely, a positive response to the determination at step (210) is followed by a second query to determine if the secondary weight assignment will be a tiered structure (212). More specifically, each profile may include a hierarchy of weight values depending upon a quantity of data string matches returned during the search process with the selected profile. A negative response to the determination at step (212) is followed by setting the minimum threshold of data string matches that must be returned in order to employ a secondary weight assignment to sectiony (214). Following step (214), the secondary weight value is set for profiieχsectionγ (216). The input at steps (214) and (216) is to set the parameters satisfying the secondary weight structure as established at step (212). Accordingly, for each profile section, a secondary weight value may be set to provide emphasis on the search results when a threshold value of matches has been exceeded.
[0031] In addition to setting a single secondary weight value, each select section of a. profile may be configured to accommodate a hierarchy of secondary weight threshold values. Following a positive response to the determination at step (212), the variable ZTotal is assigned to the quantity of hierarchical thresholds to be assigned to profiicχ,sectionγ (218), and a tier counting variable Z is set to the integer one (220). Following step (220), the minimum threshold of data string matches that must be returned in order to employ a secondary weight assignment to profikx, sectiony, tier/, is set (222), and the secondary weight value is set for profs leχsectiony tierz. (224). Once the weight value is set for the select tTierz, the tier counting variable Z is incremented (226), followed by a determination as to whether all the weight values have been set for all of the tiers for profile^, sectiony (228). A negative response to the determination at step (228) is followed by a return to step (222). Conversely, a positive response to the determination at step (228) or following step (216) is followed by an increment of the counting variable Y to proceed to evaluation of the next section of the select profile (230). H is then determined if ail of the sections of the select profile have been evaluated for assignment of a hierarchy of secondary weight threshold values (232), A negative response to the determination at step (232) is followed by a return to step (210), and a positive response to the determination at step (232) is followed by an increment of the profile counting variable X (234), Following step (234), it is determined if all of the created profiles have been evaluated for assignment of a secondary weight (236). A negative response to the determination at step (236) is followed by a return to step (206), and a positive response to the determination at step (236) concludes the assignment of a hierarchy of secondary weight threshold values to select sections of created profiles (238). Accordingly, each profile may be configured with a hierarchy of secondary weights to place emphasis on both the select sections of each profile as well as the quantity of matching strings within a profile. [0032] As shown in Fig. 2, a hierarchy of secondary weights, i.e. tiers, may be applied to each individual section of a profile, with the secondary weights based upon one or more threshold values for the quantity of matches between the query string and the data in the document collection being parsed. In another embodiment, the secondary weight may reflect the location within one or more profile sections in which the string match occurs, as demonstrated in Fig. 3. This secondary weight may be separate from or supplemental to the secondary weight demonstrated in Fig. 2. The variable Xjbtas is assigned to represent the quantity of profiles created (302), as demonstrated in Fig. 1 , and a counting variable X is assigned to the integer one (304) Thereafter, the variable Y Total is assigned to represent the quantity of sections in profiiex with a weight assignment (306), and a counting variable Y is assigned to the integer one (308). It is then determined if a secondary weight will be added to profile*, section^- {310). A positive response to the determination at step (310) is followed by dividing profiiex, section? into multiple subsections (3 12). There are different embodiments that may be employed for the division at step (312). For example, in one embodiment, there may be three subsections with a first subsection being limited to the first sentence, a third subsection being limited to the last sentence, and a second subsection being limited to all data located between the first and third subsections. Similarly, in another embodiment, profiiex, sectiony may be divided into multiple sections, with each section length pertaining to a percentage of the profiiex, sectiony as a whole. Regardless of the method employed for determining the quantity of subsections, each profiiex, sectiony may be divided into two or more subsections with a secondary weight assigned to reflect a matching string not only in profiiex, sectiony but also the location of the match in the select subsection.
[0033] Following step (312), the variable Z Total is assigned to the quantity of subsections created for profiiex, sectiony (3 14), and a counting variable Z is assigned to the integer one (316). A secondary weight is assigned to profiiew sectiony, subsectionZ. (318). Following the assignment at step (3 18), the counting variable Z is incremented (320), followed by a determination as to whether there are any more subsections in profiiex, sectiony, that have not been evaluated for a secondary weight assignment (322). A negative response to the determination at step (322) is followed by a return to step (318). Conversely, a positive response to the determination at step (322) or a negative response to the determination at step (310) Is followed by an increment of the counting variable Y (324). It is then determined if there are any sections in profHex that have not been evaluated for assignment of a secondary weight (326). A negative response to the determination at step (326) is following by a return to step (3 10). Conversely, a positive response to the determination at step (326) is followed by an increment of the counting variable X (328), and a determination as to whether all of the profiles have been evaluated for a secondary weight assignment (330). A negative response to the determination at step (330) is followed by a return to step (306), and a positive response concludes the secondary weight assignment process. Accordingly, a profile section may be subdivided into multiple subsections based upon their physical location, with a secondary weight assigned to one or more of the identified subsections.
[0034] Fig 4 is a block diagram (400) of a user interface to support creation of a profile for a search to be submitted to an intellectual property document collection. The user interface functions as a veneer to the underlying code for application of weights to sections of the underlying documents. As shown, there are multiple blocks presented within the interface, with each box associated with a section identified in the document col Section. More specifically, in the example shown herein there are five boxes (410), (420), (430), (440), and (450), with each box having indicia identifying the respective sections of the underlying documents in the collection. The first box (410) is associated with a first section present in the documents (412), the second box (420) is associated with a second section present in the patent documents (422), the third box (430) is associated with a third section present in the patent documents (432), the fourth box (440) is associated with a fourth section present in the patent documents (442), and the fifth box (450) is associated with a fifth section present in the patent documents. Although the interface represented herein only shows the underlying documents in the collection divided into five sections, the invention should not be limited to this quantity Sn one embodiment, the document collection may be parsed into a larger quantity or a smaller quantity of sections, with each section represented in the interface (400).
[0035] For each section of the underlying document identified in the interface for parsing, a siide mechanism is provided to raise or lower the weight to be allocated to the associated section, As such, the first box (410) is provided with a slide (414), the second box (420) is provided with a slide (424), the third box (430) is provided with a slide (434), the fourth box (440) is provided with a slide (444), and the fifth box (450) is provided with a slide (454). In one embodiment, each box (410) - (450) is scaled with a position of the slide indicating the weight to be applied to a match of the query with data present in the specified section of the documents. Although the numerical indicia of the weights are not shown herein, in one embodiment, the numerical indicia may be provided on a vertical axis of each box (410) - (450). As the individual slide of each box is raised, the weight of the associated section is increased. Similarly, as the individual slide of each box is lowered, the weight of the associated section is decreased. Accordingly, the interface provides a graphical tool to support the allocation of weights to different sections of the intellectual property documents for creation of a profile.
[0036] Once the profiles have been created and the primary and/or secondary weights have been assigned to the different sections and subsections as identified in the profile, the profiles may be employed to create a compilation of relevant documents from a document query. Fig. 5 is a flow chart (500) illustrating a process for submitting a query to a compiled and indexed document collection. Initially, one or more document collections are selected to receive the query (502), together with a profile for the query submission (504). It is understood that & properly selected profile will reflect the intended scope of the document query submission. In other words, a search limited to the claims should reflect a profile that substantially limits the document collection to the claims section. Accordingly, a search that does not intend to review the document sections in their entirety should include selection of an appropriately categorized profile. Once steps (502) and (504) are complete, the searcher provides the query and submits it to the document collection (506). The variable XTotal is assigned to the quantity of documents that are determined to match with data submitted in the query (508), and an associated counting variable, X, is assigned to the integer one (510). Similarly, the variable NTotal is assigned to the quantity of sections identified in the selected profile with at least one occurrence of the query input (512), and an associated counting variable, N is assigned to the integer one (514). A score is calculated for documentX, sectionN (516) based upon the following mathematical formula:
docurnentx, sectionN = (number of matches in section N) (weight assigned to section N)
Following step (516), the variable N is incremented (518) followed by a determination as to whether all of the sections in the profile have been evaluated for document X (520). A negative response to the determination at step (520) is followed by a return to step (516). Conversely, a positive response to the determination at step (520) is followed by aggregating a score for document X as the summation of the weighted score value of documentX, sectionN for each of the sections in tlie document (522). This aggregation is compiled for each patent document in the collection that has a match with the query input. Following step (522), the variable X is incremented (524), followed by a determination of whether a weight has been calculated for ail of the documents with a match ( 526). A negative response to the determination at step (526) is followed by a return to step (514) Conversely, a positive response to the determination at step (526) is an indication that the weight has been calculated and assigned to each document in the collection based upon the selected profile (528). Accordingly, a compilation of documents is returned with an attached weight that reflects the relevancy of the document based upon the profile employed in the search.
[0037] The compilation returned in Fig. 5 is based upon a query submitted to a document collection with the employment of a selected query. In one embodiment, the query selection may be dynamically modified. Following the return of the document collection the profile may be adjusted, with the query re-submitted to the selected document collection. By selecting a different profile, the search results in the form of the returned compilation of documents may be different. Similarly, in one embodiment, the profile may be dynamically adjusted through the graphical user interface shown in Fig. 4 to solicit a different compilation of documents for the same search query. Accordingly, the same search query may be submitted to the document collection with a dynamic modification of the query profile to solicit a return of a different compilation of returned documents.
[0038] As demonstrated above, each patent in the document collection may be parsed to a emphasize or de-emphasize the value of data matches in specified sections of a returned compilation of intellectual property documents. Fig. 6 is a block diagram (600) illustrating a set of tools for creation of search profiles and assignment of weights to different sections of the intellectual property documents identified in the search profile, As shown, a computer system (602) is provided with a processor unit (604) coupled to memory (60όj by a bus structure (60S). Although only one processor unit (604) is shown, in one embodiment more processor units may be provided in an expanded design. The system (602) is shown in communication with storage media (640) configured to house a document collection (642). ϊn one embodiment, the electronic document collection includes a compilation of patent documents, including issued patents and published patent applications. The storage media (640) is in communication with the processor unit (604). In addition, the system is shown in communication with a visual display (650) for presentation of visual data. Each, of the elements shown and described herein support query submission to the document collection (642).
[0039] A director (660) is provided local to the computer system (602) and in communication with memory (606). The director (660) is responsible for compiling and indexing the document collection (642). The director (660) is in communication with a document manager (662) which identities each section of each document in the collection. As explained above, in the case of a patent document collection, each patent or published patent application is comprised of specific uniform sections. However, not all patent document collections have a uniform layout. As such, the document manager (662) is employed to identify the sections of the documents in the collection, and in one embodiment, the order of the presentation of the identified sections. A profile manager (664) is provided in communication with the document manager (662). The profile manager (664) organizes a search profile for the document collection (642). More specifically, the profile manager (664) facilitates the selection of one or more sections of the documents, as identified by the document manager (662) for inclusion in a query, and assigns a weight to each selected section, ϊn one embodiment, the weight is a numerical value to identify the importance of matching data in the selected section(s). Accordingly, the search profile as organized by the profile manager (664) provides an outline for the sections of the document collection that are pertinent to the query.
[0040] A query manager (666) is in communication with the profile manager (664), also provided local to the computer system (602) and in communication with memory (606). The query manager (664) is responsible for selection of at least one search profile with submission of a query to the document collection (642). More specifically, the query manager (666) compares query data with data in the sections of the document collection (642) that are identified in the profile and assigned a weight. The comparison as performed by the query manager (666) yields a compilation of relevant patent documents (646). In one embodiment, the compilation is presented on the visual display (650). Similarly, in one embodiment, the compilation may be retained on storage, either volatile or persistent.
[0041] In one embodiment, the director (660), document manager (662), profile manager (664), and query manager (666), may reside in memory (606) local to the computer system (602), However, the invention is not be limited to this embodiment. For example, in one embodiment, the director, document manager, profile manager, and query manager (660) - (666) may each reside as hardware tools external to local memory (606), or they may be implemented as a combination of hardware and software. Similarly, in one embodiment, the director and managers (660) - (666), may reside on a remote system in communication with the storage media (640). Accordingly, the director and managers may be implemented as a software tool or a hardware tool to support submission of one or more queries to an electronic patent document collection to yield a compilation of relevant patent documents.
[0042] In one embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc. The invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0043] Embodiments within the scope of the present invention also include articles of manufacture comprising program storage means having encoded therein program code. Such program storage means can be any available media which can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such program storage means can include RAJVl, ROM, EKPROM, CD-ROM, or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired program code means and winch can be accessed by a general purpose or special puipose computer Combinations of the above should also be included in the scope of the program storage means.
[0044] The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, random access memory (RAM), read-only memory (ROM), a rigid magnetic disk, and an optical disk. Current examples of optical disks include compact disk B read only (CD-ROM), compact disk B read/write (CD-R/W) and DVD. [0045] A data processing system suitable for storing and/or executing program code wiii include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
[0046] Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
[0047] The software implementation can take the form of a computer program product accessible from a computer-υseahSe or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
Advantages Over the Prior Art
[0048] Each intellectual property document is known in the art to have a defined outline of sections that are required to meet statutory filing requirements. One or more profiles are created to facilitate submission of a query to the document collection. Each profile imparts a weight to one or more of the identified sections in the document. The weight represents the importance of the identified section and adds value to each document in the returned compilation. Not all queries are the same. For example, it is recognized that intellectual property documents in the chemical technologies have a limited number of drawing figures, if any. As such, a query in the chemical technology may de-emphasize the drawing figures, and place a greater emphasis on the written text. .Different queries are submitted to the collection to achieve different results. Accordingly, the creation of multiple profiles, with each profile employing a different selection of the identified sections, and imparting different weights to the different selected sections, enables a query submission to be efficiently and effectively processed to yield a focused compilation of document results. Alternative Embodiments
[0049] It will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without departing from the spirit and scope of the invention. In particular, there are different forms of intellectual property documents, including patents, trademarks, and copyrights. Within the category of patent documents, there is a further breakdown of documents, including issued patents, published patent application, patent abstracts, and utility model registrations. Some of these documents may contain the same quantity of sections in the same order, and others will have a different quantity of sections and/or a different order. The profiles are independently created based upon sections that are present, and not necessarily the order in which they appear in the underlying document.
[0050] In addition, the electronic document collection has been specifically described pertaining to intellectual property documents, including issued patents and published patent applications, trademark registrations and application, and copyright registrations and applications. However, the invention should not be limited to these specific categories of electronic documents. In one embodiment, the electronic document collection may include any type of document that has a defined plurality of sections. This would enable the managers to parse the documents into the defined sections, create multiple profiles with associated weights for one or more of the defined sections, and submission of a query to the document collection with a selected profile. As noted above, selection of a query profile may be dynamically modified, In one embodiment, modification of the query profile while maintaining the query content may change the documents returned in the compilation as well as the order in which the documents in the compilation are presented. Accordingly, the scope of protection of this invention is limited only by the following claims and their equivalents.

Claims

CLAIMS What is claimed.
1. A computer implemented method for searching on an electronic document collection comprising: compiling and indexing a collection of intellectual property documents, each of the documents in (he collection having multiple sections; identifying each of the sections of each document in the collection; organizing a search profile for the document collection, wherein the search profile includes a selection of at least one of the identified sections of each document in the compiled collection; within the organized search profile, assigning a weight to each of the identified and selected sections; at query time, submitting a query to the intellectual property document collection, including selecting at least one search profile, and comparing query data with data in each of the document sections of the selected profile in the collection with an assigned weight, and a compilation of relevant documents generated from said query submission, including a match of the submitted query to data in at least one identified profile section having an assigned weight.
2 The method of claim 1 , further comprising subdividing at leasi one identified section in a search profile to at least two subsections and assigning a secondary weight to at least one of the two subsections.
3. The method of claim 2, wherein the at least two subsections are defined based upon a size of the subsections.
4. The method of claim 2, wherein the at least two subsections are hierarchical tiers of the compilation of relevant documents.
5. The method of claim 1, further comprising calculating a score for each document in the compilation of relevant documents, wherein the score is an aggregation of a product of the quantity of matches in each profile section of the selected profile with the weight assigned for that section.
6. The method of claim 1, further comprising dynamically modifying the assigned weight to at least one identified section of each intellectual property document in the collection.
7. The method of claim 6, further comprising employing a graphical user interface as layer for modifying the assigned weights, wherein the graphical user interface includes a field for each identified section of each intellectual property document and a graphical device for changing a parameter within said field that affects said modifying the assigned weights.
8. The method of claim T1 wherein the graphical device is a slide bar and further comprising moving the slide bar in a positive direction to increase relevancy for the corresponding identified section of the patent document.
9. The method of claim 8, further comprising moving the sliding bar in a negative direction to decrease relevancy for the corresponding identified section of the intellectual property document.
1.0. The method of claim 9, further comprising adjusting the sliding bar in at least one identified section of the patent document to a zero value for removing the identified selection section from the scope of the query.
1.1. The method of ciai m 10, furth er com pri si ng for an i nfri ngem en t search , ad j usti ng the sliding bar in a claim section toward a maximum setting and adjusting the sliding bar in all other sections toward a minimal setting.
12. The method of claim 1, wherein the assigned weights are static.
13. The method of claim I , further comprising pre-progr ammi ng weight profiles for the identified sections of the collection of patent documents based upon a scope of the query- to be submitted to the document collection.
14. The method of claim 13, further comprising assigning a title for each preprogrammed weight profile to describe an associated search scope,
15, The method of claim 13, further comprising pre-programming weight profiles for the identified sections of the collection of patent, documents based upon a specific technology of the query to be submitted to the document collection.
1.6. A system comprising: a processor in communication with memory and storage media; a collection of intellectual property documents retained on the storage media, with each of the documents in the collection having multiple sections; a director to compile and index the collection of documents; a document manager in communication with the director, the document manager to identify each section of each document in the collection; a profile manager, in communication with the document manager, the profile manager to organize a search profile for the document collection, wherein the search profile includes a selection of at least one of the identified sections of each document in the compiled collection; the profile manager to assign a weight to each of the identified and selected section with the organized search profile, and at query time, a query manager to submit a query to the document collection, the query to include selection of at least one search profile and comparison of query data with data in each of the document sections of the selected profile in the co! lection having an assigned weight, said query resulting in a compilation of relevant documents generated from said query submission and returned from the query manager, with each document having a match of the query to data in at least one identified profile section having an assigned weight.
17. The system of claim 16, further comprising the profile manager to subdivide at least one identified section in at least one search profile to at least two subsections, and to assign a secondary weight to at least one of the two subsections,
18. The system of claim 17, wherein the at least two subsections are defined based upon a size of the subsections.
19, The system of claim 1 7, wherein the at least two subsections are hierarchical tiers of the compilation of relevant documents.
20 The system of claim 16, further comprising the query manager to calculate a score for each document in the compilation of relevant documents, wherein the score is an aggregation of a product of the quantity of matches in each profile section of the selected profile with the weight assigned for that section
21. The system of claim 16, further comprising the profile manager to support dynamic modification of the assigned weight to at least one identified section of each intellectual property document in the collection.
22. The system of claim 21 , further comprising a graphical user interface as layer to modify the assigned weights, wherein the graphical user interface includes a lield for each identified section of each intellectual property document and a graphical device for changing a parameter within said field that affects said modifying the assigned weights.
23. The system of claim 22, wherein the graphical device is a slide bar, and further comprising moving the sliding bar in a positive direction to increase relevancy for the corresponding identified section of the patent document.
24. The system of claim 23, further comprising movement of the slide bar in a negative direction to decrease relevancy for the corresponding identified section of the patent document.
25. The system of claim 24, further comprising adjustment of the slide bar in at least one identified section of the patent document to a zero value to remove the identified selection section from the scope of the query
26 The system of claim 25, further com prising tor an infringement search, adjustment of the sliding bai in a claim section toward a maximum setting and adjustment of the sliding bar in all other sections toward a minimal setting
27 The sv stem of claim 16, wherein the assigned weights axe static
28 The sv stem of claim 16. further comprising pre-programmed weight profiles for the identified sections of the collection of patent documents based upon a scope of the quety to he submitted to the document collection
29 The system of cl aim 28, further comprising assignment of a title for each preprogrammed weight profile to describe an associated search scope
30 The system of claim 28, furthei comprising pre-programmed weight profiles for the identified sections of the collection of patent documents based upon a specific technology of the query to be submitted to the document collection
3 1 An article configured to search an electronic document collection on computer memory, the article comprising a computer-readable carrier including computer program instructions to perform a query the instructions comprising: comprising instructions to compile and index a collection of intellectual property documents, each of the documents in the collection having multiple sections, instructions to identify each of the sections of each document in the collection, instructions to organize a search profile for the document collection. wherein the search profile includes a selection of at least one of the identified sections, of each ducument in the compiled collection, instructions to assign a weight to each of the identified and selected sections within the organized search profile, and instructions to submit, a query to the intellectual property document collection at query time, including selection of at least one search profile, and comparison of query data with data in each of the document sections of the selected profile In the collection with an assigned weight; and a compilation of relevant documents with each document including a match of the submitted query to data in at least one identified profile section having an assigned weight.
32. The article of claim 31, further comprising instructions to subdivide at least one identified section in a search profile to at least two subsections and to assign a secondary weight to at least one of the two subsections.
33. The article of claim 32, wherein the at least two subsections are defined based upon a size of the subsections.
34. The article of claim 32, wherein the at least two subsections are hierarchical tiers of the compilation of relevant documents.
35. The article of claim 31, further comprising instructions to calculate a score for each document in the compilation of relevant documents, wherein the score is an aggregation of a product of the quantity of matches in each profile section of the selected profile with the weight assigned for that section.
36. The article of claim 31 , further comprising Instructions to dynamically modify the assigned weight to at least one identified section of each intellectual property document in the collection.
37. The article of claim 36, further comprising instructions to employ a graphical user interface as layer to modify the assigned weights, wherein the graphical user interface includes a field for each identified section of each intellectual property document and a graphical device for changing a parameter within said field that affects said modifying the assigned weights.
38. The article of claim 37, wherein the graphical device is a slide bar, and further comprising moving the slide bar in a positive direction to increase relevancy for the corresponding identified section of the patent document,
39. The article of claim 38, further comprising movement of the slide bar in a negative direction to decrease relevancy for the corresponding identified section of the intellectual property document.
40. The article of claim 39, further comprising adjustment of the slide bar in at least one identified section of the intellectual property document to a zero value to remove the identified selection section from the scope of the query.
41. The article of claim 40, further comprising for an infringement search, adjustment of the slide bar in a claim section toward a maximum setting and adjustment of the slide bar In all other sections toward a minimal setting.
42. The article of claim 31, wherein the assigned weights are static.
43. The article of claim 31, further comprising pre-programmed weight profiles for the identified sections of the collection of intellectual property documents based upon a scope of the query to be submitted to the document collection,
44. The article of claim 43, further comprising instructions to assign a title for each pre-programmed weight profile to describe an associated search scope.
45. The article of claim 43, further comprising pre-programmed weight profiles for the identified sections of the collection of intellectual property documents based upon a specific technology of the query to be submitted to the document collection,
46. A document search apparatus for searching on an electronic document collection, comprising: compiling means for compiling and indexing a collection of documents, each of the documents In the collection having multiple sections; section identifying means for identifying each of the sections of each document in the collection; organizing means for organizing a search profile for the document collection, wherein the search profile includes a selection of at least one of the identified sections of each document in the compiled collection; assigning means for assigning a weight to each of the identified and selected sections within the organized search profile; query means for submitting a query to the intellectual property document collection, including selecting at least one search profile, and comparing query data with data in each of the document sections of the selected profile in the collection with an assigned weight; and results means for receiving a compilation of relevant documents generated from said query submission, including a match of the submitted query to data in at least one identified profile section having an assigned weight.
PCT/US2009/043174 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection WO2010128967A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
CN2009801599633A CN102483744A (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection
NZ596369A NZ596369A (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection
PCT/US2009/043174 WO2010128967A1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection
CA2761713A CA2761713A1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection
AU2009345822A AU2009345822A1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection
KR1020117028646A KR101560756B1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection
EP09789651.8A EP2427830B1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2009/043174 WO2010128967A1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection

Publications (1)

Publication Number Publication Date
WO2010128967A1 true WO2010128967A1 (en) 2010-11-11

Family

ID=41328521

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/043174 WO2010128967A1 (en) 2009-05-07 2009-05-07 Method, system, and apparatus for searching an electronic document collection

Country Status (7)

Country Link
EP (1) EP2427830B1 (en)
KR (1) KR101560756B1 (en)
CN (1) CN102483744A (en)
AU (1) AU2009345822A1 (en)
CA (1) CA2761713A1 (en)
NZ (1) NZ596369A (en)
WO (1) WO2010128967A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2457182A1 (en) * 2009-07-22 2012-05-30 Foundationip, LLC Method, system, and apparatus for delivering query results from an electronic document collection
GB2520936A (en) * 2013-12-03 2015-06-10 Ibm Method and system for performing search queries using and building a block-level index

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572620B (en) * 2014-12-31 2018-11-23 百度在线网络技术(北京)有限公司 A kind of method and apparatus for showing chapters and sections content
CN106156111B (en) * 2015-04-03 2021-10-19 北京中知智慧科技有限公司 Patent document retrieval method, device and system
KR101762252B1 (en) * 2016-04-08 2017-07-31 (주)윕스 Method and apparatus for supporting idea generation
CN108228648B (en) 2016-12-21 2022-03-15 伊姆西Ip控股有限责任公司 Method and device for creating index

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2369698A (en) * 2000-07-21 2002-06-05 Ford Motor Co Theme-based system and method for classifying patent documents
US7406458B1 (en) * 2002-09-17 2008-07-29 Yahoo! Inc. Generating descriptions of matching resources based on the kind, quality, and relevance of available sources of information about the matching resources

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040039734A1 (en) * 2002-05-14 2004-02-26 Judd Douglass Russell Apparatus and method for region sensitive dynamically configurable document relevance ranking
JP4972358B2 (en) * 2006-07-19 2012-07-11 株式会社リコー Document search apparatus, document search method, document search program, and recording medium.

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2369698A (en) * 2000-07-21 2002-06-05 Ford Motor Co Theme-based system and method for classifying patent documents
US7406458B1 (en) * 2002-09-17 2008-07-29 Yahoo! Inc. Generating descriptions of matching resources based on the kind, quality, and relevance of available sources of information about the matching resources

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TROTMAN ET AL: "Choosing document structure weights", INFORMATION PROCESSING & MANAGEMENT, vol. 41, no. 2, 1 March 2005 (2005-03-01), ELSEVIER, BARKING, GB, pages 243 - 264, XP025281816, ISSN: 0306-4573, [retrieved on 20050301] *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2457182A1 (en) * 2009-07-22 2012-05-30 Foundationip, LLC Method, system, and apparatus for delivering query results from an electronic document collection
EP2457182A4 (en) * 2009-07-22 2014-01-15 Foundationip Llc Method, system, and apparatus for delivering query results from an electronic document collection
GB2520936A (en) * 2013-12-03 2015-06-10 Ibm Method and system for performing search queries using and building a block-level index
US10262056B2 (en) 2013-12-03 2019-04-16 International Business Machines Corporation Method and system for performing search queries using and building a block-level index

Also Published As

Publication number Publication date
AU2009345822A1 (en) 2011-12-01
EP2427830A1 (en) 2012-03-14
CN102483744A (en) 2012-05-30
NZ596369A (en) 2014-02-28
CA2761713A1 (en) 2010-11-11
KR101560756B1 (en) 2015-10-15
EP2427830B1 (en) 2015-06-24
KR20120027285A (en) 2012-03-21

Similar Documents

Publication Publication Date Title
JP5534266B2 (en) Method, system and apparatus for sending query results from electronic document collection
US8364679B2 (en) Method, system, and apparatus for delivering query results from an electronic document collection
US20100287177A1 (en) Method, System, and Apparatus for Searching an Electronic Document Collection
US6654744B2 (en) Method and apparatus for categorizing information, and a computer product
CN108804421B (en) Text similarity analysis method and device, electronic equipment and computer storage medium
US20100287148A1 (en) Method, System, and Apparatus for Targeted Searching of Multi-Sectional Documents within an Electronic Document Collection
EP2427830B1 (en) Method, system, and apparatus for searching an electronic document collection
CN111506727B (en) Text content category acquisition method, apparatus, computer device and storage medium
US20100211569A1 (en) System and Method for Generating Queries
CN106815265A (en) The searching method and device of judgement document
US20110295861A1 (en) Searching using taxonomy
EP2438507A1 (en) Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection
WO2011149454A1 (en) Searching using taxonomy
EP2038771A1 (en) Organising and storing documents
JP3422396B2 (en) Similarity search method based on viewpoint
CN117033561B (en) ESG (electronic service guide) index optimization-based enterprise assessment model generation method and system
KR102279490B1 (en) Apparatus for processing information, method thereof and storage including a software thereof
Huang et al. Hierarchical Location and Topic Based Query Expansion.
Yadav et al. Ontdr: An ontology-based augmented method for document retrieval
JP2005234865A (en) Domain-categorized concept dictionary constructing method and device and program
CN118503304A (en) Data table recall method, device, equipment and storage medium
Barat Design and implementation of content based text retrieval system
WO2013043146A1 (en) Searchable multi-language electronic patent document collection and techniques for searching the same

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200980159963.3

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09789651

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2761713

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2009345822

Country of ref document: AU

Ref document number: 596369

Country of ref document: NZ

WWE Wipo information: entry into national phase

Ref document number: 2009789651

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20117028646

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2009345822

Country of ref document: AU

Date of ref document: 20090507

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 9025/CHENP/2011

Country of ref document: IN