WO2001067300A1 - Bases de donnees a valeurs de parametres ameliorees - Google Patents
Bases de donnees a valeurs de parametres ameliorees Download PDFInfo
- Publication number
- WO2001067300A1 WO2001067300A1 PCT/US2000/005638 US0005638W WO0167300A1 WO 2001067300 A1 WO2001067300 A1 WO 2001067300A1 US 0005638 W US0005638 W US 0005638W WO 0167300 A1 WO0167300 A1 WO 0167300A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- user
- values
- database
- terms
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract description 98
- 238000013479 data entry Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 5
- 230000008901 benefit Effects 0.000 abstract description 9
- 230000008878 coupling Effects 0.000 abstract description 2
- 238000010168 coupling process Methods 0.000 abstract description 2
- 238000005859 coupling reaction Methods 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 31
- 238000005259 measurement Methods 0.000 description 9
- 239000000047 product Substances 0.000 description 9
- 241000282472 Canis lupus familiaris Species 0.000 description 8
- 238000001914 filtration Methods 0.000 description 7
- 230000007246 mechanism Effects 0.000 description 6
- 238000011282 treatment Methods 0.000 description 6
- 241000109329 Rosa xanthina Species 0.000 description 5
- 235000004789 Rosa xanthina Nutrition 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 229920000642 polymer Polymers 0.000 description 5
- 230000008439 repair process Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000002596 correlated effect Effects 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 3
- 206010028980 Neoplasm Diseases 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 239000010931 gold Substances 0.000 description 3
- 229910052737 gold Inorganic materials 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 238000005065 mining Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 229920000728 polyester Polymers 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 241000196324 Embryophyta Species 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 230000003796 beauty Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003370 grooming effect Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 229920003023 plastic Polymers 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 241001550224 Apha Species 0.000 description 1
- 241000167854 Bourreria succulenta Species 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 241000220317 Rosa Species 0.000 description 1
- 244000181025 Rosa gallica Species 0.000 description 1
- 235000000533 Rosa gallica Nutrition 0.000 description 1
- 230000004308 accommodation Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 235000019693 cherries Nutrition 0.000 description 1
- 238000010411 cooking Methods 0.000 description 1
- 235000014510 cooky Nutrition 0.000 description 1
- 230000002079 cooperative effect Effects 0.000 description 1
- 238000012272 crop production Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002592 echocardiography Methods 0.000 description 1
- 230000037308 hair color Effects 0.000 description 1
- 235000008216 herbs Nutrition 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 101150115538 nero gene Proteins 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000009333 weeding Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/954—Navigation, e.g. using categorised browsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2428—Query predicate definition using graphical user interfaces, including menus and forms
Definitions
- This invention relates to the field of wide area data networks and databases that index large numbers of different types of items.
- the Internet is by now the world's largest computer network, interconnecting millions of computers.
- One side effect of this large size is that the vast amount of information available on the Internet is often extremely difficult to access. Similar problems tend to occur on any large network, and in this discussion the Internet is discussed herein as an example of such a network. Similar problems occur in searching large databases in general, whether or not part of a network.
- interpersonal speech one can sometimes adequately describe an item using decidedly non-specific language, such as "the clippie thing on the end of the rope", but in computer searching such descriptions are not helpful at all.
- An automated yellow pages type search for example, can only access items if the exact name is known.
- Keyword searches using existing technology are also exceedingly poor at handling ranges. For example, if one is searching the Internet to buy a LexusTM automobile between
- the present invention includes improved parameter-value databases and methods of using the same that provide significant benefits to individuals loading data and/or conducting searches.
- the database is used to properly identify overlap between search data and target data where the data sets contain any combination of single values, multiple values, and ranges.
- items are loaded onto the database as sets of parameter- value pairs, with subsets of such pairs being further sub-correlated for various purposes, including establishing display order, chronological order, or coupling groupings of parameter-value pairs.
- users are allowed to add new parameters to the database in such manner that the database develops a user-evolved categorization system.
- users are presented with word or other value lists to assist them in searching the database, and the lists become smaller as filters are applied.
- parameters and/or values are presented in listings for selection by users. This is potentially a huge advantage over the common Internet type search engines which provide no guidance at all in selection of parameters and values.
- the listings may advantageously be presented in alternative alphanumeric and relative historical usage formats.
- At least some of the information is classified using a classification structure having at least three levels. More preferably at least one of the levels contains a large number of classes that span (i.e., are applicable to) a high percentage of classes in the other levels.
- systems and methods described herein can be applied to substantially all products, services, and information.
- contemplated systems and methods can be used to describe every conceivable type of product and service currently listed in consumer or business-to-business telephone yellow page books, as well as all products and services commonly listed only in specialty consumer or industry catalogs.
- the types of information that may be stored are equally universal. It is specifically contemplated, for example, that systems and methods described herein may be utilized to index such diverse types of information as news items, historical facts, book reviews, questionnaires, opinion surveys, case law, and the topics discussed in various chat rooms.
- Figure 1 is a schematic of a preferred classification selection interface.
- Figure 2 is a schematic of a preferred interface for adding data.
- Figure 3 A is a schematic of a preferred interface for retrieving data.
- Figures 3B - 3E are examples of preferred "complete record” displays.
- Figure 4 is a preferred parameter selection interface.
- Figures 5 A - 5B are examples of usage of a preferred values selection interface.
- Figure 6 is a table showing a preferred three-level classification system.
- Figure 7 is a preferred units selection interface.
- Figure 8 is a preferred interface for accessing stored searches.
- Figure 9 is a preferred database structure for storing and retrieving parameter-value data.
- Figure 10 is an example of a preferred data interface providing information on "polyesters”.
- Figure 11 is an example of a preferred data interface providing information "drug use”.
- Figure 12 is an example of a preferred data interface providing information on "Clinton”.
- Figure 13 is an example of a preferred data interface providing information on a questionnaire.
- a preferred user interface 100 generally includes an item description section 110, a tree search selector 120, a classification display table 130, record navigational buttons 140, and other navigational buttons 150.
- the item description section 1 10 is typical of many search engines in that a small space is allowed for one or more search terms, and in some embodiments there may be a button that allows for complex Boolean searching.
- the search here may be different from typical searches, however, in several ways.
- the search term or terms may advantageously be compared first against the classification structure itself, i.e., against the names of the various classes, subclasses and so forth, and then only applied against the values of parameter/values stored in the database if there are no matches in searching the classification structure.
- search terms may be coupled together using synonyms, such that searching for one term may pull up records in which the search term is not present, but a synonym is present. For example, searching for the terms "autos", “auto”, “car”, “cars”, and “automobile” may all trigger a search for "automobiles".
- the tree search section 120 preferably navigates to a pop-up or other series of windows (not shown) which display in sequence the various levels of the classification system. Additional details of preferred classification systems are described in more detail below.
- the classification display table 130 lists classifications from the classification system that match the search terms.
- the system is using, or at least displaying, only three levels of classification.
- Level 1 class names are displayed in the first column 131, level 2 class names in the second column 132, and level 3 class names in the third column 133.
- the fourth column 134 shows relative frequency of entries using the displayed classifications. These relative frequencies are intended to assist the user in selecting an appropriate classification.
- the fifth column 135 provides check boxes 135A for users to select a specific classification from among the listed classifications. In this instance the user has selected the first listed classification, and the system has recorded the selection by changing the check box to a check mark 135B.
- systems may also permit users to select multiple classifications so that a single search could cover many areas. Where selection of multiple classifications are chosen, it is likely that the system will limit the number of selections to a relatively small number, perhaps three or four.
- the navigational buttons 140, 150 assist the user in navigating the various displays in the system.
- the buttons first row of navigational buttons 140 are used to view subsets of classifications when more classifications meet the search criteria than can be conveniently displayed at the same time.
- the classification table 130 displays 10 rows at the same time, which number is most likely a function of the resolution and size of the display screen, as well as the size of the window in which the classification table 130 is displayed.
- the various navigational buttons 140 would be used to navigate among the many rows.
- the "21-30" button for example, would display rows 21 - 30.
- the ⁇ button would display rows 41 - 50, then 51 - 60, and so forth, while the ⁇ ⁇ button would display the last set of rows.
- the "New Search” button of the second set of navigational buttons 150 provide the user with the ability to clear the screen and start a new search.
- the "Add Item” button changes focus to an interface such as that shown in Figure 2, in which the user can add a new item to the database.
- a preferred data entry interface 200 generally includes a selected classification display 210, a data entry table 220, and navigational buttons 230.
- the classification section 210 echoes a classification chosen in the interface of Figure 1. In the event that multiple classifications are chosen, the classification section 210 may advantageously echo all the chosen classifications.
- the first column 221 in table 220 preferably defaults with five, ten or some other relatively small number of the most frequently used parameters for the chosen classification or classifications. Any of the parameters can be modified from the defaults, either by overtyping the displayed parameter with a new parameter, or by selecting a parameter from a parameters listing such as that shown in Figure 4. The parameters listing can be displayed by clicking on the adjacent " ⁇ " symbol or other button shown here in column 222.
- the third column 223 in table 220 records values that the user wants to associate with the selected parameters.
- the user has chosen to associate the value 6000248 with the Patent No. parameter.
- the system is preferably designed so that the user can either just type in the value, or select a value from a values interface such as that shown in Figure 5.
- the values interface can be accessed by clicking on the corresponding " ⁇ " symbol or other button in column 224.
- the system will limit the maximum number of parameters correlated with any given classification.
- a typical limit may be 75 or 100 parameters. Users wanting to add a parameter beyond the maximum will likely be given a message to that effect, and asked to use an existing parameter, or try again later. Parameters that have minimal or no usage may be deleted periodically to make room for addition of new parameters.
- the number of values allowed to be associated with a given parameter and classification may be limited in a manner analogous to limitations placed on parameters, although the limit would most likely be set much higher. For example, there may well be thousands of different dollar amounts (values) that can be associated with the parameter Price for an automobile classification. On the other hand, there may only be twenty or thirty values that can be reasonably associated with the parameter Color for the same classification.
- the fifth column 225 of table 220 displays the measurement units corresponding to the value on the same row.
- the corresponding value is pure text so that the units designation is simply listed as "text”.
- the corresponding value is usually a number.
- the user can choose among units of measurement in an auxiliary interface (not shown).
- a user entering the numeral 55291 as a value for the parameter Odometer may be given a choice of Miles or Kilometers as units of measurement.
- the system may advantageously keep track of default units in a Units Key field 926 with respect to individual parameter-classifications, so that the same literal parameter name may have different parameters depending upon the classification.
- the parameter Color may be text data (red, blue, white, etc) for automobiles, but numeric data (1.79, 2.30, etc for hair color). Users may also choose to add multiple parameters to describe the same characteristic in different ways.
- the parameter Height (text) may be used in conjunction with the values Tall or Short, while the parameter Height (in) may be used in conjunction with the values 72.5 or 68.
- the sixth column 226 shows the " A " or other symbol that can be clicked upon to display a values selection interface such as that shown in Figures 5A - 5D.
- a user may well elect not to use the values selection interface, and may instead simply type in a literal.
- the literal would most likely be compared against existing values as a means of confirming spelling, as a means of suggesting alternatives to the user, and as a prelude to adding the value as a new value in the system.
- the seventh and eighth columns 227, 228, respectively, may well be displayed only to those users who have identified themselves as being advanced.
- Column 227 stores numbers that can be used to depict correlations between or among the parameter-value pairs for a given entry. These may also be referred to as "sub-correlations" because they provide further correlations among parameters and values that are already correlated by virtue of relating to the same entry or item.
- One contemplated use is to set control the sort order in which the values are displayed.
- the value "Patent" in the first data row would be displayed before the value 6000248 for the patent number because the corresponding sort data are 1 and 2, respectively. Examples of using the column 227 data in this manner are set forth in Figures 3B - 3D.
- a cooking recipe or laboratory procedure for example, generally has multiple steps.
- the various steps could be stored in the database using sequence identifying parameter names such as Step 1 ,
- Step 2 Step 3 etc.
- a user could enter all of the steps using only one or a small number of parameters with names such as Steps, Preliminary Steps, or Follow Up Steps. In these latter cases the user could still keep the steps in proper order upon later display by entering the step order as column 227 information.
- the data need not be integers, so that one could have steps 1.0, 1.1, 1.2, 1.21 , 1.3, 2.1 and so forth. It should be apparent that the chronology of historical events can be designated in a similar manner.
- column 227 is the grouping of subsets of parameter-value pairs. For example, a manufacturer selling an item that has multiple color choices would probably want to enter the differently colored telephones in the database as completely different entries. On the other hand the manufacturer may want to store O 01/67300
- Column 227 stores delimiters that can be used in displaying data in desired formats such as those set forth in Figures 3B - 3D. At present it is preferred to use only simple characters such as asterisks, slashes, hyphens, carriage returns ("next line”), and so forth. In the future, however, it is contemplated to include more sophisticated delimiters, or even codes that are not delimiters per se, but affect the format of the information being displayed.
- the navigational buttons 230 assist the user in navigating the various displays in the system.
- the New Search button transfers the user back to a searching interface such as that shown in Figure 1.
- the Cancel button clears the display of Figure 2, and returns the user back to the previous interface.
- the Record button starts the process of performing validity checks on the data of Figure 2, and if the data clears the validity checks, ultimately causes the data to be loaded onto the system.
- the parameter- value database is also a self-evolving (i.e. user- evolved) database.
- the ability to add new parameters and/or values provides users with a mechanism for delineating characteristics of their items (products, information, or other data) that distinguish such items from those of others. For example, a person selling a car may want to advertise that he or she is the original owner. If Original Owner is not listed as an available parameter, the car owner can add that parameter, along with a likely value of Yes.
- the Original Owner parameter will either "bubble to the top” or the listing (because subsequent users tend to choose that parameter), or remain at the bottom (because subsequent users tend not to choose that parameter).
- the evolution process also takes care of variant spelling of parameters.
- the database may well include both Color and Colour as parameters, but one of them will likely bubble to the top, and one of them will likely sink to the bottom.
- a data retrieval interface 300 generally includes a selected classification display 310, a three-row parameter/filter/units selector 320, a main data display table 330, column navigation slider 340, record navigation buttons 350, and other navigation buttons 360.
- the selected classification display 310 serves substantially the same function as the selected classification display 210 of Figure 2 - it displays the classification or classifications that the user is using, preferably classification(s) selected in an interface such as that of Figure 1.
- the information displayed in the main data display table 330 is dependent upon the listed classification(s), as well as the selected parameters and filters as described below.
- the three-row parameter/filter/units selector 320 defaults to the five, ten or some other number of the most frequently used parameters for the chosen classification, in a manner similar to that of Figure 2.
- the parameters are displayed as column headings whereas in Figure 2 the parameters were displayed as row headings.
- columns and rows in display formats are more or less conceptually interchangeable, and all permutations of these are contemplated as alternative embodiments, as well as matrices in which the cells are non-contiguous horizontally, vertically, or in both directions.
- the first row 321 of the parameter/filter/units selector 320 is labeled with the term "Parameters" at the far left.
- the cells to the right are in pairs, with each pair having the same final letter.
- cells in row 1 , columns 2 and 3 form a pair labeled 325A, 326A
- columns 4 and 5 form a pair labeled 325B, 326B
- columns 6 and 7 form a pair labeled 325C, 326C, etc.
- the first cell in the pair shows the parameter name used to define the data in the column of main display table 330 immediately below.
- cell 325A shows the parameter Type, which in this example refers to the type of intellectual property. Examples may be Patent, Copyright, Trademark, Trade Secret, Contract, etc.
- the second cell in each pair displays a " ⁇ " symbol or other button that leads the user to a parameter selection interface such as that depicted in Figure 4.
- the second row 322 of the parameter/filter/units selector 320 is labeled with the term "Value" at the far left.
- An alternative and possibly preferable label may be "Filter”.
- the cells to the right are again in pairs, with the first cell of each pair either blank or displaying a value used for filtering, and the second cell of each pair is a " ⁇ " symbol or other button that leads the user to a value selection interface such as that depicted in Figures 5A-5D.
- the second row 322 is intended to receive values used in filtering the corresponding parameters.
- the filters are preferably null at the outset, but can be filled in by the users.
- One of the most important aspects of preferred systems is that they can display and filter on any combination of parameters.
- the user could then filter to a particular make, or perhaps filter on some other parameter.
- competing systems such as the current version of autobytel.com, a user can only access the database by first selecting a make and a model. That type of very limited access to the database is just not satisfactory in many circumstances.
- the third row 323 of the parameter/filter/units selector 320 is labeled with the term "Units" at the far left.
- the cells to the right are once again in pairs, with the first cell of each pair displaying a units measurement, and the second cell of each pair displaying the " ⁇ " symbol or other button that leads the user to a units selection interface, as for example the units selection interface depicted in Figure 7.
- Typical units measurements are "text"
- the units information is employed in displaying the data in the corresponding column of the main data display table 330, with the system making appropriate calculations and rounding. Using this system a user can readily filter for and view Odometer data as miles or kilometers, regardless of how such data is stored.
- the "Go Fish” button 328 tells the system to apply the parameters and value filters selected in rows 322 and 323, and produce the results in the main data table 330. Of course, other terms could be substituted for "Go Fish", including “Apply”, or “Go”, “Build Table”, or “Submit”, and the button 328 could be located elsewhere in the interface 300.
- the main display table 330 preferably contains between 6 and 30 columns, with many of the columns being positioned off the screen at any given time. This can be accomplished by the usual WindowsTM type of horizontal slider 340, or any other suitable manner, such as tab type navigational buttons that would show subsets of the columns. Where more columns are utilized than can conveniently fit on the display screen, the columns with filters can advantageously be moved to the far right. Thus, if a user is employing 10 columns in the main display table 330, and 3 of those columns contain a filter, then those three columns would preferably be automatically moved to columns 8-10, respectively. In so doing the first seven columns would contain the variable data of interest to the user. Such automatic movement of columns, however, is not depicted in the main display table 330 of Figure 3A to better illustrate the preferred filtering techniques.
- users can move directly to the " ⁇ " symbol or other button, or alternatively users can type a parameter or value into the corresponding cell of the table.
- the system verifies the validity of the entry, and provides assistance (such as transfer to the appropriate parameter or value interfaces) if the entry is invalid.
- Values are displayed in the various rows of the main data display table 330 that correspond to items matching the selected classification 310, the parameters selected or defaulted in row 321, and the filters selected in row 323, in short for values matching the search request.
- the table sorts by default from left to right, but can advantageously be resorted by data within any given column by clicking on the corresponding ⁇ or T sort buttons 321 A, 32 IB at the head of the desired column.
- the cells of the table can include text, icons, hyperlinks to web pages, files or the like. Where a hyperlink is in the cell, users can preferably jump directly to the linked site. Where a video, audio or other file is indicated, users can preferably open and play or display that file as the case may be. Where an e-mail address is indicated, the system preferably opens an interface to facilitate recording and sending of an e-mail to that address.
- the record navigation buttons 340 and other navigation buttons 350 are intended to be similar to those used on other systems, and are self-explanatory.
- the Select button brings up an interface such as those depicted in Figuies 3B - 3E.
- the Spreadsheet button is used to send the data in the main display table 330 to the user as an Excel, or perhaps some other spreadsheet.
- Figure 3B depicts a very simple preferred "full record” display.
- the user has chosen to store only text information.
- the text displayed is " 1998 Lexus LS400, white, gold package, 12,000 miles, perfect inside and out, original owner, Fullerton, CA, Bob 714-555-5555, $32,900, firm". This display can be achieved by inputting the following information into the interface of Figure 2:
- Price firmness firm text 14 Figure 3C depicts substantially the same information as described above with respect to Figure 3B, but with the addition of a picture, modification of the sort order, and modification of the delimiters. This display can be achieved by inputting the following information into the interface of Figure 2:
- Figure 3D depicts yet another format for a "full record" that may be selected by a user. This display can be achieved by inputting the following information into the interface of Figure 2:
- each delimiter other than a space is followed by a space, and hyphens are preceded by a space.
- Another convention may be that all words are displayed in lower case except for recognized proper names, and the first word after a period.
- Still another contemplated convention is that the user can choose not to designate any sort order at all by leaving the sort field null. In such cases the data may be displayed in a simple multi-column parameter-value listing such as that depicted in Figure 3E.
- alternative conventions are also contemplated, and innumerable combinations of conventions can be developed by clever programmers.
- the system may also store default display formats for various classifications, which can be used by the system to display data in "classified" type listings when the person or company posting the data (i.e., the data provider) does not specify a custom format.
- data can be "mined" from an original classified type listing format, converted into parameter value pairs, and then redisplayed in a classified type format similar or even identical to that of the original.
- One step in a preferred strategy for such data mining would be to locate the target data using a web crawler, or have the data provider transfer or point the target data to the mining system.
- the target data can then be parsed into terms, preferably using a listing of standard parsing characters (i.e., delimiters such as spaces, slashes, hyphens, commas, and so forth), or using one or more parsing characters designated by the data provider.
- the system would also attempt to determine an appropriate classification for the target data, preferably by locating a classification code in the target data, or having the data provider designate a classification in some other manner. It may also be desirable to apply all of the parsed terms against the database to determine which classification contains the highest number of such terms as values. Once the classification is known, the parsed terms can be applied against the database to determine corresponding parameters, and stored as parameter value pairs. If the system also associates the delimiters from the original format with the respective terms as discussed above with respect to Figures 3B - 3E, the stored data can be redisplayed at a later time with an appearance similar to that found in the original listing.
- the data provider would include the data on the web page in either a memo field or in a simple flat table.
- a typical listing may be "Long stem red roses for sale. Shipped daily from Hawaii. Only $24.99 per dozen", with the listing followed by a picture of the dozen roses.
- This information would preferably be forwarded to the mining system by the data provider with a selected classification designation, perhaps a code such as "A27", or less preferably with a corresponding recognized classification such as "agriculture-flowers-marketplace".
- Preferred classifications and codings are discussed below with respect to Figure 6. If, however, the classification were not known, the system could still obtain a working classification by locating those classifications containing as values the parsed terms "red”,
- the system can work backwards by locating a parameter for each, or at least many, of the values. In that manner Color may be determined to be a valid parameter for the value Red, and Plant Name may be determined to be a valid parameter for the value Roses.
- the system could thus automatically resolve the parameter-value pairs from which the original listing could be substantially recreated. For the roses example, the data may resolve as follows:
- an automatic data mining system may take substantially the same steps as a human user would take in separating data into parameter- value pairs, and then loading such pairs onto the system.
- Another contemplated method of mining data is to provide a web crawler that scans web pages or other documents sequentially, or according to some other logic. In that scenario it is preferred that the web pages or other documents would tag selected information using tags that specify classification, parameters, and values.
- the system could use XML type tags for this purpose, some other tagging format, or even a combination of tagging formats - provided that the system could resolve the data into parameter-value pairs.
- a parameter selection interface 400 generally includes a header 405, a parameters table 410, Apha/Frequency navigation buttons 452, slider 413, a word entry interface 440, and other navigation buttons 454, 456.
- the parameters table 410 preferably lists some or all of the parameters presently stored for a previously chosen classification 405.
- the first column 41 1 of table 410 lists the available parameters
- the second column 412 lists the respective frequencies with which the corresponding parameter was historically utilized with respect to the chosen classification 405, while the third "column" 413 is really a slider used to view additional rows.
- One or more parameters can be selected by clicking on the desired row(s).
- the default sort for the parameters is by frequency, although users can access alphabetic sort (including alphanumeric or numeric as appropriate) by clicking on the
- Alpha/Freq toggle button 452 toggles between Alpha when the list is displayed by frequency, and Freq when the list is displayed by Alpha. Users can also access alphabetic sort, and jump to a particular point in the alphabetic sort by entering a literal in the appropriate word entry interface 440, typing a literal in a parameters box (column 1) of table 200, or typing a literal in the parameters box in column 325 A of Figure
- the absolute number of parameters allowed on the system for any given classification may advantageously be limited.
- the classification of "real estate-residential-marketplace” may be limited to 80 parameters, while the classification of "office supplies and equipment-desk items-marketplace” may be limited to only 50 parameters.
- a simple viewing mechanism such as slider 413 will be sufficient to select among the various parameters.
- the Cancel and Select navigation buttons 454 and 456, respectively, are intended to be similar to those used on other systems, and are self-explanatory
- a values selection interface 500 generally includes a header 505, a values table 510, a word entry interface 540, and other navigation buttons 552, 554, 556.
- the values table 510 preferably lists some or all of the values presently being stored for a previously chosen classification and parameter. Here the available values are listed in column 51 1 with corresponding frequencies in column 512. Slider 513 is used to view additional values. As with the parameters selection interface of Figure 4, the default sort is by frequency, although users can access alphabetic sort (including alphanumeric or numeric as appropriate) by clicking on the Alpha/Freq toggle button 452. The Alpha/Freq toggle button toggles between Alpha when the list is displayed by frequency, and Freq when the list is displayed by Alpha.
- Users can also access alphabetic sort, and jump to a particular point in the alphabetic sort by entering a literal in the appropriate word entry interface 440, typing a literal in a parameters box (column 1) of table 200, or typing a literal in the parameters box in column 325 A of Figure 3.
- the absolute number of values allowed on the system for any given classification and parameter may well be limited with respect to text values, but is probably unlimited with respect to numeric values. Thus, although it may be that a simple viewing mechanism such as slider 513 will be sufficient to select among the various values, other viewing mechanisms such as the alphabetic buttons or record number selectors discussed above may be utilized.
- the record navigation buttons 530 and other navigation buttons 540 are intended to be similar to those used on other systems, and are self-explanatory.
- Figure 5B depicts the interface of Figure 5 A in which the choices are reduced in number because of filtering of other parameters by the user.
- the user is presumed to have selected a classification having automobile information as data.
- the user elected to see the names of all models previously stored on the system with respect to the selected classification, and to list those models alphabetically.
- the user presumably filtered his data by selecting a value of Chevrolet for Make, and therefore the only values showing for the parameter Model are Chevrolet models.
- One of the enormous benefits of preferred systems and methods set forth herein is that users can access stored data using any combination of filters, and view the data using any combination of parameters. It is entirely possible, for example, for a user to select for all houses within a given price range, a given number of bedrooms range, and at least 100 feet of lakeside frontage without selecting a location at all. That degree of flexibility in searching is hitherto unknown on the Internet and elsewhere.
- Figure 7 depicts a preferred interface 700 for selecting an alternative units measurement. Focus to this interface would most likely occur by clicking on the " ⁇ " or other symbol in any of the cells of column 226 of the main data entry table 220 of Figure 2, or on the " ⁇ " or other symbol in any of the cells of row 323, columns 326A, 326B, 326C, etc. of the main data retrieval table 330 of Figure 3.
- the interface 700 preferably includes instruction lines 710, 720, and a units display table 730, and navigation buttons 742 and 744.
- the units display table 730 preferably includes only a few conversion units so that the entire units conversion can be readily downloaded to, and operated on the client side of the network.
- the accuracy column provides indicates how the system will display the converted data.
- the entry "scientific notation" indicates that the system will display the converted data using scientific notation
- the "+1 " entry indicates that the system will add one level of accuracy to that found in the source data
- the "same” entry indicates that the system will use the same of accuracy as that found in the source data.
- Figure 8 depicts a preferred Saved Searches interface 800, generally comprising a title 810, a main stored searches display table 820, and navigation buttons 831, 832, 833,
- the main stored searches display table 820 includes a first column 821 containing user defined search designations, second, third, and fourth columns 822, 823, and 824 containing Level- 1, Level-2, and Level-3 class names, respectively, a fifth column 825 containing a designation of how often the search will be automatically run, a sixth column 826 containing a " ⁇ " symbol or other button for accessing the run schedule choices (weekly, monthly, etc), and a seventh column 827 that contains the last run date of the search.
- There is no slider because in at least preferred embodiments users are limited to a relatively small number of saved searches.
- the fourth row of data is partially completed because the user navigated to this interface 800 from a current search in the classification of Pets &
- a Run Schedule is optional. Searches run automatically by the system according to the Run Schedule only include data that is new to the system since the last run date, and searches that are not null are preferably send to the user via e-mail in spreadsheet format.
- Previously stored searches are accessed by clicking on the appropriate row, and then clicking on the "Select” button 834.
- Stored searches are deleted by clicking on the appropriate row, and then clicking on the "Delete” button 833.
- the Cancel button 831 is self-explanatory.
- One especially useful feature contemplated herein is the ability to perform multi- value and range searches on target data that itself may include multiple values and ranges.
- data rows 2, 3, and 6 have been highlighted by the user, and all three can be simultaneously selected for inclusion in filtering by clicking on the Select button 556.
- the same principle can be advantageously applied to the parameter selection interface of Figure 4.
- preferred embodiments include the ability to enter a value as a literal range, e.g. "between 15000 and 20000". Such a range may, for example, be included as a filter in a values cell of row 322 of Figure 3, or a values cell of column 223 of Figure 2. Because of the way the data structure is set up, (see e.g. discussion below with respect to Figure 9), both multiple value and multiple range searching can be accomplished merely by altering the database query.
- Quadrant 1 represents the simplest type of search, where both the search data and the target data contain a single value for a parameter of interest. For example, a user searching to buy a dog through the Internet may enter the term "dogs" in a search field of a search engine. In the prior art, a search engine would have typically crawled through millions of web pages looking for tagged keywords, and would likely have stored the keyword "dogs" in an index for pages containing that term. The search engine would then match up the single value "dogs" against the index, and identify to the user the various pages that include the keyword "dogs”.
- Quadrant 2 is similar to quadrant 1, except that the web pages contain multiple keyed terms for the same parameter.
- a single web page may refer to both "dogs” and “cats”, which are either expressly or inherently correlated with a parameter such as Type of Pet.
- the prior art search engines would identify web pages regardless of whether the user searched for "dogs” or “cats”.
- the user specifies multiple search terms that are concatenated either expressly or inherently using Boolean logic connectors such as "or” and "and”.
- Boolean logic connectors such as "or” and "and”.
- Quadrant 1, 2, 4, and 5 searches are known to the extent that they do not involve searches in databases that store descriptions of items as parameter-value pairs.
- LEXISTM and NEXISTM databases for example, all utilize Quadrant 1, 2, 4, and 5 searches. When applied to self-evolving databases, however, even these simple searches are thought to be novel.
- Quadrant 3, 6, 7, 8, and 9 searches are all thought to be novel with respect to parameter-value type databases because they involve ranges.
- a user searching for a car may enter a value such as $17000, and that value is applied against a web page stating that cars are available for between $15000 and $20000.
- Yahoo!TM, GotoTM, LycosTM, or any of the other prior art search engines there would be no match because the term $17000 does not match either $15000 or $20000. In embodiments of the present invention, however, there would be a match.
- Quadrants 7 and 8 searches a user may use an interface such as that shown in
- Figure 3 to enter a value of "between $10000 and $15000" in row 322, column 325B.
- a Quadrant 7 search preferred embodiments of the present system would located a web page having a value of $13000
- a Quadrant 8 search preferred embodiments of the present system would located a web page having a multiple values of $10000, $12500, and $13000.
- a user may be looking for cars with years after 1997, odometer reading less than 50000 miles, and price less than $15000. These searches would apply the indicated ranges against single or multiple stored values for the respective parameters.
- the Quadrant 9 search is particularly powerful because it can identify items in which the search range overlaps the target range, such as the search range "between $70000 and $80000" matching a target range "$75000 - $1000000". This can be extremely useful, for example, in loading and searching insurance policies and other data.
- an insurance policy may limit policies to individuals having incomes of "$75000 - $1000000", while a person shopping for an insurance policy may list his income as "between $70000 and $80000.
- a toy manufacturer may price a particular toy "between $15.95 and $28.95" depending upon the color. A search for that toy would find a match if the searcher entered a price of " ⁇ $20.00".
- preferred embodiments of the present invention thus include methods of searching a database comprising: storing descriptions of a plurality of different items on the database as sets of parameter- value pairs, in which at least some of the values form a target; providing a search criterion such that at least one of the target and the search criterion comprises a numeric range; and identifying a successful search as occurring when there is an overlap between the search criterion and the target.
- Such methods may involve Quadrant 8 searches in which the overlap includes a portion of the target having multiple values for a particular one of the items, Quadrant 3, 6, or 9 searches in which the overlap includes a portion of the target having values stored as a range, Quadrant 6 searches in which at least a portion of the search criterion includes a collection of discrete values, Quadrant 7, 8, or 9 searches in which at least a portion of the search criterion includes a numeric range of values, and permutations thereof.
- More preferred embodiments are self- evolving in that they supply a user providing the search criterion with an ability to add new parameters to the database, and still more preferred embodiments guide the user in the addition of the new parameters by displaying historical summary usage information, such as relative historical usage information on a percentage scale for other parameters previously employed in the same classification.
- Level 1 (shown in column 1) includes a relatively small number of classes, Advertising & Marketing, Agriculture, Art, Automobiles, Beauty & Grooming . . . Transportation, Travel, and Weapons.
- Level 2 (shown in column 2) includes varying numbers of classes hierarchically related to corresponding Level 1 classes.
- the exemplary classification system of Figure 6 is typical in that many or even most of the Level 2 classes make sense only with respect to the related Level 1 classes.
- Level 1 class of Agriculture one finds Level 2 classes of Animal Production, Chemicals, Crop Production, Florists, etc.
- These Level 2 classes make sense with respect to Agriculture, but are generally inconsistent with respect to other Level 1 classes such as Art or Automobiles.
- Levels 1 and 2 can also be described as having a superior / inferior relationship, with Level 1 being relatively superior and Level 2 being relatively inferior
- An exemplary Level 3 (shown in column 3) includes 89 classes, many of which are referred to herein as "spanning classes" because they are logically related to many or all of the Level 1 / Level 2 classifications.
- the Level 3 class of Awards could well apply to the Level 1 / Level 2 classification path of Advertising & marketing / Personnel Recruitment. But awards also applies to the Level 1 / Level 2 classification path of Art / Artists.
- the Level 3 class of Industry Information applies to the Level 1 / Level 2 classification paths of Agriculture / International, Automobiles / Trucks, and Beauty & Grooming / Nails.
- An alternative Level 3 (shown in column 4) includes only 47 classes. Astute observers will recognize that of the classes have been collapsed, and some of the categories have been eliminated entirely. For example, Importing and Exporting are subsumed under the more general class entitled Trade.
- Another alternative Level 3 (shown in column 5) is even more collapsed, including only 28 classes.
- the classes of Consortia and Cooperatives are collapsed into a single class named Companies, and Enthusiasts is subsumed under People. Also, Conventions & Conferences and meetings are merely types of Events, and so have been eliminated.
- classification can be readily summarized for users in code format.
- the classification path of Automobiles / Cars / Marketplace could be coded with only four numeric digits as 527 or 8103, or with a three digit alphanumeric code such as G57 or H29. This is because contemplated classification systems may only have about 10,000 or fewer permutations.
- Individual codes can advantageously be provided to web-site developers for inclusion into their web pages, and used in combination with XML or other tagging system to direct a search engine to automatically apply a desired classification.
- the classification codes could thus be used by millions of unrelated users in a manner analogous to the way real estate agents us the Thomas GuideTM page and grid codes.
- Level 1 class of Automobiles may well include a service-related Level 2 class of Repairs & Maintenance.
- Level 1 / Level 2 classification of Automobiles / Trucks addresses services by inclusion of the service-related Level 3 class of Schools & Training.
- the type of classification systems contemplated herein can readily accommodate opinions, polls, and indeed information of all types.
- the presently described systems and methods address these additional types of data very effectively, in part because in the parameter-value aspect the users themselves decide what parameters relate to what classifications, and what values relate to those parameters.
- Such systems are preferably made to be inherently self-evolving at least in part by providing subsequent users with summary comparison usage information based upon the choices of previous users, and in part by permitting subsequent users to can add new add classifications, parameters, and values instead of being limited by those previously used by others.
- Summary comparison usage information is preferably communicated to users in the form of listings in which the choices are presented in order of descending usage. In that manner parameters and values that are used more frequently bubble to the top of the list, while parameters and values that are used less frequently sink to the bottom of the list. Very poorly used parameters and values can even be deleted periodically.
- the term "user” is employed herein to mean an end-user of the database, i.e. an ordinary person or business who is either listing an item on the database, or looking for an item, or both.
- users of the database can add new classifications, and/or parameters, and/or values, rather than waiting for a programmer or systems designer to do so. In that manner the aggregate of users largely control the evolution of the database rather than a few programmers or other individuals. That is what makes the database or system self-evolving.
- usage information is employed herein in a very broad sense to include information relating to occurrence, absolute or relative frequency, or any other data that indicates the extent of past or present usage with respect to the various choices. It is contemplated, for example, that the choices for which usage information is displayed would include one or more of item classifications, geographic classifications, parameters, and values. It is also contemplated that the usage information displayed may relate to subsets of choices determined by a user's own previous responses, the term "subset" being employed herein to include proper and improper subsets. Thus, when selecting a minor item classification, the system may display a listing of possible minor item classes determined by the user's selection of major classification, along with relative usage information among the displayed minor classes.
- the item descriptions displayed, and the corresponding usage information would preferably be a function of the major and minor item classes selected.
- the parameters and/or values displayed, and the corresponding usage information would preferably be a function of the item selected, and possibly also of the geographic class(es) selected.
- usage information is not unlimited in scope. Usage information as employed herein is meant to be inherently comparative and summary in nature, so that usage information does not include a user successively performing keyword searches and viewing the numbers of hits independently of one another. Also, the terms “historical usage information” and “usage information” are used herein in slightly different manners. Historical usage information necessarily includes data that has accumulated over time, while usage information may or may not include data, and may therefore be limited to information gleaned from data currently on the system.
- Usage information can be presented in many ways.
- usage information is shown on a relative frequency scale as an integer from 1 to 100, with the data rows sorted from highest frequency at the top to lowest frequency at the bottom.
- usage information can be displayed by depicting absolute frequency, by depicting occurrence of number of uses or "hits", or even by displaying data or data rows in different colors or using other identifying indicia.
- indices such as used car guides do provide summary comparison information for various makes, models, and years, that information is based upon factors other than usage information for the database at hand, and is not accompanied by offers for individual cars.
- the self-evolving approach systems and methods described herein can be used for all manner of products and services.
- the self-evolving database concept is readily applied to employment want ads.
- users would add parameters such as nature of employer, location, educational requirements, experience requirements, duties, salary, etc.
- the self-evolving database concept is readily applied to personal advertisements. There, likely parameters include marital status, race, sex, sexual preferences, hobbies, likes and dislikes.
- Still other specialized parameters may be employed to conduct auctions. For example, a user may choose to list the items of interest using the parameters of "last price bid", "last bid date”, and "closing date/time”. This capability is especially powerful because it allows a user to view information stored on all items of interest, whether such items were listed as fixed price offers, auctions, or whatever.
- a user looking for a particular book, for example, would be presented with a single table showing fixed price offerings from volume retailers such as Amazon.com and BarnesandNoble.com, as well as offerings of smaller companies, individuals selling new and used copies of the book, offerings by auction, and so on. It is especially contemplated that both auction and non-auction (sales, lease, rental, etc) offerings can be displayed in the same table at the same time merely by selecting appropriate parameters.
- entries for items being auctioned are stored as sets of parameter- value pairs, and then displayed to a potential customer in a matrix format in which individual cells contain values from the parameter-value pairs.
- the item(s) being auctioned may be similar or even identical to the item(s) being sold other than by auction, or they may be quite different from one another.
- the parameter-value pairs for the item(s) being auctioned may be stored on the same database as the item(s) being sold other than by auction, in overlapping databases, or in completely independent databases.
- the matrices discussed here, as well as elsewhere in this application may have contiguous or noncontiguous cells, and may include navigational aids that accommodate more columns and/or rows than are viewed at a single time.
- search strategies (which would include the classification, parameters, and values used to obtain a results set), can be stored either locally to a user (perhaps as a cookie), or stored centrally. This would allow a user to develop a search over time, and then run the search again using a keyword or other locator, rather than having to reconstruct the entire search.
- one of the parameters utilized across all item characterizations is likely to be listing date or update date, and searches could be stored that only look for items entered after the last time the search was run.
- the system could store a search strategy, run the strategy periodically, and then e-mail the user who entered the search only upon selecting a non-null results set.
- Data structure 900 generally comprises a classification table 910, a parameter table 920, a values table 930, an entries (items) table 940, a parameters-value table 950, and several supporting tables (not shown).
- the number of tables and fields is thought to have been optimized to enhance performance for MicrosoftTM Sequel Server 7.
- additional fields may be added as well to various tables, and additional tables (including vendor records and so forth) may be added.
- the Classification table 910 includes fields for Class key 911, Level- 1 912, Level-2 913, and Level-3 914 class literals, and a field storing a maximum number of parameter- value pairs 915. These fields should all be self-explanatory.
- the Parameters table 920 includes a Parameter Key 921, and a Class Key 922 that relates back to the Class Key 91 1 of Classification table 910. Although parameter literals are thus stored repeatedly, it is thought that there are sufficiently few parameters that the inefficiency of storage is outweighed by the efficiency in accessing parameter frequencies for specific classes. Synonyms for parameter literals are preferably stored by including the synonym literal in the Parameter Literal field 923, storing the Parameter Key 921 of the root or base parameter in the Synonym Param Key field 925. That way a search for any parameter among a set of synonyms can identify all Parameter Keys 921 in the set.
- the Parameter Freq field 924 stores the historical occurrence with which this particular parameter and classification combination has been used. The system can then calculate relative frequencies with which different parameters have been used for the same class. Frequency fields 924, 956 are maintained on an ongoing basis in Parameter table 920 and
- the Units Key 926 related back to a Units table (not shown) that includes a corresponding Units Key, and fields for the literal name of the units (miles, kilometer, etc), a base unit of measurement into which the units can be converted, a conversion factor, and a default rounding factor. Thus miles would likely be converted into kilometers with a conversion accuracy indicator. Alternatively, the result of the conversion can use the same level of accuracy as the data being converted. From inclusion of the Units Key 926 in the Parameters table 920 it should be apparent that each parameter within each class is contemplated to have only a single parameter.
- the Values table 930 is straightforward, containing a Value Key field 931, a Value Literal field 932 and a Synonym Value Key field 933.
- The preferably operates in an analogous manner to the Synonym Param Key field 925.
- the Value Key field 931 may be highly efficient for the Value Key field 931 to contain a very simple alphanumeric key, such as V99231.
- the Value Key field 931 may be efficient for the Value Key field 931 to contain the numeral in some standard format to facilitate range searching when the Value Keys are used in Value Key-1 953 and Value Key-2 954 in the Parameter-Value table 950,
- the Entries table 940 is used to correlate sets of parameter-value pairs that represent a particular item.
- a person listing a car for sale may have the car data split up into 20 parameter-value pairs, which would be stored in the parameter-value table 950.
- the system later needs to collect those 20 parameter-value pairs together it does so because all of those parameter-value pairs have the same Entry Key 941 in Entry Key field 957.
- Entries table 940 also includes a Vendor Key 942 field that contains a key referring back to a Vendors table (not shown) that contains name, address, phone numbers, billing information, and so forth. .
- the Entries table 940 also includes a Load Date field 943 and an Expiration Date field 944 that indicate when the particular entry was originally loaded onto the system, and when it is scheduled to be deleted, respectively. Entries may well be deleted a month after being added unless the vendor providing the data pays a fee to maintain the data on the system, especially for vendors having more than a small number (possibly 5 or 10) of entries.
- the Rate Code field 945 is used to store information on how each particular entry is being billed.
- the Rate Code field 945 may, for example, list a monthly billing charge for each item, with items listed for free having a rate code of zero.
- One contemplated use would be for an insurance company to store doctor information on the public access system, and proprietary or confidential information on their local system. Users behind the firewall of the insurance company could readily access all the parameters and values made available to the public through the public access system, but additionally access the proprietary or confidential information as extensions of the public access data - and all of this can be done using a single interface such as the main display table 330 of Figure 3A. The only difference is that users behind the insurance fire wall would potentially have more choices for their parameters and values than would the public access users. In the meantime the public access users would have no idea that the proprietary or confidential information even existed. This scenario could be extended even further, with multiple offices, individuals or companies having their own “extensions" to the same basic data.
- the Access Restriction field 946 and Access Code field 947 provide a person or company loading data with a means of keeping that data away from others.
- One contemplated use is to keep adult materials from under-age users.
- a vendor can load images and other data onto the system, using parameters and values to describe whatever is being offered.
- a subsequent user accesses that information in a table such as the main display table 330 of Figure 3 A, only the text will appear in the cells. If the user then clicks on the Select button 332 to view the full record, he would be presented with an access screen (not shown) corresponding to a code stored in the Access Restriction field
- the access screen would have functionality for weeding out the under-age users, and in this instance preventing them from viewing the videos or images contained in the full record. In other contemplated scenarios, the access screen could have functionality that provides access only to users entering a code that matches the data in the Access Code field
- the Parameter-Value table 950 stores a Parameter-Value (PV) Key 951 , which is probably not used anywhere else in the database. It is unlikely that the Parameter- Value table 950 will be sorted by PV Key 951, and instead it may be kept more or less sorted by the Parameter Key 952 that relates back to the Parameter Key 921 of Parameters table 920. Most likely, the Value Key-1 always contains data because it stores a key for either a single value where no range is involved, or for the low value where a range is involved. The Value Key-2 most likely contains data only where a range is involved, and in that case includes a key for the high value of the range. To simplify matters, it is preferred that ranges always interpreted as being inclusive on both ends.
- the Correlation field 955 can be used for all sorts of purposes, but preferably to store the sort information discussed above with respect to Figures 3A - 3E.
- the system would first display all Level- 1 classes stored in the Classification table 910 of Figure 9. Upon selection of one of the Level-1 classes by the user, the system would search for and display all Level-2 classes subordinate to the selected Level-1 class. The user would then select one of the listed Level-2 classes, and upon such selection the system would search for and display all Level-3 classes subordinate to the selected Level-1 class. It is preferred, however, that all of the Level 3 classes would be spanning classes that are more or less applicable to all of the Level-2 classes, and would be displayed no matter what Level 1 / Level 2 choices the user had previously made.
- the system would search the Level-1, level-2 and Level-3 fields for matches, and display the selected record set in descending frequency order as in the table 130 of Figure 1.
- the system may well employ a synonyms table (not shown). If no matches were found the system would then search the value field of the parameter- value table 950, and work backwards from the selected record set to determine the corresponding classifications to display.
- the system may advantageously employ a synonyms table (not shown). If no entries are found for the selected classification, the user would be notified of same, and prompted to add an item using a display as in Figure 2. Such an event may actually be a powerful incentive to a user, because that user has a wonderful opportunity to shape the parameters and values of that classification for future users.
- the system Upon selection of a classification, the system would display data interface such as the three-row parameter/filter/units selector 320, and the main data display table 330 of figure 3.
- the data in the parameter/filter/units selector 320 is determined by searching through the Parameter-Value table 950 for records having the selected classification.
- the system calculates relative frequencies for the parameters, and displays the top five to ten parameters (depending on system settings) as defaults in the parameters row 321.
- the user can modify those parameters or add new parameters in other columns.
- the system then fills in the corresponding data in the main data display table 330, which process does not require another search since the relevant record set was already located.
- selecting parameters for the various columns of the parameter/filter/units selector 320, and displaying the parameters selection interface of Figure 4 does not require any further trips to the database since all parameters for the selected classification were already located.
- the system will need to search the Parameter- Value table 950 to determine what values have been used for the selected parameters within the selected classification. Once that information is obtained, the system calculates the relative percentages on the fly, and displays the information in the values portion of the main display table 330 of Figure 3, and as needed the values table 510 of Figure 5.
- the system displays five to ten, or other number of the top frequency parameters for the selected classification, and then waits for the user to enter corresponding values, or change the selection of parameters. Values entered by the user are verified against the Parameter- Value table 950 for the selected classification and parameter, or alternatively values are chosen from the values listing of
- the systems and methods described herein are applicable to storing and retrieving information regarding services, as well as other forms of information.
- the global applicability derives in part because the self- evolving concept allows users to be creative in establishing parameters and values.
- a user may store scientific articles or other information relating to such articles using parameters such as "article name”, “author”, “abstract”, “keywords”, and “full text”.
- Historical facts can be stored articles using parameters such as "type of information” (where the value is “historical facts"), "subject matter” (where the value may be “ Egyptian Rome”), "persons involved” (where the value may be "Nero").
- an embodiment provides a classification interface 1000 similar to the interface 100 of Figure 1.
- a user entered the word "polymers" in a data entry field 1010, and the system provided a listing 1020 of all classifications including the term "polymers".
- the listing 1020 has five columns 1021 - 1025 and a vertical slider 1026.
- the system presents frequency information that assists the user in choosing among the various classifications.
- Some of these classifications may deal with offers to buy or sell polymers, but some of the classifications may also deal with miscellaneous information, including scientific articles, historical facts, and so forth.
- the classification system itself can be self evolving, allowing anyone to enter any classification he wants, where the classifications most commonly used bubble to the top, and those infrequently used sinking to the bottom.
- a user has selected a classification 11 10 of Plastics / Polymers / Polyesters, either by a keyword path or a tree search, and the system responded by displaying a table 1120 containing information related to the classification.
- the table 1120 has six columns 1 121 - 1126, and a vertical slider 1 128. By including a horizontal slider more rows can be utilized than can be visualized on the display at any given time.
- Each column contains cells having values for a given parameter, which is named in the first row 1131. If the current user wants to limit the column to records in which the chosen parameter matches a particular value or range of values, he can do so by entering the value or range of values in the second rowl 132 of the corresponding column. In Figure 11, no values were selected.
- the data rows 1 133 - 1 139 can contain data cells, with each row corresponding to a different entry.
- the final row 1 150 can be used to add a new item.
- Advertising graphics 1140 may be included as well.
- At least some of the columns are preferably dictated by the user.
- the user chose to include "Type of Story” in the second column 1 122, "Company Involved” in the third column 1 123, and so forth.
- the second row of each column can be used as a drop-down window to select parameters, where the user is guided by information relating to usage of these parameters by previous users.
- the user did not have to select the parameters shown here, and may well have had several dozen parameters to choose from.
- the parameter choice is limited to parameters previously employed with respect to the classification chosen.
- Sorting may be enforced by the user, or may occur from left to right, or in some other manner.
- the highest level sort may be determined by the data in the first column, the next highest level sort being determined by the data in the second column, and so forth.
- the system may be configured so that a user can adjust the widths of the columns, and possibly even the number of columns.
- each record is shown on a single line in Figure 11 , it is contemplated that a user could choose a multiple line format for one or more rows. If multiple line rows are not allowed, then a user could view the entire cell in an overlying window, perhaps by double-clicking on the cell.
- Data entry is contemplated to generally follow the teachings set forth with respect to the marketplace indices.
- a person entering a record i.e. a story line or item
- an idea not described with respect to the market place indices is that a user could couple his record to an existing record. In that manner, a subsequent user could find an item of interest, and then click on some portion of the related row, such as the cell in the first column of the row, to bring up all stories that were identified as being related.
- Figure 12 is a sample of a display screen similar to Figure 12. Again the user has selected a classification 1210 of Plastics / Polymers / Polyesters, either by a keyword path or a tree search, and the system responded by displaying a table 1220 containing information related to the classification.
- the table 1220 has four columns 1221 - 1224, and a vertical slider 1225. By including a horizontal slider more rows can be utilized than can be visualized on the display at any given time.
- Each column contains cells having values for a given parameter, which is named in the first row 1231. If the current user wants to limit the column to records in which the chosen parameter matches a particular value or range of values, he can do so by entering the value or range of values in the second rowl232 of the corresponding column.
- the data rows 1233 - 1237 can contain data cells, with each row corresponding to a different entry.
- the final row 1250 can be used to add a new item.
- Advertising graphics 1240 may be included as well. In this instance, the user chose a Sports/Olympics/Drug Use classification. Also, the user limited the items listed to those items in which the Type of
- the system can operate in a manner substantially similar to that described herein for marketplace information, and news and information.
- Figure 13 for example, a user entered the name, Clinton, and was presented with a listing of classifications relating to people's opinions regarding Clinton.
- the user checked off the third row of data, relating to Hillary's run for a senate office seat.
- the user has chosen to list questions according to frequency. No limitation (in row 2) as to the value of the frequency was selected, so that even questions that garnered minimal interest are included. Other columns selected list numbers of "yes" responses, numbers of "no" responses, total numbers of responses, dollar amounts, and average dollar amounts. Examples of possible other columns which were not selected are standard deviation, and other statistical functions.
- a user could have a text column, which might seek adjectives to describe a particular venture. In data row 3, for example, a question was listed that asks for an adjective. The user chose the fourth column to list adjectives, and in the pull down of the values (not shown) may have received a listing of what adjectives were used by what percentage of respondents. Alternatively, the user could have entered a particular adjective as a value designation in the second row, and then chosen the other columns to obtain data with respect to respondents choosing that adjective. More complex tables can also be provided that statistically analyze specific responses to a given question.
- the system can keep track of identifiers for individual users, so that their opinions about literally hundreds of various things can be accumulated over the years.
- identifiers for individual users, so that their opinions about literally hundreds of various things can be accumulated over the years.
- news and information can also be applied to opinions.
- Historical facts may be readily stored as event/outcome pairings. For example, a user in the field of chemistry may choose to store the results repeated experiments using a given protocol in a single record. Illustrative parameters and values are listed below in a format similar to that of Figure 2:
- the system can readily be configured to perform statistical calculations (averages, standard deviation, etc) on data within a single entry, and on data across multiple entries - and produce corresponding graphs as desired.
- the statistics and/or graphs can be stored in specialty parameters, or displayed in separate windows.
- One particularly useful embodiment is for the insurance company to store treatment information in parameters such as diagnosis, treatment, and results.
- the system could then automatically calculate the percentages with respect to doctors in a given hospital, geographic region, or specialty, or those utilizing a given treatment for a given diagnosis. Such information would be presented automatically in a values type interface such as that depicted in Figures 5B.
- a user would select a diagnosis and view the relative percentages of the treatments for that particular make.
- the user could select a diagnosis and a treatment, and view the relative percentages of the corresponding treatment results.
- Still other parameters covering signs and symptoms could be added so that the database would evolve over time to be an incredibly useful statistical resource.
- analogous information could be stored and retrieved in many fields, including herbs and alternative medicine, or even car repair. It would be very interesting, for example, for a user to compare what repairs tend are performed by a particular service station with repairs performed by the industry in general, or by repair shops in a local area.
- Other parameters may, for example, be useful in providing auction information. It is contemplated, for example, to provide parameters such as Auction Opening Date, Auction Closing Date, Auction Closing Time, No of Bids, Most Recent Bid, and Bidding History.
- the Bidding History parameter would most likely be a specialty parameter that engaged a program to produce the history.
- These parameters can be displayed along with any other selected parameters, so that a single table can contain both auction and "for sale" information. Still further, those skilled in the art will recognize that the same table can also contain any other information desired by the viewer. In looking for a book, for example, a viewer can see not only prices, delivery and other information from many different vendors, but also auctions, rentals, prices of used copies, and so forth. In short, the viewer gets to see all the information that he wants to see, and none of the information that he doesn't want to see.
- the self-evolving type of parameter- value database can be utilized to provide an index to case law.
- the LEXISTM system does not presently have a sophisticated key indexing system such as that found on the WestlawTM system.
- a given user may well find that the WestlawTM key system does not characterize the cases in a manner especially useful to that particular user.
- users could categorize cases in any manner that they choose, and then record their own summaries or other interpretations of cases of interest to them using parameter-value pairs.
- LEXISTM or WestlawTM could keep track of such categorizations and parameter- value pairs as a service for the users, but then also make available to all users the categorizations and parameter-value pairs used by others. In that manner categorizations and summaries of cases in each particular field and subfield would eventually evolve to reflect what the users have stored for their own benefit. This would allow LEXISTM to develop its own key-type system without doing much of anything. In the hands of WestlawTM, systems according to the present invention could be used to supplement the existing key system.
- the contemplated user developed categorization systems can be considered as a method comprising: providing an interface through which a user can categorize a document indexed on the database using at least one parameter-value pair; providing the user with a first listing that displays a set of parameters previously used by others in categorizing other documents; providing the user with a second listing that displays a set of values previously used by others in categorizing the other documents; and allowing the user to add a new value to the set of values such that subsequent users will have access to the new value.
- the development or "evolution" of the categorization may be either entirely or only partially dependent upon actions of the users.
- the step of providing the user with a first listing includes displaying historical usage information for individual members of the displayed parameters.
- the step of providing the user with a second listing includes displaying historical usage information for individual members of the displayed values.
- Still another aspect of preferred embodiments further comprises allowing the user to add a new parameter to the set of parameters such that subsequent users will have access to the new parameter.
- Yet another aspect of preferred embodiments further comprises storing the at least one parameter-value pair in a storage system distal to the user, and providing the user with access to the at least one parameter- value pair at a future date through a network.
- Another potentially valuable feature of in preferred embodiments of the present invention is that they can provide values lists that are automatically shortened as the user narrows in on his search.
- Matthew BenderTM markets a CD ROM based database product for accessing intellectual property case law.
- system users are assisted in formulating their queries by access to "word wheel" listing all of the indexed keywords on the system.
- the word wheel is advantageous in many ways, such as preventing mis-spellings of keywords on the part of the searcher, and identifying alternative word spellings and even mis-spellings in the case law.
- the keywords in existing word wheels are all jumbled together rather than being separated according to some sort of usage classification.
- Another problem with existing word wheels is that the keywords are displayed are always the entire keyword listing rather than just those keywords located in a previously selected subset of records. This wastes time since it does a user precious little good for a user to add a keyword to his search strategy if the keyword is not present in any of his documents. Still another problem with existing word wheels is that they either do not show the frequency of the keywords in the documents, or the frequencies are not updated as the search proceeds to narrower and narrower record sets.
- the values listing for the parameter Opinion Date would include only dates, and not the names of parties. This solves the jumbling problem alluded to above.
- the third problem of depicting frequencies would also be solved.
- Preferred embodiments of the present invention automatically display values with their corresponding frequencies of use.
- the frequencies are shown in relative frequency format as a percentage from 0 to 100, but they could readily (and even more simply) by shown as the raw occurrence frequency. Still further, however they are displayed, the frequencies would automatically be updated as filtering occurs.
- these advantages can be considered to result from any method of searching a database having a plurality of records, in which the method comprises: deriving a plurality of terms from the plurality of records; displaying to a user at least some of the terms in a first interface; selecting a search term from the plurality of terms; using the search term to derive a subset of records from the plurality of records; using the search term to derive a subset of terms from the plurality of terms; and displaying to the user at least some of the subset of terms in a second interface.
- the terms would often be ordinary text, or perhaps alphanumeric, and that the terms could advantageously be listed in either alphanumeric or frequency order.
- the interfaces would likely comprise one or more windows on a computer operated display screen, although as technology progresses the interfaces may be completely or primarily audial
- parsing and./or tagging may be used to separate out individual terms (values) from the documents, and however derived the terms may advantageously be stored as values of parameter-value pairs.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2000/005638 WO2001067300A1 (fr) | 2000-03-03 | 2000-03-03 | Bases de donnees a valeurs de parametres ameliorees |
AU2000233946A AU2000233946A1 (en) | 2000-03-03 | 2000-03-03 | Improved parameter-value databases |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2000/005638 WO2001067300A1 (fr) | 2000-03-03 | 2000-03-03 | Bases de donnees a valeurs de parametres ameliorees |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2001067300A1 true WO2001067300A1 (fr) | 2001-09-13 |
Family
ID=21741116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/005638 WO2001067300A1 (fr) | 2000-03-03 | 2000-03-03 | Bases de donnees a valeurs de parametres ameliorees |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2000233946A1 (fr) |
WO (1) | WO2001067300A1 (fr) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003027902A1 (fr) * | 2001-09-21 | 2003-04-03 | Endeca Technologies, Inc. | Procede et systeme de navigation et de recherche hierarchiques guides par donnees pour recuperation d'informations |
WO2003027901A1 (fr) * | 2001-09-21 | 2003-04-03 | Endeca Technologies, Inc. | Systeme de navigation hierarchique guide par des donnees evolutives et procede de localisation d'information |
US6633903B1 (en) * | 2000-03-23 | 2003-10-14 | Monkeymedia, Inc. | Method and article of manufacture for seamless integrated searching |
WO2004057453A2 (fr) * | 2002-12-20 | 2004-07-08 | Sap Aktiengesellschaft | Defilement de donnees dans une interface utilisateur graphique |
WO2004059526A2 (fr) * | 2002-12-30 | 2004-07-15 | Richard Wiedemann | Systeme de gestion de l'information |
WO2004059525A2 (fr) * | 2002-12-30 | 2004-07-15 | Richard Wiedemann | Systeme de gestion d'informations |
CN100403314C (zh) * | 2006-04-19 | 2008-07-16 | 华为技术有限公司 | 一种数据查询方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5632009A (en) * | 1993-09-17 | 1997-05-20 | Xerox Corporation | Method and system for producing a table image showing indirect data representations |
US5745899A (en) * | 1996-08-09 | 1998-04-28 | Digital Equipment Corporation | Method for indexing information of a database |
US5802361A (en) * | 1994-09-30 | 1998-09-01 | Apple Computer, Inc. | Method and system for searching graphic images and videos |
US5802525A (en) * | 1996-11-26 | 1998-09-01 | International Business Machines Corporation | Two-dimensional affine-invariant hashing defined over any two-dimensional convex domain and producing uniformly-distributed hash keys |
-
2000
- 2000-03-03 AU AU2000233946A patent/AU2000233946A1/en not_active Abandoned
- 2000-03-03 WO PCT/US2000/005638 patent/WO2001067300A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5632009A (en) * | 1993-09-17 | 1997-05-20 | Xerox Corporation | Method and system for producing a table image showing indirect data representations |
US5802361A (en) * | 1994-09-30 | 1998-09-01 | Apple Computer, Inc. | Method and system for searching graphic images and videos |
US5745899A (en) * | 1996-08-09 | 1998-04-28 | Digital Equipment Corporation | Method for indexing information of a database |
US5802525A (en) * | 1996-11-26 | 1998-09-01 | International Business Machines Corporation | Two-dimensional affine-invariant hashing defined over any two-dimensional convex domain and producing uniformly-distributed hash keys |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6633903B1 (en) * | 2000-03-23 | 2003-10-14 | Monkeymedia, Inc. | Method and article of manufacture for seamless integrated searching |
US7739357B2 (en) | 2000-03-23 | 2010-06-15 | Eric Justin Gould | System, method, and article of manufacture for seamless integrated searching |
US7653704B2 (en) | 2000-03-23 | 2010-01-26 | Gould Eric J | System, method, and article of manufacture for seamless integrated searching |
US7062483B2 (en) | 2000-05-18 | 2006-06-13 | Endeca Technologies, Inc. | Hierarchical data-driven search and navigation system and method for information retrieval |
WO2003027901A1 (fr) * | 2001-09-21 | 2003-04-03 | Endeca Technologies, Inc. | Systeme de navigation hierarchique guide par des donnees evolutives et procede de localisation d'information |
WO2003027902A1 (fr) * | 2001-09-21 | 2003-04-03 | Endeca Technologies, Inc. | Procede et systeme de navigation et de recherche hierarchiques guides par donnees pour recuperation d'informations |
WO2004057453A2 (fr) * | 2002-12-20 | 2004-07-08 | Sap Aktiengesellschaft | Defilement de donnees dans une interface utilisateur graphique |
WO2004057453A3 (fr) * | 2002-12-20 | 2004-12-29 | Sap Ag | Defilement de donnees dans une interface utilisateur graphique |
WO2004059526A3 (fr) * | 2002-12-30 | 2004-09-23 | Richard Wiedemann | Systeme de gestion de l'information |
WO2004059525A3 (fr) * | 2002-12-30 | 2004-09-10 | Richard Wiedemann | Systeme de gestion d'informations |
WO2004059525A2 (fr) * | 2002-12-30 | 2004-07-15 | Richard Wiedemann | Systeme de gestion d'informations |
WO2004059526A2 (fr) * | 2002-12-30 | 2004-07-15 | Richard Wiedemann | Systeme de gestion de l'information |
CN100403314C (zh) * | 2006-04-19 | 2008-07-16 | 华为技术有限公司 | 一种数据查询方法 |
Also Published As
Publication number | Publication date |
---|---|
AU2000233946A1 (en) | 2001-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10733250B2 (en) | Methods and apparatus for matching relevant content to user intention | |
US7181438B1 (en) | Database access system | |
US7698315B2 (en) | System and method allowing advertisers to manage search listings in a pay for placement search system using grouping | |
US10262028B2 (en) | Simultaneous intellectual property search and valuation system and methodology (SIPS-VSM) | |
US8296296B2 (en) | Method and apparatus for formatting information within a directory tree structure into an encyclopedia-like entry | |
US6732092B2 (en) | Method and system for database queries and information delivery | |
US8015063B2 (en) | System and method for enabling multi-element bidding for influencing a position on a search result list generated by a computer network search engine | |
US8380721B2 (en) | System and method for context-based knowledge search, tagging, collaboration, management, and advertisement | |
US20040230461A1 (en) | Methods and systems for enabling efficient retrieval of data from data collections | |
EP1282051A1 (fr) | Système et procédé pour permettre des offres sur des éléments multiples pour influencer une position dans une liste de résultats générée par un moteur de recherche de réseaux informatiques | |
US10026112B2 (en) | Systems and methods for storing and retrieving goods and services information using parameter/value databases | |
DE20023291U1 (de) | System zur Beeinflussung einer Position in einer von einer Suchmaschine eines Computernetzwerks erzeugten Suchliste | |
WO2001067300A1 (fr) | Bases de donnees a valeurs de parametres ameliorees | |
WO2001053967A1 (fr) | Bases de donnees perfectionnees de valeurs de parametres | |
Türker | The optimal design of a search engine from an agency theory perspective | |
Lucas et al. | The present and future of internet search | |
WO2001063477A1 (fr) | Systemes et procedes de gestion d'informations fournies par l'utilisateur dans un reseau | |
WP | Internet Information Aggregation using the Context Interchange Framework | |
Suen | Internet information aggregation using the Context Interchange framework |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ CZ DE DE DK DK DM EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase |