WO2000026766A1 - Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens - Google Patents

Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens Download PDF

Info

Publication number
WO2000026766A1
WO2000026766A1 PCT/US1998/023492 US9823492W WO0026766A1 WO 2000026766 A1 WO2000026766 A1 WO 2000026766A1 US 9823492 W US9823492 W US 9823492W WO 0026766 A1 WO0026766 A1 WO 0026766A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
user
list
restricted
keywords
Prior art date
Application number
PCT/US1998/023492
Other languages
English (en)
Inventor
Sullivan Walter, Iii
Carlos D. Aponte
Ivan K. Saltz
Original Assignee
Sullivan Walter, Iii
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sullivan Walter, Iii filed Critical Sullivan Walter, Iii
Priority to PCT/US1998/023492 priority Critical patent/WO2000026766A1/fr
Priority to AU13805/99A priority patent/AU1380599A/en
Publication of WO2000026766A1 publication Critical patent/WO2000026766A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24522Translation of natural language queries to structured queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2428Query predicate definition using graphical user interfaces, including menus and forms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/243Natural language query formulation

Definitions

  • the present invention relates to database query systems.
  • it relates to database query systems which reduce sensitivity to keyword selection by automatically translating keywords when data is inserted into the database and automatically translating keywords when a query is made to the database.
  • a database query system For many industries, there are a variety of uses to which a database query system can be put to use, including billing, sales, personnel records, medical and legal data, parts, inventory, etc.
  • Early database systems typically required highly skilled personnel both to create original data entries and to construct queries to search the database.
  • the training level and language background of the personnel using database systems became increasingly unpredictable.
  • stem searching Prior art search systems have tried to increase the number of matches by allowing a user to enter a partial keyword.
  • This type of search is called stem searching. For example, the stem "bath” would return all records with bath, baths, bathroom, etc.
  • Another common prior art technique is to use a thesaurus to generate multiple queries for each keyword that the user enters. So a user desiring to locate an apartment in a real estate system may enter the keyword "apartment", but the system will search for apartment, apartments, condominium, condominiums, condo, flat, flats, quarters, duplex, studio etc. The result is multiple searches resulting in an inefficient database. While earlier database applications may have had untrained users on the query side, they would usually have users with some training, even if minimal, on the data input side. Now however, systems are available in which data is subject to poor keyword selection from both directions: data input and data query. For example, a number of database systems have been developed to provide information related to the buying, selling or renting of particular types of property. For example, real estate listing services have cropped up on the Internet which allow anyone to write their own "ad”.
  • This type of database query system i.e., systems which are used to sell or trade property
  • the reason for this is that users of these systems who may be experts on the particular types of property they trade in, may be inexperienced with computers or even computer illiterate. Their lack of computer skill results in a poor choice of keywords because they do not understand the implications of a particular keyword selection.
  • the problems related to keyword selection are magnified by this type of system, because the same users who enter poorly formed keywords are often permitted to enter data into the database for use by other users.
  • the problem is compounded by users, who may or may not be experienced with computers, but who have no knowledge of the jargon used in the subject of the database. Of course, this problem is even further complicated when users that speak or use different dialects or languages attempt to enter or query the database.
  • a real estate listing system is most often used by individuals in a relatively confined geographic area. Using colloquial expression, they may be unconcerned with other users who may have different jargon or language. Therefore, the problems caused by the use of dialectical keywords in a system with a geographically confined audience are minimal.
  • systems similar to a real estate listing system such as those used on an international or global basis to trade property or commodities, are more prone to errors due to keyword selection.
  • the market for high quality or luxury items such as yachts, aircraft, exotic cars, businesses or luxury estate properties are marketed over wide, even global, markets. Wide geographic markets not only create problems due to multi-lingualism in the user population, but also create problems due to regional dialect or slang variations in a single language. It would be desirable to have a database system capable of use over wide areas or multilingual areas which could insulate users from keyword selection problems caused by language differences.
  • airplanes have substantial values which tend to limit their sale in a local market.
  • the most effective way to market commodities such as airplanes is by reaching an international market through a globally accessible database system in which information can be exchanged between buyers and sellers, and their agents.
  • a globally accessible database system in which information can be exchanged between buyers and sellers, and their agents.
  • a listing database for yachts may result in a yacht owner in Hawaii listing a boat located in Baja, Mexico with a yacht broker in San Diego, California.
  • a buyer in Switzerland may use a broker in Turkey to search for a desired yacht type.
  • the opportunity for selecting dialectal, slang, local jargon, obscure or ineffective keywords in such a situation greatly increases the chances that the yacht will not be found when the buyer's broker searches the database. Therefore, it would be desirable to have a system which could insulate all of the parties from missed matches due to data entry and query keyword differences, as well as insulating the system from performance problems which result from inefficiently searching the database with an excessive number of keywords.
  • Prior computer systems directed to these markets have primarily used textual descriptions which are exposed to all of the keyword problems discussed above.
  • prior systems have not provided a structured database which would more easily define data entries in terms of the unique and myriad structural and equipment combinations available to complex properties such as yachts or airplanes. It would be desirable to have a structured database in which the data entry and query processes could be dynamically altered to accommodate variances in equipment descriptions.
  • the prior art has failed to provide a system which reduces keyword related errors by making both the data entry and the data query applications independent of the keywords used to search the database.
  • the prior art has not provided a system which accepts any keyword entered and dynamically selects and substitutes keywords from a restricted list of keywords for data entry in the database such that a uniform searchable field is provided for query and then substitutes keywords from the restricted list of keywords for the data query such that the search locates the desired keyword even though the data entry and the data query portions of database access use different keywords.
  • the prior art has not provided a dialect independent query system that allows complete sentences to be created and retrieved, in a grammatically correct form, utilizing a dynamic and configurable list of keywords.
  • the present invention solves the foregoing problems by providing a restricted keyword list and bidirectional keyword translation (e.g., data entry or search) for use with a database (a database can be any information storage system).
  • New entries received from a user of the system are input to a keyword translator.
  • a restricted keyword list is accessed by the keyword translator which compares the user-entered input with a restricted list of acceptable keywords and acceptable synonyms. If the input is on the list, it is used. If the input is not on the list, but it is a synonym for a keyword on the list, then the keyword is substituted before storing the information.
  • a query keyword is input to the keyword translator.
  • the restricted keyword list is accessed by the keyword translator which compares the query keyword with a list of acceptable keywords and their synonyms. If the query keyword is on the list, it is used. If the query keyword is not on the list, but it is a synonym for a keyword on the list, then the keyword is substituted.
  • the keyword translator is used when the database is being added or updated and also when the database is being queried by a user, but these activities do not have to occur at the same time or in the same device or system.
  • the bidirectional keyword substitution (i.e., using keyword substitution when necessary in the data input and data query phases of database use) makes the database independent of the keywords on both the input and query sides of the system.
  • the invention uses selected keywords from the keyword list to ensure that in the process of accessing the database, database entries are not missed due to the use of dialectal or obscure keywords on either the data entry or the data query phases of database use.
  • Figure 1 is a block diagram illustrating the bi-directional keyword substitution system used by the invention.
  • Figure 2A is a block diagram illustrating the preferred embodiment of the restricted keyword list, the synonym list and their relationship to the underlying property item.
  • Figure 2B is an example of the embodiment of figure 2A which shows a specific keyword "Sofa” and its synonyms.
  • Figure 3 is a diagram showing the layout of the data entry screen used in the preferred embodiment.
  • Figures 4A and 4B comprise a flow diagram which illustrates the realtime process of updating the phrase display.
  • Figure 5A illustrates the data entry process in prior art systems.
  • Figure 5B illustrates the data entry process in the preferred embodiment.
  • Figure 5C illustrates the data query process in prior art systems.
  • Figure 5D illustrates the data query process in the preferred embodiment.
  • database can include any form of data storage, including relational databases, flat files, file management systems, object oriented databases, or any data storage which contains discrete fields of information which a user would search for.
  • the terms "dialect” or “dialectal” shall mean any formally recognized foreign language, dialect, colloquialism, local expression, slang or jargon (i.e. technical terms which are normally used in a particular profession or discipline).
  • hit indicates that a match has been made between the search keyword and a word in the database.
  • listing means any organized collection of specifications, facts, and opinions used to describe a commodity or property for the purpose of sales or rent.
  • the use of a specific property type, such as homes, aircraft, etc. is for illustrative purposes only. The features and advantages of the invention can be applied to any subject matter which is suitable for a database.
  • this figure shows a bidirectional keyword substitution system 100 used by the invention.
  • This embodiment is intended for use as an illustrative example to show the benefits of a bidirectional keyword translator.
  • "bidirectional" is defined as the translation of a keyword received from a user, when necessary, to a keyword on a restricted list of keywords both when inputting data records into the database, and when selecting keywords for searching the database. More detailed descriptions of the features and advantages of the system will be discussed in regard to the figures, below.
  • Database 114 is used for storage of relational database information.
  • the database 114 can be stored in a user's station (e.g., a personal computer) or in a central location (e.g., a server) that can be remotely accessed by a user.
  • the data stored in database 114 can be any useful information which would lend itself to searching for particular data items.
  • a property listing system for boats will be used to show the features and advantages of the system 100.
  • information related to any type of property or inventory can be used.
  • the particular arrangement of data in the database 114 will vary to suit the nature of the property information stored in the database 114.
  • a database controller 112 (hereinafter controller 112), is used to add, delete, or update data items in database 114 when listings are added or deleted, or when information related to the property changes.
  • the database controller can be implemented as a data processing apparatus (e.g., a microcomputer) programmed to perform as required for this invention; or, it can be implemented as a system of apparatuses such as two or more computers in a client/server configuration.
  • data entry input 104 hereinafter data input 104
  • Data input 104 can be any suitable device or method used by computers to enter data or commands, such as a keyboard, a mouse, a voice recognition equipment, etc., all of which are well known in the art. In prior art systems, data entries were directly made into the database 114, via a data input device 104 and controller 112.
  • the database controller 112 can be implemented with, for example, any commercially available computer, micro controller or microprocessor.
  • a disadvantage of directly storing searchable keyword data into database 114 is that a poor choice of keywords by a particular user will make the data records input by that user difficult to locate by subsequent users making database searches. In the area of property listing systems, this problem is exacerbated in several ways.
  • database systems are widely available for access by individuals having little or no skill with computers. An unskilled user will typically have no appreciation for the effect of inputting a particular keyword on a subsequent search. Likewise, an unskilled user may not find the data being sought because of an equally poor choice of keywords when performing a search of a database. Due to the increasing number of non-computer professionals who enter and/or search data in relational databases, the ability to successfully locate all of the pertinent records has steadily decreased.
  • An alternative embodiment provides language translators which allow data to be entered in one language, stored in the database in a second language, and queried in a third language.
  • the language translator may be an independent function or may be an integral part of the keyword translator.
  • multiple languages can be incorporated into the same database and selected by user preference or automatically selected when a keyword unique to a given language is entered.
  • a principal advantage of data base system 100 is the avoidance of errors due to poor keyword selection by a user (e.g., a person buying or selling an object property such as a yacht, or a dealer working with a buyer or seller) in both the data input and data searching operations. Errors are avoided by converting, when appropriate, a keyword received from a user into a keyword which is a member of a limited set of keywords (i.e., a restricted keyword).
  • a keyword entered by a user into the data input 104 is routed to a translator 102. Prior to entry of data into the database, the translator 102 compares the received user-entered keyword with restricted keywords in a restricted keyword list 106 (hereinafter keyword list 106).
  • Synonym list 108 contains a list of keywords which have the same meaning as related words on the keyword list. For example, one may use the keyword "head" in regard to a bathroom on a boat. However, several other words may be used to describe a head, such as bathroom, lavatory, convenience, John, toilet, commode, etc. In prior art systems, entry of obscure keywords could hinder any subsequent search of the data record.
  • the benefit of the synonym list 108 is that it has a list of words, such as those discussed above in regard to the keyword "head”, which would be equivalents.
  • the benefit of the synonym list 108 is that it has a list of words, such as those discussed above in regard to the keyword "head”, which would be equivalents.
  • the equivalent words in the synonym list 108 are related to a single related keyword in the keyword list 106. If the entered keyword is not on the keyword list 106, but it is on the synonym list 108, then the related keyword on the keyword list 106 is selected by the translator 102 and input to the controller 112 which in turn inputs the related keyword to the database 114 instead of the entered keyword.
  • a benefit of restricting the keywords entered into the database during the data entry operation is that such restriction provides a uniform database 114 resulting in improved and more consistent search results.
  • the keyword list 106 and synonym list 108 can also contain plural forms of each word. Plural forms may be stored as rules or spelled out with each itemcode 206 (shown in figure 2A) and synonym 208 (shown in figure 2A). The plural form will be found by the keyword translator 102 if the user enters the plural form of the word when describing the object 202 to be entered into the database 114. The keyword translator 102 will substitute the singular form of the itemcode 206 before submitting the records to the database controller 112 for insertion into the database 114. For example, in a luxury real estate listing service, a user may describe a property as having 2 aviaries. The item will be stored in the database as quantity 2, itemcode aviary. Then a user building a query may request a property that has a birdhouse. The keyword translator 102, using the
  • n synonym list 108 will substitute the word aviary before submitting the query to the database controller 112 and the appropriate properties will be found.
  • the user can be queried for more information via conventional means (such as a pop- up dialog box in a graphical system asking for another possibility) such that an acceptable keyword can be determined.
  • the keywords are presented to the user in a list from which he can select them by mouse click.
  • there is a Find button 312 which can help the user find an acceptable keyword.
  • the operation of button 312 is more fully described in Table 4.
  • the invention may implemented in other ways, though. For example, the user may input the desired keyword into a data entry field and the keyword and synonym lists are subsequently searched via well-known conventional means. If an acceptable match is not then found, the user can be prompted with a request to try a different word.
  • a user makes database queries via the query input 110.
  • the query input 110 can be any suitable device or method used by computers or other data processing apparatus to enter data or commands, such as a keyboard, a mouse, a touch screen or voice recognition equipment, etc., all of which are well known in the art.
  • Query keywords are routed to the translator 102 which searches the keyword list 106 and synonym list 108 in the same manner as the input keyword was searched above with respect to the data entry operation.
  • a principal advantage of the instant invention is that it ensures that the controller 112 uses the same keywords for both the data entry and the data search operations. As a result, the errors and omissions which are caused by the use of obscure or mismatched keywords are substantially or completely eliminated.
  • Another advantage of the invention is that it allows the use of local expressions and colloquialisms in listings. For example, an agent in France may insist on using the word “settee” in his listings because he feels it sounds better than "couch," or some other term.
  • the use of the restricted keyword searching, as discussed above, will allow the listing agent to use keywords which are appropriate for the agent's location, while allowing an agent in a distant location such as California to search the database 114 using a synonym such as "couch" and still find the French agent's listing.
  • the data entry input 104 and the query input 110 are shown as separate devices.
  • the devices When data is input by a user in one location and accessed by users in remote locations, then the devices will typically be separate. However, they can be a single device.
  • a single keyboard in one computer could be made to accomplish both functions.
  • either device can be any apparatus capable of receiving data, such as a touch panel screen, a light pen and screen, voice recognition equipment or a mouse and screen combination.
  • the keyword translator 102 can be a single unit or there can be a keyword translator 102 dedicated to the data entry input 104 and a second keyword translator 102 dedicated to the query input 110.
  • the query output 116 can be a conventional display, a voice synthesizer system, or other suitable device.
  • the data entry input 104, the query input 110, and the query output 116 can also all be implemented by a single touch panel display. Of course, in a large system, multiple data entry inputs 104 and query inputs 110 can be used.
  • the controller 112 can also be implemented using a number of known database search engines. Likewise, controller 112 can be implemented as hardware or as software.
  • Figure 2A illustrates the preferred embodiment of structures for the keyword list 106 and synonym list 108 and their relationship to the underlying object 202.
  • Each object 202 in a database would be the listing for the underlying property (for example, a boat for sale or rent, etc.).
  • each object 202 would normally have many features such as a galley, engine number and type, communications equipment, etc.
  • Each feature would have an entry in the item table 204 (from 1 to n) which contains a description. The description might be couch, sofa, settee, etc.
  • the description entry would have both a pointer 212 to the synonym in xref table 208 that the user prefers, and a pointer 210 to the restricted keyword in itemcode 206.
  • the itemcode 206 would contain a restricted keyword (for example, the term "sofa") from the restricted keyword list 106 discussed above in regard to figure 1.
  • the itemcode 206 will have the same keyword stored in it. There will be one itemcode 206 for each restricted keyword used for searching purposes. While the preferred embodiment uses a table containing multiple itemcodes 206, only one entry in the table is shown for ease of illustration.
  • the xref table 208 will also point to the itemcode 206 entry. Each xref entry in the xref table 208 will contain a synonym for the restricted keyword stored in the itemcode 206 and a pointer back to the itemcode 206.
  • Many itemcodes 206 have several xref 208 entries, each containing a synonym. These are the synonyms that the user is allowed to enter into a description field in the item table 204. These are also the synonyms used for a query operation. For example, the object 202 contains "couch" in the description field of the item table 204 used in the listing. In addition, it has a pointer in the item table 204 to the restricted keyword "sofa" in the itemcode 206. An xref table 208 will be set up if there is at least one synonym for a restricted keyword. For each synonym, there will be a pointer in its associated xref table 208 entry which points back to the itemcode.
  • the xref table 208 entries may contain the terms “couch”, "settee”, etc. Therefore, the user (e.g., seller) can have "couch” in the description field, the person querying the system (e.g., buyer) can use the synonym "settee", and the controller 112 will use the term "sofa" in the itemcode 206 for the actual database search.
  • the database search is independent of the keywords used by both the data entry operation and the data query operation.
  • the xref table 208 contains any synonyms for the restricted keyword stored in the itemcode 206. For example, “settee” and “couch” may be listed in the xref 1 and xref 2 fields respectively. Each entry in the xref table 208 would have a pointer back to the restricted keyword in the itemcode 206.
  • the itemcodes 206 are checked to the entered keyword. If found, the restricted keyword stored in the itemcode 206 is used to search the database 114. If not found, the xref table 208 are checked. If a match is found in the xref table 208, then the restricted keyword in the itemcode 206 is obtained via the xref table 208 pointer.
  • the system allows users on both sides of the system to have flexible use of synonyms for search terms and for descriptions during data entry while ensuring that the actual searching is done with keywords stored in the itemcode 206 that are standardized for each particular feature in an object 202.
  • the item table 204 contains several other fields that elaborate on the features. For example, each entry in the item table would have quantity, size, unit of measure, manufacturer, model, description, etc.
  • the types of additional data would be customized to suit a particular type of property (i.e., yachts, aircraft, real estate, etc.). In any case, these additional data are called keyword modifiers.
  • Figure 2B is an example of the embodiment of figure 2A with specific keywords in the itemcode 206 and synonym lists 208.
  • a restricted keyword "sofa” is stored in itemcode 206.
  • Synonyms for the word "sofa”, such as “couch", “settee”, etc., are stored in the xref table 208.
  • the item table 204 has an entry for each feature on the boat.
  • the item table 204 entry includes a pointer 212 to the xref table 208 entry with the word selected by the user and a second pointer 210 which points to the restricted keyword that is used by the controller 212.
  • the entries in the xref table 208 are selected by a limited group of expert users who attempt to include any appropriate synonym.
  • Figure 3 shows a preferred embodiment of a data entry screen 300 used with the system discussed above in regard to figures 1 and 2.
  • the upper left quadrant of the screen has a drop down list of categories 302.
  • Each itemcode 206 has a category associated with it which is stored with the itemcode 206 record. This category is used to narrow the list of possible items from which the user chooses. For example, in the case of homes, the categories may include rooms, accommodations, furnishings, entertainment, equipment, kitchen items, landscaping, etc. If the user selects accommodations, then items related to accommodations, such as closets, beds, and pillows can be entered, but not unrelated items such as lawn mowers or swimming pools. If the user selects furnishings in the category menu, then items such as chairs and cushions can be added, but not garage or satellite dish.
  • the structured nature of data entry using categories reduces the amount of items that a user must address for a particular feature and reduces the amount of time required to access the database.
  • the system's visual presentation of categorized lists of qualified items acts as a memory jogger as well as it helps the user identify the appropriate context in which to use an item.
  • the categories may include accommodations, deck, sails and rigging, or mechanical equipment. If the user selects accommodations, then items related to accommodations, such as closet, v-berth, and pillows can be entered, but not unrelated items such as generator or fishing chair. If the user selects deck in the category menu, then items such as life vests and cushions can be added, but not stove or mainsail.
  • Accommodation items related to accommodations, such as closet, v-berth, and pillows can be entered, but not unrelated items such as generator or fishing chair.
  • items such as life vests and cushions can be added, but not stove or mainsail.
  • the lower left quadrant of the data entry screen 300 contains the item list 304.
  • the displayed item list 304 contains both the items that the user may add and the items that the user has already added. They are distinguished from each other by font and/or color.
  • the item list 304 is organized in alphabetical order by item. It is further organized by sub-items for some, but not all items.
  • the structure of the itemcode table 206 allows the software to group some alternative items together under a heading. In this case, the item is not an itemcode, but a heading for the sub-items which themselves are itemcodes.
  • the heading item cannot be added to the description of the object 202, but the sub-items can be added.
  • the sub-items related to a particular item are indented under their respective item heading. In the preferred embodiment, the user may double-click on an item heading to alternately display and hide the sub-items under the item heading.
  • the itemcodes In addition to the items and sub-items displayed in the item list 304, the itemcodes
  • 206 that a user has added to the object description may be indented under the item name and displayed in a different color or font, etc. to denote those additions. The user selects the addition with the data entry mouse to modify it or delete it.
  • the data entry fields in the lower right quadrant 306 are keyword modifiers which are used to more fully describe the particular item description.
  • the keyword modifiers in this example are quantity, size, unit of measure, manufacturer, model, and post description.
  • real-time updating of the phrase builder box 308 is used. During the real-time updates, the phrase builder box 308 is dynamically redisplayed each time a character in a keyword modifier is entered, deleted, modified or edited.
  • table 1 illustrates the system as it could be used for boats. As an example, table 1 shows the effect of modifying the fishing chair entry.
  • the system will store and understand that the keyword (or itemcode 206) is a fishing chair. Another user who searches the database for a boat with a fishing chair will find this entry quickly, regardless of how elaborate the textual description has become.
  • a feature such as a television set, would also have related data such as the manufacturer, model number, cable readiness, etc.
  • properties such as avionics for aircraft would list features such as the make and model of the transponder, etc.
  • One advantage of using the structured form of data entry discussed above in regard to figure 3, is that it helps guide an inexperienced or semi-skilled user, or users from different countries or users with different languages, to enter a restricted set of keywords into an abstract inventory while simultaneously building a textual representation of the listing entry in the local language.
  • This advantage allows the phrase builder box 308 to present a natural language description of the property by automatically constructing the phrase from the individual data items in the property record to form a complete sentence from the individual keyword and its modifiers. When the property is displayed or printed, all of the automatically generated natural language phrases are used together to construct a natural language document describing the property.
  • An additional feature of the preferred embodiment is the ability to switch the item description to the plural form when the property contains more than one of the item.
  • the user entered a Quantity of 2, and the description is dynamically adjusted to the plural form.
  • the quantity may also be indicated by a Many-flag.
  • a house for example, may have an unknown quantity of cable TV outlets but known to be more than one.
  • the user selects the Many-flag by setting the Many-flag toggle box 316 instead of directly indicating a numerical quantity.
  • the Many-flag indicates a non-specific plural quantity amount. When the Many-flag is set, the software will display and print the plural form of the itemcode.
  • Table 2 shows the implementation of the item sensitive help database table with some examples as they might apply to a boat listing system.
  • a down arrow will appear next to the model keyword modifier entry box and next to the post description entry box in display quadrant 306. If the user then clicks on the down arrow to the right of the model box, a dialog will display the word "gimbaled” with a check box in front of it. If the user checks the check box with his mouse and clicks OK for the dialog, then the word "gimbaled” will show up in the model entry box in the lower right hand quadrant 306.
  • the Xclusive column in the database table demonstrated in table 2 is Y (yes) for a group in which the help options are mutually exclusive.
  • a radio for example, may have multi-band and single-band in the model help group. It would be marked Y in the Xclusive column since the radio could be multi-band or single-band, but not both.
  • the dialog for the help in this case would put radio buttons next to the help options, so that the user could select one or the other, but not both.
  • Another advantage of the system's implementation of item sensitive help is its ability to provide the user with unit of measure options that are compatible with the item that is being entered or modified. For example, when the user adds a "sail" item, a down arrow will appear next to the unit of measure keyword modifier entry box. If the user then enters a size and clicks on the down arrow to the right of the unit of measure box, a dialog will display a list of appropriate unit of measure options such as square meters.
  • a find box 310 is also shown in the upper left corner of the data entry screen 300.
  • the list of items in the item list 304 may optionally be organized in alphabetical order or by some other technique. If the list is extensive, or the user is unable to find the desired entry, the user may use the find box 310 to search the synonyms in the xref table 208.
  • a user may wish to add the term "toilet" to the list of items. If the item cannot be found, the term toilet is entered into the find box 310. The find button 312 is then selected. The itemcodes 206 table and the xref table 208 are then searched for a match using any conventional technique for matching terms, such as stem searching, phonetic scoring, etc. The search results are then displayed for the user, who would determine that the keyword "head" was the appropriate itemcode term.
  • This function allows the user to perform a dialect-equivalency search with the keyword tables. As a result, even semi-skilled users can produce data entries and data queries using the correct terminology.
  • a data entry is an item entered into a record in the database, whereas a data query is an item being searched in the database.
  • the list of items in the item list 304 is preselected and pre-sorted.
  • the list of items displayed in the item list 304 is limited to those items which are appropriate to that category.
  • a novel feature used for this portion of the system is that the list is dynamically altered based on previous selections, such that only a subset of items which are appropriate at that time are displayed.
  • the list of items displayed in 304 would be restricted to a list of items appropriate to that kind and length of boat, and also restricted by the category selection in category box 302. If the user had previously entered that the type of boat was a "sailboat", then the item list for the sail inventory category would include the keyword "sail”. However, if the boat was previously listed as a power boat then the keyword "sail” would not appear in subsequent item lists. Similarly, in a real estate system, if the user had entered a 3 -floor townhouse, the keyword "elevator" might appear on the item list for the amenities category.
  • an advantageous feature of the invention is that the keyword entries in the item list 304 are dynamically selected based on previous inputs about the object being described.
  • Click delete button (Only enabled if a user-item is highlighted on the left and displayed in the item description entry box.)
  • Table 4 illustrates the "Find dialog" used for word selection in the preferred embodiment.
  • Table 5 illustrates the Item Sensitive Help Dialog used for word selection in the preferred embodiment.
  • Another advantage of the invention is the ability, discussed above, to automatically adjust the words in the phrase builder box 308 to indicate plural or singular form. Since the phrase builder box 308 shows the textual description which will be printed and/or read by those searching for properties, the automatic conversion to the appropriate form of a word enhances readability for the user.
  • the textual output of the system which fully describes the property, appears to be in natural language format, even though the description of the object has been structured into a normalized database.
  • the invention has another feature that improves the usefulness of the textual output of the system.
  • each user can set on his own local device a "dialectal preference".
  • the itemcode and xref synonyms associated with an itemcode constitute a set of descriptions.
  • the user can specify which one of the set of descriptions is the preferred display for the local device. Whenever the property description is displayed or printed, the invention will choose the item description that has been marked as local "dialectal preference" instead of the description that was entered in the database.
  • an entry for a house in the United States may have a "den”.
  • a broker in Great Britain may indicate that term "drawing room” should always be used for this particular itemcode.
  • the device in Great Britain will always display a house as having a drawing room, even though the data was originally enter in the United States with a den.
  • Figures 4A and 4B illustrate the method used by the preferred embodiment to update the phrase builder box 308 in response to keystrokes as they are entered by the user.
  • a database related to boats is used for ease of illustration.
  • step 402 the phrase is initialized to an empty state prior to constructing a new phrase.
  • an item description is created. If a quantity was entered for an item, or if the Many-flag was set, then it would be tested in step 404 and appended to the item description in step 406. If a size is appropriate, then the size would be tested in step 408 and the item description updated in step 410. If the manufacturer identity is appropriate, then the manufacturer identification would be tested in step 412 and the item description updated in step 414. The model numbers would be added to the item description in steps 416 and 418 in like fashion. When the item description is complete, it would be appended to the phrase in step 420. Post descriptions are features which may be added to the description of an item. These would be tested in step 422 and added to the phrase in step 424. In step 426, the phrase display would be updated.
  • Figures 5 A and 5B illustrate a difference between data entry in prior art systems versus the preferred embodiment.
  • FIG 5 A the data entry process in prior art systems is shown.
  • Data entered in prior systems is input to the database as written.
  • Figure 5B illustrates the data entry process in the preferred embodiment.
  • keywords input to the database are screened to determine if they are to be directly input into the database or be substituted with a synonym keyword. As can be seen, multiple keywords result in a single searchable keyword being stored in the database.
  • Figures 5C and 5D illustrate a difference between data query in prior art systems versus the preferred embodiment.
  • FIG 5C the data entry process in prior art systems is shown. Data entered in prior systems is input to the system and expanded into multiple keywords prior to use by the controller 112. This is accomplished by a variety of known techniques, such as stem expansion. As a result, the performance of prior art search engines is degraded by the need to search for multiple keywords.
  • Figure 5D illustrates the data query process in the preferred embodiment.
  • query keywords input to the database are screened to determine if they are to be directly used or be substituted with a synonym keyword. As can be seen, multiple keywords result in a single searchable keyword being used to query the database.
  • a principal advantage of the keyword substitution method used in the preferred embodiment is that it increases the number of hits made during a search. The reason for this is the elimination of missed hits due to poor user selection of keywords on both the data input and data query sides of the database.
  • Another advantage of the keyword substitution method is that it provides improved performance due to the use of searching for only a single keyword in the database instead of many synonyms.
  • Another advantage of the keyword substitution method is that it gives users the option to create and to retrieve fully qualified expressions that make use of dialectal keywords.
  • the system uses a restricted keyword list internally, but allows the user to customize the description according to a personal preference or the preferences of the user's target audience.
  • a further advantage of the system is dynamic item selection.
  • dynamic item selection reduces the number of items which a user can select from based on a previous selection by the user. For example, in regard to features that are dependent on other aspects of a property (such as the elevator, discussed above, which are dependent on the number of floors of the property), subsequent screens will display only items which are possible based on the previous selection. Automatic plural selection is an advantage which provides a more readable listing phrase for the convenience of the user.
  • the keyword translator can be implemented separately for the data input and data query operations
  • the structure of the database can vary
  • the display presentations can vary, etc.
  • the type of information or property can vary.
  • Various features of the system can be implemented in hardware or software.
  • the database can be implemented in a variety of ways. For example, it can be stored on hard disk, CD-ROM, magnetic tape, distributed on a network, etc. Accordingly, the invention herein disclosed is to be limited only as specified in the following claims.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un système conçu pour une base de données (114) qui sélectionne des mots clés dans une liste de mots clés limités (106) permettant à la fois la mise à jour et la recherche (p. ex. traduction dans les deux sens). Un traducteur de mot clé (102) accepte un mot clé saisi par un utilisateur et sélectionne un mot clé limité à utiliser (p. ex. saisie d'une recherche) avec une base de données (114). Le mot clé saisi peut être choisi par l'utilisateur d'après les préférences utilisateur. Le système compare le mot clé saisi par l'utilisateur avec une liste de mots clés limités (106) afin de déterminer s'il existe dans cette liste (106) un mot clé correspondant au mot clé saisi par l'utilisateur. Le cas échéant, le système sélectionne le mot clé correspondant à utiliser avec la base de données (114). Si la correspondance n'existe pas, le système compare le mot clé saisi par l'utilisateur avec une liste de synonymes (108) afin de trouver dans la liste de synonymes un synonyme correspondant au mot clé saisi par l'utilisateur. Chaque synonyme contenu dans la liste est associé à un mot clé limité. Si un synonyme correspond à un mot clé saisi par l'utilisateur, le système sélectionne le mot clé limité associé au (ou correspondant au) synonyme trouvé.
PCT/US1998/023492 1998-11-04 1998-11-04 Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens WO2000026766A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/US1998/023492 WO2000026766A1 (fr) 1998-11-04 1998-11-04 Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens
AU13805/99A AU1380599A (en) 1998-11-04 1998-11-04 Database system with restricted keyword list and bi-directional keyword translation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US1998/023492 WO2000026766A1 (fr) 1998-11-04 1998-11-04 Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens

Publications (1)

Publication Number Publication Date
WO2000026766A1 true WO2000026766A1 (fr) 2000-05-11

Family

ID=22268232

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/023492 WO2000026766A1 (fr) 1998-11-04 1998-11-04 Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens

Country Status (2)

Country Link
AU (1) AU1380599A (fr)
WO (1) WO2000026766A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1351162A2 (fr) * 2002-03-28 2003-10-08 Matsushita Electric Industrial Co., Ltd. Appareil et méthode d'extraction de contenu
EP1353280A1 (fr) * 2002-04-12 2003-10-15 Targit A/S Un procédé de traitement de requêtes multilingues
EP1442396A1 (fr) * 2001-09-17 2004-08-04 Netpia.Com, Inc. Systeme permettant d'acceder a une page web a l'aide d'un nom reel et procede correspondant
WO2009033562A2 (fr) * 2007-09-07 2009-03-19 Daimler Ag Procédé et dispositif pour reconnaître une information alphanumérique
TWI459224B (zh) * 2012-09-06 2014-11-01 Shuttle Inc 電子裝置的內容搜尋方法及其應用程式

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4688195A (en) * 1983-01-28 1987-08-18 Texas Instruments Incorporated Natural-language interface generating system
US4751635A (en) * 1986-04-16 1988-06-14 Bell Communications Research, Inc. Distributed management support system for software managers
US4849898A (en) * 1988-05-18 1989-07-18 Management Information Technologies, Inc. Method and apparatus to identify the relation of meaning between words in text expressions
US4992972A (en) * 1987-11-18 1991-02-12 International Business Machines Corporation Flexible context searchable on-line information system with help files and modules for on-line computer system documentation
US5297039A (en) * 1991-01-30 1994-03-22 Mitsubishi Denki Kabushiki Kaisha Text search system for locating on the basis of keyword matching and keyword relationship matching
US5386556A (en) * 1989-03-06 1995-01-31 International Business Machines Corporation Natural language analyzing apparatus and method
US5765131A (en) * 1986-10-03 1998-06-09 British Telecommunications Public Limited Company Language translation system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4688195A (en) * 1983-01-28 1987-08-18 Texas Instruments Incorporated Natural-language interface generating system
US4751635A (en) * 1986-04-16 1988-06-14 Bell Communications Research, Inc. Distributed management support system for software managers
US5765131A (en) * 1986-10-03 1998-06-09 British Telecommunications Public Limited Company Language translation system and method
US4992972A (en) * 1987-11-18 1991-02-12 International Business Machines Corporation Flexible context searchable on-line information system with help files and modules for on-line computer system documentation
US4849898A (en) * 1988-05-18 1989-07-18 Management Information Technologies, Inc. Method and apparatus to identify the relation of meaning between words in text expressions
US5386556A (en) * 1989-03-06 1995-01-31 International Business Machines Corporation Natural language analyzing apparatus and method
US5297039A (en) * 1991-01-30 1994-03-22 Mitsubishi Denki Kabushiki Kaisha Text search system for locating on the basis of keyword matching and keyword relationship matching

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN ET AL.: "On the query translation in Ferative Relational databases", IEEE, 1996, pages 491 - 498, XP002922602 *
COTEZ ET AL.: "The Hybrid Application of an Inductive Learning Method and a Neural Network for Intelligent Information Retrieval", INFORMATION PROCESSING AND MANAGEMENT, vol. 31, no. 6, February 1995 (1995-02-01), pages 789 - 813, XP004001084 *
SHAW JR.: "Term Relevance Computations and Perfect Retrieval Performance", INFORMATION PROCESSING & MANAGEMENT, vol. 31, no. 4, February 1995 (1995-02-01), pages 491 - 498, XP004062656 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1442396A1 (fr) * 2001-09-17 2004-08-04 Netpia.Com, Inc. Systeme permettant d'acceder a une page web a l'aide d'un nom reel et procede correspondant
EP1442396A4 (fr) * 2001-09-17 2007-01-10 Netpia Com Inc Systeme permettant d'acceder a une page web a l'aide d'un nom reel et procede correspondant
EP1351162A2 (fr) * 2002-03-28 2003-10-08 Matsushita Electric Industrial Co., Ltd. Appareil et méthode d'extraction de contenu
EP1351162A3 (fr) * 2002-03-28 2004-05-12 Matsushita Electric Industrial Co., Ltd. Appareil et méthode d'extraction de contenu
US7082431B2 (en) 2002-03-28 2006-07-25 Matsushita Electric Industrial Co., Ltd. Content retrieval apparatus and method
EP1353280A1 (fr) * 2002-04-12 2003-10-15 Targit A/S Un procédé de traitement de requêtes multilingues
WO2009033562A2 (fr) * 2007-09-07 2009-03-19 Daimler Ag Procédé et dispositif pour reconnaître une information alphanumérique
WO2009033562A3 (fr) * 2007-09-07 2009-06-25 Daimler Ag Procédé et dispositif pour reconnaître une information alphanumérique
TWI459224B (zh) * 2012-09-06 2014-11-01 Shuttle Inc 電子裝置的內容搜尋方法及其應用程式

Also Published As

Publication number Publication date
AU1380599A (en) 2000-05-22

Similar Documents

Publication Publication Date Title
US5956711A (en) Database system with restricted keyword list and bi-directional keyword translation
US7533089B2 (en) Hybrid approach for query recommendation in conversation systems
US8433711B2 (en) System and method for networked decision making support
US7512609B2 (en) Methods, apparatus, and data structures for annotating a database design schema and/or indexing annotations
US20060224586A1 (en) System and method for improved spell checking
EP3655840A1 (fr) Analyse de pages web pour faciliter une navigation automatique
US20110320191A1 (en) Text creation system and method
US20040093567A1 (en) Spelling and grammar checking system
KR20040058300A (ko) 데이터 소스 탐색 시스템 및 방법
US20090094223A1 (en) System and method for classifying search queries
US20050165819A1 (en) Document tabulation method and apparatus and medium for storing computer program therefor
US20100293162A1 (en) Automated Keyword Generation Method for Searching a Database
JP2002520698A (ja) コンテキスト依存データベースを構築する方法
JPH0310134B2 (fr)
JP2002507296A (ja) キーボードを用いない入力環境における、検索タームの知的選択のためのシステム、方法、およびメディア
US20050209992A1 (en) Method and system for search engine enhancement
US20220414158A1 (en) Systems and methods for generating a search query using flexible autocomplete menus
Vilar et al. Comparison and evaluation of the user interfaces of e‐journals
WO2000026766A1 (fr) Systeme de base de donnees contenant une liste de mots cles limites et une traduction des mots cles dans les deux sens
Matthews Time for new OPAC initiatives: an overview of landmarks in the literature and introduction to WordFocus
France et al. Use and usability in a digital library search system
JP5398202B2 (ja) 翻訳プログラム、翻訳システム、翻訳システムの製造方法及び対訳データ生成方法
US7421096B2 (en) Input mechanism for fingerprint-based internet search
Clemencin Querying the French Yellow Pages: natural language access to the directory
US20240232272A1 (en) Analyzing web pages to facilitate automatic navigation

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase