INTELLIGENT CLIENT-SIDE FORM FILLER
BACKGROUND OF THE INVENTION
FIELD OF THE INVENTION The invention relates to electronic form mapping and recognition. More particularly, the invention relates to an automated client-based method of filling out electronic forms that does not require any prior mapping or examination of the forms.
DESCRIPTION OF RELATED TECHNOLOGY Internet users are only too familiar with the repetitive chore of filling out online forms. Many websites limit full access to registered accountholders. Registration may require providing personal data by way of entering the data into the fields of an HTML form. Return visits to such sites generally require a sign-on procedure in which the user is asked to provide, for example, a user name and password. E- commerce applications require the user to fill in order forms that ask for personal data, billing address, shipping address, credit card information and such. Often, users, when faced with a request to complete yet another form, just move on because they simply don't want to take the time to complete the form. Thus, filling out online forms, while essential, also constitutes a serious obstacle for the site sponsor and the user alike. The user is deprived of what may be an important source of value, and the site sponsor may lose sales, or at the very least, may miss important opportunities to find out more about the people who visit their site. Thus, it would be an important advantage to provide a system that automated the task of filling out electronic forms.
The parent to the current application, J. Rawat, S. Palnitkar, Method and system of implementing recorded data for automating Internet interactions, U.S. Patent Application Ser. No. 09/561 ,449, filed April 28, 2000 provides just such a system. User information is maintained in a central database. The user information includes personal data as well as account-specific data, such as the URL, and the login data for sites the user has visited. Client-side program code, integrated with a
conventional web browser provides a utility window through which the user may access the central database - to edit user information, for example. In addition to the database of user information, the system also maintains a database of form information wherein mapping information is stored for forms from previously visited web sites. The forms are mapped by parsing the underlying code, typically HTML, of the form, for example, the field tags. When a mapped from is subsequently encountered, the map allows the form fields to be automatically populated with the required user data in the required format. Occasionally, for example, in the case of optional fields or forms that have changed slightly so that the saved map is no longer completely accurate, user intervention may be required.
The prior art provides several examples of network based or distributed electronic wallets, in which user information is stored in a database for later use. D. Schutzer, System and method for use of distributed electronic wallets, European Patent Application No. EP1077419, published February 21 , 2001 describes a method and system in which two electronic wallets communicate and exchange information; typically a consumer's electronic wallet populates a merchant's wallet with the consumer's personal data. The merchant wallet stores the consumer's personal data for later use. M. Bahdur, G. Huddleston, C. Paltenghe, M. Takata, Distributed network based electronic wallet, European Patent Application No's. EP09171 19 and EP0917120, filed May 19, 1999 provide a system in which various classes of user data are stored in distributed databases; the user may download the stored data to a wallet application. M. Sivadas, D. Steed, J. Main, Server-based electronic wallet system, European Patent Application No. EP1168264, filed January 2, 2002 describes system in which purchase requests directed to a merchant server from a wireless device are mediated by one or both of a proxy server and a wallet server. The .NET PASSPORT SERVICE, provided by Microsoft Corporation of Redmond WA, includes a service wherein the user may use a single sign-in and make express purchases from participating merchants and web sites. A wallet application allows the user to store personal data on a secure server, such as billing and shipping addresses and credit card data. When the user makes a purchase from a participating merchant, the user data is automatically supplied to the merchant from the secure server. In all of these previous examples from the prior art, a distributed,
or client-server architecture is essential to the proper function of the system. None of the systems described contemplate an exclusively client-based form-filler that is capable of populating any web-based form with user data.
A. Gupta, A. Rajaraman, Method and system for automatically filling in forms in an integrated network based transaction environment, U.S. Patent No. 6,199,079, filed March 20, 1998 describes a method of automatically filling in online forms presented by web pages, in which a particular form is assigned a unique identifier and a template for the form is stored in a database of form templates, indexed by forms' unique identifiers. When the user encounters a form, the form is filled in according to that form's template from the database. R. Haridas, M. Markus, Method and apparatus for completion of fields on Internet page forms, International Application No. PCT/US00/41802, filed on November 2, 2000 describes an automated method for completion of Internet webpage forms in which user data stored on a centralized server is automatically applied to forms that have been previously registered with the centralized server. Thus, both of these examples from the prior art employ a distributed architecture and derive their form-filling capability from a record of the form stored at a central location. Neither contemplates the ability to analyze any form encountered and populate the form fields with the required user data in the proper display format without any prior mapping or examination.
J. Light, J. Garney, Automatic web based form fill-in, U.S. Patent No. 6,192,380, filed on March 31 , 1998, describes an automated form filler that recognizes a form within an HTML page and fills in the form fields with data taken from a database. Recognition of the form and the form fields occurs by parsing the page's HTML code and identifying page tags and field tags.
M. Pennell, A. Martin, Method and apparatus for automatic form filling, International Application No. PCT/US0042073, filed on November 9, 2000 describes a software application intended for use with or integration into a conventional web browser application that automatically populates the fields of a web-based form with the required user data. The described software application gains knowledge of the
form's fields and the expected contents by analyzing the underlying code for the page received by the browser from the visited web site, generally HTML or XML code, or the like. Embodiments of the software application in both distributed and client-based implementations are described.
The Roboform user manual, published at http://www.roboform.com/manual.html, ©1999 - 2002, updated January 25, 2002, describes a client-based web form filler that works as an add-on to conventional web browser applications. Forms may be filled in either by means of a "pass card," a record that saves information related to a specific form at a particular web site, or by means of an "identity," a user profile, wherein the software application analyzes the page encoding and populates the form fields with appropriate data from the identity. By the user selecting a country, the application applies the appropriate display format to form data such as dates and telephone numbers.
All of the above form-fillers analyze a web form based on the form's underlying code. There is no indication that they analyze a form based on page elements visible to the user on the rendered page, such as field labels. While the Roboform application is able to format data according to a user-selected country format, there is no indication that either Roboform or the form-filler described by Pennell, et al. can format data on an ad hoc basis by analyzing user-visible formatting prompts, generally provided somewhere adjacent the field.
While parsing html field tags provides generally satisfactory results, the lack of any naming convention for the fields of an HTML form has made it impractical to devise a completely automated form-filler. It is difficult for systems that parse a page's underlying code to identify fields to accommodate the limitless variety typically encountered in field names, without at least some user intervention. The adoption of standardized markup languages such as ECML (E-commerce markup language) may help to remedy this situation. In the meantime, there exist multitudes of forms that do not follow any sort of naming convention for the fields. On the other hand, field labels, the visual page elements that communicate a field's purpose to the user display a great deal more uniformity than the underlying field names.
Accordingly, it would be a great advantage to provide an intelligent, fully automated, client-based form filler that maps the fields of an electronic form by parsing visual page elements, such as the user-visible field labels. It would also be desirable to provide the functionality of determining the appropriate contents for a field based on the field's context, the type of neighboring fields to the target field. Furthermore, it would be a significant technical advance to provide the capability of formatting the appropriate data according to visible formatting prompts.
SUMMARY OF THE INVENTION
In recognition of the above needs, the invention provides an intelligent form-filler that does not require any prior mapping or examination of the forms. Client-side program code examines electronic documents such as web pages and automatically fills in fields of forms contained in the document with the appropriate data from a user profile, without requiring prior mapping or examination of the form. The application maps user data to the appropriate form field by examining field label text on the form as the user sees it, i.e. text that is visually nearest the field. For fields lacking labels, the application evaluates the field context by determining the field type of neighboring fields to determine the required data. To enter the information in the correct format, the application parses visual hints concerning, for example, the date format provided to the user and formats the data accordingly. In the complete absence of any usable visual cues or contextual information, the program code is capable of parsing the form's underlying code. Alternate embodiments of the invention are possible, in which that user information is alternately stored on the client, or on a server, to maximize portability.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 provides a screen shot of a typical web-based electronic form;
Figure 2 provides a second view of the form of Figure 1 ;
Figure 3 provides a schematic diagram of a client-based system for filling out electronic forms according to the invention; and
Figure 4 provides a data flow diagram of a client-based method for filling out electronic forms according to the invention.
DESCRIPTION OF THE INVENTION
As shown in Figure 1 , a typical HTML form 100 includes a plurality of fields 101 that must be filled out by the user, for example, when making an online purchase, or registering at a web site. Form-filling software generally analyzes the fields 101 of the form and maps them to the correct user data by parsing the HTML field names, commonly known as field tags (not shown) and then supplying the correct user data from a stored user profile, generally located on a remote server. Because no naming convention has existed for fields in an HTML form, it has been difficult to produce a fully automated form-filler application. Previously, forms had to be mapped or analyzed in advance and the mapping saved in a database of form descriptions, usually also located on a remote server. Often, user intervention is required to complete the form.
HTML forms also include a plurality of visible field labels 102, in which each label is spatially and visually related to its corresponding field, although no programmatic relationship usually exists between them, as with the HTML field tags. The field labels are provided for the user's benefit to advise them of the correct information to enter into a particular field. The invention recognizes that there exists a great deal more uniformity and consistency among the visible field labels than to the underlying field names, because the labels generally identify the information sought in a well-known, highly conventional manner. Based on such recognition, the invention provides a client-based system and method for filling out electronic forms automatically, in which the fields of an HTML form are identified and mapped
to the correct user data based on visible form elements such as field labels. Following mapping and identification, the fields of the form are populated with the correct user data, without reference to a previous, stored mapping or analysis of the form, and without requiring user intervention.
Turning now to Figure 3, a schematic diagram of the system of the invention is shown. A client 301 , in communication with a network 307, retrieves an HTML page containing a web form 308 from a remote site on the network. The web form could be substantially similar to the typical form shown by Figures 1 and 2. In one embodiment of the invention, the client 301 is a conventional microcomputer, either desktop or laptop. However, other clients possessing the requisite storage capacity and processing capability are entirely consistent with the spirit and scope of the invention. A client may also be a process, such as a program, that requests a service from another process. In one embodiment of the invention, the client communicates with a publicly accessible HTTP network such as the Internet, however other network environments employing other networking protocols are also suitable for the invention. The means of connecting to the network includes dialup and broadband connections, as well as other connection methods, such as wireless.
As previously stated, the client 301 has both storage capacity and processing capability. Logic 303 stored and executed on the client implements a probabilistic, rule-based method of analyzing the form in separate steps. In a first step, the logic traverses the form from beginning to end, locating the field labels, associating them with a field, and then mapping the field to the correct metadata based on a best match from a field label dictionary 304 - a file of analogs, or expressions resembling the field label, stored on the client 301. Also incorporated in the field label dictionary are the rules for mapping the field to the correct metadata and for mapping fields lacking labels to the correct metadata based on the field's context. At the same time, the functional blocks of the form are identified, for example, shipping address and billing address. In this way, identical field labels, such as "address" or "zip code" in a shipping address block and a billing address block are mapped to the correct metadata.
In a second step, the logic traverses the form elements in a reverse direction, refining the granularity of the mapping done in the first pass, based on rules contained in a normalization dictionary 305, also stored on the client. The logic traverses the form a third time, identifying visible display format hints 103, mapping them to a display format dictionary 306, also stored on the client. Similar to the field labels dictionary 304, the display format dictionary 306 contains a number of regular expressions that are analogs of, or similar to the visible display format hints 103 found on the form 100. Additionally, the display format dictionary 306 contains rules and code for mapping the field to the correct display format. Finally, after the visible elements of the form have been completely mapped the correct user data is retrieved from a stored user profile 302, a data file stored on the client, and concatenated, truncated or re-formatted as required by the display format, and the form fields are populated with the data. In an alternate embodiment, the user profile is stored on a server and retrieved by the client.
Figure 4 provides a data flow diagram of the method of the invention 400. In a preferred embodiment, the method is implemented as a JAVASCRIPT function that analyzes forms in a target window and maps each identified form element to the correct user data to be filled into each field. Conventional techniques of computer programming are employed in the implementation of the invention. In addition to JAVASCRIPT, other commonly known scripting and programming languages would also be suitable for the invention such as VBSCRIPT, PERL, JAVA, or JPYTHON. As previously indicated, the invention primarily relies on the visible field label. However, in the absence of a visible field label, the invention also utilizes:
• HTML or ECML field name;
• Default field value, in the case of a field that is a select box;
• A list of possible values for radio buttons; or
• Previous field mapping
The invention matches the values with the dictionaries previously described to map the fields to metadata, wherein metadata comprises a data type, such as • Last name;
• First name;
• AddressLine 1 ;
• AddressLine 2; and
• City.
As will be apparent to the practitioner of ordinary skill, the above metadata classifications are exemplary only. Others will occur to the artisan according to the setting and the functional requirements.
As shown in Figures 3 and 4, the domain-specific data contained in the dictionaries has been kept separate from the processing logic. Thus, the invention is easily modified by substituting dictionaries to support forms written in any language. Furthermore, the mapping and normalization rules are also easily modified to accommodate a variety of settings and applications of use.
Referring again to Figure 4, the core method involves the following steps, each of the steps accomplished in a single traversal of the form elements:
• First, field discovery 401 ;
• Second, field normalization 402; and • Display format mapping 403.
On the first traversal: The logic loops forward over all form elements and discovers the fields. The primary method of field discovery involves, for an unmapped field, analysis of the visible field label. A number of rule-based approaches to the field label analysis are possible. Among them:
• if a field is positioned in a table cell, analyzing text expressions in adjacent cells; and comparing the analyzed text expressions with entries in the field label dictionary to find the closest match with a metadata expression, wherein the field is mapped to the correct metadata. As Figure 1 and 2 show, the field label is typically situated to the left, or immediately above the
field. However, according to need, the rules may be varied, to accommodate a placement below or to the right of the field;
• analyzing text expressions that occur within a predetermined number of words and within predetermined direction and distances from the field; • based on page coordinates, examining the general vicinity of the form in all directions from the field and analyzing the text expression closest to the field; and
• ignoring supplemental text that does not contribute to the field label while searching for the field label, for example, text within parentheses or quotation marks.
After the field label is located according to one of the above procedures, the analyzed expression is compared with a listing of similar expressions in the field label dictionary to find the closest match. The dictionary expressions, analogs of the analyzed expression, are organized according to metadata. Thus, when a match is found, the field corresponding to the field label is mapped to that metadata 404.
There may be cases when a field doesn't have a label. For example, two or three fields may be provided for street address. Often they are labeled "Address 1" and "Address 2," or something similar. But the first field may only be labeled "Address" and the second address field may not bear a label at all. In such a case, it is possible to map the second field according to its context. Mapping according to context requires that the field immediately preceding the field of interest have been mapped. As fields are mapped during the first pass, the algorithm assumes -maximum granularity of the data. Thus, in the case of an unlabeled field that follows a field labeled 'address;' the 'address' field would have been mapped to the metadata "AddressLine 1." It is probable that an unlabeled field following a "AddressLine 1" field is an "AddressLine 2" field. Accordingly, the unlabeled field will be mapped to the metadata "AddressLine 2."
As stated above, the algorithm assumes maximum granularity of the data. Thus, an unlabeled field following an "area code" field is assumed to be a "prefix" field, rather than a field asking for the entire remainder of the telephone number. A field
labeled 'Address' is presumed to be an "AddressLine 1" field, rather than a field that asks for the entire street address. As described further below, during normalization, the granularity of the field mapping is refined.
Additionally, if a field lacks a label, the algorithm may analyze the field's programmatic name. Following field name analysis, the field name is compared to the entries in the Field label dictionary and a match found. The field is then mapped as described above.
Furthermore, field size can be used to resolve ambiguity. For example, in the scenario "Name: First [ ] Middle [ ] Last [ ]," it may be unclear whether the form is asking for the middle name or middle initial. Considering the size of the form element would help to resolve the ambiguity in this case: size=1 implies middle initial else middle name.
In the case of select boxes, if the field lacks a label the algorithm uses the default value for comparison with the field label dictionary followed by mapping as described above. Typically, in such cases, the default value acts as the label, e.g. "Select a state."
Sets of radio buttons are completely separate objects in the DOM (document object model) of the page. Thus, the logic creates a Radiobutton object. The list of possible values is compared to the field label dictionary and a mapping performed as described above.
IDENTIFYING THE BLOCK TYPE
Also, during the first traversal of the form elements, the fields are mapped to a block type 405. A block is a functional unit of the form, such as:
• billing address;
• shipping address;
• email address; and
• credit card information.
The block type mapping for most fields is reasonably obvious: an email address is mapped to email, a credit card number is mapped to the credit card block. However, in the case of shipping address and mailing address, the mapping is more complex. Both block types contain identical field labels, although the underlying field names are different. Figures 1 and 2 show different blocks of the same form, with Figure 1 showing a billing address block and Figure 2 showing a shipping address block.
Thus, when a Name, Address or Phone field is encountered, the block type may be either BILLING or SHIPPING. A new block is assumed to start when an AddressLine 1 , NameTitle, FirstName or LastName field is found; and the previous field was not a Name field. To identify whether the block is BILLING or SHIPPING, first the HTML fieldname is analyzed. If it matches any of the expressions in either a billing address array or a shipping address array, the block is mapped to BILLING or SHIPPING respectively.
If it isn't possible to map using the field name, the visible text preceding this field (and up to the previous Country or AddressLine 1 field, if any, is analyzed. If this text contains any of the expressions in the billing address or shipping address array, the block is BILLING or SHIPPING respectively.
Often, the visual text is too large to analyze and may contain additional information such as the order summary or anchor labels, which may contain strings like "Shipping Information" or "Shipping Options," resulting in mapping to an incorrect block type. In such cases, the following strategy is able to pick out the block label more accurately. It is adapted specifically to cases in which the address block and its label are situated in a table:
1. Search the text from the start of the BODY of the HTML page (or from just after a previous AddressLine 1 or Country element, if any) to the position of the current element.
2. Get the table that the current element is in.
3. If there is no table, exit. (It isn't possible to narrow down the range of text to be searched.)
4. If the start of table is before start of the previously text searched as above in 1 , exit. (It isn't possible to narrow down the range of text to be searched.)
5. Search the text from start of table to start of current element.
6. If any of the text searched matches with an expression in a billing address array or a shipping address array, the block type is determined.
7. Otherwise, if the current table is embedded in another table return to step 3.
During the second traversal of the form elements, the logic steps through the elements in reverse order, starting with the last field in the form. During this step, field normalization 402, the granularity of the field mapping is refined. For example as described above, assuming maximum granularity, a field labeled 'Address' was presumed to be the first line of two or more address lines, thus it was mapped to 'AddressLine 1.' However, the presumption that the form contains more than one address line may be incorrect. In the current step, such mapping errors can be resolved by examining a field's context in reverse order. Thus, in the case of the 'Address' field, if the field following the labeled address field is mapped to 'City' it is determined that the original mapping of the field to 'AddressLine 1' was incorrect, and the field can be mapped correctly.
In the case of a telephone number field, on the first pass, the first field, because maximum granularity is assumed, would have been mapped to area code. On the second pass, if the mapped field is immediately followed by another labeled field, the original mapping will be incorrect - since there is only one field for telephone, it will be necessary to concatenate the separate expressions for telephone number to
create a single string to put into the single telephone field. If, however, the field mapped to area code is followed by two unlabeled fields and then a labeled field, the mapping of the first field to 'Area code' was correct, and the two additional unlabeled fields can be mapped to accept the other expressions that together make up an entire phone number, a three-digit prefix, followed by a four-digit prefix. The rules for evaluating the field contexts in this fashion are found in the normalization dictionary 305. While the invention has been described in terms of conventions followed in the United States, the practitioner of ordinary skill will appreciate that the invention can be adapted to those of any country.
On the third traversal, for fields that require the data to be entered in a specified format, the visible display format hints 103 are analyzed. Thus, in a display format- mapping step 403 the fields that have been mapped to a metadata category 404 are mapped to the correct display format 406.
The display format hints are organized by category, i.e. there is one set of display hints for Phone fields, another for Name fields, Date fields and so on. Display format hint text for the field is captured (The text could be either to the left or right of the field). As described above, supplemental text enclosed within parentheses or quotation marks is ignored for the purpose of identifying field labels. However, when identifying display format hints, priority is given to the supplemental text because the display format hints are more likely to be found in the supplemental text, embedded within parentheses or quotation marks. This text is matched against the expressions found within the display format dictionary 306 in the set of hints for the field category (Note that the field has already been identified, so the field category is known at this point), and the display format is identified. If no display format is obtained and the previous field category is the same as this field category, then the previous display format is applied to this field, too. This would apply to cases like:
Home Phone (xxx-xxx-xxxpcj: Work Phone:
CHECKOUT FORM DETECTION
The program code object has a flag, which is set whenever a Billing or Shipping Address field or a Credit Card field is identified to indicate the form is a checkout form. Thus, the invention also provides checkout form detection functionality.
MASKED FIELDS
Certain INPUT elements, for example, credit card number fields are designated as password fields, so as to mask sensitive information from being displayed. The names of these fields are stored in a separate array. To provide added security, the field type can also be changed to password for certain fields, for example, a form may generally allow users to enter the credit card data in visible clear text. The field type can be programmatically changed to password before filling out the card number so that the data shows up as "******" instead of clear text.
After the metadata, block type and display format of each field has been identified, the information is used to generate a form mapping. In the current embodiment of the invention, the form mapping is coded in XML (extensible markup language). However, other page description languages would also be suitable in the practice of the invention.
Following mapping, the appropriate user data is retrieved from the user profile data file, formatted as required by the newly generated page mapping and the fields of the form populated with the required data.
While the invention has been described with respect to e-commerce applications, such description has been for purposes of illustration only, and is not meant to limit the scope of the invention. The invention finds application with other types of forms as well. Additionally, the invention is also well suited to automating the login process at sites requiring a user login.
Although the invention has been described herein with reference to certain preferred embodiments, one skilled in the art will readily appreciate that other
applications may be substituted for those set forth herein without departing from the spirit and scope of the present invention. Accordingly, the invention should only be limited by the Claims included below.