WO2001027849A9

WO2001027849A9 - Electronic shopping management: task models

Info

Publication number: WO2001027849A9
Application number: PCT/US2000/028566
Authority: WO
Inventors: Elizabeth B Charnock; Jesse Burns; Philip Chang; Fabien Gerard Norber Hertschuh
Original assignee: Troba Inc
Priority date: 1999-10-13
Filing date: 2000-10-13
Publication date: 2002-05-30
Also published as: WO2001027803A3; AU1432601A; WO2001027850A2; WO2001027801A2; AU1333801A; WO2001027849A2; WO2001027849A8; WO2001027803A2; AU1333701A; AU1206501A; WO2001027850A8; WO2001027801A8

Abstract

La présente invention concerne un procédé et un système permettant d'élaborer des modèles de tâches d'un site Web qui sont progressivement affinés et ensuite, d'analyser le comportement de l'utilisateur final qui est en relation avec ce site Web. Le modèle de tâches peut être élaboré sans intervention humaine. L'utilisation de dictionnaire d'ontologie/des synonymes et d'autre analyse linguistique, telle que l'indexation par radicaux, peut être appliquée à l'analyse des tâches. Le modèle de tâches peut également être élaboré avec une petite quantité de données saisies par des utilisateurs bien informés sur les affaires commerciales. Des données concernant le comportement de l'utilisateur final sur le site Web peuvent ensuite permettre d'affiner le modèle de tâches. L'affinage itératif du modèle de tâches augmente la fidélité des résultats. De plus, une telle approche progressive permet aux utilisateurs bien informés sur les affaires commerciales de connaître la satisfaction quasiment immédiate d'avoir quelque chose à travailler tout en tirant toujours un bénéfice. Le procédé et le système permettent de détecter et de quantifier, de façon générale, des problèmes spécifiques, relatifs à la facilité d'utilisation, sur des sites de commerce électronique, et d'inclure la génération de suggestions par rapport à une rectification ou une amélioration. Une analyse spécifique au coût peut être réalisée. Par exemple, des calculs indiquant combien tout défaut de conception spécifique coûte aux créateurs de sites Web peuvent être effectués. De plus, une conception de site Web est évaluée pour voir si elle est bien adaptée aux objectifs commerciaux spécifiés pour le site Web. Un graphique de tâches réel, un graphique de tâches idéal et un graphique de tâches empirique pour un site Web peuvent être comparés. De plus, une évaluation comparative des sites Web connexes peut être réalisée.The present invention relates to a method and a system for developing task models of a website which are progressively refined and then analyzing the behavior of the end user who is in relation to this website. The task model can be developed without human intervention. The use of dictionary of ontology / synonyms and other linguistic analysis, such as indexing by radicals, can be applied to task analysis. The task model can also be developed with a small amount of data entered by knowledgeable business users. Data about the behavior of the end user on the website can then be used to refine the task model. The iterative refinement of the task model increases the fidelity of the results. In addition, such a step-by-step approach allows users who are knowledgeable about business matters to experience almost immediate satisfaction with having something to work on while still making a profit. The method and system generally detect and quantify specific usability issues on e-commerce sites, and include the generation of suggestions in relation to a correction or improvement. A specific cost analysis can be performed. For example, calculations showing how much any specific design flaw costs website creators can be performed. In addition, a website design is evaluated to see if it is well suited to the business objectives specified for the website. An actual task graph, an ideal task graph, and an empirical task graph for a website can be compared. In addition, a comparative evaluation of related websites can be performed.

Description

ELECTRONIC SHOPPING MANAGEMENT: TASK MODELS

RELATED APPLICATIONS

This application claims the benefit, under 35 U.S.C. §119(e), of Provisional

Application Number 60/159,226, entitled "Method and Apparatus for Electronic Shopping Management" filed on October 13, 1999, by Elizabeth B. Chamock, Loki Der Quaeler, Philip Chang, Jesse Bums, Mishkin Berteig, and Curtis Thompson, and incorporates that application by reference in its entirety.

This application also claims the benefit, under 35 U.S.C. §119(e), of Provisional

Application Number 60/201,183, entitled "Electronic Shopping Management" filed on May 2, 2000, by Elizabeth B. Chamock, Loki Der Quaeler, Philip Chang, Jesse Bums, Mishkin Berteig, Curtis Thompson, and Mark Yu-Hung Lin, and incorporates that application by reference in its entirety.

This application is related to copending U.S. application serial number , entitled "Electronic Shopping Management: User States" filed on October

13, 2000, by Elizabeth Chamock, Loki Der Quaeler, Philip Chang, Jesse Bums, Curtis Thompson, Michel Henri Guzy, Fabien Gerard Norbert Hertschuh, and Wenxin Mao, which is incorporated by reference herein in its entirety.

This application is related to copending U.S. application serial number , entitled "Electronic Shopping Management: User Interface" filed on

October 13, 2000, by Elizabeth Chamock, Loki Der Quaeler, Mark Yu-hung Lin, and Curtis Thompson, which is incorporated by reference herein in its entirety. This application is related to copending U.S. application serial number , entitled "Electronic Shopping Management: Intervention" filed on October

13, 2000, by Elizabeth Chamock and Philip Chang, which is incorporated by reference herein in its entirety.

BACKGROUND OF THE INVENTION

TECHNICAL FIELD

The present invention relates generally to web-site management, and more particularly, to analyzing customer behavior at electronic commerce web-sites.

B ACKGROUND OF THE INVENTION

The e-commerce industry is growing at an astounding rate. More and more companies are creating web-based stores in which customers can shop. One of the advantages of e- commerce, which the e-commerce industry counts on, is the ease with which customers can shop on-line, instead of physically having to go to conventional stores.

However, despite the numerous advantages that e-commerce businesses have over conventional stores, e-commerce businesses also face some problems that are unique to them. Most e-commerce sites are very poorly designed, and thus frustrate customers trying to perform even simple tasks. Further, conventional stores change their layout very rarely, and repeat customers know the layout and can complete their browsing and buying rapidly and efficiently. In contrast, the majority of commercial web-sites change extremely frequently. Each time a site changes, the customers face the barrier of a new learning curve before they can complete a familiar task.

Moreover, floor managers in conventional stores can physically observe traffic flow and shopping trends. They can see, for example, the locations in the store where customers repeatedly get frustrated, or make a note of items which customers recurrently find hard to locate. Also, since conventional stores change their layout rarely, floor managers in these stores can become intimately acquainted with these trends. Such observations can form the basis for changes in the design of the stores in order to enhance customer satisfaction level. This kind of information, however, has not been available for e-commerce businesses. Thus e-commerce executives have very little understanding of how the design of their web site affects the intents, goals, and moods of their potential customers. It is very unlikely that potential customers of poorly designed web-sites will be able to easily identify, much less complete, their intended tasks. Further, even with a well-designed site, e-commerce executives may still be striving to understand how to make the site more successful.

In the increasingly competitive world of e-commerce, only those sites that are highly task-oriented will succeed. A task-oriented site is one which closely structures its content around different tasks that are of significant value to its customers. Not all tasks are, or can be of equal importance to the merchant or to their customers. Some tasks are t ly mission critical; without a certain percentage of users successfully completing them, the site can't exist. Others are more opportunistic or ancillary.

The two most critical factors in good site design are a) ensuring that the entries (e.g. links) to the really critical tasks stand out so that users can very easily and quickly pick them out, and b) ensuring that the number of steps in each task is kept as small as is reasonably feasible. Various studies show that, on average, 50% of users are lost for each extra click that they must traverse. This latter point renders many seemingly benign designs unsuccessful from a business point of view, as they discard a large chunk of their audience at each step. If some unnecessary clicks are removed from the path, the effective browser -> buyer conversion rate could double, triple, or more, thus significantly impacting the profitability of the site.

Accordingly, a system and method is needed, which determines which tasks may be performed at a web-site, and presents each task in an efficient manner in keeping with customer needs. Further, there exists a need for a system and method which would refine both its assessment of the task model of the web-site, as well as its suggestions for refinement to the web-site, based on data regarding use of the web-site by customers.

SUMMARY OF THE INVENTION

The present invention provides a method and system for continually monitoring website activity. This monitoring is performed while taking into consideration the goals (or "tasks") that the end-user (or "customers") can perform when they visit a web-site. This notion of tasks allows for definitive and objective measurement, and improvement of the ease of use and/or user-friendliness of a web-site.

A system in accordance with the present invention constructs a task model for the web site. In one embodiment, the task model is constructed by first performing a blind analysis of the web-site. Thus, automated task detection can be performed without any human input.

In one embodiment, a business knowledgeable user then reviews this preliminary task model, and provides information to refine the task model to represent the web-site accurately in accordance with the business model. The task model is then revised accordingly, to create a more accurate task model. In one embodiment, a system in accordance with the present invention uses and heavily leverages a relatively small amount of human entered information to build a task model that is good enough for analytical purposes.

In one embodiment, end-user behavior is observed. As more and more customer data is gathered over time, the task model is iteratively updated. Numerous passes of such refinement increase the fidelity of the results. In addition, the above discussed incremental approach allows business knowledgeable users to experience some almost immediate gratification of getting something to work, while still gaining a benefit.

Further, a system in accordance with the present invention comprehensively detects and quantifies specific usability problems on e-commerce sites, including the generation of suggestions for repair or improvement. This is done, in one embodiment, by both analyzing the site statically, and then analyzing end-user behavior in relation to it.

In addition, a system in accordance with the present invention has the ability to test a site design to see how well it is aligned with its stated business objectives.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 is an illustration of a system in accordance with the present invention.

Fig. 2 illustrates the different kinds of constructors that can create task models.

Fig. 3 illustrates a flowchart outlining the various steps performed by the blind constructor.

Fig. 4 illustrates a flowchart outlining the various steps performed by the sighted constructor. Figs. 5A-C are screenshots of a web application that can be used by a business-knowledgeable user to provide information.

Fig. 6 illustrates a flowchart outlining the various steps performed by the observed constructor.

Fig. 7 illustrates a sample task graph.

DETAILED DESCRIPTION OF THE INVENTION

Embodiments of the present invention are now described with reference to figures where like reference numbers indicate identical or functionally similar elements.

Fig. 1 illustrates a system in accordance with an embodiment of the present invention. As shown in Fig. 1, this system comprises a task model creator 110, a site representative profiler 120, an end-user profiler 130, a site profiler 140, a performance analyzer 150, and a report generator 160. Each of these modules is discussed in detail below.

A. Task Model Creator 110

A system m accordance with one embodiment of the present invention creates one or more task models for the user's web site. This involves identifying various tasks on the web site, associating the steps of the various tasks with different pages on the web site, ranking the task in their order of importance, and so on. A page is an HTML, XML, WML. or similarly structured document designed to be fetched from an HTTP server. A page may consist of links, forms, and/or other such elements. In one embodiment, an end-user can define as many different task models for a site as she likes. However, each task model will generate a separate set of reports that is uniquely associated with it.

Tasks and Task Steps:

A task is a real-world user goal. It can be thought of as the reason that the end-user came to a specific web-site in the first place. Some examples of tasks are "Book a Plane Ticket," "Buy a Concert Ticket," and "Register for Free Email." In some embodiments of the present invention, tasks must contain two or more steps, in order for successful completion of the task to be measurable, and to be distinguishable from the unsuccessful case. A task step is not an end-user goal itself, but is something that must be done m order to complete the task. Some examples of task steps are: "Provide Credit Information," "Enter Contact Info," and "Verify Purchase Order " Further, it is to be noted that some tasks ("supertasks") may contain other tasks ("subtasks") An example of this would be the supertask "Plan a Vacation" containing subtasks such as "Reserve a Hotel Room," "Rent a Car" and so on Understanding what a task is, and the vaπous attributes and relationships that tasks can have, will facilitate understanding of how a task model is created Some of the relationships between tasks include

• Precursor (task A is usually performed before task B)

• Post-cursor (the reverse)

• Correlated to (task A and task B tend to be performed together, but are order independent of one another)

• Containment (task A is contained by task B)

Tasks having any of the above relationships are often referred to as "related tasks " One of the important attributes of a task is its dollar or importance value This value is calculated from a number of different raw inputs from a business knowledgeable user These include a) the direct revenue amount realized by the merchant on a successful completion for each distinct type of end-user, b) similarly the indirect revenue, c) a business importance

Direct Revenue The amount of total money or profit (the user's choice) that the business will receive as a direct consequence of a transaction

Indirect Revenue The amount of total money or profit that the business will receive from indirect sources such as advertising referrals, if a task is completed successfully Indirect revenue could also include money saved For example, a site might save $30 for each customer who was successfully serviced by web-based customer support instead of the traditional customer call center

Business Importance In one embodiment, this has one of three possible values a) business critical, b) important, or c) opportunistic As discussed below, this attribute is provided to both a) allow business knowledgeable users to specify a relative pnonty between tasks for which they do not know have exact revenue data, and b) indicate that one task is of greater importance than another task even if the revenues of the two tasks are equivalent This latter case might occur for example, if the profit realized from selling two products v. as the same, but the sale of one of the products was considered to be very important in attracting a critical new type of customer, while the other was not The two revenue numbers and the importance level are, in one embodiment, combined eate one number. In one embodiment, the combination occurs as follows. a. At least one of the three fields is required to be filled in, and it is only required for one end-user type (If no data is explicitly provided for any additional end-user tyP^e _> ^tne data provided is assumed to hold true for all end-user types.) b. Combine .5 x the indirect value and add it to the direct value, c If importance level is specified i. If only importance levels are specified throughout, these importance levels determine the ordenng. n. If only an importance level is defined in some cases, but revenue numbers are specified for others, then the importance levels are fitted to a Gaussian curve and converted to the appropnate revenue amount in. If both the importance level and revenue numbers are defined, it is ensured, in one embodiment, that the business knowledgeable user understands that an importance level should only be provided alongside revenue numbers to weight these numbers differently This requirement serves to prevent business knowledgeable users from making up phony revenue numbers in order to get the outcome they want Same curve fitting discussed above is performed 2 In one embodiment, data that differs across different types of end-users. a. Is averaged, if all end-user types are equally important b. Assumes the value of that for the "business critical" end-user, if there is one in this task In one embodiment, if a task is optional for a particular type of end-user, that end-user is not considered business cntical for that type of user. If it is optional for all end-users, it is less important than any other tasks that are deemed business critical, and is less important than any task with the same values other than optional

A high score on need or a low score on knowledge causes the user to not be considered business cntical on that particular task, even if they are generally considered so for other tasks, or for the site as a whole Weighted average of the different types of end-users is otherwise used

3 For supertasks, the sum of the amounts for the various subtasks is added However, the business knowledgeable user may decide to add additional value to the completion of the supertask beyond just the sum of its parts It is to be noted that a similar calculation may be performed for a task, based on the value assigned to its steps. In one embodiment, the value of the whole cannot be less than the sum of its parts.

In one embodiment, if a business knowledgeable user specifies none of this information, the system will be unable to provide any information as regards a) pre-emption, b) interruptability, or c) ordering constraints.

If the business knowledgeable user specifies some or all of the information described above and decides that she does not like the relative prioritizations between tasks that emerge as a result, she can, in one embodiment, forcibly rank the tasks on a scale from 1 to N, where N is the total number of tasks that are currently defined in that task model. Ties may be permitted. In one embodiment, if forced ranking is done, either all tasks must be forced ranked, or only the M most important ones. Further, in one embodiment, any task that received a forced ranking outranks a task that did not. Forced ranking is not recommended however, since as new tasks are added, they may also have to be forced ranked in order to accurately determine their priority relative to existing tasks that have already been forced ranked. The business importance value provides a means for the business-knowledgeable user to specify the relative priorities of tasks if she does not know what exact numbers to use for revenue. In addition, in one embodiment, there is a means of forcing one task to be more important than another after the system has calculated the importance from inputs a) - c). This is needed because the end result of the importance calculation based on these initial inputs may not exactly conform to the business-knowledgeable user's wishes. This is discussed in detail later in this document.

Some other task attributes are as follows:

• Task Steps

The steps that are contained in the task, specifically those that must or may be completed by the user as part of the process of successfully accomplishing the task.

Task steps may be shared with other tasks. Task steps have their own properties, which are listed below.

• Ordering Information

This pertains to whether or not the steps must be done in a particular order, are preferred to be done in a particular order, can equally well be done in any order, or whether a mandatory or preferred partial ordering exists. In the case of partial orderings, the properties of individual task steps are needed to derive the valid orders. Pre-emptability

Visual pre-emptability deals with how visually prominent the link(s) to a particular task should be on any page where the link(s) appears, relative to other tasks. In one embodiment, there are three possibilities: a) links to the task can be visually pre- empted by any other task, b) links to the task can be visually pre-empted, but only by tasks specifically selected by the business knowledgeable user, or c) links to the task must always be the most visually prominent on a page where they appear. In another embodiment of the invention, task specific pre-emptability can be specified by the business-knowledgeable user. For example, the task "buy chickens" might be most important when the user is performing the task of "purchase poultry," but might not be important at all while the user is performing the task "research fertilizer."

An example of visual pre-emptability is as follows. If the most visually prominent link on a home page is a large image of a beach in Jamaica that links to a sweepstakes promotion, the task "Enter Sweepstakes" ought to be the most important task on the site. If it is not, business priorities are not being honored in the site design, and an error will be flagged by sighted constructor 220. In one embodiment, one aspect of pre-emptability is language dependent. For English speakers, and for European languages, textual pre-emptability is from left to right and top to bottom. This implies, for example, that a more important link should appear to the left of a less important one. However a different set of ordering rules would be needed for Japanese speakers, or for the Arab world for instance.

In one embodiment, a link that is in a font that is larger, of greater contrast with its background color, is bold, italicized, or of a different font type than neighboring links, will not be counted in order-dependent analysis. Larger image links pre-empt smaller ones that appear to be similarly positioned on a page. Higher contrast images pre-empt lower contrast images that are of similar size and relative position. Blinking or moving images pre-empt everything else that is likely to be visible.

It is to be noted that task steps may also be pre-empted. In one embodiment, the rules applied for task step pre-emption are the same as those applied for task pre- emption, with the exception that links to steps that require other steps to have been performed first cannot pre-empt such steps. In one embodiment, tasks that have a higher importance value are said to loosely pre-empt tasks with lower importance values. If the business-knowledgeable user indicates that a lower importance value task can pre-empt a higher dollar value task, this looser relationship will be ignored. However, if the business-knowledgeable user does not specify any formal pre-emption information, a warning will be issued in one embodiment. If a pre-emption relationship was specified, but was not implemented on the site, an error will be flagged. In one embodiment, the message text will note the severity of the problem in terms of its implication (i.e. the difference in importance between the tasks in question.) • Interruptability

Interruptability has to do with whether it is acceptable for links to other non-related tasks to appear within the task. It is possible that such interrupting links may distract end-users from completing the important task at hand. However, in one embodiment, only the most business critical tasks are non-interruptable. If a task is inteπuptable, it can either be interrupted by any task, or only by those tasks explicitly specified by the user while building the task model. In one embodiment, with respect to importance value, the calculations are performed by the analyzer in the same way as they are for pre-emption. It is to be noted that steps may also be interrupted.

• Promotion Attributes

A task may be a promotion, which means that it has both a fixed start and end date. In one embodiment, there are two types of promotions: advertisements (which point outside of the current site,) and site-internal promotions.

• Business Objective In one embodiment, a task may be related to one or more high level business objectives for purposes of convenient reporting to the merchant.

• Departmental Ownership

A task may be associated with one or more departments, in order to be able to automatically send out via email, or display information for, only those tasks that a particular individual or department has ownership of.

• Related Usability Metrics & Goals

Certain usability metrics, such as time spent, may be appropriate for some tasks, but not others. Therefore, in one embodiment the relevant metrics are specified by the business-knowledgeable user while creating the task model. • Related End-User Profiles

Not every task is performed by every type of end-user However, in one embodiment, unless a list of possible end-users is provided explicitly by the business-knowledgeable user, a task is presumed to be applicable to all end-user types. In one embodiment, it is possible for a business-knowledgeable user to specify whether or not a task is even visible to certain end-user types.

• Task type

This attribute allows the name of a task template to be compared to a specific implementation of the task. Not all tasks will have such a template that is relevant A business knowledgeable user might want to do specific compansons between two tasks on his sιte(s) or between his site and some other arbitrary site Allowing a type to be optionally specified allows all tasks of the same designated type to be compared conveniently in reports

In addition to the attnbutes descnbed above, tasks may also have some management- related attnbutes such as a textual descnption, a version number, a creation date, a last edit date, and a creator In some embodiments, the task may have an image to represent it in vanous visualizations In addition, a task may have one or more forms associated with it that require a set of valid values to be entered in order for the sighted evaluator to navigate through it The busmess-knowledgeable user's ability to provide such data for forms navigation, including fake credit card information, is one of the advantages that the sighted constructor

220 has over the blind constructor 210 It is to be noted that in some embodiments forms are associated with one or more individual steps However, in one embodiment, from a user interface usability standpoint, forms are most conveniently displayed to end-users as a property of tasks as well as of individual steps This facilitates having values specified only once for a form that appears in several possible steps in a task.

Task steps, like tasks, have different properties that impact the analysis performed by the sighted constructor 220 Some of these different properties and their consequences are as follows

Specifier: A specifier is a regular expression or other pattern specifier that indicates at least one, and possibly a very large number of pages that are considered to belong to that step An example of the single page scenario is a page that allows the user to enter credit card data An example of a case where a very large number of pages might constitute a step is browsing any mystery novel at Amazon.com, which might correspond to tens of thousands of dynamically generated pages.

Optional: This means that the step need never be performed in order for any user to successfully complete the task. Whether or not it is performed is not taken into account in analysis. In addition, in one embodiment, optional steps are not considered in the step count of a task.

Optional for: This means that the step need not be performed for (only) certain types of end- users. Whether or not it is performed by types of end-users for whom it is not relevant is not taken into account in analysis in one embodiment. Possible Only For: This means that the step will not even be made available to some profiles of users.

Exists: This attribute is used to indicate that the task being specified has not yet been implemented on the site, but the business-knowledgeable user would nevertheless like to include it in the analysis of the web-site. In one embodiment, the default value of this attribute is: "yes."

Step Type: This information is used because some well known step types such as "error" or "help" impact mood state calculations. Other types of steps such as "registration" may result in one or more particular static images being added to the room in the Cam, and may impact the semantics of the motion of the characters. For example, in a "registration" step, users head over to a table to fill out the needed form. For further information regarding this form, please refer to copending application number entitled "Electronic Shopping Management:

User Interface" which is hereby incorporated by reference herein.

Help: The step is an optional step whose function is to provide help to the end-user. In one embodiment, such step accesses are counted as if they were not part of the current task for purposes of determining a confused state. Frequent accesses of these pages will cause the sighted constructor 220 to flag a "missing content" warning in one embodiment. This warning suggests that some or all of the content on the help page should be placed inline, in pages that lie along the main path to completing the task. This attribute is mutually exclusive with other step types. Error: The step only arises once an error has been committed. Such a step can be associated with one or more tasks or steps. Accesses of these steps will lead to confusion. Repeatedly stumbling on the same enor page may anger an end-user, regardless of its task affiliation. This attribute is mutually exclusive with other step types. Registration/Application: This designation is used to place tables in rooms that animated characters in the user interface can use to fill out the needed forms. For details regarding the user interface, please refer to copending application number entitled "Electronic Shopping Management: User Interface" which is hereby incorporated by reference herein.

Browsing/Shopping: This means both that a longer time spent by an end-user is not necessarily a bad thing, and therefore reports will treat an increase in the amount of time spent on this step as being of neutral value as opposed to negative value in the general case. In one embodiment, this also means that repeated accesses of pages in these steps or cycling between them will not flag that the end-user is confused, except after a very large number of cycles. Further, in one embodiment, a task that consists solely of browsing steps need not have a blessed egress defined. By default it will be considered complete if each step in it has been visited. Research: This is similar to browsing, in that time spent by the end-user, is considered not to be a value-bearing metric. Thus in one embodiment, if specified with browsing, time spent will not be counted as a metric.

Blessed Egress: This means that arrival at this step signals successful completion of a task. This is fundamental to how calculations of task completion are performed. End- users who reach this step are considered to have completed the task successfully. It is to be noted that a task may have more than one blessed egress, if there is more than one distinct means of successfully completing the task. In one embodiment, any errors previously generated by the blind constructor 210 with respect to missing "next step" are removed. In one embodiment, a blessed egress is potentially identified if words such as "visit," "thank," and "again" are observed on a page after passing it through a stemmer.

For a further discussion of browsing/shopping, and confused and angry states of end- users, please refer to copending application number entitled "Electronic

Shopping Management: User States" which is hereby incorporated by reference herein. Exists: This attribute is used to indicate that the task step being specified has not yet been implemented on the site, but the business-knowledgeable user would nevertheless like to include it in the analysis of the web-site. In one embodiment, the default value of this attribute is: "yes."

Step Group Order: In some cases, there are several possible valid alternative steps. Completing any of one of these steps is sufficient to move on successfully to the next step. Such steps can be said to belong to the same step group. In one embodiment, there is also the notion of step group 0, which is a step that can be performed at any time in the course of performing the task.

Description: An optional textual description that may be filled in by the business knowledgeable user. This description will be used to clarify the meaning of the step in reports. Promotion: Like a task, a task step may also be a promotion, which means that it only exists for a predefined, bounded period of time. In one embodiment, this data is used to both bound data appropriately for convenient access in reports, as well as to be able to automatically provide a before and after contrast of end-user behavior.

In addition, like tasks, steps can also have relationships with other steps (e.g. precursor, post-cursor, etc.) Also like tasks, steps have various management related attributes such as a description. Also, like tasks, steps may have an importance value. In one embodiment, a step may have one or more images associated with it that can be used to identify it in various visualizations.

In one embodiment of the present invention, task models can be created at several different levels of refinement. Fig. 2 illustrates these different levels of task model creation in accordance with one embodiment. These comprise the blind constructor 210, the sighted constructor 220, and the observed or empirical constructor 230.

Different Kinds of Constructors:

The lower levels of refinement provide the user with some immediate gratification, while the higher levels of refinement provide the user with a more detailed and accurate modeling of their web site. In addition, as illustrated in Fig. 2, these different modules can provide information to each other, so that the lower levels of refinement of the task model can be corrected or improved with information from the more refined levels. Further, this feedback of information is iterative. For instance, as more information is obtained over time as the observed constructor gathers more customer behavior data 230, the empirical dimension of the task model is updated continually.

1. Blind Constructor 210 The blind constructor 210 can create a task model of a web-site at a low level of refinement. The blind constructor 210 performs a fast, lightweight analysis of any e- commerce site for generic usability problems. This type of analysis does not require any human provided knowledge of the tasks on the site or their relative priorities. Such analysis is, however, necessarily less accurate as a result. However, any such inaccuracies can be corrected by the sighted evaluator 220 and the observed evaluator 230, once the task analysis information for the site has been provided by a human user.

For example, one of the things it is possible for the blind constructor 210 to detect, is combinations of text color and background color that could cause problems for people who suffer from various kinds of colorblindness. Such a state of affairs is almost never deliberate, and is almost always problematic. Nevertheless, a rare exception case where such a situation may be deliberate might be a site that provides tests for colorblindness. A list of some problems that can be detected by the blind constructor 210, and their suggested solutions, is provided below.

It is to be noted that determining what tasks a web site contains is not a completely computable problem. This is because in general, the exact partitioning of a site into its different constituent tasks is closely related to the business objectives and priorities of the site creators. Further, an end-user may come to a site to perform a task that the site creators did not intend, or are not aware of. For example, he may wish to perform several tasks consecutively - effectively performing a supertask - that most end-users would not, and that the site designer did not intend. Thus the tasks identified by the blind constructor 210 only provide the site creator with a crude and low-fidelity identification of tasks. These initially identified tasks may be refined with the help of the sighted 220 and observed 230 constructors, as described below.

In one embodiment, the initial creation of the task model is performed by such a blind constructor 210. During this stage, the constructor is "blind" in that it has absolutely no knowledge of the site beyond the publicly available HTML of the web site.

Fig. 3 illustrates a flowchart outlining the various steps performed by the blind constructor 210. These include spidering the web-site 310, assembling 320 a preliminary task graph, and assigning 320 a certainty factor to the task graph. The blind constructor 210 spiders 310 the site in order to create an index so that key aspects of the individual pages' content can be analyzed as part of a broader whole, so that an initial pass of task identification may be performed. The blind constructor 210 performs the initial detection of tasks based on analyzing all, or in some cases a subset of, the links on each page of the user's web site. (It is to be noted that in some embodiments, the blind constructor 210 will not be able to bypass certain kinds of forms for which very specific information is required, such as a user password or credit card number. It may also initially be set to limit itself to detecting only a certain level of links, for example only one link away from the home page.)

In another embodiment, the blind constructor 210 performs the initial detection of tasks based on analyzing the forms on the user's web site. One place on a page with a high probability of being a task is in a form. In one embodiment, for each form that is detected on the page an examination is made of the words around the form as they may have a high probability of being a task. For words highlighted in bold or italic, the probability of the words identifying a task is increased. The blind constructor 210 also looks, in one embodiment, for noun verb phrases. For forms that have a list of values, the blind constructor 210 may look at each of the values of the form, as these might be an option to a single task or each value may represent an entirely new task. In some cases, a site designer may choose to put different tasks in the form elements, such as "buy a book", "buy a record", "search for airplane tickets". In other cases, the web site designer may choose to put different options to a single task such as a list of list of countries for an address. The list of countnes represents options for a single task. In one embodiment, a heuristic employed for determining whether the options in a form are of a single task or multiple tasks is to look at the regularity and parts of speech of the options as is done elsewhere. For example, are they all noun verb phrases, and hence likely a task? Or are they just nouns or have capitalization to suggest proper names (such as a list of states) in which case they might be options for one task? Another way to determine whether the values in a form are just options to a single task or multiple tasks are to take the first option on the form and do an HTTP POST. In one embodiment, the document object model of the resulting page is examined. The document object model is platform and language neutral interface that will allows programs and scripts to dynamically access and update content, structure and style of documents. By examining the document object model, the blind constructor 210 can determine the structure of the page, such as what elements are contained inside of other elements and where they are relative to each other in a page. If the document object models of the pages are similar, then the pages have similar structure. This would imply that the form values might be options to a single task since different tasks would tend to generate different types of pages resulting in a different document object model. A backtrack to the previous page with the form on it is then performed. The second option is then taken, and an HTTP POST is performed. This is then compared with the document object models of each of these pages. If the pages are similar, then there is a high likelihood of them being the same task. If they are very different, then they are labeled as different tasks in one embodiment. Various other parts of the document object model (such as navigation bars) may be looked at for potential tasks. In one embodiment of the present invention, tasks and their initial links are identified by phrases that have one or more of the following constructions: Verb noun; Verb verb; Noun verb; (Implicit verb) noun; Verb (implicit noun); Verb (article) noun; and/or any of the previous constructions with no more than one trailing preposition phrase or relative clause. "Implicit" in this case means that an ellipsis can be created within the same set of HTML elements, for example within the same table. For example, a set of radio buttons titled "Buy:", with each of the buttons labeled with a noun such as "books" could be readily understood as the task phrase "buy books."

In one embodiment, the blind constructor 210 operates a commercial spider modified by the system. The spider has been modified by the addition of code to increase its functionality to allow it traverse pages that other spiders cannot read. The modified spider reads a page and decodes the html tags in a page to allow it to analyze the following attributes of the links it detects: a) font type and characteristics such as color, boldness, etc. (the blind constructor looks at the html tags that determine font type); b) the position of the link relative to other links (the blind constructor looks at the structure of the document to determine the layout of the links); c) consider HTML tables of certain dimensions to see if they might be navigation bars (the blind constructor parses the html text looking for table and table element tags and counts them); d) perform and record counts of how many links are on a page(the blind constructor parses the page and counts the number of link tags found) ; e) perform and record counts of how many links lead to any particular destination; f) pass link names to a stemmer in order to perform various kinds of linguistic analysis; g) detect high contrast images, and note the visual prominence of their location (the blind constructor 210 examines the color values of gif and jpeg images for bright and dark color values); h) detect blinking or otherwise moving images, and note the position of their location; i) store an index of the non- navigational text content of the page, including alt text ("alternate text" for images that appears in mouse-overs amongst other usages) for subsequent lexical analysis. Some of these are described in more detail below. In one embodiment, the blind constructor 210 recognizes tasks by performing linguistic analysis. In some embodiments of the present invention, stemmer technology is used in order to perform certain types of lexical analysis on link names in order to determine how the link text or name is impacting the usability of the site. In order to perform such analysis, it is important for the analyzer to be able to understand that the names of links such as 'Book Buying,' 'Buy Books', 'Buy a Book,' 'Book Buys' etc. are all, from an end-user perception point of view, the same name, and therefore should lead to the same destination page.

A stemmer works by identifying the stems of words, so that these stems may be used for the purposes of name companson Speech identification is performed by a "tagger" once the phrase has been tokenized (i.e. split into different pieces) by a "tokenizer." It is to be noted that the least ambiguous link names are often those that are closest to a well-formed phrase, specifically a verb noun phrase For example 'Plan a Vacation' is a verb noun phrase It connotes a clear, user understandable goal In contrast, a link that just contained the word 'Computers' is, apart from its context, inherently ambiguous It does not inform the end-user of what it is exactly that can be done with computers by clicking on the link. For example, it is unclear to the end-user whether, after clicking on the link, she will be able to buy a computer, sell one, research or pnce one, or all of these

Several types of analyzers use the stemmer, and the related tagger and tokenizer. One analyzer performs a check to see if links on a page where all the stems match lead to the same destination since their names are for all intents and purposes identical If this is not the case, m one embodiment, an automated suggestion will be generated that suggests that these links must go to the same page. In one embodiment, the page whose lexical content is the closest match to the stems in the link text will be proposed as the destination page. In the worst case scenano, the profile of the union of clusters contained withm the two tasks is identical. In this scenario, the constructor can only suggest that the names (in both the analysis and in the link anchor texts) be different than what they currently are, presumably longer in order to disambiguate the two tasks It is to be note that in one embodiment, this suggestion will be made only if the observed constructor 230 has determined that there is a problem with a statistically noticeable number of end-users falling out of task A to attempt task B and/or vice versa As discussed elsewhere, if the two names contain exactly the same stems, a warning will be flagged in one embodiment, regardless of whether or not a noticeable end-user problem was observed as a result In any less than worst case scenario, between the primary verb and noun, the pair of stems that are least statistically similar between the two tasks will be selected, regardless of whether or not the stem is a verb or a noun.

A second analyzer checks to see whether a link is a verb noun phrase. If not, in one embodiment, the analyzer will suggest that the missing part of speech be added to the link text to reduce ambiguity. (It is to be noted that in one embodiment, there is a look-up table of well known exceptions to having verb noun phrases as link texts, such as links with names like 'exit.') A third type of analyzer tests to see that stems, and especially proper nouns, in a link name are adequately represented in the content of the destination page. For example, a link that was titled 'B ce Springsteen Concert Retrospective' could be expected to lead to a page where the stem 'concert' and the proper nouns 'Bmce' and 'Springsteen' appeared with a statistically noticeable frequency. If not, in one embodiment, the automated suggestion will be to change the link text to a name that contains both a verb and a noun whose stem appears relatively frequently on the destination page. Yet another type of analyzer related to the stemmer compares searches entered by end-users to determine how many times overall different searches are performed by the full population of end-users. This information can be used, in one embodiment, both to alert merchants to searches that yield no results, as well as to the level of end-user demand for different searches. This in turn can impact the desired navigational structure. In one embodiment, an automated suggestion is generated to prominently place links with a name suggested by the most linguistically common form of the search phrase.

In one embodiment of the present invention, the blind constructor 210 assembles 320 a preliminary task graph. This is done, in one embodiment, by associating pages with each likely identified task. In one embodiment, a task graph is a non-acyclic directed graph. One example of a task graph is shown in Fig. 7. Fig. 7 is discussed in detail below. In one embodiment, the blind constructor 210 assembles 320 the preliminary task graph by using a "breadcrumb" approach as follows, starting from links on the home page of the web site that either a) contain a verb-noun phrase, b) are located in a navigation bar (with the specific exceptions of common things such 'FAQ', 'About Us,' 'Contact Us', etc.), or c) are very visually obvious links such as alt text associated with large image links, bolded links that are surrounded by a certain amount of white space, etc. The words in these links are passed through a stemmer to both eliminate unimportant words such as 'the' and to normalize different tenses and persons of the same word. Next, each page to which each of these links points is similarly analyzed. It is assumed that at least one of the pages after the current page represent a valid next step in the task. Which one or more pages represent valid next steps in the task is determined by examining the density and importance (discussed below) of occurrence of the stems in the original link at the start of the chain. The stems are hence the breadcrumbs, as they provide a trail that can be followed to trace the task to its conclusion. In one embodiment, the so-called breadcrumb trail ends at the point at which there is no non-trivial equivalence relation that can be defined between the Nth and N+lth proposed step in a task. Specifically, if the stem profile of the two pages is no more similar than any other randomly selected two pages, the trail has stopped, and the task is (until analyzed by the sighted constructor 220) considered to end at the Nth step.

Density is the raw number of occurrences of the stem relative to the population and frequency of other stems. The importance criterion measures how focal the words in the original link are. In one embodiment, importance is measured by assessing like differentiated font characteristics used (e.g., underlining, the appearance of words in links, if the stem appears in a single phrase bounded in both directions by paragraph tags, etc). In one embodiment, the depth level to which the blind constructor 210 is set will determine the initial set of pages that is viewed to determine where the likely tasks are. However, once it has determined the identity of the initial links, and subsequent pages within the originally set depth, the blind constructor 210 will pursue links past this level on a second pass of the spidering in order to pursue these tasks to either their concluding page, or to the first form that it cannot bypass without human provided information such as a credit card number. Whether a page is a concluding page is determined by where the breadcrumb trail terminates. In one embodiment, a page is determined to be a concluding page when there is no next page a link away where the stems in question appear with any noticeable density or importance. This is the cardinality (C) ~N case, after all possible passes have been performed. This means that it is not possible to determine that one set of pages is more closely associated with a task than any other set. This causes the analysis to stop.

One embodiment is described in further detail below:

The site is first spidered to a depth of 1 from the home page • Call this set of pages, including the home page, the set E.

• For each page in P, , two indices are built. The first index I, contains stems of all the text that appears in links that are not embedded within regular text. If a specific link name appears in every single page of E. , this link name is not taken into account. This simply means that its stems are not added to 7, . However, these same stems could be in I, because of other link names that do not appear in every page. The second index I₂ contains stems of all other text content. Stems of text that appears in alt text tags are placed into I, too For each page, a total word count calculation is performed and the result stored at this point for density calculations at a later stage in the process. (It is to be noted that if the blind constructor 210 has determined that the navigation bars for all pages in question are identical, the I, indexes will not be built. Instead exactly one such 7, index will be built, since that is all that is needed.) • The contents of both 7, and 7₂ are stripped of common "stop" words such as prepositions, pronouns, articles, and forms of very common verbs such as have The stripped down results are stored for each page.

• The spidenng, stripping, and indexing process is repeated to the next level of depth to form the set P₂ of pages that are 2 levels from the home page. Each page is only counted once; a page is processed and added to a set only the first time it is encountered by the spider. It is to be noted that this analysis could go deeper, but a depth of 2 will hit the entry points for most tasks at most sites. In the case of the ride along, where the user can provide the system with valid next pages, the process is continued to the blessed egress in each task. Adding the stems of pages from deeper levels provides another global set of stems against which one can determine the randomness or non-randomness of stems.

• A frequency count is performed on stems that occur m all of the 7, ' s Stems that have either a statistically even distribution across the pages or a statistically random distribution throughout all pages covered are added to a new "stop" list These stems are then stnpped from all indices; this should leave only stems that are relatively distinct intact. A frequency count for each remaining stem is performed for each page and recorded

• Each page can now be considered to have a profile defined by an π-tuple where n is the number of unique distinct stems remaining in the union of all the 7- ^'s of all the pages For example, if a page had the stems, "translate" (8 occurrences) "visualize" (4 occurrences), "dilate" (3 occurrences) but not the stem "conjugate", the stem profile for this page could be expressed as (8 x 1 4 x 2 3x3 0x4) . In this way, the content of different pages can be compared. If the vectors thus expressed are either a) identical for two pages, or b) one is a multiple of the other, the two pages are said to belong to the same set or cl7uster C under any mapping from pages to clusters; that is, they will always be in the same cluster, regardless of the cardinality of C, the set of all clusters.

• If two pages have no stems in common, no equivalence relation can be defined other than the null mapping that would put them in the same set. So, logically, these pages cannot be in the same cluster, regardless of how few clusters there are.

In one embodiment, the real work thus involves those sets of pages that share weak equivalence relations amongst themselves, since in this scenario there are typically a large number of different ways to partition the elements into different sets, no one of which is clearly superior to any other.

• If the stems contained in one page are a subset of those in another page, a meaningful equivalence relation exists; whether or not to actually instantiate a new cluster will depend on an examination of the stems that in the complement set. These provisional clusters are re-evaluated after one complete set of pairwise comparisons has been done (see below.) Both the overlap set and its complement set are stored for another possible round of computation. It is to be noted that each round of the comparison process is reasonably efficient because most the "noise" stems were stripped out in the initial stage. As a result, if the computation starts with the least frequently occurring stems and works its way up, a minimal number of comparisons will have to be performed.

• If two pages have some stems in common, but others that do not overlap, again, the complement set, which is the set of stems that appear only in one of the two pages, is calculated and stored.

• As a result of the first pass, the set C of all clusters will have a cardinality of no greater than N, where N is the total number of pages that have been analyzed to that point. However, if cardinality(C) is close to N, the result of the first pass, while valid, may not be optimally useful for analysis purposes. • If this is the case, another round of stripping is performed on stems that have the highest cardinality, and the lowest cardinality in the set C which is the union of all complement sets. Specifically the left and rights tails of a Gaussian curve are removed. But before the stems are actually removed from processing, the URL's of the pages in question are compared and considered. If the presence of a particular regular expression element beyond the base URL (e.g. www.travelocitv.com) statistically correlates to the appearance of a high cardinality stem, this stem is not removed, since it is very likely to be semantically meaningful.

In the case of a low cardinality stem, if there is a difference in the number of common elements in the Uniform Resource Indicators (URIs) of pages between page(s) that contain the stem and page(s) that don't, the stem should be left in. But if there are common elements in the URIs beyond the host name, and these common elements in the URIs are the same, the business knowledgeable user should be queried if possible as to whether the page should be "considered the same as" the other page (For that purpose an "element" in a URI is defined as either an element of the directory name

(name between with slashes) or a filename (HTML or JSP, ASP... file name), or the name of any dynamic content generator executable (script, servlet...) or a parameter / value pair of the URI). If so, the definition of the equivalence relationship for that cluster is redefined to be the original set of shared stems. If the business knowledgeable user indicates that the pages are not to be considered the same, then the stem remains in the analysis, and cannot be removed in subsequent passes (unless there was a human user involved, who subsequently changed her mind on the topic.) This can be terminated at any stage, or be continued until every stem has been handled.

• The process of clustering is an iterative process during which clusters are merged or separated. Initially, every single page forms it' s own cluster. As equivalences are determined between clusters, clusters are merged. If the number of clusters is too low, some refinements will be used to split clusters.

• Then the comparisons checks for equivalence between clusters are re-performed. This will reduce the cardinality of C on each pass. Each potential cluster is checked to see to what extent the URIs of its pages have a common structure, and in some applications how links in its pages point to other pages in the cluster (it is expected that every page in the cluster is at least reference by one other page of the cluster). If there is no good correlation between URIs structure and proposed clusters, it is likely because there are too few clusters and therefore too many pages in each cluster.

• If there are more clusters than desired, the stem stripping process described above is reiterated until a reasonable number of clusters is obtained. • If there are fewer clusters than desired after the above process terminates, a further fibering of the existing clusters will be sought. At this point, several other factors are considered: o Density of the stem's occurrence when the full text is considered (including

Importance of the stem's occurrences): The above calculations are now redone with the additional requirement that the densities be roughly equivalent. "Important" occurrences of the stem are given 1.5 the weight of regular occurrences. Important occurrences are defined as those that were in I and not just 7₂ . o Physical nearness of different stems to one another: This is a search for pairs of stems that are no more than 5 words apart from one another, and which do not have terminal punctuation (such as a period) that lies between them. These pairs are then treated as an additional stem, and the process is started again.

In one embodiment, a preliminary task graph is assembled 320 in which the common root node represents the home page, and edges emanating from the root node correspond to links that were identified during the above-described process as being the entry points into likely tasks. In one embodiment, duplicates are removed automatically. That is, if two links to the same task were identified, only one node will appear. The node will have the name of whichever link had the stem with the greatest combined density and importance rating. In one embodiment, there are two different approaches to generate a name for a task. The first one is to look at the text of the links pointing to the task. If the texts are coherent enough, this is a good candidate for the task name. When the first approach cannot be used (for instance when the appropriateness of the link name is to be evaluated), or when it does not give good results, the second approach based on the page stems is used. In one embodiment, this done by searching for the most frequent occurrence of a pair of likely (as determined by the tagger,) verb noun stems that appear within a predetermined number (e.g. five) of words of one another in the full text and are not separated by terminal punctuation such as a period or exclamation point. The name will be proposed as being the first person form of the verb and the plural form of the noun. It is to be noted that the name chosen is only intended to ensure that an appropriate name is selected, even if one of the initial links pointing to the task is oddly worded. However, any alternate names for the nodes may be chosen.

In one embodiment, the likely average number of steps per task computed is derived from such a graph of the site assembled 320 by the blind constructor. In one embodiment, this is done by literally computing the average length of every path that starts with the home page and which ends at either the conclusion of the task, or at an impermeable form. A task step is a set of one or more pages that are involved in the completion of a task. The use of a pattern specifier that can cover an arbitrary number of pages over a plain URL is one of the distinguishing characteristics of the invention. In one embodiment, task steps are associated with pages via a regular expression. The average length of all paths between two nodes is calculated from a task graph in order to determine the average number of steps in a task.

The blind constructor 210, in one embodiment, then assigns 330 a certainty factor to the task graph. The certainty factor is the extent to which the blind constructor 210 believes that the task graph is likely to be accurate. In general, the certainty factor for the preliminary task graph generated by the blind constructor is never 100%, even if the graph looks very well behaved, since there is no input available from the site creator at this point to validate it or disprove it. In one embodiment, the default certainty factor of a very well behaved preliminary task graph is no more than 90%, or some other arbitrarily high value initial value chosen to show that graphs generated by the blind constructor 210 are less accurate than human generated graphs. This certainty factor is then decremented substantially for each undesirable behavior observed in the preliminary task graph. Examples of such undesirable behavior include task graphs that are highly unbalanced, that resemble lattices more than graphs, and task graphs where the average path length is less than three nodes. It is to be noted that, in one embodiment, the certainty factor is not intended to be a precise measure, but is rather used to provide an indication to site creators of the accuracy of the preliminary task graph.

In one embodiment, the blind constructor 210 is designed to cross over barriers to dynamic content that stop regular spiders. Some of the barriers that stop most commercial spiders today are dynamic looking content, cookies, redirection via meta tags or location

HTTP directives, agent identifiers, constructors.txt files, Secure Sockets Layer (SSL), Macromedia Flash, forms, frames, Javascript based navigation or linking. These are addressed in further detail below.

Many spiders that are in use today for creating indexes of web content will not fetch a link if it is possible for that link to invoke dynamic looking content such as an Internet Server Application Programming Interface (ISAPI), Common Gateway Interface (CGI), or Netscape Server Application Programming Interface (NSAPI) module of the remote server. However, in one embodiment, the blind constructor 210 does not avoid dynamic content. Instead, it only avoids actual downloads of EXEs, zip files, tgzs, Java applets, images, or any page whose text exceeds a user configured upper boundary of size (for instance, 25 KB).

In one embodiment, the blind constructor 210 deals with cookies by optionally storing them for a site. Other spiders generally do not process cookies because they have been designed to process only statically generated pages. Most statically generated pages do not have state and session information, and so they do not have cookies. The blind constructor 210, however, processes dynamically generated pages that generally do have state and session information, and can process cookies. In one embodiment, it interprets the get and set cookie headers from HTTP traffic, stores the cookie information in an internal data structure, and returns the cookie information when requested by the web server.

The blind constructor 210 can support redirections. In one embodiment, redirections inside a page are reported as a special form of link that the blind constructor 210 follows just like it would a normal HREF.

In one embodiment, the blind constructor 210 will use an agent identifier that indicates that it is Mozilla 4.x. Both Internet Explorer and Netscape Navigator identify their agents in such a manner. This is done in order to maximize coverage, since some pages will not be served if the remote server thinks that the end-user's browser is the wrong version.

In one embodiment, the blind constructor 210 deals with the content of Robotss.txt files by ignoring them. Robots.txt is a file that appears on webservers that specifies a list of URL's that should be ignored by spiders. Since such files are not part of a normal end-user's experience of a web site, they are ignored.

Some conventional constructors or "spiders" have trouble accessing SSL protected areas of a site. However, in one embodiment of the present invention, by using a pure Java library called Java Secure Sockets Extension (JSSE), the SSL protected areas of a site can be accessed.

In addition, forms present a major challenge to conventional web-site spidering. Forms will be navigated by both the blind 210 and sighted 220 constructors. However in blind mode forms with non-trivial parameters such as end-user password or credit card number will cause the blind constmctor 210 to drop pursuit of that particular path. In general, only forms whose controls are all of an enumerated type such as "radio button" can be successfully navigated by the blind constructor 210. However, some forms that contain text fields may allow any non-null entry, so in one embodiment, the blind constructor 210 will attempt this once per form unless the text fields have certain special titles, for example "password," "ID" or "credit card." The HTTP POST operation that is required to submit a form will be executed like a single link on that page. The form will not be entered more than once by the blind constmctor 210 in one embodiment of the present invention.

Javascript based navigation or linking is another common difficulty for regular spiders. Frames present a problem to conventional spiders, but not to the blind constructor. Conventional spiders consider each page separately. Conventional spiders can process each page separately because their main purpose is to index the content of a page and not to analyze the navigational aspects of a web site. Because each page is independently processed, conventional spiders do not understand the relationship of the navigational aspects of individual frames in a frameset, so they do not process frames correctly. Instead, in one embodiment, the blind constructor 210 considers each frame in a frame set to be in separate page, but then ties the pages together as if the frameset linked the pages together. By performing this action, we are able to correctly determine the navigational aspects of a frameset. Javascript interpretation can be extremely complex. In one embodiment, the blind constructor 210 will attempt the simplest possible way of finding links that are concealed in Javascript. It will also indicate a problem when a site appears to be using Javascript as a form of HREF. The Javascript in the onClick event handler option in links will be parsed, to look for a static string representing the URL that the Javascript link is associated with, this link will then be treated like any other HREF by the blind constructor 210.

In addition to handling normal HTML content, the blind constructor 210 will also handle arbitrary markup languages such as Wireless Markup Language (WML), which is used in wireless phone applications. The components of the spider such as the parser and analyzers are pluggable, i.e. one is able to plug another implementation into the existing framework by providing an implementation for each spider interface. For example, to parse WML, a technical user registers with the spider a XML parser and WML DTD for the mime type corresponding to WML, which is text/vnd.wap.wml. When the blind constmctor 210 encounters content with the MIME type text/vnd.wap.wml it calls the WML based XML parser to parse the page. Likewise, a technical user can plugin WML based analyzers that measure phone specific usability problems. This framework allows for specific parsers and analyzers for different types of devices besides the PC, wireless applications, and other appliances such as web based TVs and kitchen appliances.

Common Usability Problems that can be detected by the blind constructor 210:

The blind constructor 210 performs a large number of usability checks that require only a low fidelity task graph, if at all. Some of the types of likely usability problems that can be detected with the blind constmctor 210 include:

1. Inappropriate use of red text outside an error condition. In one embodiment, text is assumed to be inappropriate if it is of a red hue, and appears the first time a page shows up, when the end-user has not yet had a chance to make an error. However, in one embodiment, if red later proves to be a branding color for the site, these errors are removed. 2. The words "Click here" in a link. Such wording tends to draw an end-user's eye without providing any information itself as to what the end-user will get when they "click here".

3. Presence of a splash screen. This may be undesirable, since it automatically adds another click in the path to accomplishing a task.

4. Too many links leading from one page to any particular page. This can be thought of as the "all roads lead to Rome" problem, regardless of whether or not the end-user wants to go to

Rome.

5. Links with very different names leading to the same location. This is another variant of the "all roads lead to Rome" problem.

6. Too many links on a page. This causes a significant amount of "visual noise" which may distract end-users. In one embodiment, links on navigation bars that are not specific to a particular page are removed from this analysis. While they do reduce the number of links that are allowed in the body of the page, if half or more the total number of allowable links for that page are consumed by the navigation bar, in one embodiment, only a global warning about the number of the links in the navigation bar will be reported. 7. Too many images outside an HTML table. Again, this causes a significant amount of "visual noise" which may distract end-users. 8. Colorblindness problems arising from certain combinations of background and text color, for example, red/green colorblindness.

9. Large numbers of circular references between pages. Such circular references can literally send end-users around in circles, leading to a sense of disorientation.

10. Insufficient content on a page. This may be undesirable as it squanders a valuable click. In one embodiment, insufficient content is defined as all of the following being true:

• No controls, apart from any controls that appear in navigation bars that are not specific to that page.

• Only one paragraph of text - or no unlinked content

11. Too many links in a list of links. This makes it difficult for the end-user to pick out the one she is interested in. In one embodiment, a milder warning is given if such links are presented in alphabetical order, as providing a well known ordering makes it much easier for end-users to quickly locate the thing they are looking for, or to quickly conclude that it is not present amongst the list.

12. Missing alternate text from an image or area tag. This is important for a myriad of reasons including searchability, access for the blind, and access for people who have "graphics" turned off on their browser.

13. No contextually appropriate "next step" link at the bottom of a page - or alternately too many such links. In either case, the outcome is the same: there is not a simple choice to be easily made by the end-user to proceed to another relevant location on the site. It is to be noted that in one embodiment, this kind of analysis cannot detect whether the "next step" link is contextually appropriate. It can merely note that it differs from page to page. Further, in one embodiment, a lookup table is used, which contains phrases/words that are typically included in what might be considered tertiary navigation bars so that these are always excluded from consideration. For example, such a lookup table may include phrases such as "FAQ" and "About Us."

14. Change in navigation bars. Such changes can be disorienting to end-users, especially if all the navigation bars change. In one embodiment, a navigation bar is presumed to be one of a) an HTML table of dimensions W x H where either H = 1 or 2 and W > 3 or H > 3 and W = 1, or b) a series of juxtaposed text whose font characteristics are different from that of the rest of the page, and the text is linked, or c) similarly for images. In another embodiment, there are three types of navigation bars:

• Lists of links:

These navigation bars are basically a set of at least 4 contiguous <A> tags with hrefs that are separated by at most 2 separators (separators must be consistent throughout the navigation bar) and that are nested in distinct children of a common parent. Both text and images links are used the same way; navigation bars can mix the two types. Detected navigation bars include navigation bars based on tables (vertical and horizontal), navigation bars with full text and/or images link links, lists based on <OL>, <UL>, <BR> tags; horizontal lists separated by characters like |, ^• , or -. Small images are neglected when looking for navigation bars as they usually are bullets or arrows.

• Image maps:

Images maps with more than 4 areas are considered to be navigation bars.

• Pulldown menus:

Forms that have only one input (except hidden fields) which is a pulldown menu and which have 0 or 1 submit buttons are considered to be navigation bars because they are usually shortcuts to the various tasks of the site.

15. Links with names that are too long. These can be problematic because a) the underline of the.link can become a major visual distraction, and b) the longer the text in the link, the more difficult it can become for the end-user to understand exactly what kind of information they are likely to see on the resulting page.

16. Missing search capability, or missing search refinement or scoping ability (e.g. only search through the "books" part of the site), or failure to include a search textbox in each page.

Ubiquitous search can be a critical factor in lowering the number of clicks the user has to perform to reach their goal.

17. Invalid default value in pulldown. This forces 100% of the users to commit a click, while picking a common value at least removes this action for some subset of the users. 18. No link on each page to return to the home page. Having such a consistent link can provide end-users with a sense of orientation much the way a tall skyscraper does in a city. 19. Redundant Content. In one embodiment, two pages are said to contain redundant content if and only if all of the following are true: the character count on the two pages is within 20%, and the probability that a stem that survived the pre-complement set stoplisting appears in either page is close to equal. 20. Too many steps to complete the task. In one embodiment, it is not considered advisable to have more than three steps per task.

In addition, more specific automated suggestions can be generated based on knowing the importance of different tasks. So errors like "too many links on a page" can be met with an automated suggestion to remove the ones to tasks that are of a relatively lower importance.

It should be noted that the blind constructor 210 can only detect possible errors. However, this can be a valuable first pass. Once human-entered data is added, the warnings flagged by the blind constructor 210 can be modified in several ways by the sighted constructor 220. Such modifications may include some or all of the following:

• Red text messages can be removed if red is stated to be one of the site's branding colors. • Missing next step errors can be removed from the last steps of tasks.

• Changes in navigation bars may not be flagged as an error automatically. Instead, this error may only be flagged if the change was not contextually appropriate. In one embodiment, only links to related tasks and steps within the task are considered contextually appropriate. Errors will be flagged if: • The appearance of a new text entry in the navigation bar is not appropriate.

• An inappropriate link remains

• If all roads really do proverbially lead to Rome, but this reflects the reality that a large number of links are appropriately leading to a task with one of the highest importance levels, this error may be removed. • In one embodiment, there should not be links to the home page from within uninterruptable steps or tasks

• Correction of the average and maximum numbers of steps to complete the task. The sighted constructor has certain information from the human user on this topic while the blind constructor does not; the blind constructor will also get stuck in the face of most forms while the sighted constructor will not.

• Redundant content warnings will be removed within the same task. In addition, more specific automated suggestions can be generated based on knowing the importance of different tasks So errors like "too many links on a page" can be met with an automated suggestion to remove the ones to tasks that are of a relatively lower importance

The observed constmctor 230 may refme the above analysis further For instance, the definition of what is too many or too few in the above discussion may be modified due to empirical data that is gathered on end-user behavior across different web-sites Thus these thresholds should not be thought of as having ngid fixed values It should be noted that while the exact value of the threshold that is applied can vary, a system m accordance with the different embodiments of the present invention takes into account that there is some threshold that can cause problems for end-users

2 Sighted Constructor 220

In one embodiment of the present invention, the sighted constmctor 220 is based on the assumption that of the N many tasks that appear on an e-commerce site, some have much greater importance than others Further, some tasks are of greater importance to some of the e-businesses' customers (that is end-users) than others For example, college students are unlikely to have much interest in retirement plans, but may have great interest in checking accounts For a small site that consists of only several web pages, such considerations of pnoπtization may be relatively unimportant However, for more complex e- busmesses that may offer tens or even hundreds of tasks, the alignment of the site's presentation to its business objectives can be cntical to the success of the site

The sighted constructor 220 interacts with the site creator (or with a business knowledgeable user familiar with the web-site) to provide a more refined task model for a web site In one embodiment of the present invention, the sighted constructor 220 may bootstrap on and correct the lower fidelity task model created by the blind constructor 210, by incorporating into it information provided by a busmess-knowledgeable user familiar with the

Fig 4 illustrates a flowchart outlining the vanous steps performed by the sighted constructor 220 in one embodiment of the present invention These include creating/refining a task model, receiving 420 attributes/relationships for identified tasks, and assembling 430 a more refined task graph.

The task model created 410 by the sighted constructor 220 can either be based on the one created by the blind constmctor 210, or can be created from scratch. In one embodiment, the sighted constructor 220 starts off at this stage with a blank slate. The representatives who have knowledge and understanding of the business model of the web site can often include mid-level to senior marketing or e-commerce people. In particular, one or more such business-knowledgeable users provides the sighted constructor 220 with a list of tasks implemented on their web site as they perceive it themselves, the regular expressions (or other pattern specifiers,) for the pages that represent the steps of the various tasks on the site, and a prioritization of these tasks by their completion value to the site creator's business.

It is to be noted that multiple business-knowledgeable users may edit the same task model. In one embodiment, it may be possible to record and report for comparison purposes, the differences in what each user entered. Widespread disagreement among the users who performed the edits may be highlighted in such a report. Such highlighting can help organizations identify areas of gross disagreement internally, which can be a large source of a confused or inconsistent site navigational structure. However, in order to always have a task model that is usable by the system, a single business-knowledgeable user edit model is implemented in one embodiment of the present invention. Thus in one embodiment, while many business-knowledgeable users can record their views, edits from only one business- knowledgeable user actually modify the task graphs used for analysis. In another embodiment, each business-knowledgeable user's changes overwrite the previous changes. If logical conflicts occur (for example, having a task whose total value is less than the sum of its steps), they are treated in the same way as they would be if only a single business- knowledgeable user were editing the task model. In one embodiment, a warning panel comes up, and the offending changes may not be saved until other needed changes are also made.

In one embodiment, the sighted constructor 220 uses the information generated by the blind constructor 210 in two ways. The sighted constructor 220 will look for the business- knowledgeable user to specify the tasks that the blind constructor 210 automatically detected. If the list of tasks the business-knowledgeable user creates does not include one or more of the tasks uncovered by the blind constructor 210, a warning panel will appear asking the business- knowledgeable user to either create entries for the missing task(s), or to confirm that these "tasks" are not considered tasks by the business-knowledgeable user. Secondly, as discussed above, the sighted constructor 220 will look at the errors found by the blind constructor for subsequent possible correction.

The sighted constructor 220 can create 410 a task model through any user interface that allows the various attributes of different tasks and their steps to be specified through it. Such user interfaces include: a) a simple forms-based web application, b) a drag & drop means of assembling the graph visually, and c) a "ride along" capture tool.

In one embodiment of the invention, a business knowledgeable user provides information required for a task model via a web-based application. Each attribute or possible attribute of tasks, task steps, promotions, sites, and customer profiles is entered through this application. Examples of this application can be seen in Figs. 5A-C.

In another embodiment of the invention, the user directly assembles the task graph by building the graph element by element via a drag & drop user interface. The end result closely resembles the graph in Fig. 7, which is discussed in detail below. Text properties of the various objects in the graph are defined by clicking on the object in the graph, which brings up a panel in which this information can be specified.

In yet another embodiment, a "ride along" tool is used by a business knowledgeable user to construct the more refined task graph. The ride along tool is a browser plug-in that allows business knowledgeable users to specify a task and the various preferred paths through it without having to master the complexities of regular expressions (or other pattern specifiers.) The ride-along is an extension of the task creator 110. It is invoked by pressing the "Show me" button in the task analysis application. This brings up a small recording panel that records each page that the user went to prior to hitting the stop button. It has both a 'record task' and a 'record step' button. The latter allows all the different pages that were visited before the stop button is pressed to be understood as being part of the same step. In addition, in one embodiment, the ride along tool allows key value pairs entered in forms to be captured, as well as cookies from the URLs. While this surface level approach is sufficient to capture individual URLs, it by itself may not be powerful enough to yield the needed regular expressions that could match to many thousands of URLs. In order to capture this otherwise missing information, in one embodiment, the same clustering-based approach used for automatic task identification by the blind constructor 210 is used here to build "like" sets of pages for each page that was part of the recording. This is done with clustering analysis in the same manner as it is performed for automated task identification by the blind constructor 210.

The regular expression that unifies the largest number of pages within the cluster but none outside it, is the one that is used in one embodiment. In one embodiment, this generation process works in the following manner: • The very first approximation consists in taking the common base of all the URIs of the pages in the cluster with the .* wildcard affixed to it. ^Λ is added at the beginning of the regex and $ at the end so that partial URIs are not incorrectly matched. It is to be noted that although, in one embodiment, the $ sign is useless at this point because it is preceded by the .* wildcard, it can become useful after a few transformations of the regular expression. As a consequence the initial regular expression will generally look like ^Λhttp://www\.domain\.com/directory/-*$. (It is to be noted that in one embodiment, dots have to be escaped).

• The protocols of each page of a cluster should be checked. If they are not all identical, an OR should be generated. Usually the protocol is not very relevant for a task (for instance it could be either http or https, but in the end the page should look the same if the rest of the URI is identical). With that transformation the regex could look like: ^Λ (http|https) ://www\.domain\.com/directory/.*$.

• If the host names are different for the different pages, we will assume that we are in the case of a load balancing. For instance the server names could be cgil.domain.com, cgi2.domain.com. In that case, the specific server name is replaced by the [^Λ.]* wildcard and the domain name is kept intact. The regex could therefore read: ^Λ(http|https)://[^Λ.]*Vdomain\_com/directory/.*$. A smarter implementation would detect that the servers are numbered (which happens quite often) and use the [0-9]* wildcard for the server number part. Regex would therefore look like: ^Λ(http|https)://cgi[0-9]*\.domain\.com/directory/.*$.

• If the common part of the URIs of the pages goes up to the end of the file specification (i.e. the end of the URL or the interrogation point in the URI), it is the case of dynamically generated pages. The base regex should look like' ^Λ(http|https)://cgi[0- 9]*\.domain\_com/cgi/query?.*$. In that case, each parameter of the URIs should be analyzed separately to determine the parts that are common and the parts that differ. If a parameter appears in all the page URIs, the part param=[^Λ&]* is included in the regex (the wildcard [^Λ&]* is used to grab only this specific parameter). If some common part is detected in the value of that parameter, it is added to the part of the regex for instance, the pattern for the parameter could be- param=foo[^Λ&]* or param=[^Λ&]*bar or param=foo[^Λ&]*bar. Some regularity could be detected in the non common part of the value to make the regex more specific. For instance, the non common part could be a number; in that case, the pattern for that parameter would be: param=foo[0-9]*. However, in the case when the values for that parameter totally differ and the number if different values is quite limited (for instance lower than 4 different values) an OR instance of the [^Λ&]* wildcard is used. For instance: param=(foo |bar| value) . o If the same set of parameters appears for all the pages, they can be appended to the end of the regex with & signs between them. For instance the complete regex would read: ^Λhttp://wwwVdomain\.com/cgi/query?paraml=foo[0-

9]*&param2=(foo|bar).*$. o Otherwise, the different set of patterns of parameters have to be ORed. A smart implementation would detect the parts that can be factored and the parts that have to be ORed. The resulting regex would for instance be: ^Λhttp://wwwVdomain\.com/cgi/query?paraml=foo[0-

9]*&(param2=(fool|barl)|param3=(foo2|bar2)).*$. If the common part of the URIs do not go up to the end of the file specification or an interrogation point, patterns in the directory names and the filenames have to be detected. o For the directory names, the approach is close to the approach for parameter values. The common part is found, and the [^Λ/]* wildcard is added for the rest. If some regularity is found in the names of the directory (like numbers), a more precise wildcard is used. However if directory names have nothing in common and the number of them is quite limited, an OR. Pattern is created for directories (can be for instance directory/ or product[0-9]*/ or

(cars|vans|suvs)/ or in the worst case [N]*/). o Hopefully only the only directory name which changes is the last one. If not, the patterns for the different directory depths one next to the other are appended. The regex for the pages could for instance be: ^Λhttp://www\.domainVcom/(sell|buy)/(computers|printers)/.*$. o In the case where the pages with different directory depths, an OR has to be created between all the possible directory names. However this case is relatively suspicious and should not appear often For instance: ^Λhttp://www\.domain\.com/(buy/new|buy/auction|sell)/.*$. o For the file name itself, the same pattern as for the directories and parameter values applies, except that the default wildcard is [^Λ?]* to stop before the query parameters. A pattern for filename could be: buy[^Λ?]*\.html or product[0- 9]*Vjsp. o In the end the pattern for the file is added to the pattern for the directories.

• When both the directory/file name and the parameters on the line change for the different pages, the patterns for the directories, file name and parameters are detected as described above and append them to create a global regex. The final result could be: ^Λ(http|https)://cgi[0-9]*\.domain\.com/cgi/(query|submit)?paraml=foo[0- 9]*&param2=(foo|bar).*$.

• In the case when there are a few totally different URIs in the cluster, a global or for the URIs will be used instead of a complex succession of ORs for the directories, file names... for instance the regex would be: ^Λhttp//:www\_domain\.com/(products/listing\.html|services/introduction\.html)$.F H

A canonical exemplar of each such set is shown to the business knowledgeable user, who then has to indicate whether or not the page in question is the "same" as the page he visited during the recording, or whether it is different. To select the exemplar, the sighted constructor will at random select a page that is a member of the strongest equivalence relation that prevails for the whole set of pages of the cluster.

In this way, reasonably accurate regular expressions with constructs such as OR and NOT can be built up in the background. Where the clusters are too large, or the numbers of clusters are too large, the business knowledgeable user is prompted by the ride along tool to answer questions such as whether pages like X are to be considered the same as pages like Y.

Regardless of the user interface used to create 410 the task model, in one embodiment, the data entered by the user is checked for logical consistency. For example, it is not possible for both task A to contain task B, and for task A to contain task B. If the user does not correct such logical consistency errors, a flag will be set that disables analysis from being done on those tasks which are affected. Further, in one embodiment, as data is gathered on user behavior via the observed constructor 230, omissions or errors in the task model are aggressively sought through automated tests. For example, if more people end up completing a task than started it, it is clear that there were some entry points to the task that were missed in the definition of the task model. Similarly, if completions do not match known figures obtainable from application server logs and other sources with which the sighted constmctor is integrated. In some embodiments, such errors are prominently displayed as errors both on all reports that are relevant to that task and in the user interface for editing task attributes.

Alternately, the sighted constructor 220 may refine 410 the task model created by the blind constructor 210. Modifications or conections to the blind constructor's 210 task model include, in one embodiment, removing pages that conespond to non-tasks (for example, glossaries, "about us," etc.). Additions to the blind constructor's 210 task model include, in one embodiment, and showing divergences in importance by different end-user type if appropriate. In particular, in one embodiment, the primary function of the sighted constmctor 220 is to provide the "correct" task ordering and presentation attributes based on the user's business objectives, and to compare and contrast this correct order with the order as it currently appears on the site as assessed by the blind constructor 210. In one embodiment, there are factors that are considered in this regard: order, interruptability, preemption, and ordering constraints. Each of these involves verifying - or disproving - the faithful following of the task and task step properties indicated in the task model. For example, in the case of a mandatory step ordering, if one task step must be performed before another, it should be impossible for the end-user to haphazardly attempt to perform the steps out of order.

Further, page-to-task associations, as well as the number of steps per task, can also be modified 410 by the sighted constructor 220 as appropriate. As discussed above, the number of steps calculation performed by the blind constructor 210 may be inaccurate for any number of reasons, and could be conected by the sighted constructor 220. In addition, the sighted constructor 220 may have to correct the task model created by the blind constmctor 210 due to the concept of the "blessed egress." A blessed egress is a step that if reached by the user signifies successful completion of the task. By definition, blessed egresses do not have any "next steps." It is to be noted that there may be more than one such blessed egress for any task.

Once a task model has been created/refined 410, the sighted constructor 220 receives

420 various task attributes and relationships. As described above, tasks have certain attributes and relationships to one another. These relationships, coupled with certain key attributes of tasks such as their dollar value to the merchant, determine the optimal navigational structure for the site in one embodiment of the present invention. An "optimal structure" seeks to minimize the number of clicks needed to complete critical tasks.

As illustrated in Fig. 4, the sighted constructor 220 assembles 430 a more refined task graph. In the various embodiments of the invention, in certain cases, such as highly personalized sites, the site may present a substantially different task graph for each customer profile type for which a personalized or specialized version of the site exists. This does not however impact the appropriateness of the methodology. It simply means that more than one task graph must be created for the site.

In one embodiment, the sighted constructor 220 associates pages of the web site with each identified task by spidering the site and searches for pages that map to the various regular expressions generated in the task model creation. It is to be noted that in one embodiment, the sighted constructor 220 spiders the site again even if it has already been previously spidered by the blind constructor 210. This is due to several factors, including: the blind constmctor

210 does not have valid values to navigate past most forms; the relatively limited task identification capabilities of the blind constructor 210 may have caused it to overlook certain tasks that the business knowledgeable user has specified and which were therefore not originally spidered to the needed depth; etc. For each page that is scanned, the page's URL is compared against the task step's regular expression map. (A page is an HTML, XML, WML, or similarly structured document designed to be fetched from an HTTP server. A page may consist of links, forms, and/or other such elements. As noted previously, in one embodiment, if a page contains multiple frames, each of these frames is considered a separate page.)

Form data: Users will be able to input sample form data in key value pairs, the keys may be either by type or by key. The values will be used on any set of forms on the site which has either the same key names, or the same key types for its input parameters.

In one embodiment, the various constructors 210, 220 and 230 build and compare three different task graphs: the ideal graph, the actual graph, and the empirical graph. In theory, all three graphs would be structurally identical to one another. However, in practice this is rarely the case.

The first graph is the graph defined by the task model created by the sighted constructor 220 solely on the basis of input from a business knowledgeable user. This can be thought of as the "ideal" task graph, since it should correctly reflect the business priorities of the e-busmess The second type of graph is the actual graph The actual graph is produced by the constructor spidenng the site according to the task and task step definitions provided by the business knowledgeable user directed spider's spidering of the site It is to be noted that in the case where more than one page maps to the regular expression or equivalent pattern specifier for a task step, the set of pages that do map are tested for lexical and navigational homogeneity A representative of each cluster that is detected is added to the graph Each set of sibling links is numbered from 1 to N, with a 1 being assigned to the link that is most visually prominent on the page Ties are permitted The algonthms for determining prominence are discussed below There would be differences between the ideal graph and the actual graph if, for example, there were a link missing between two related tasks The empirical graph is discussed below in the section on the observed constructor 230

It is to be noted that business knowledgeable users may specify tasks which do not yet exist on the site This ability can be useful in utilizing the system with pre-production sites, which may not yet be fully complete when the use of the system is begun In one embodiment, such tasks appear as nodes in the ideal task graph If specified, so too will the steps of these tasks Such "missing" tasks and steps will be considered for purposes of step counting, interruptability, pre-emption, and ordering constraint issues They will not be counted in delta computations between the ideal and actual task graphs

3 Observed Constructor 230

The observed or empmcal constructor 230 makes use of end-user behavior data to detect significant divergences in end-user behavior from the ideal task model defined by the user in his interactions with the sighted constructor

Fig 6 illustrates the steps performed by the observed constructor 230 These include gathering 610 end-user behavior data, and assembling 620 an empirical task graph based on observed end-user behavior data

The observed constructor 230 gathers 610 end-user behavior data The type of data gathered 610 includes the nodes and edges m the graph that the end user travels through, including the order of visitation and the time spent, end-user behavior withm the page, including scrolling events and mouseovers In one embodiment such data can be collected using vanous data collectors such as a log file sniffer, packet sniffer an observer applet, and a log analyzer. Details regarding these various types of data collectors can be found in copending application number entitled "Electronic Shopping Management: User

States" which is hereby incorporated by reference herein.

The observed constructor 230 assembles 620 an empirical task graph based on the data gathered 610. In one embodiment, the observed constmctor 230 uses the actual graph and/or the ideal graph in assembling 620 the empirical graph. In one embodiment, the observed constmctor 230 can add new edges between the nodes in the actual task graph. It can assign traffic numbers to both nodes and existing edges in the actual task graph. It can detect which deltas between the actual task graph and the ideal task graph do actually seem to pose problems for end-users. This information allows the report generator to assign both severities and even specific costs to different errors that had previously been detected on the site by both the blind 210 and sighted 220 constructors.

The empirical graph reflects the actual traffic patterns of users against the actual graph. It differs from the actual graph in several respects. First, both links and nodes are assigned values based on the number of users that traveled through them. In the case of links, the direction of travel is also recorded. This information can then be compared to the importance values assigned by the site creators to the different tasks in the task analysis application. The second difference is that links that are wholly unused are removed from the graph. However, virtual links may also appear. For example, if a large number of users are completing task A, returning to the home page and then immediately selecting a link to task B - and there was within task A no link to task B, a virtual link would be added to the empirical graph between task A and task B. Thus, previously undefined supertasks may be identified and subsequently tracked. In addition, the system also looks at long statistically non-random path sequences - including those that span multiple tasks - in the log file information to construct tasks not identified by either the ideal or actual graph. Where appropriate, edges will be added to connect tasks into supertasks in the empirical task graph. For example, an edge will be added if many end-users are transporting between two steps by performing key word searches. Edges will also be added to the empirical graph between steps or tasks that end-users were transported to in significant numbers as a result of interaction with the real-time intervention user interface if the preponderance of these transports were either a) confirmed as successful by the human operator, or b) resulted in the successful completion of a task in the fully automated system. Real-time intervention is described in copending application number entitled "Electronic Shopping Management Intervention" which is hereby incorporated by reference herein.

In one embodiment, the observed constructor 230 will construct a graph that may differ from the actual task graph m vanous respects including

• Adding an edge to the graph where a user has gone from step A to step B by

• Using the "back" or "forward" button on the browser

• Was taken there by an interaction with the real-time intervention user interface (see patent #4)

• Made the jump by doing a keyword search

• Assigns additional weight to an edge that is disproportionately traveled relative to the business importance assigned to it in the task model

• Removes edges that have not been traveled since observation began

• Removes optional nodes (and links to them) that have not been traveled since observation began

• Removes the optional designation from nodes that virtually all customers visit

• Assigns additional weight to frequently paths that may contain one or more task in order to detect missing supertasks Note that if there are no prerequisite constraints between the tasks in question, any path that contains all N tasks contiguously will be counted as the same path, regardless of the order they were performed in

Fig 7 illustrates a sample task graph In Fig 7, the $100 task is clearly the most valuable one to the merchant since the other tasks pictured are valued at $20 or less The shortest - and hence best - path to the $100 blessed egress is from the home page -> A -> B > $100 blessed egress A slightly longer, and therefore less good path is from home page -> A - > C -> D -> $100 blessed egress However, a much worse path from the merchant's vantage point is home page -> A -> C -> G -> the $10 blessed egress, since he realizes only 1/10 the value he would from the better outcome. The lines extending from home page -> E -> G -> $10 blessed egress, and from home page -> A -> C -> G -> $10 blessed egress indicate where the observed constructor 230 has detected significant differences between end-user behavior and the behavior preferred by the merchant, and in support of which the site was probably designed.

B. Site Representative Profiler 120

In one embodiment, the site representative profiler 120 permits site representatives to set up different profiles for various representatives. There are two distinct kinds of site representatives who generally interact with a system in accordance with the present invention. One type of user representative includes marketing or e-commerce executives or senior managers (business knowledgeable users), while the other type of representative includes technical people (technical users).

The marketing or e-commerce executive or senior manager user representative may be at virtually any technical skill level, including none. The only requirement for this type of business knowledgeable user is that they have a detailed grasp of the site's business objectives. The real value such business knowledgeable users bring is knowledge of the business itself. For instance, such business knowledgeable users can include people having a retail background. Such people are often very familiar with traditional retail terms, but are less likely to be very web-savvy. Thus, in one embodiment, for these business knowledgeable users, the user interface presents all report data in pure business terms. For example, the word "page" never appears in a report; words like "goal," "objective," "direct revenue, " "merchandising," and "conversion rate" do.

The technical type of user generally supports the senior management type users described above. Such a technical user is often a Management Information Systems person. A site representative who identifies herself as a technical user will be expected to have some familiarity with web technology, and will be expected to know, for example, how to handle regular expressions.

It is to be noted that in many cases, both kinds of site representatives edit certain aspects of the task models and site representative profiles. Thus multiple site representative profiles must be supported by the site representative profiler 120, allowing different site representatives to use the tool as appropriate. Having such predefined site representative profiles allows a person to indicate which pre-set user category she falls into, so that the user experience can be customized accordingly. In one embodiment, each pre-defined user type will be followed by a brief description, so that the site representative can easily decide how to best categorize themselves.

In an embodiment of the present invention, a site representative can set up an individual profile to customize all aspects of the reports that he will receive, as well as determine the appearance of his personalized starting page. A name and a password that the site representative wishes to use may also be entered into the site representative profiler 120.

In one embodiment of the invention, the primary differences among the site representative profiles in the user interface are as follows:

Executive User: Only summary level reporting information is displayed by default for users who select this profile. By default, the only part of the task model they may modify is the user/task matrix, as well as any personal preference settings for the application. Using such settings, they may give themselves access to addition information, but they will be asked for extra confirmation if they try to change any values for any functionality that they did not have access to by default.

Business User: This user type is by default provided with access to all available kinds of reports. They have, by default, access to all parts of the data except for the regular expression (or other pattern specifier) data, and user account information.

Technical User: This user type has access to the full set of data. He can change any kind of data, and can create or delete user accounts from the system.

C. End-User Profiler 130

In one embodiment, the end-user profiler 130 allows the site representatives to set up different profiles for different types of end-users important to the site's business. Information about these different kinds of end-users is used throughout the system.

In one embodiment, the different types of end-users may be identified by domain. In another embodiment, the different types of end-users may be identified by login. In yet another embodiment, the different types of end-users may be identified by using a cookie database or a profiling or personalization engine. Further, in one embodiment, the relative importance of each type of user can be identified as being business critical, important, or opportunistic. The importance of different tasks, and possibly different steps within these tasks (in those tasks in which not every step is mandatory,) for this type of end-user. In addition, it is also possible to define what image and type of animation should be used to visually identify both the type of end-user, as well as individual instances of the end-user type. These images/animations may be used in some reports generated by the report generator 160.

In one embodiment of the invention, an arbitrary number of different end-user profiles can be defined. In addition, an arbitrary number of different sets of end-user profiles may be defined. This might be done if, for example, it might sometimes be desirable to break down users by gender, but at other times it might be preferable to break down by age group. In one embodiment, the end-user profile set used will impact the ordering of nodes in the task graphs, because it may highlight a different set of both user priorities and problems.

In one embodiment, the importance of different tasks, and possibly different steps within these tasks (in those tasks in which not every step is mandatory,) can be defined for each type of user. This can be done, in one embodiment, by means of an end-user/task matrix. The end-user/task matrix is an editable table in which each task is prioritized for each type of user. Such a matrix can serve several purposes for the site representatives. Such a matrix can be a visual aid for assessing how many discrete types of end-users there really are from an empirical standpoint. That is, if there are two kinds of end-users defined, but the tasks they perform and the priority of these tasks is always the same, there is effectively only one type of end-user.

Further, such a matrix can be a visual aid for understanding how the priorities of tasks differ by end-user type, and what impact that has on the ideal navigational structure. This is important, as the end-user can get confused if the overall task priority for a site seems totally out of whack with a particular type of defined end-user. A site representative can modify these per end-user priorities.

D. Site Profiler 140 The site profiler 140 is where the relevant properties of the web site as a whole are defined. In one embodiment, the kinds of information obtained here include: the URL of the web site's home page, the location of the webserver's log files, URLs of partner sites that are to be handled as "blessed egresses," related sites (e.g., competitors, partners, partial competitors, and subset or superset relationships) that are to be periodically evaluated with either the blind 210 or sighted 220 constructors, the URLs and profiles of such related sites, an image to graphically represent it in various visualizations, and brand names and other site- specific terminology which should be ignored by the stemmer-related parts of the system in performing linguistic analysis.

This last piece of data is necessary in order to perform proper linguistic analysis of link names, especially comparative analysis. For example, in doing competitive site analysis between GTE and AT&T, the two task phrases "Check out GTE's Long Distance Rates" and "Check out AT&T's Long Distance Rates" should constitute a match. Thus, words such as brand names are added to the "stop" list for this site, or list of words that are automatically eliminated by the tokenizer as the phrase is first parsed.

E. Performance analyzer 150

Analysis of the performance of the web site is performed with the data provided by all three constructors. The various embodiments of the present invention take an inherently incremental architectural approach. Thus, the blind constructor's 210 input alone would permit business knowledgeable users to experience some immediate gratification from learning something of interest very quickly. However, relatively little useful information can be obtained based on the blind constructor's 210 input alone, and the information that is obtained is more error prone. The input from the sighted constructor 220 and the observed constructor 230 respectively provide the system with increasingly more information. While a business knowledgeable user has to invest more time and effort to obtain this information, she is rewarded by a more detailed and accurate analysis. Different kinds of data are weaved in as they become available, and errors in interpretations from earlier stages in the processing are undone. Such an incremental approach thus combines some immediate gratification for the user along with an iterative refinement of the reports generated by the performance analyzer 150.

Various reports are generated based on the analysis performed by the performance analyzer 150. The reports generated include, in various embodiments, information such as a problem/solution-oriented view of the design flaws and non-optimalities in the site design, a detailed, page by page analysis of problems, and question and answer formats. The primary target audience for such reports includes both the high level decision makers who determine how and when the design of the web site may be changed based on these reports, as well as the implementors of such changes.

The performance analyzer 150 includes several different types of analyzers, which are described below.

7. Blind Analyzers 610:

These analyzers use data obtained from the blind constructor 210. Several kinds of errors can be identified by using input from the blind constructor 210 alone. While some of these identified potential errors may be conscious design choices and may be justified, an initial listing of these potential errors nonetheless often helps the user to identify some unintentional mistakes that often confuse end-users of the web site.

2. Sighted Analyzers 620:

These analyzers use data obtained from the sighted constructor 220.

Pre-emptability analyzer: The pre-emptability analyzer checks to ensure that the pre- emptability requirements of each task and step are met. A warning is flagged if this is not the case. In one embodiment, interruptability requirements are also checked.

Inaccessible Task Step Analyzer: This analyzer checks to ensure that all the steps that the user specified as belonging to a particular task are actually directly accessible from a link within that task.

Missing Task / Link to Task Analyzer: This analyzer serves several different functions.

1) To check that the anchor text of a link has stems similar to the stems in the user's description of the task in the task analysis application. It is to be noted that listings of individual products can cause a problem here without the assistance of an ontology. Accordingly, in one embodiment, this is handled heuristically. In some cases, the names of the links should not match the task step names the links lead to: product selection and research steps. Typically for these steps, pages contain a list of words which are either categories of products or products themselves. For instance, in a step named "select a computer", links could be "desktops, laptops, servers" or "500MHz Intel Celeron, Compaq Presario 350, ...". Several approaches can be used to prevent from reporting link naming errors in this situation: a) check if the items are ordered alphabetically. Product lists as well as product category lists are usually ordered alphabetically, so that a long lists of ordered links should be ignored for the analysis. This works for both product lists and category lists, b) Define a dictionary proper to the site or even proper to the task. For instance, this dictionary would define that laptops, desktops and servers are computers and it is therefore acceptable for any of these words appears instead of "computer". This is applicable for category lists only, c) Detect that list of links contains only product names. Factors that can be used to detect lists of products include: Link phrases contain mostly common noun, no verbs; Many proper nouns identified by the fact that there are capitalized; Many numbers which are usually model numbers; Uppercase words which can be model numbers or brands; Many words with no valid stem (brand names, model names), d) Define a dictionary of brand name and possibly model names proper to the site or the task This is discussed further under the "Linguistic Analysis" section.

2) To flag any occurrence in the task graph where the first node of a task is one or more nodes further from the root in the actual task graph than it is in the ideal task graph.

3) Report paths that appear with statistical significance in the empirical task graph that do not appear in the actual task graph because one or more edges are missing. These edges could however have been virtually traversed either by using the forward and back buttons on the browser, with bookmarks, or by smoothing out a pass through from the home or other page. For example, if the probability is very high that if the end-user goes from step A to step B, he next goes to step C, and the path A -> B -> C is not part of any defined task, an edge connecting A directly to C would appear in the empirical graph, and the analyzer would flag a warning about a missing link.

4) To detect 'missing' links between related tasks, including tasks that often or necessarily precede one another.

Backtrack Analyzer: This analyzer combines static semantic analysis with data on end-user behavior to determine whether a particular task or task step has a statistically higher than normal backtrack rate. A 'backtrack' is commonly defined as an almost instantaneous departure from a page to return to the just previous page, often before it is even completely loaded. This is most often accomplished with the use of the browser's back button, but could also be accomplished by clicking on a link at the top of the page that takes the user back to the previous page. In one embodiment, an access of the browser's stop button also is considered as a backtrack. This analyzer also analyzes a) the number of end-users who left this page with statistically unusual haste regardless of whether they returned to the previous page; b) if such end-users then, with any statistical significance, immediately go to a third page for which they demonstrate a sustained intent to complete the task, this is noted by the analyzer; and c) if some links to a particular task or step have a statistically significantly greater backtrack rate than others, this is noted by the analyzer.

Missing Content Analyzer: In one embodiment, if a statistically significant number of end-users, or a statistically significant number of an important sub-population (e.g. a end- user profile type), access the help page associated with a particular step, a missing content alert will be reported. What part of the missing content the end-users were looking for, may be detectable from the scrolling behavior of end-users. For example, if only few end-users scrolled to the bottom of the page, in one embodiment it is inferred that the needed content lay near the top of the page.

End-User profile analyzer: This analyzer notes when end-users of a profile that were explicitly excluded from needing to complete that task during the manual creation of the task model, do so in statistically significant numbers. This is important data to know, since if different tasks on the site were very explicitly designed around the notion of only end-users of specific profiles or demographic backgrounds completing them, and this is not in fact the case, it may be appropriate to change the site design.

3. Observed Analyzers 630:

These analyzers use data obtained from the observed constructor 230. Some of the analyzers assess:

Customers immediately backtracking from a step, especially when coming from a specific destination.

Customers going from a higher dollar value task to a lower dollar task in statistically noticeable numbers. Customers cycling or spiraling around a particular step.

Customers performing the same N tasks together unexpectedly, suggesting that there is a supertask that should be defined.

Customers unexpectedly traveling from step A to step B via keyword search.

One perspective provided by the observed constructor is the number and kind of divergences from approved or "blessed" paths in the task model. Based upon input from the observed constructor 230, it can be observed when a statistically significant number of end- users take less than efficient paths through the web sites. This could occur either because a shorter path was not readily apparent to many end-users or because there is a missing supertask definition and associated navigational structure. If a statistically significant number of end-users fall out of a task at a particular point, and proceed with another task, this could be because the task name was not sufficiently expressive so as to give users an accurate picture of the task. If a statistically significant number of end-users invoke a site service such as search or help at a particular point in a task, it could be because that particular step is confusing and/or is not clearly defined. Appendix A contains an example of a table which includes some indications of blame assignment, or cause and effect relationship between problems found by the blind 210, sighted 220, and observed 230 constructors. It is to be noted that, in one embodiment, the blame assessment involves not just end-user fall-off, but also the end-users' mood state at that time. For a further discussion regarding end-users' mood states, please refer to copending application number entitled "Electronic Shopping Management: User

States."

4. Related Site Evaluator 640

The related site evaluator 640 performs competitive analysis between two or more related e-commerce sites. In one embodiment, either of these cases, the related sites are specified by the user in the site profiler 140. For each site that is specified, the business knowledgeable user indicates a spidenng interval (1 e the interval between observing and reporting on the contents of the site,) and the relationship of the site to be analyzed to the pnmary site As discussed above, the relationship can be one of unrelated, partner, direct competitor, partial overlap, superset of, or subset of Such information is used to determine which parts of the two sites to usefully compare to one another For example, one may only want to compare the CD part of Amazon com (rather than the entire Amazon com site) to CDNow

In one embodiment, two related types of companson are possible The first causes the task pnontization for the pnmary site to be applied to the related sites The second tnes to assess what the relative task importances seem to be on the related sites, and how this differs from the pnmary site

In either case, in one embodiment, each site to be compared is spidered by the blind constructor 210 Automated task discovery is performed The task names found dunng the process are passed through the stemmer m order to extract the stems in the link text and or page names that are used to identify the task The stems extracted from these names are then compared, in one embodiment, to the union of the stems from a) the names of the tasks entered by the business knowledgeable user, b) the text of each link that points to the first step of that task on the related site, and c) the text description of the task entered by the business- knowledgeable user

In one embodiment, a synonyms dictionary or a full ontology can be used with the stemmer to handle more cases than just simple stem matching For instance, such a dictionary can indicate that a laptop is a kind of computer, and task phrases may be matched accordingly This can be implemented in several ways, including a) integration with generic third party thesauruses or ontologies, b) integration with domain-specific or vertical thesauruses or ontologies, c) allowing the business knowledgeable user to enter specific synonyms they would like to match to specific words as an optional step in the task analysis application In another embodiment, support for simple ontological relationships is supported

In one embodiment, if both the (likely) verb and noun stems match, the tasks on the two sites are said to match If any of the stems appear in the "match" list (that is, a list of stems that when found automatically match regardless of the presence or absence of other stems), then the tasks are said to very likely match If more than one of the stems is in the match list, the probability of a match is even greater If a stem on the "kill" list is detected, the whole phrase will be rejected from analysis. If a step is on the "stop" list, the stem will be removed from the phrase, but the phrase will be processed. If only one of the stems matches, a match may be declared if the stem in question is rarely occurring on the site. Specifically, in one embodiment, in order to avoid multiple possible matches, the stem would have to either a) appear only in the name and path of one task, or b) appear with close to an order of magnitude greater frequency in one task than it does in any other task.

In one embodiment, a probability of match is computed based on the stem distribution throughout the site. For example, if a site has a large number of very similar tasks in it which thus contain highly similar lexical content, the probability of a match can never be anywhere near 100%.

If the two sites were being evaluated according to the business objectives of the primary site, in one embodiment, what would have been sighted constmctor 220 errors on the primary site are translated into messages that advise the business knowledgeable user that a particular task seems to be given more - or less - importance on the related site than it is on the primary site. For instance, a warning is flagged if an important or business critical task for the primary site is being assigned more importance on the competitor's site than on the primary site. In one embodiment, a list of the tasks discovered on each site is compiled in a table. The perceived or known importance of each task is listed, as is which site(s) it was detected on.

F. Report Generator 160

In one embodiment of the present invention, there three levels of reports corresponding to

the three different constructors 210, 220, and 230.

7. "Blind" Report

This report has only data gathered by the blind constructor 210. In one embodiment, such a report is not based on any human provided data. In another embodiment, the only human entered information on which the blind report is based, is the starting URL, and the starting URL's for any related sites that are to be spidered. In one embodiment, most of the information in this report is presented in list format, and is provided on a page basis rather than being presented according to tasks and steps. This report is often targeted at the technical user, who can use it to identify to specific errors on each page of his site, and thus repair them.

An example of a report is provided in Appendix B . A blind report can contain several kinds of data. Such data can include a preliminary task list; individual errors generated by the blind analyzers; aggregated error information; and basic related site analysis.

Aggregated error information. This includes which errors occurred most frequently, which errors occurred in every page analyzed, which errors occurred on a majority of pages, and so on. Errors may be grouped by severity.

Competitive Task Phrase Matrices. As discussed previously, links in navigation bars (as detected by the blind constructor) on both the primary site being analyzed, and any competitive sites are analyzed to see if they correspond to a goal-oriented phrase, most typically a verb noun construction. Stems are extracted from all such phrases, and if the stems match, the task is said to exist on all sites that contained it. This is explained in further detail 630.

Preliminary or "Blind" Task List. As discussed earlier, the blind constmctor performs automated task detection. These tasks are listed, along with the steps detected for them, as best the blind constructor could discern. Because this analysis is completely "blind," this information can also be gathered on competitive sites, and comparisons in numbers of steps, and names of steps can be made.

All information in this report is presented in list format, and, apart from aggregate information, is provided on a page basis rather than being presented according to tasks and steps. This report is targeted at the technical user, who can use it to identify to specific errors on each page of his site, and thus repair them. It is also used when there is no human entered data available.

2. "Sighted" Report:

In one embodiment, the sighted report is based on information provided by the sighted constructor 220. Thus the sighted report has knowledge of human entered data, and the resulting task model. The sighted report can therefore present higher level information, including normative statements involving pre-emption, ordering violations, and interruptability. The sighted report is targeted primarily at business knowledgeable users.

The sighted report will, in one embodiment, have full coverage of tasks. Further, because of knowledge of business objectives, this report can provide automated suggestions as to how to re-arrange the presentation of links so as to align the presentation with the business objectives. In one embodiment, it will suggest correcting any differences between the actual and ideal task graphs. Such suggestions include moving a link to an important task higher up on the page than links to less important tasks, placing a link to a critical task on the home page where it was previously absent, etc. As discussed earlier, the sighted constructor will also refine the interpretations of the blind constructor based on its increased knowledge. For example, it will remove errors involving a missing "next step" link from blessed egress pages.

The sighted report will typically have full coverage of tasks, since information will be provided to allow it to drive by all forms. This report is targeted primarily at business users. In one embodiment, minor errors from the blind constructor are removed to make the report shorter. It can be run without any data from the observed constructor. This is important, since it means that this report can be run on test or pre-production sites that do not yet have real user traffic to analyze.

3. "Observed" Report:

The observed report is based on information provided by the observed constructor 230, and thus has knowledge of both the task model and the observed end-user behavior on the site. This combined knowledge allows it to prioritize problems based on their cost to the merchant. The sighted report is targeted primarily at business knowledgeable users.

In one embodiment, the top ten most costly problems can be listed in order at the top of the report. In another embodiment, problems are presented in relation to the business objectives they are impairing, as garnered from the mapping of tasks to business objectives in the task model. In still another embodiment, statistically significant changes over previous observations are highlighted at the top of the report. In yet another embodiment, this information is presented in a question and answer format. For example, a question could be. Q: Which task cost me the most money this month? Answers to such questions can be computed because the task model contains information about the value of successful completions to the merchant for each task on the site, both by end-user type and on average. When a (negative) dollar value is assigned to such problems, users may have more incentive to remedy them. The table in Appendix A lists some examples of some such usability problems, the design problem to which such a problem may be linked, and a rule of thumb stating how such a problem can be detected in some embodiments of the present invention.

In one embodiment, the observed report also contains both end-user traffic data and cost data relating to any edges that exist on the empirical graph, but not on the actual graph. Similarly, it notes gross disparities between links to tasks that the merchant has deemed relatively unimportant in the task model, but which are nevertheless receiving a high quantity of end-user traffic. For example, the observed constructor 230 may report whenever a statistically significant number of end-users pursued a link to a lower importance task over one or more tasks of higher importance.

In conjunction with each other, the report generator 160 described below displays, in one embodiment, two graphs: the actual task graph and the empirical task graph. Where the two graphs differ, the links and nodes that differ are rendered in red in one embodiment. Specifically, this includes:

• Edges that occur only in the empirical graph

• Error and help nodes and related edges that appear only if they are frequently accessed. • User traffic as detected by the observed constructor that is highly disproportionate to the amount of business importance assigned in the task model. In one embodiment, the different kinds of information here described in the various reports can be configured by the site representative to appear or not appear, or appear in a particular order in a custom report format. The site representative may also specify at what point to truncate the display of a particular type of information. For example, she might decide to only show the 10 most serious problems detected per step.

From the above description, it will be apparent that the present invention disclosed herein provides a novel and advantageous method and system for constructing and analyzing task models of web sites. The foregoing discussion discloses and describes merely exemplary methods and embodiments of the present invention. As will be understood by those familiar with the art, the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. In addition, it will be noted by one skilled in the art that the organization and placement of the areas in figures depicting user interfaces, are merely illustrative, and are not limited by the present invention. Variations in the placement, size, and shape of the areas would be readily apparent to those of skill in the art of user interface design. Accordingly, the disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is set forth in the following claims.

Claims

We claim:

1. A computer implemented method for analyzing the effectiveness of a web site directed to electronic shopping, the method comprising: constructing a task model for the web site, wherein the task model includes a plurality of tasks that end-users may perform at the web site, and a gradation of each of the plurality of tasks in accordance with their relative importance; analyzing the effectiveness of the web site based on the task model; suggesting modifications to the web site to improve the effectiveness of the web site; and incrementally refining the task model in accordance with the modifications to the web site.

2. The method of step 1, wherein constructing a task model for the web site comprises: performing a blind analysis of the web site resulting in a preliminary task model; interactively engaging the user to review the preliminary task model; refining the task model; obtaining data regarding functioning of the web site; and refining the task model further in accordance with the data obtained.

3. The method of claim 2, further comprising: performing a blind analysis of at least one web site competing with the first web site.

4. The method of claim 2, further comprising: comparing the preliminary task model to a business model based on which the web site is constructed; and responsive to the task model being different from the business model, receiving suggestions from the user for refining the task model.