EP3369069A1 - Escrow personalization system - Google Patents

Escrow personalization system

Info

Publication number
EP3369069A1
EP3369069A1 EP15907552.2A EP15907552A EP3369069A1 EP 3369069 A1 EP3369069 A1 EP 3369069A1 EP 15907552 A EP15907552 A EP 15907552A EP 3369069 A1 EP3369069 A1 EP 3369069A1
Authority
EP
European Patent Office
Prior art keywords
tax return
data
escrow
data store
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP15907552.2A
Other languages
German (de)
French (fr)
Other versions
EP3369069A4 (en
Inventor
Tristan Cooper BAKER
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intuit Inc
Original Assignee
Intuit Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intuit Inc filed Critical Intuit Inc
Publication of EP3369069A1 publication Critical patent/EP3369069A1/en
Publication of EP3369069A4 publication Critical patent/EP3369069A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/123Tax preparation or submission
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/08Payment architectures
    • G06Q20/20Point-of-sale [POS] network systems
    • G06Q20/207Tax processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/10Tax strategies

Definitions

  • Embodiments are related to computer-centric and internet-centric technologies such as preparation of electronic tax returns by an online tax return preparation application accessible by browsers executing on user computing devices, data and memory management and computing system efficiencies, user interface generation and real-time user interface adaptation and modification, interfacing with or communicating with on-line computer applications, and modular system components that can be used to implement changes to an on-line tax return preparation application without having to change code or components of the on-line tax return preparation application itself.
  • Embodiments also involve or are related to user interfaces generated during preparation of electronic tax returns and personalizing user interface content for particular users, personalizing tax return preparation experiences, modifying user interfaces and user experiences on the fly in real time during preparation of an electronic tax return, and adapting user interfaces in response to changes in electronic tax return data.
  • Embodiments also involve or are related to data escrow systems and how components of data escrow systems may operate independently of or asynchronously relative to an online tax return preparation application while still interfacing with the online tax return preparation application to respond to an application's request for results generated by the data escrow system.
  • Such results may be in the form of a personalized tax return topic ranking that is incorporated into an interview screen that is presented to a user of an online tax return preparation application during preparation of an electronic tax return.
  • Examples of tax return topics that may be processed by embodiments and integrated into a personalized tax return topic ranking include, but are not limited to, income, deductions, and taxes paid.
  • Embodiments may also be utilized to rank sub-topics of a topic, e.g., certain types of income, such as wages or other income such as business income, interest, dividends, pension income, annuities, rental income, unemployment compensation, capital gains, gambling income, farming and fishing income, clergy earnings, social security, scholarships, alimony, canceled debt, 401 k distributions.
  • sub-topics for the topic of "deductions" may include medical and dental expenses, home mortgage points, interest expenses, charitable contributions, business related expenses (car, travel, etc.), educational expenses, property taxes.
  • a tax return topic ranking may include only "root” node topics (e.g., income, deductions, etc.), a combination of "root” and “leaf' node topics / sub-topics (e.g., income, property tax, mortgage interest), only "leaf' node sub-topics of a certain topic (e.g., property taxes paid, points, mortgage interest), "leaf node sub-topics of different topics (e.g., property taxes paid, points, mortgage interest, wages, dividends), or a combination of one or more of topics, sub-topics and further drilled down or lower level, more specific sub-topics.
  • Embodiments also involve or are related to processing of a small portion of data managed by an online tax return preparation application in order to make changes to and improve interview screens and user experiences during preparation of electronic tax returns. Embodiments also involve or are related to reducing the amount of tax return data that is processed in order to implement dynamic changes to interview screens during preparation of an electronic tax return to provide personalized user experience, thus achieving improvements while providing for efficient use of computing resources as a result of, for example, reduced demands on processor and memory components and reduced data transmission between components and/or through networked components.
  • embodiments may involve transformation of on-line tax return preparation experiences as electronic tax return data is changed by a user of an online tax return preparation application.
  • embodiments are adaptive to these changes and provide for generation of interview screens that reflect these changes and include different tax return topic rankings or topic menus with topic items that are arranged in different ways depending on the data provided.
  • Embodiments also involve or are related to transforming static menus or tax return topic listings into dynamic, adaptive menus or tax return topic listings that prioritize what a user is more likely to select or view, thus providing for more efficient menu presentation, electronic tax return preparation and computing resources to implement interview screens with more streamlined or reduced user interactions with interview screens and menu items generated by an online tax return preparation application.
  • an online tax preparation application may manage a "master" persistent data store that includes tax return data of millions of taxpayers, whereas embodiments utilize a separate data store or cache that includes a small portion of a persistent data store managed and serves as a low latency remote persistence mechanism where user data is stored and indexed for fast access and retrieval at decision computation time.
  • Embodiments utilize a cache of data required by models that are used to compute decisions in the form of topic rankings, for example, for currently active users (i.e., user's with an active session or who are logged into the online tax return preparation application). While the a runtime persistence data store managed by the online tax return preparation application is used for maintaining all application data for all users over all time, the cache component utilized according to embodiments maintains only a small subset of that persistent data for active users, and only for the duration of the active user's session.
  • Embodiments also involve models, such as predictive models, and integration of predictive models into on-line tax return preparation applications. Results generated by models are incorporated into user interfaces or interview screens generated by a tax return preparation application. [0008] Embodiments also involve a modular escrow system that allows for changes to individual components such as models and specification files regarding data required for model execution, without having to modify code or components of an on-line tax return preparation application.
  • Embodiments also involve how a tax topic ranking system can execute independently of or asynchronously relative to an on-line tax return preparation application for which the topic ranking is generated.
  • Embodiments also involve user interfaces, such as user interfaces generated by online tax return preparation applications, and how such user interfaces can be modified during preparation of an electronic tax return, and in the background for an interview screen that is not currently viewed by a user.
  • user interfaces such as user interfaces generated by online tax return preparation applications, and how such user interfaces can be modified during preparation of an electronic tax return, and in the background for an interview screen that is not currently viewed by a user.
  • a computerized tax return preparation system includes an on-line tax return preparation application and a tax return topic ranking system.
  • the on-line tax return preparation application (such as turbotax.com) can be accessed by respective user computing devices executing respective browsers to prepare respective electronic tax returns of respective users.
  • the on-line tax return preparation application is configured to write respective electronic tax return data of respective users to a first data store, or a master persistent data store. This first data store may include electronic tax return data of millions of customers for many years.
  • the tax return topic ranking system is in communication with the on-line tax return preparation application and the first data store and can operate independently of the on-line tax return preparation application while interfacing with the online tax return preparation application to provide tax return topic rankings thereto.
  • the tax return topic ranking system is configured to retrieve from the first data store particular types of data for respective users that are logged into the online tax return preparation application, and the small portions of data of respective logged-in users retrieved from the first data store are included in generated escrow records, which are stored or cached to a different, second data store.
  • a model is executed utilizing the particular types of electronic tax return data to generate a tax return topic ranking, which is provided to the online tax return preparation application.
  • the on-line tax return preparation application is configured to generate an interview screen including tax return topics structured according to the generated tax return topic ranking.
  • the escrow controller is in communication with the on-line tax return preparation application and can receive data from the online tax return preparation application indicating, for example, that a user has initiated an on-line session or logged into the online tax return preparation application.
  • the escrow controller is configured to process the model specification file, or escrow contract, to determine which types of electronic tax return data are required in order to trigger execution of a model.
  • One or multiple models may be applicable to a user and an escrow record may be generated for each active model for a user, and one or more escrow contracts may specify one or more types or combinations of data required for respective model inputs and execution.
  • the selective data retrieval service is in communication with the escrow controller and the persistent or first data store utilized by the online tax return preparation application to maintain all application data for all users over all time.
  • the second data store or cache is in communication with the escrow controller.
  • the selective data retrieval service is configured to retrieve the logged in user's electronic tax return data of types identified by the model specification file (e.g., based on a subscription or request by the escrow controller) and provide the retrieved electronic tax return data to the escrow controller.
  • the selective data retrieval service may be a polling service employed by the escrow controller, or a native element of the first data store that automatically notifies the escrow controller of changes in specified types of electronic tax return for logged-in users without the need for polling or a subscription for specific data or changes thereto.
  • the escrow controller can generate an escrow record for the user or online session and including retrieved electronic tax return data, which may be some or all of the data types identified by the specification file.
  • model execution is triggered, and for this purpose, the escrow controller is configured to issue a call to the model processor to execute a model identified in the model specification file.
  • a result generated by execution of a model is a ranking of tax return topics, or a ranking of menu items (generally, ranking of tax return topics). This ranking result is provided to the online tax return preparation application.
  • the escrow controller may be configured to store the model result in an independent third data store until the model result is requested by the online tax return preparation application.
  • the third data store may include the most recent or current model result generated and can be modified as a result of iterations of changes to the first data store data, changes to the second data store or cache data, changes to whether model requirements are satisfied and which models are executed with different inputs, and resulting different results.
  • a further system embodiment includes one or more or all components of a computerized tax return topic ranking system discussed above (including an escrow controller, a model specification file, a selective data retrieval service, a second data store or cache for escrowed data, an execution service or model processor, and a third data store) and the on-line tax return preparation application.
  • a computerized tax return topic ranking system discussed above (including an escrow controller, a model specification file, a selective data retrieval service, a second data store or cache for escrowed data, an execution service or model processor, and a third data store) and the on-line tax return preparation application.
  • Further embodiments are directed to computer-implemented methods for generating ranked or prioritized data, e.g. in the form of a list or menu item, such as a list of tax return topics that is presented to a user of an online tax return preparation application.
  • Additional embodiments are directed to computer program products or articles of manufacture comprising a non-transitory computer readable medium embodying instructions executable by a computer to execute a process generating a ranking of tax return topics for presentation to a user of an online tax return preparation application.
  • Yet other embodiments are directed to adaptive or dynamic user interfaces or interview screens that reflect or can change given a current state of a user's tax return data.
  • Further embodiments are directed to how a small portion of data is retrieved from the first or persistent data store that maintains all data of all users of the online tax return preparation application and processed to generate portions of interview screens or user interfaces during preparation of an electronic tax return.
  • a third data store different from and independent of the first data store and the second data store is utilized to store a generated tax return topic ranking for subsequent retrieval in response to a ranking request by the on-line tax return preparation application.
  • the on-line tax return preparation application can write data to and read data from the first data store, but the on-line tax return preparation application cannot write data to or read data from the second data store or the third data store.
  • a model that is utilized to generate a tax topic ranking may be or be based one or more predictive models including logistic regression; naive bayes; k-means classification; K-means clustering; other clustering techniques; k-nearest neighbor; neural networks; decision trees; random forests; boosted trees; k-nn classification; kd trees; generalized linear models and support vector machines.
  • One or multiple models may be published or activated for use, and one or multiple models may be available for execution for an individual user session, and escrow records can be generated for respective models for each logged in user.
  • the tax return topic ranking system is initiated for a user in response to receiving data indicating that a user has logged into the online tax return preparation application or initiated an on-line session, and terminated when the user has logged off from the online tax return preparation application or terminated the on-line session.
  • the tax return topic ranking system receives a request for a tax return topic ranking from the on-line tax return preparation application during preparation of the electronic tax return for a logged-in user.
  • the type of response provided depends in part upon whether a model has been executed and a result for a logged-in user that is the subject of the request is stored in the third data store.
  • a request received by an escrow controller may identify logged in user, and in response, the escrow controller may initially access the third data store to determine whether a tax return topic ranking has already been generated for the identified logged-in user. If so, this ranking can be provided in response to the request. If not, a determination can be made whether the escrow record data of the identified user in the second data store satisfies pre-determined criteria or escrow requirements. If these criteria or requirements are satisfied, a corresponding model can be executed to generate a tax return topic ranking, which is provided to the on-line tax return preparation application in response to the request.
  • the tax return topic ranking system can wait for the escrow record to be modified such that the pre-determined criteria is satisfied, and the corresponding model can be executed and/or respond to the request by notifying the online tax return preparation application that no generated tax return topic ranking is available, in which case the online tax return preparation application may utilize a default or pre-determined topic ranking.
  • the escrow record is modified, as a result of adding, deleting or changing data to generate a different escrow record, such modifications may result in execution of different models, which may result in different topic rankings.
  • data analyzed relative to requirements of an escrow contract or specification file is read from a single source, i.e., the second data store or cache of the portion of the data in the first data store.
  • electronic tax return data is provided by multiple sources, and in one embodiment, this involves both the second data store or cache and the on-line tax return preparation application, e.g., the request provided by the online tax return preparation application.
  • escrow record data may be modified, this may result in execution of the same model again with different data of the same type for that model, which may or may not result in a different tax return topic ranking.
  • changes to an escrow record may trigger execution of a different model, which may result in a different tax return topic ranking.
  • a tax return topic ranking system being configured to execute the model independently, asynchronously, or in an uncoordinated fashion relative to the on-line tax return preparation application writing electronic tax return data to the first data store.
  • the online tax return preparation application and the tax return topic ranking system are independent of each other and do not coordinate with each other until a request is made for a tax topic ranking at which point these components are not coordinated or in synchronization with each other.
  • FIG. 1 is a block diagram of a computerized tax return preparation system constructed according to one embodiment including a ranking system that executes a model to determine a ranking of tax return topics for presentation in an interview screen generated by an online tax return preparation application;
  • FIG. 2 depicts how system embodiments may be utilized to change how tax return topics are presented in interview screens that are may be presented to a user at a later time relative to or based at least in part upon a currently displayed interview screen or other electronic tax return data in order to provide a more personalized listing of tax return topics for the user;
  • FIG. 3 is a system flow diagram showing components of computerized tax return preparation systems constructed according to embodiments and aspects of computer-implemented methods for providing an escrow-based personalization system that can be utilized to provide tax return topic rankings to an online tax return preparation application to personalize tax return preparation experiences;
  • FIG. 4 is a system flow diagram illustrating one manner in which a polling service may be implemented and operate according to embodiments
  • FIG. 5 is a system flow diagram illustrating how components shown in FIG. 3 can operate in a mode in which model execution to generate a tax return topic ranking is based on retrieval of selected data from a persistent data store managed by an online tax return preparation application;
  • FIG. 6 is a system flow diagram illustrating how components shown in FIG. 3 can operate in a mode in which model execution to generate a tax return topic ranking is based in part upon selected data retrieved from a persistent data store managed by an online tax return preparation application and selected data received from or derived from a request for a tax return topic ranking made by an online tax return preparation application;
  • FIG. 7 is a system flow diagram illustrating how components shown in FIG. 3 can operate in a mode in which the online tax return preparation application sends "heartbeat" signals to trigger analysis of data for execution of a model to generate a tax return topic ranking.
  • FIG. 8 is a system flow diagram illustrating one manner in which an escrow controller can be configured and operate according to embodiments involving a "heartbeat" signal and thread pools;
  • FIG. 9 is a block diagram of components of a computer system that may be programmed or configured to execute embodiments.
  • FIGS. 10A-B are system flow diagrams illustrating embodiments that are utilized to determine personalization experiences.
  • Embodiments of the invention relate to how a personalized list of topics or menu options is generated, communicated to a tax return preparation application (such as turbotax.com), and presented to a user, and how topics or menu items rankings can change as a user prepares an electronic tax return.
  • Embodiments of the invention may be utilized with or be integrated into computing systems or applications such as on-line tax return preparation applications to provide taxpayer users with a list of tax topics or menu options that are arranged in a particular ranking, sequence or order and to change the order of menu options or list of tax topics as a taxpayer user enters or changes data during preparation of an electronic tax return.
  • the order or sequence of tax topics presented to taxpayer users is also changed to reflect changes or updates to the tax return data.
  • Embodiments of the invention may also be utilized in other on-line or networked system such as television systems to or on-line movie systems (such as netflix.com) to provide viewers with a list of viewing options that are arranged in a particular ranking sequence or order. For example, as a user views various television programs or movies, an order or sequence of other available television programs or movies or types thereof (e.g., comedy, horror, romance, etc.) may be changed as the user browses or views various programs or movies.
  • on-line movie systems such as netflix.com
  • real time topic ranking changes are implemented using a computing system that that utilizes a only small portion of data in a "master" data store managed by the on-line tax return preparation application, while monitoring the master data store for changes to data for user that are currently logged into the on-line tax return preparation system.
  • This small portion of data is escrowed or cached, or temporarily stored, to another, independent data store that the online tax return preparation application does not and cannot access.
  • the system determines whether there is a model that can be triggered to determine a tax return topic ranking for one or more interview screens, and when these model requirements are satisfied, and the models are executed, the resulting rankings that are generated are provided to the on-line tax return preparation application.
  • tax topics that can be selected within the interview screens are arranged according to the determined ranking rather than according to a static, default or fixed sequence.
  • execution of a model may change the order of "income" topics in an interview screen that may be set to be displayed next from a default or static order that begins with "wages” (e.g., for Form W-2) to a new order of: dividends, capita! gains, business income, interest, rental income, etc., thus not only resulting in modification of the user interface presented to the user, but also providing the user with a more personalized and pertinent tax return preparation experience.
  • the user's data changes may result in re-execution of the same model or execution of other models which, in turn, may result in changes to how tax topics are presented in various interview screens.
  • embodiments not only transform tax topic sequences during preparation of an electronic tax return, but also transform a pre-determined, static user experience into a dynamic, personalized experience, and do so utilizing only small portions of the "master" data which, leads to reduced demands on computing resources compared to other systems and methods that require additional data and interactions. Further aspects of embodiments are described with reference to FIGS. 1-10B.
  • an online tax return preparation application 111 is managed by a host computer 110.
  • the online tax return preparation application 111 is accessed by respective user computing devices 120a-c (generally, computing device) executing respective internet browsers 122a-c (generally, browser) through respective networks 130a-c (generally, network 130) utilizing the online tax return preparation application's Uniform Resource Locator (URL) address.
  • Interview screens 112a-c (generally, interview screen) generated by the online tax return preparation application 110 are presented to respective users through screens or displays 124 of the computing devices 120. Users interact with the displayed interview screens 112 by entering or importing tax return related data into respective fields thereof to prepare respective electronic tax returns.
  • Examples of browsers 122 that may be utilized for this purpose include INTERNET EXPLORER, GOOGLE CRHOME and MOZILLA FIREFOX browsers, and examples of networks 130 that may be utilized for communications between computers include.
  • a user computing device 120 maybe a desktop computer, a laptop computer, a mobile communication device such as a smartphone or a tablet computing device.
  • Examples of networks 130 that may be utilized for communications between system components include but are not limited to a Local Area Network (LAN), a Wide Area Network (WAN), Metropolitan Area Network (MAN), a wireless network, other suitable networks capable of transmitting data, and a combination of such networks.
  • LAN Local Area Network
  • WAN Wide Area Network
  • MAN Metropolitan Area Network
  • wireless network other suitable networks capable of transmitting data
  • a combination of such networks For ease of explanation, reference is made to a network 130 generally, but various networks, combinations of networks and communication systems, methods and protocols may be utilized.
  • the online tax return preparation application 111 manages a "master" or “global” persistent data store 141 (otherwise referred to as a "first" data store 141) for respective electronic tax return data or electronic tax returns of respective users of the online tax return preparation application 111.
  • the first or master data store 141 may include data for millions of electronic tax returns for many users for many years. Thus, it is important for this first data store 141 to be a secure and robust and a persistent storage of many taxpayer files.
  • the first data store 141 may be hosted by the same computer 110 or may be accessible by the on-line tax return application 111 through a communications network 130 (not illustrated).
  • the online tax return preparation application 111 writes electronic tax return data to the first data store 141 to reflect the data entered, changed or imported (e.g., from a computerized financial management system such as mint.com), reads electronic tax data from the first data store 141 and generates interview screensl 12 and tax forms populated with the data.
  • a computerized financial management system such as mint.com
  • a tax return topic ranking system 150 (generally, ranking system) is in communication with the on-line tax return preparation application 111 and is shown in FIG. 1 as being in communication with the on-line tax return preparation application 111 and the first data store 141 managed thereby.
  • the ranking system 150 is in communication with the on-line tax return preparation application 111 and the first data store 141 , e.g., through respective communication networks 130 (not illustrated) or other communication channels, and may also be a component or module of, or a plug-in into, the online tax return preparation application 111 depending on the system 100 configuration employed.
  • the ranking system 150 is shown as being an independent component in communication with the online tax return preparation application 111 and first data store 141 managed by the online tax return preparation application 111. [0047] The ranking system 150 can access the first data store 141 to read only certain types of data 141d ("d' referring to "data"), but is not able to write any data to the first data store 141 given integrity requirements of storing electronic tax return data as noted above.
  • the algorithm performed by the ranking system 150 includes retrieving a small portion of the first data store data 141 d for that particular user, i.e., for a specific on-line session, and caches, escrows or temporarily stores the retrieved data 141d locally until the raking system 150 determines that conditions or requirements 162 reflected in a specification file or escrow contract concerning an executable model 160 are satisfied such that the model 160 can be executed.
  • the conditions or requirements 162 dictate when a model 160 is allowed to execute.
  • the model 160 may be a customized model or a model such as a predictive model, examples of which include logistic regression; naive bayes; k-means classification; K-means clustering; other clustering techniques; k-nearest neighbor; neural networks; decision trees; random forests; boosted trees; k-nn classification; kd trees; generalized linear models and support vector machines.
  • a predictive model examples of which include logistic regression; naive bayes; k-means classification; K-means clustering; other clustering techniques; k-nearest neighbor; neural networks; decision trees; random forests; boosted trees; k-nn classification; kd trees; generalized linear models and support vector machines.
  • model requirements 162 are not satisfied given a current set of data retrieved 141d from the first data store 141 and cached or locally stored by the ranking system 150 in a second data store or cache 142 (second data store), the data 141 d remains cached or escrowed for future use for subsequent iterations of checking for a different model 160 that can be executed or when the first data store 141 is updated or changed and those updates or changes are retrieved and cached locally by the ranking system 150. Then, a model 160 may be able to execute with the retrieved data updates 141d.
  • the on-line tax return preparation application 111 issues a request 113 to the ranking system 150 for a tax return related topic ranking 161 , e.g., for an interview screen 112 determined to be displayed to a user next given the interview screen 112 that is currently being displayed, the response 153 to the request 113 indicates that there is no result, and the on-line tax return preparation application 111 may utilize the prior or default ranking.
  • the data utilized for model 160 execution is only data 141d that was previously retrieved from the first data store 141 and cached locally by the ranking system 150 in the second data store 142.
  • data utilized for model 160 execution may come from multiple sources such as cached or locally stored data 141 d that was previously retrieved from the first data store 141 and data of or derived from the request 113 by the on-line tax return application 111.
  • the model 160 is executed to generate a result 161 in the form of a ranking or sequence of tax topics 161.
  • the online tax return preparation application 111 issues a request 131 for a tax topic ranking, e.g., based on one or more screens 112 that are determined to be displayed next or later in a sequence of interview screens 112 given the interview screen 112 that is currently being displayed, the generated ranking 161 can be communicated to the online tax return preparation application 111 in response to the request 113.
  • tax return topics examples include, but are not limited to, income, deductions, taxes paid.
  • Embodiments may also be utilized to rank sub-topics of a topic, e.g., certain types of income, such as wages or other income such as business income, interest, dividends, pension income, annuities, rental income, unemployment compensation, capital gains, gambling income, farming and fishing income, clergy earnings, social security, scholarships, alimony, canceled debt, 401 k distributions.
  • sub-topics for the topic of "deductions” may include medical and denta! expenses, home mortgage points, interest expenses, charitable contributions, business related expenses (car, travel, etc.), educational expenses, property taxes.
  • a tax return topic ranking may include only "root” node topics (e.g., income, deductions, etc.), a combination of "root” and “leaf node topics / sub-topics (e.g., income, property tax, mortgage interest), only "leaf node sub-topics of a certain topic (e.g., property taxes paid, points, mortgage interest), "leaf node sub-topics of different topics (e.g., property taxes paid, points, mortgage interest, wages, dividends), or a combination of one or more of topics, sub-topics and further drilled down or lower level, more specific sub-topics.
  • Topics as defined herein includes such “topics” and “sub-topics” of various levels of specificity, and a resulting ranking 161 may include “topics” of one more levels of specificity or root and or leaf node level topics or sub-topics.
  • a ranking 161 may include a small number of topics (e.g., three or four), or a larger number of topics (e.g., 10-20), the order of which such topics are displayed to a user being reconfigurable in real-time during preparation of an electronic tax return with embodiments of the invention and is reflective of the user's electronic tax return data managed by the online tax return preparation application 111.
  • the tax return topics in that interview screen 112 that would normally be presented in an interface including a first sequence, order or ranking (e.g., Rank 1 ) which may be a static or fixed sequence) are transformed into another interface including a different, second sequence, order or ranking (e.g., Rank 2) such that the second ranking 161 generated by model execution 160 is personalized or customized for the user, while utilizing a very small portion of first data store data 141d.
  • a first sequence, order or ranking e.g., Rank 1
  • second sequence, order or ranking e.g., Rank 2
  • the ranking 161 that is generated and communicated to the on-line tax return preparation application 111 in response to the request 113 may be one that was previously generated and locally stored or cached, waiting to be delivered to the online tax return application 161 in response to a request 113.
  • the ranking system 150 may then execute a model 160 (assuming the requisite data for the model 160 has been collected and is stored in the second data store 142 per the specification file 162), and the resulting tax topic ranking 162 generated by execution of the model 160 is provided to the online tax return preparation application 161 in response to the request 113 and may be locally stored in a third data store 143 for future reference.
  • the user may be interacting with a currently displayed interview screen 112c ("c" referring to "currently” displayed), and based on a predetermined, predicted or potential sequence of interview screens 112p (that are not currently displayed) given the currently displayed interview screen 112c, the online tax return preparation application 111 may request rankings 161 for these other interview screens 112p that are not currently displayed 112c but that will be, or might be, displayed. As shown in FIG.
  • rankings 161 may be generated for various numbers of "future” or “potential” interview screens 112p (Screenl - Screen3 112p that are not currently displayed are illustrated) and various numbers of rankings 161 (Rankl -Rank3 are illustrated) generated by respective models 160 that apply to the interview screens 112.
  • rankings 161 While embodiments are described with reference to how rankings 161 may be generated for future potential interview screens 112p that are not currently displayed, embodiments may also be executed concurrently for multiple users, thus providing a personalized tax return preparation experience for each user.
  • the online tax return preparation application 111 may issue Requests 1 -3 for respective users U1 -U3, and each request may involve one or multiple topic rankings for interview screens that will or may be displayed to the user after a currently displayed interview screen.
  • the online tax return preparation application 111 may also issue Requests 1 -3 for respective interview screens for a particular user. Examples of specific embodiments are described in further detail with reference to FIGS. 3-1 OB.
  • a computerized tax return topic ranking system 150/350 (generally, ranking system 350) constructed according to one embodiment interfaces with a tax return preparation application 111/311 and comprises an escrow controller 320, an escrow or model specification file 162/362, a selective data retrieval service 380, a model processor or model executor 370, a second data store 142/342 to which selected data for a logged-in user is retrieved from a first or "master" data store 141/341 is written and temporarily retained as an escrow record 345. For each on-line session, an escrow record 345 is generated for each user and for each model 160/360 applicable to each user.
  • Models 360 and escrow contracts or model specification files 362 are activated or made available for use via a publication service 365.
  • Certain embodiments also include a third data store 143/343 that may be dedicated to storing a current tax return model ranking 161/361.
  • System components are described in further detail together with how they work together with other system components to implement computer-implemented methods for determining and communicating tax return topic rankings 361 that are communicated to an on-line tax return preparation application 311.
  • the escrow controller 320 also referred to as an Decision Engine (DE) Escrow Service in FIG. 3, is a central, intermediate control element that interfaces with the on-line tax return preparation application 311 and manages escrow of retrieved data, model 360 execution, when a model 360 can be executed based on an escrow contract or specification file 363, and how a tax return topic ranking 361 is retrieved and communicated to the on-line tax return preparation application 311.
  • the escrow controller 320 manages escrow records 345 generated for data retrieved from the first or master data store 341 and cached or temporarily stored to the second data store 342 and manages escrow records 345 in the second data store 343.
  • escrow record 345 per active user session, per active model 360.
  • data is collected from the master of first data store 341 , it is temporarily held or stored in the second data store 342 and accumulated to satisfy certain model execution requirements 362.
  • the data retrieved from the first data store 141 is held "in escrow” until such time as it can be used to execute a model 160 to generate a result 161 in the form of a tax return topic ranking.
  • the escrow record 345 that is generated by the escrow controller 320 and stored to the second data store 342 includes a user identifier such as a ticket or token generated for the on-line session when the user logged into the on-line tax return preparation application 311 , a model identifier, which identifies the model 360 to which the escrow record 345 pertains (since an escrow record 345 is generated for each user and each model 360 applicable to each user), a description, such as a canonical description or unique identifier of required data and its location in the second data store 342 for the associated model 360, and which will be used to build a payload that is provided to a model execution service or model processor 370 by the escrow controller 320, and a description, such as a canonical description, of the location to which the generated result or model output 361 (in the form of a ranking of tax topics), is to be written, e.g., in a third data store 343 dedicated to storage of model results 361.
  • An escrow record 345 is never the source of truth for a particular value for a particular customer; instead, the source of truth will always be the location from where the data was initially retrieved.
  • the set of user data managed by the escrow controller 320 is a much smaller subset of the user data managed by the collection of "source of truth" data stores, i.e., the first or "master" data store 341 in the illustrated embodiment.
  • embodiments operate based only on a sliver or very small set of that data - only for users that are currently logged into the online tax return preparation application 311 , and only for types of data required by models 360 that have been activated or published via a publication service 365.
  • embodiments provide for a system that utilizes data stores 342 that may be a smaller, upper bounded, low latency memory based cache rather than the first data store 141 , which is a very large, growing and permanently persistent storage mechanism.
  • a computer-executable model 360 is executable by the model processor or execute service 370 utilizing cached data of escrow records 345 for that model 360 to generate a result 361 in the form of a ranking of tax return topics.
  • a model 360 includes data such as a unique model identifier, a description, e.g., canonical description expressed in XML Schema Definition (XSD), of the format of data inputs for the model 360, and a specification file or escrow contract 362 specifies the types and locations of data needed to execute a model 360.
  • XSD XML Schema Definition
  • a specification file 362 may list the canonical location of a piece of data, e.g., lists the zip code not just as word "zip” but as instruction that the escrow controller 320 can read and interpret to determine how to fetch the "zip" data from the second data store 342 and indicates when a call can be made by the escrow controller 320 to the model processor 370 to execute the model 360.
  • the escrow contract 362 also indicates whether certain inputs are required or optional. For example, a model 360 may require 10 total inputs to execute, and may only execute on those specified 10 inputs, or a model may require 10 total inputs to execute, but if an escrow record 345 includes other types of data, these other types of data may also serve as inputs to a model 360 to supplement the other required data. As shown in Fig. 3, the specification file 362 is provided to the escrow controller 320, but not the model processor 370, and the model 360 is provided to the model processor 370 but not the escrow controller 320.
  • system embodiments involve a distributed model execution system in which execution of the model 360 is separate of determinations of whether the model 360 should even be executed, and if so, how the model 360 should be executed (i.e., how to locate the data for the model 360 in the second data store 342).
  • the model processor or execution service 370 is a computer processor component that, once called by the escrow controller 320 to execute a model 360 upon satisfying the requirements specified by the specification file 362, provides an execution result 361 in the form of a ranking of tax return topics to the escrow controller 320 based on the cached data provided to it.
  • Embodiments also utilize a selective data retrieval service 380 that is also controlled by the escrow controller 320 for the purpose of retrieving pre-determined types of data for a currently logged in user from the first or master data store 341 and to check for changes in certain types of data in the shared data store 341.
  • the data retrieved from the first data store 341 is utilized by the escrow controller 320 to generate an escrow record 345 or to update an escrow record 345 given a data change, and to monitor or check the first data store 341 for changes in predetermined types of data for a particular user so that corresponding escrow records 345 can be updated in the second data store 342.
  • the second data store 342 is utilized by the escrow controller 320, the second data store 342 is not accessible by the online tax return preparation application 311. This is further illustrated in dashed line in FIG. 3 that surrounds ranking system 350 components other than the online tax return preparation application 311 and the first data store 341 managed thereby.
  • suitable caching applications include, memcached and redis, which provides for the second data store 342 to serve in the role of a low latency remote persistence mechanism to which user data (which is a small fraction of the amount of data stored by the first data store 341 ) is stored and indexed for fast access and retrieval at decision computation time.
  • the second data store or cache 342 is a cache of all the data required by all models 160 will be used to compute rankings or decisions 361 for all the currently active users (i.e., users with an active on-line session, but not users that are not currently logged in).
  • the limited scope and role of the second data store 342 can be described through its contrast with the first data store 341 , which is a runtime persistent storage component responsible for maintaining all online tax return preparation application 311 data for all users over all time. This is in sharp contrast to the second data store 342, which needs only to maintain a small subset of that data for active users and only for the duration of the active user's session.
  • One type of data retrieval service 380 that may be utilized to retrieve data from the first data store 341 and cache it to the second data store 142 is a polling service that manages one or more pollers that are active for a user's on-line session to retrieve the user's data from the first data store 341 and to detect when pre-determined data fields in the first data store 341 have been changed, e.g., added, deleted or modified.
  • a poller is an object that works on behalf of a single active user model 360 to gather data from the first data store 341.
  • a poller may include or utilize data such as a user identifier (such as an authid or user name and a ticket or token or password) for the on-line session, a collection of fields in the first data store 341 to poll or monitor, and a time during which the poller is active or how long the poller should continue to monitor the specified fields of the first data store 341.
  • a poller can be configured to monitor or check the first data store 341 at certain times or periodically with a scheduler. Pollers are configurable such that frequency of polling can be modified to increase or decrease the frequency of fetching data or to run off or terminate the poller after a predetermined time.
  • Embodiments may utilize different types of polling services or polling mechanisms.
  • the escrow controller 320 submits a subscription 322 to changes in a specified data field of the first data store 341 such that all modifications to this data field by the on-line tax return preparation application 311 will result in a change event being fired and received by escrow controller 320.
  • This "subscribe" configuration involves an ongoing, continuous subscription 322. in addition to an ongoing subscription 322 to get access to ail subsequent changes to a data field of the first data store 141 , the data field will also be fetched once at the time of the creation and initialization for purposes of generating an escrow record 145. [0069] This will allow the system to "catch up" to the last known state of the data field, and then stay caught up while the subscription 322 is active. This procedure can be used for any data field that is likely to change during the course of a user's session.
  • the polling service 380 utilizes a subscription 322 as described above, but rather than "subscribe” generally, the subscription is limited to "subscribe until found.”
  • the subscription 322 is active during monitoring of a specified data change in the first data store 141, but when that data change in the form of a change from an empty field to a populated field of the first data store has been detected and the data has been cached to the second data store 142, the subscription 322 can be terminated.
  • a "one time” subscription may also be employed to detect a change in a populated field to different data, after which the subscription 322 is terminated.
  • the polling system 280 may be configured as a "fetch once" system such that a data field will be preemptively retrieved from the system and provided to the escrow controller 320.
  • This configuration may be suitable for data items of the first data store 141 that are highly unlikely to change during the duration of the escrow controller's 320 operation and on-line session. For instance, any data representing an unchanging fact (i.e., "did an event occur at some point in the distant past?"), would likely be appropriate to retrieve with this strategy.
  • This strategy also involves less computing resources than strategies involving a subscription.
  • polling systems 380 may be utilized, for ease of explanation, reference is made to a polling system 380 generally, and one that utilizes a subscription 322 to changes of data in the first data store 141 as shown in FIG. 3.
  • a data retrieval service 380 in the form of a poling service can be utilized in the event that the first data store 341 does not support native automated data feeds from the first data store 341 and that indicate to the escrow controller 320 which fields have been changed or updated.
  • the first data store 341 may be configured to automatically output a data feed in the form of a stream of information identifying changes to data to which the escrow controller 320 has subscribed.
  • a data retrieval service 380 in the form of a polling service, there is one poller for every active escrow record 345, which in turn will exist for every active user and for every active model 360.
  • the escrow controller 320 manages the lifecycle of a poller, which may be in the form of independent threads of execution across a large cluster of machines. This independence and statelessness of pollers allows for horizontal scaling of the machines, and the escrow controller 320 ensures that there is a single poller per unique active model for an on-line session.
  • FIG. 3 illustrates one example of a selective data retrieval service 380 in the form of a poller service and poller management and how the escrow controller 320 interacts with a polling service 380
  • a poller-based data retrieval service 380 includes a polling manager or service, which includes a collection of poller objects which, as noted above, work on behalf of a single active user model to gather data from the first data store 341 , a collection of buckets, which are sets of poller objects executed for each run of a poller executor, which is activated (e.g., periodically, such as every second) to select a bucket of poller objects from the poller manager for execution in an asynchronous manner, and a timing mechanism to implement poller expiration such that a poller is not active beyond a certain time.
  • a polling manager or service which includes a collection of poller objects which, as noted above, work on behalf of a single active user model to gather data from the first data store 341 , a collection of buckets,
  • FIG. 4 is a system flow diagram reflecting system components and an algorithm for of one manner in which a polling system 380 is executed and how the polling system 380 interacts with the escrow controller 320 ("Escrow Service" in FIG. 4).
  • the escrow controller 320 sends a "create poller” request to a PollerCreation message queue 450, and at 404, this message is passed to a single poller application in the cluster 452, and the poller application 454 instantiates one poller process 456 and executes it according to instructions in the escrow controller's 320 request.
  • the escrow controller 320 wants to modify the behavior of a single existing poller process 456 or multiple poller processes 456 and for this purpose, sends a message to the "PollerNotification" topic 458.
  • the message identifies the poller process 456 to modify together with instructions describing the nature of the modification.
  • the modification message is broadcast to all poller applications 454 (represented by three arrows), and the poller application 454 that is managing the poller process 456 with the message's identifier reacts to the message by modifying the poller process 456 as appropriate. Other poller applications 454 can ignore the message.
  • the escrow controller 320 wants to remove the poller process 456 from service, and for this purpose, submits a message to the same "PollerNotification" topic 458.
  • the message identifies the poller process 456 to remove, and this removal message is broadcast to all of the other poller applications 454.
  • the poller application 454 managing the poller process 456 with the removal message's identifier responds to the removal message by removing the poller process 456 as appropriate, and other poller applications 454 can ignore the removal message.
  • a poller process 456 instance detects the data change in the first data store 341 and sends a message containing the details of the detected data change to the "DataChangeNotification" queue 460, and at 412, the escrow controller 320 receives the message from the queue 460 and updates an escrow record 345 in the second data store 342 accordingly.
  • the system embodiment that is illustrated also utilizes a distributed and shared data store configuration (341 , 342, 343) such that each data store has its own purpose and can function independently of the other.
  • the on-line tax return preparation application 311 can only access, and write to and read from, the first or master data store 341 , but cannot access the second data store 342 or the third data store 343. Instead, the second data store 342 and the third data store 343 are only accessible by the escrow controller 320.
  • the first or master data store 341, or the runtime persistence data store is the data store that includes electronic tax return data for numerous, if not all, users of the on-lien tax return preparation application 311 , for the current year as well as for prior years, and manages data that collectively forms a profile of a user such as user, product purchase history, tax information and electronic tax returns, and interaction events (e.g., pages visited).
  • the first or master data store 341 is thus a very large, secure and robust data store given the amount and sensitive nature of the data stored in the first data store 341.
  • the second data store 34 in contrast to the first or master data store 341 , retrieves only a very small fraction of the data of the first data store 341 and caches or temporarily stores the retrieved data for a specific an on-line session - it is not necessary to store data retrieved from the first data store 341 when the user is offline since the second data store 342 is configured as a temporary data store or cache to which selected data is escrowed until it can be utilized for model 360 execution for a particular on-line session.
  • the third data store 343 is utilized for storing the results 361 of model 360 execution in the form of tax return topic rankings and for each result or tax return ranking generated, may include data such as an authid that identifies the user to which the tax return topic ranking 361 applies, an entity key that is globally unique among all generated decisions or tax return topic rankings 361 , a flag that is used to indicate whether or not tax return ranking 361 has been provided to or consumed by the online tax return preparation application 311 , and a decision or result document containing the content in the form of the tax return ranking 361 for the identified user and executed model 360.
  • data such as an authid that identifies the user to which the tax return topic ranking 361 applies, an entity key that is globally unique among all generated decisions or tax return topic rankings 361 , a flag that is used to indicate whether or not tax return ranking 361 has been provided to or consumed by the online tax return preparation application 311 , and a decision or result document containing the content in the form of the tax return ranking 361 for the identified user and executed model 360
  • the escrow controller's 320 coordination of the execution of a model 360 is a "fully automatic" algorithm in the sense that the online tax return preparation application 311 that is consuming a generated tax return ranking 361 plays no role in the coordination of the timing of the production of the tax return topic ranking 361.
  • the online tax return preparation application 311 operates as it normally does by incrementally adding, updating, deleting data in the first or master data store 341 , and the changes to data of a logged-in user in the first data store 341 serve as the triggers for the escrow controller 320 to analyze the second data store 342 data relative to the activated models 360 and specification files 362 indicating when such models 360 can be executed.
  • generation and management of escrow records 345 by the escrow controller 320 and eventual model 360 execution is a result of or side effect of a trigger in the form of changed data in the first data store 341.
  • models 360 and specification files or escrow contracts 362 are loaded or published to the system via a publishing service 365, and the model 360 is available for execution by the model processor or execution service 370, and the escrow controller 320 receives a specification file 362 that specifies the conditions that must be satisfied before a call can be made to the model processor 370 to execute the model 360.
  • This is shown by "get model documents” and “get escrow contracts” in FIG. 3 - the model processor or execute service 370 can access published models 360 but not a specification file or escrow contract 362, which is only accessible by the escrow controller 320.
  • model 360 execution and how and when the model 360 is executed are separated from each other and independent of each other.
  • the escrow controller 320 calls the model processor 370, e.g., periodically, and the model processor 370 returns data about or identifying the models 360 that have been published and are activated or live and available for execution.
  • a user logs into the online tax return preparation application 311 with user credentials (such as user name and password), and a login service 550 of the online tax return preparation application 311 verifies the credentials. Assuming the credentials are correct, the login service 550 initiates an on-line user session for that user, and at 508, data indicating initiation of an on-line session is communicated from the login service 550 to the escrow controller 320. This is also illustrated in FIG. 3 as the arrow from the online tax return preparation application 311 to escrow controller 320 indicating "signal user session start/stop" (the beginning of an on-line session concerns the "start” portion of "start/stop”).
  • the escrow controller 320 utilizes a message queue 382 to communicate with the first data store 341 , and at 512, pulls or retrieves available types of data of the model 360 from the first data store 341 to generate, complete or supplement an escrow record 345, which is stored in or written to the second data store 342. For these purposes, the escrow controller 320 retrieves data from the first data store 341 utilizing a polling service 380, a native automatic data feed of the first data store 341 , or both types of selective data retrieval mechanisms.
  • the escrow controller 320 calls the first data store 341 , e.g., via a messaging queue 382, to initiate a subscription 322 for changes in data of the logged-in user or for the activated on-line session in the first data store 341.
  • This is shown in FIG. 3 as "data subscription requests.” These data changes may involve data that has been added, deleted or updated.
  • the subscription request 322 is provided as a set of canonically named data fields (e.g. , a set of XPaths or instructions for selecting nodes from an XML document.).
  • the escrow controller 320 may initiate a subscription 322 for any changes (new or added data, deleted data, updated data) for the specific types of data required for execution of the model 360 as specified by the specification file or escrow contract or specification file 362.
  • the escrow controller 320 calls the polling service 380 through the message queue 382 to initiate a polling process that will poll for changes in the specified types of data in the first data store 341.
  • the polling service 380 After the subscription 322 has been established, at 516, the escrow controller 320 calls the polling service 380 to initiate a polling process that monitors the first data store 341 for any changes concerning specified types of data associated with the activated on-in session.
  • the first data store 341 by a native data feed component, sends a data feed, which may be a continuous data feed, indicating which specified data fields had changes, to the escrow controller 320.
  • the selective data retrieval service 380 periodically checks for changes of specified data fields for the active on-line session such that as polling progresses, each polling check made to determine changes relative to a prior polling check.
  • the polling service 480 continues with periodic checks for changed data in the first data store 341 , but when a change is detected by the polling service 380, the escrow controller 320 is notified of the change by the polling service 380 through the message queue 382.
  • the escrow controller 320 writes the data update to the second data store 342 or cache.
  • the escrow controller 320 based on the data pre-requisites or conditions set forth in the specification file or escrow contract 362, eventually determines that the data requirements of a model 360 applicable to the logged in user have been satisfied. This event triggers the escrow controller 320 to read the required model data from the second data store 342, communicate with the model processor or execution service 370, and call the model processor 370 to execute the model 360 with the provided data that has been retrieved from the first data store 341 and that was cached to the second data store 342.
  • the result 361 generated by execution of the model 360 in the form of a tax return topic or menu option ranking is returned by the model processor 370 to the escrow controller 320, which locally stores the tax return topic result 361 , e.g., in the third data store 343, that is a dedicated data store for tax return topics generated by model 360 execution and that is not accessible by the on-line tax return preparation application 311.
  • the generated tax return topic ranking 361 is stored in the third data store 343 until retrieved by the escrow controller 320 in response to a request by the online tax return preparation application 311.
  • Data that is analyzed relative to requirements of an escrow contract or specification file 362 is read from a single source, i.e., the second data store 342 or cache of the portion of the data in the first data store 341.
  • electronic tax return data is provided by multiple sources, and in one embodiment, this involves both the second data store 342 or cache and the online tax return preparation application 311 , e.g., the request provided by the online tax return preparation application 311.
  • all of the data that is required for execution of a model 360 is retrieved from the first data store 341 via a selective data retrieval system 380, i.e., no data for model execution is received from the online tax return preparation application 311 or request thereby.
  • the escrow controller 320 via communications through the message queue 382 with the data retrieval service 380, is able to asynchronously find all of the data elements required by a model 360, cache the data to the second data store 342, and issue a call to the execution service 370 to execute a model 360 utilizing the retrieved data and to generate a decision or result 361 in the form of a tax return topic ranking. This may be done prior to the online tax return preparation application 311 issuing a request for the ranking.
  • the online tax return preparation application 311 requests a ranking 361 from the escrow controller 320 ("get decision" in FIG. 3), and as part of this request or subsequent communication associated with the request, provides the required data elements to the escrow controller 320.
  • the escrow controller 320 then generates or updates an escrow record 345, which may already include data fulfilling the requirements for model 360 execution per the specification file 362, and this triggers the escrow controller 320 to issue a call to the model processor or execution service 370 to execute the model 360 utilizing the data received from the online tax return preparation application 311 to generate a result 361 in the form a tax return topic ranking, which is then returned by the escrow controller 320 to the on-line tax return preparation application 311 and may also be stored in the third data store 343 for future reference.
  • data that is utilized to fulfill conditions or predetermined model 360 criteria per the specification file 362 is received from multiple sources such that systems implement a "hybrid" type data retrieval system.
  • a first portion of the data required for model 360 execution is retrieved from the first data source 341
  • a second portion of the data required for execution of that model 360 is received from the online tax return preparation application 311 , e.g., as part of the request ("get decision" in FIG. 3).
  • the online tax return preparation application 311 e.g., as part of the request ("get decision" in FIG. 3).
  • a majority of the required data is retrieved from the first data source 341
  • a much smaller portion of data is supplied by the online tax return preparation application 311.
  • the data supplied by the online tax return preparation application 311 is the final piece of data required before a call can be made by the escrow controller 32 to the model processor 370 in order to execute the model 360 with the collected or retrieved data.
  • the escrow controller 320 accesses the first data store 341 or runtime persistence via communications made to the data retrieval service 380 through the message queue 382 to retrieve or pull data elements for a model 360 as specified by the escrow contract 362 and for a user that is currently logged into the online tax return preparation application 311.
  • the escrow contract 362 specifies which data is required for model 360 execution, which data is options, and the location of such data in the second data store 342.
  • the escrow controller 320 retrieves or pulls the data from the second data store 342 or cache to update or complete an escrow record 345 that is stored in the second data store 342. This is repeated for each distinct data store that the escrow controller 320 accesses or manages for this purpose, and in the illustrated embodiment, there is a single data store, i.e., the second data store or cache 342.
  • the escrow controller 320 calls the first data store 341 to initiate a subscription 322 for any changes (adds, updates, deletes) to only those data fields specified by the model 360 / escrow contract 362, and at 608, the escrow controller 320 calls the polling service 480 to initiate a polling process that will poll for data changes/additions on behalf of the user for the user's on-line session.
  • the data feed of the first data store 341 may send a continuous feed to the escrow controller 320 to notify the escrow controller 320 of changes that occurred to the data fields of the first data store 341 during a particular on-line session and involving a particular user.
  • a polling service 480 may be utilized, and at 612, the polling service 480 polls the first data store 341 , e.g., periodically.
  • Each poll request is a request for changes to any data fields that have occurred since the last time the poll request was made.
  • the polling service 480 detects a change
  • a notification of that change is sent to the escrow controller 320.
  • some, but not all, of the data required for model 360 execution has been collected.
  • the online tax return preparation application 311 issues a request ("get decision" in FIG. 3) to the escrow controller 320 for a tax return topic ranking 361.
  • the request includes the remaining data elements that are required for model 360 execution.
  • the escrow controller 320 utilizes a combination of the data elements from the request received from the online tax return preparation application 311 and the data of the escrow record 345 of the second data store 342 such that requirements for model 360 execution per the specification file or escrow record 362 have now been fulfilled.
  • This ranking 361 is provided to the online tax return preparation application 311 in response to the request and may be stored to the third data store 343 by the escrow controller 320 for future reference.
  • data requirements for a model 360 may be fulfilled in different ways during an on-line session, which may involve various updates to the first data store 341 and various requests made by the online tax return preparation application 311.
  • a first request for a tax return topic ranking 361 may involve a model
  • 361 may involve a model 360 that is executed with data from both the first data store 341 and data of or derived from the request made by the online tax return preparation application 311.
  • the system is configured such that the current or most recent tax return ranking 361 applicable to an interview screen 112 or tax return topic ranking 361 is stored, but the resulting rankings that may have been generated by execution of prior versions of the same model 360 may also be stored in the third data sore 343, e.g., for reference to allow developers to see how different model 360 versions perform.
  • the third data store 343 may store a current ranking of tax return topics that is based on the most recent execution of the model 360 (version 3), but also store prior versions (version 2, version 1 ) of the model 360 and/or resulting tax return rankings 361 in the third data store 343 (but not in at least the first data store 341).
  • the rankings / models in the third data store 343 can be stored on a permanent basis, or with persistence, or for a longer duration of time, since the second data store 342 data is based on the actual data of a temporary on-line session.
  • the online tax return preparation application 311 determines that a ranking 361 generated by execution of a model 360 may be required (e.g., given a first interview screen 112-1 , it is known that current interview screen 112c containing rankable tax return topics 361 will be displayed or is a possible interview screen 112p that can be displayed given defined menu options), the on-line tax return preparation application 311 issues a request for the topic ranking 361 ("get decision" in FIG. 3).
  • the escrow controller 320 may be configured to first determine whether a tax return topic ranking 361 has already been generated and is already contained in the third data store 343 (indicated by "fetch” in the escrow controller 320 and "get decisions for immediate consumption” in FIG. 3). If so, then the generated tax return topic ranking 361 can be immediately read from the third data store 343 and provided to the online tax return preparation application 311 in response to the request. A flag can be set in the third data store 343 to indicate that the generated ranking was utilized or provided to the online tax return preparation application 311.
  • the escrow controller 320 is configured to determine whether a model 360 can be executed based on the specification file 362 indicating the data conditions for execution of a model 360 that would generate the requested topic ranking 361 and where such data can be retrieved from the third data store 341. If the data conditions have been satisfied for the model 360, the escrow controller 320 issues a call ("decision computation" in FIG.
  • model processor 370 executes the model utilizing the received data to generate a result 361 in the form of a tax return topic ranking, which is communicated to the escrow controller 320 which, in turn, stores the generated ranking 361 to the third data store 343 and provides the determined ranking 361 to the online tax return preparation application 311 in response to the request.
  • the escrow controller 320 can respond to the tax online preparation application 411 indicating that no other tax return topic ranking 361 is available, in response to which the online tax return preparation application 311 can use the current, original or default ranking.
  • a current ranking 361 may replace a default or original topic ranking, thus resulting in an interface transformation by modifying one interview screen 112 / user experience into a different interview screenl 12 / user experience.
  • a current ranking 361 also replace or modify a previously generated ranking 361.
  • the online tax return preparation application 311 may continue to utilize the current, original or default ranking for the interview screen 112.
  • the online tax return preparation application 311 may issue a call to the logout service 552 to terminate the on-line session and at 532, notify the escrow controller 320 regarding the terminated session (indicated by "signal user session... end" in FIG. 3).
  • the escrow controller 320 is now able to delete the escrow record 345 from the second data store 342, thus clearing the data from the prior on-line session.
  • the rankings and data of executed models 360 may be maintained in the third data store 343 for future reference or for use during a future on-line session.
  • the escrow controller 320 initiates a re-scan of the first data store 341 for purposes of determining whether an escrow record 345 should be updated for that user / online session and based on the escrow record 345 content relative to the specification file or escrow record 362 applicable to a model 360, whether the escrow controller 320 can proceed with executing any models 360 for which data is read and for which data requirements have been fulfilled.
  • a model 360 is published 365 and pushed to the model processor or execution service 370, and at 704, the escrow controller 320 periodically calls the model processor 370, in response to which the model processor 370 returns data about published and live models 360.
  • the online tax return preparation application 311 calls the escrow controller 320 and transmits a 'heartbeat" or "check status" message 750 to the escrow controller 320.
  • This "heartbeat" message 750 identifies the user (e.g., by authld or use name) and their credentials (e.g., ticket or password).
  • the escrow controller 320 accesses the second data store 342 to retrieve whatever data elements for a model 360 are available for that particular user or specific on-line session, and at 710, the escrow controller 320, through the message queue 382, calls the data retrieval service 380 to initiate a polling process.
  • the polling process can periodically check the first data store 341 for data changes/additions made during the on-line session.
  • the polling service 380 checks the first data store 341 at some regular interval, and each poll request is a request for changes to any data fields that have occurred since the last time the poll request was made.
  • the polling service 380 detects a change, the polling service 380, through the message queue 382, transmits a message to the escrow controller 32o to notify the escrow controller 320 of the change.
  • the changed data is cached or temporarily stored to the second data store 432.
  • the escrow controller 320 determines that the escrow record 345 stored in the second data store 342 including the first data store 341 changes fulfills the input requirements of the decision model 360 and that the model 360 can be executed per the specification file 362, the escrow controller 320 issues a call to the model processor 370 together with the requisite data read from the escrow record 345 in the second data store 342 to execute the model 360, and the generated result 361 in the form of a tax return ranking is provided by the model processor 370 to the escrow controller 320 and stored to the third data store 343 at 718.
  • the online tax return preparation application 311 issues a request for the tax return topic ranking 361 to the escrow controller 320, which reads the tax return topic ranking 320 for that user that was previously generated from the third data store 343, and responds to the request by providing the tax return topic ranking 361 to the online tax return preparation application 311.
  • heartbeat calls 750 or checks to the escrow controller 320 for the active on-line session / user are terminated, or will not be issued after a pre-determined amount of time, in response to which the escrow controller 320 can terminate the polling service 380 and delete the escrow record 345 from the second data store 343 since the on-line session has been terminated.
  • an escrow controller 320 includes or utilizes a foreground thread pool 850 and a background thread pool 852 that are utilized to process heartbeat requests 750 and interface with the escrow records 345 in the second data store 342.
  • a foreground thread pool 850 is a pool of processes that works to receive the incoming heartbeat requests 750, or a "request for future work" from the online tax return preparation application 311 and forwards the necessary instructions to the background thread pool 852.
  • Foreground refers to work that is done will the calling application waits for the response, whereas background refers to what will be done at some time later, without the calling application needing to wait.
  • the application 311 that sends the heartbeat signal 750 can receive a rapid response while "heavier" work is scheduled to be done later, such as initializing a cache or escrow record 342 and initializing a poller 380 (in the case of the very first heartbeat signal 750 received for a user) or keeping alive an existing cache or escrow record 342 and an existing poller 380 (in the case of a subsequent heartbeat signal 750 received for a user).
  • Pools include collection of executable elements, such as computer executable elements, such as threads.
  • a thread is a component of a process and is the smallest sequences of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. Multiple threads can exist within the same process and may execute concurrently and share resources.
  • multithreading is generally implemented by time slicing or multitasking, and the CPU switches between different software threads.
  • multiple threads can be executed in parallel or at the same time, with every processor or core executing a separate thread simuitaneously.
  • the size of the pool is usually the number of concurrent users.
  • the background thread pool 852 is a pool of threads that performs asynchronous or background tasks consisting of creation and modification messages to a Java Message Service (JMS) broker, creation and modification of each unique escrow records 345, and decision engine or escrow controller 320 execution requests.
  • a thread pool is a pre initialized collection of Java threads, and threads are small independent executable 'processes' capable of doing work without coordinating with another process.
  • the size of the background thread pool 852 is the ratio of average task completion time of the foreground thread pool 850 divided by the average task completion time of the background thread pool 852.
  • MemCache refers to the cache or second data store 342 utilized by the escrow service or escrow controller 320.
  • the second data store 342 includes escrow records 345 and data about user activity such as unique user identifiers and timestamp data of activity.
  • An evictor thread 854 manages dormant escrow records 342 and the proper sizing of the background thread pool 852.
  • Dormant records 354 are records for which the system has not received a heartbeat signal 750 in some pre-determined amount of time. When an escrow record 342 is dormant, it is deleted and its worker is deleted to result in "cleaning up" of at the end of a user's session of data that was created when the on-line session was initiated.
  • the online tax return preparation application 311 sends a "heartbeat" or "check” message 750 to the escrow controller 320 to notify the escrow controller 320 that the on-line session is still active, and upon receiving the heartbeat message 750, the foreground thread pool 850 of the escrow controller 320 quickly responds to the online tax return preparation application 311 with a "success" indicator.
  • further instructions are forwarded by the foreground thread pool 850 to the background thread pool 852 for asynchronous processing.
  • the background thread pool 852 checks MemCache 856 (the cache or second data store 834) to determine whether a unique escrow record 345 has been created, and if not, then one is created and stored to the MemCache 856 / second data store 342.
  • the background thread pool 852 sends instructions to Java Message Service (JMS) brokers to create poller 380.
  • JMS Java Message Service
  • JMS brokers respond to the background thread pool with further instructions regarding data that has been requested.
  • Resources that may be utilized for this purpose include, for example, Amazon Web Services (AWS) / Simple Queue Service (SQS) 858 available from Amazon Web Services, Inc.
  • the Amazon SQS Java Messaging Library is a JMS interface to Amazon SQS that enables leveraging of Amazon SQS in applications that utilize use JMS, and the interface allows use of Amazon SQS as the JMS provider.
  • the background thread pool 852 calls the model processor or execution service 370 (also referred to as decision engine in FIG. 8) to generate a result 361 in the form of a tax return topic ranking.
  • the evictor thread 854 loops through the escrow records 345 and actively monitors the second data store / cache 342 for any dormant escrow records 345, and at 816, if a dormant escrow record is detected, the evictor thread 854 initiates termination of the poller 380, the escrow record 345, and resizing of the background thread pool 852. This is done in order to recover resources created when a user's session was initiated, rather having a result of increasing resource consumption over time.
  • FIG. 9 generally illustrates certain components of a computing device 900 that may be utilized or that that system components include for execution of various computer instructions according to embodiments.
  • the computing device may include a memory 910, program instructions 912, a processor or controller 920 to execute instructions 912, a network or communications interface 930, e.g., for communications with a network or interconnect 940 between such components.
  • the memory 910 may be or include one or more of cache, RAM, ROM, SRAM, DRAM, RDRAM, EEPROM and other types of volatile or non-volatile memory capable of storing data.
  • the processor unit 920 may be or include multiple processors, a single threaded processor, a multi-threaded processor, a multi-core processor, or other type of processor capable of processing data.
  • the interconnect 940 may include a system bus, LDT, PCI, ISA, or other types of buses, and the communications or network interface may, for example, be an Ethernet interface, a Frame Relay interface, or other interface.
  • the network interface 930 may be configured to enable a system component to communicate with other system components across a network which may be a wireless or various other networks. It should be noted that one or more components of computing device 900 may be located remotely and accessed via a network.
  • FIG. 9 the system configuration provided in FIG. 9 is provided to generally illustrate how embodiments may be configured and implemented, and it will be understood that embodiments may also involve communications through one or more networks between a user computer and a computer hosting system embodiments of on-line or cloud based tax return preparation applications
  • Method embodiments or certain steps thereof may also be embodied in, or readable from, a non-transitory, tangible medium or computer-readable medium or carrier, e.g., one or more of the fixed and/or removable data storage data devices and/or data communications devices connected to a computer.
  • Carriers may be, for example, magnetic storage medium, optical storage medium and magneto-optical storage medium.
  • Examples of carriers include, but are not limited to, a floppy diskette, a memory stick or a flash drive, CD-R, CD-RW, CD-ROM, DVD-R, DVD-RW, or other carrier now known or later developed capable of storing data.
  • the processor 920 performs steps or executes program instructions 912 within memory 910 and/or embodied on the carrier to implement method embodiments.
  • program instructions 912 within memory 910 and/or embodied on the carrier to implement method embodiments.
  • embodiments can be combined with a very specific kind of decision model that allows for enhanced and new AB Testing and targeted segmentation.
  • Targeted segmentation is a capability offered by a specific kind of model that involves the model choosing, from a finite set of possible user experiences, the one user experience that should be delivered based on the state of a user's profile.
  • target segmentation with embodiments is based on the premise that there are multiple variations of a user experience, each user experience being optimal for a sub segment of the population.
  • This targeted segmentation model has the general property that it will assign a user to an experience if either (1 ) the experience has been proven to be optimal or (2) there is not enough data gathered to come to a firm conclusion.
  • the targeted segmentation model will not assign a user to an experience if that user experience has been proven to be sub-optimal.
  • Embodiments distinguish between components responsible for producing a qualification and those that are responsible for producing an assignment, and this enables models 360 to be run and decisions or generates results in the form of a ranking 361 to be produced in the background and not coordinated with any direct user activity.
  • the nature of this design means that the models are run and results produced typically well before they are actually needed by the requesting application 311 or, in other words, since the application 311 that will require the ranking 361 has not yet called to retrieve the ranking 361 and the user has not yet been affected by it, it may be the case that the resulting ranking 361 will never be consumed.
  • embodiments address shortcomings associated with situations in which users are assigned to an experiment and potentially never reach the portion of the application 311 that is relevant, thus providing improvements by addressing problems involving: 1. when the case of a set of experiments that are mutually exclusive with each other, there is a risk of wasting traffic / performing unnecessary communications, and a user is assigned to an experiment, disqualifying them from participating in anything else but then never collecting any useful data from that user because they never actually get to participate in the experiment they were assigned, 2. difficulty in distinguishing t users that were assigned to an experiment versus those that were assigned and actually effected by the experiment, and 3. difficulties for the escrow controller to known whether or not a better assignment could be produced when potentially more data becomes available at some later phase.
  • FIGS. 10A-B are system flow diagram illustrating how user experiences may be processed according to embodiments, i.e., how personalized experience flow would be implemented, and involves an experience management system that manages an experience set and products an experience assignment by considering a set of experience qualifications and performing any additional last minute checks that validate criteria (e.g., mutual exclusion rules).
  • An experience set is defined as including elements including or involving traffic allocation percentage, a set of named variations that are members of the set, a set of other Experience Sets with which this one is mutually exclusive, a simple predicate that further determines the experiment eligibility of a participant, and a reference to the [00122] Experiment Qualification Mode
  • the application 311 (such as online tax return preparation application, but general "application”) makes a call to Jabba 1050 or another external system that assigns users to one of many equivalent experiences that can benefit from escrow-based ranking embodiments, with a unique identifier of the user and any valid JSON object as inputs to the experimentation mode, and at 1004, Jabba 1050 receives the request. Based on the established rules, Jabba 1050 determines which test and experience to call or serve. If the chosen experience is Experience Decision Engine 1052 (rather than "Baseline” or the current best performing experience from a set of equivalent experiences), then Jabba 1050 makes a request to Decision Engine 370:
  • the decision engine 370 consumes the request and responds back to Jabba 1050 with a list of "buckets" with probabilities that sum 1054 to 1.0:
  • Jabba 1050 then "flips a coin" 1060 on the options 261 a-c that the decision engine 370 has provided to return an ordered list 1062 of options.
  • Jabba 1050 then applies business rules to determine the final experience 361 f (T referring to "final") to serve back to the application 311.
  • the business rules are rules that define or constrain some aspect of business and always resolve to either true or false.
  • An example of a business rule might be given experience A, B, and C, new users are not allowed to see experience B, thus experienceC should be shown to the user even though experienceB might have higher priority on the Jabba Result List.
  • Jabba 1050 also responds to permit the application team choose which experience to serve to the user. It may also be the case that Jabba 1050 is unreachable. This may occur when the application 311 , for whatever reason, cannot communicate with Jabba 1050, and in this situation, the application team can decide which experience to serve to the user.
  • This section discusses the hosting design of the major components that make up the Platform. Since this is hosted in an AWS VPC with connectivity back to Intuit's collocated facilities, there are certain requirements and guidelines around securing the data and components.
  • n is the number of polling instances running.
  • Outbound traffic from poller instance are essentially read calls which are very small in request size and negligible for bandwidth calculations.
  • each server can handle ⁇ 50 MB/s consistently with bursts of 100-110 MB/s.
  • An Escrow Service is a background service which provides an alternative way to invoke the Decision Engine Execute Service without having to provide all the data required to execute a model.
  • the Escrow service can use pollers to proactively fetch data from the required sources so that the calling application only needs to provide the data that is not already available somewhere.
  • Escrow Service will use the authld of the user to resolve the Entity Key (EK) for the user's current tax year filing data.
  • EK Entity Key
  • the resolved EK will be stored in the Customer Data Cache.
  • Escrow service will pass along the Authld, EK, the Model Requirements and the Base line data state (which is initially empty) to the Poller
  • the poller will talk to specified sources to fetch the actual data. Whenever the Poller detects a change in the data compared to the baseline data, it sends a Data Change event to the JMS along with the details of the new data values. After successful transmission to the JMS, the Poller holds the current state of the data as the new Base Line
  • the Escrow Service Whenever the Escrow Service figures out that it has enough data to execute a model, it will execute the model and pass the data to DAS for storage.
  • the insights document will be created if required and the node specific to this decision key will be added or updated (depending on if this is the first time it is being added) whenever the decision is generated. If the decision node is created up front, there is a risk that the app will consume the empty node and mark it as consumed, so the decision node must only be created/updated after the decision has been generated.
  • Model in requirements contains 1 or more tax requirements that require EKto be resolved before DAS queries can be made
  • Each escrow worker has a config document that defines its behavior; how it gets its input, the model to invoke and how it communicates the data to the consumers of the decision.
  • the Escrow config document exposes the contract of the Escrow Service for the model that it's exposing. It is related to the model's major version and needs to change along with it. It also has a life cycle of its own and can change on its own even for the same major version of the model.
  • Escrow Config is an entirely separately managed document from the Model. Importantly, an Escrow Config is independently versioned and published. This allows the isolation of the changes to a Model from the changes to an Escrow Config and vice versa.
  • Version 1.3 of a Model document requires a zipCode and a useragent.
  • Version 1.1 of the Escrow Config is specified to retrieve the zipCode from the 'taxReturn' provider and the useragent from the 'local' provider, meaning that the useragent value is exposed to the calling client as a requirement for executing the model.
  • the useragent value is persisted to a document and retrievable through the triageTaxProfile document.
  • Escrow will talk to DAS once to get “resolve” the EK. It will get all the returns and figure out which is the current tax year and the corresponding EK that must be used in the URI to get the data entered for this tax year's returns.
  • Escrow will pass the EK to the Poller to use for its subsequent calls to get the actual data. Escrow will make read and update calls to DAS to store the
  • the Escrow Worker When the Escrow Worker is updating a decision, it needs to check that the decision has not already been consumed by the client (This is for use cases where the decision cannot be updated, such as an experiment allocation decision). This condition has the potential to go wrong in a race condition where the ESC worker checks to see if the decision has been consumed, it gets the response as false and then computes the decision and updates to DAS. But right after the Escrow worker checked the status of consumed flag, the client read the field and updated it to be true. In this case, the Escrow worker will end up updating a decision that has been consumed and updating the data that the client has already seen.
  • DAS will be providing us a way to send a processing instruction as part of the request where we will update the request only if the flag has not been set.
  • the request may be based on, for example:
  • the response on success can be based on, for example:
  • DAS supports a feature where the caller can send across an array of instructions for processing which has multiple GETs and PUTs.
  • the request may be based on, for example:

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Marketing (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Computer systems for escrowing of selected data of an online tax return preparation application for use in executing models such as predictive models to generate tax return topic rankings, which are provided to the online tax return application and displayed to a user and are personalized in that they reflect or are based on actual tax return data of the user. Escrow system components may operate independently of the online application such that small portions of a very large amount of tax return preparation data can be retrieved (e.g., periodically) from a data store maintained by the online application, and this retrieved data is cached or stored to a different data store and analyzed relative to an escrow contract specifying types of data required to trigger model execution. Upon satisfying escrow contract requirements, a corresponding model is executed, and a topic ranking is generated and provided to the online application.

Description

ESCROW PERSONALIZATION SYSTEM
SUMMARY
[0001] Embodiments are related to computer-centric and internet-centric technologies such as preparation of electronic tax returns by an online tax return preparation application accessible by browsers executing on user computing devices, data and memory management and computing system efficiencies, user interface generation and real-time user interface adaptation and modification, interfacing with or communicating with on-line computer applications, and modular system components that can be used to implement changes to an on-line tax return preparation application without having to change code or components of the on-line tax return preparation application itself.
[0002] Embodiments also involve or are related to user interfaces generated during preparation of electronic tax returns and personalizing user interface content for particular users, personalizing tax return preparation experiences, modifying user interfaces and user experiences on the fly in real time during preparation of an electronic tax return, and adapting user interfaces in response to changes in electronic tax return data.
[0003] Embodiments also involve or are related to data escrow systems and how components of data escrow systems may operate independently of or asynchronously relative to an online tax return preparation application while still interfacing with the online tax return preparation application to respond to an application's request for results generated by the data escrow system. Such results may be in the form of a personalized tax return topic ranking that is incorporated into an interview screen that is presented to a user of an online tax return preparation application during preparation of an electronic tax return.
[0004] Examples of tax return topics that may be processed by embodiments and integrated into a personalized tax return topic ranking include, but are not limited to, income, deductions, and taxes paid. Embodiments may also be utilized to rank sub-topics of a topic, e.g., certain types of income, such as wages or other income such as business income, interest, dividends, pension income, annuities, rental income, unemployment compensation, capital gains, gambling income, farming and fishing income, clergy earnings, social security, scholarships, alimony, canceled debt, 401 k distributions. As another example, sub-topics for the topic of "deductions" may include medical and dental expenses, home mortgage points, interest expenses, charitable contributions, business related expenses (car, travel, etc.), educational expenses, property taxes. As used herein, "topic" is defined herein as including both such "topics" and "sub-topics" and further drilling down into additional sub-topic levels, and a tax return topic ranking may include only "root" node topics (e.g., income, deductions, etc.), a combination of "root" and "leaf' node topics / sub-topics (e.g., income, property tax, mortgage interest), only "leaf' node sub-topics of a certain topic (e.g., property taxes paid, points, mortgage interest), "leaf node sub-topics of different topics (e.g., property taxes paid, points, mortgage interest, wages, dividends), or a combination of one or more of topics, sub-topics and further drilled down or lower level, more specific sub-topics. "Topic" as defined herein includes such "topics" and "sub-topics" of various levels of specificity. [0005] Embodiments also involve or are related to processing of a small portion of data managed by an online tax return preparation application in order to make changes to and improve interview screens and user experiences during preparation of electronic tax returns. Embodiments also involve or are related to reducing the amount of tax return data that is processed in order to implement dynamic changes to interview screens during preparation of an electronic tax return to provide personalized user experience, thus achieving improvements while providing for efficient use of computing resources as a result of, for example, reduced demands on processor and memory components and reduced data transmission between components and/or through networked components. For example, embodiments may involve transformation of on-line tax return preparation experiences as electronic tax return data is changed by a user of an online tax return preparation application. Thus, as a user's data changes, e.g., by adding, importing, deleting or changing data, embodiments are adaptive to these changes and provide for generation of interview screens that reflect these changes and include different tax return topic rankings or topic menus with topic items that are arranged in different ways depending on the data provided. Embodiments also involve or are related to transforming static menus or tax return topic listings into dynamic, adaptive menus or tax return topic listings that prioritize what a user is more likely to select or view, thus providing for more efficient menu presentation, electronic tax return preparation and computing resources to implement interview screens with more streamlined or reduced user interactions with interview screens and menu items generated by an online tax return preparation application. [0006] As another example, an online tax preparation application may manage a "master" persistent data store that includes tax return data of millions of taxpayers, whereas embodiments utilize a separate data store or cache that includes a small portion of a persistent data store managed and serves as a low latency remote persistence mechanism where user data is stored and indexed for fast access and retrieval at decision computation time. Embodiments utilize a cache of data required by models that are used to compute decisions in the form of topic rankings, for example, for currently active users (i.e., user's with an active session or who are logged into the online tax return preparation application). While the a runtime persistence data store managed by the online tax return preparation application is used for maintaining all application data for all users over all time, the cache component utilized according to embodiments maintains only a small subset of that persistent data for active users, and only for the duration of the active user's session. Measured in volume of data managed, this means the scope data processed according to embodiments is much less than the data in the runtime persistence data store managed by the online tax return preparation application while, at the same time, use of the cache according to embodiments provides for more precise and tailored implementation decisions when choosing the hardware and software that that will be used to implement embodiment components.
[0007] Embodiments also involve models, such as predictive models, and integration of predictive models into on-line tax return preparation applications. Results generated by models are incorporated into user interfaces or interview screens generated by a tax return preparation application. [0008] Embodiments also involve a modular escrow system that allows for changes to individual components such as models and specification files regarding data required for model execution, without having to modify code or components of an on-line tax return preparation application.
[0009] Embodiments also involve how a tax topic ranking system can execute independently of or asynchronously relative to an on-line tax return preparation application for which the topic ranking is generated.
[0010] Embodiments also involve user interfaces, such as user interfaces generated by online tax return preparation applications, and how such user interfaces can be modified during preparation of an electronic tax return, and in the background for an interview screen that is not currently viewed by a user.
[0011] According to one embodiment, a computerized tax return preparation system includes an on-line tax return preparation application and a tax return topic ranking system. The on-line tax return preparation application (such as turbotax.com) can be accessed by respective user computing devices executing respective browsers to prepare respective electronic tax returns of respective users. The on-line tax return preparation application is configured to write respective electronic tax return data of respective users to a first data store, or a master persistent data store. This first data store may include electronic tax return data of millions of customers for many years. The tax return topic ranking system is in communication with the on-line tax return preparation application and the first data store and can operate independently of the on-line tax return preparation application while interfacing with the online tax return preparation application to provide tax return topic rankings thereto. The tax return topic ranking system is configured to retrieve from the first data store particular types of data for respective users that are logged into the online tax return preparation application, and the small portions of data of respective logged-in users retrieved from the first data store are included in generated escrow records, which are stored or cached to a different, second data store. When data of an escrow record satisfies predetermined criteria, a model is executed utilizing the particular types of electronic tax return data to generate a tax return topic ranking, which is provided to the online tax return preparation application. The on-line tax return preparation application is configured to generate an interview screen including tax return topics structured according to the generated tax return topic ranking.
[0012] According to another embodiment, a computerized tax return topic ranking system that is operable with an on-line tax return preparation application comprises an escrow controller, a model specification file, a selective data retrieval service, a second data store or cache for escrowed data, an execution service or model processor, and a third data store for storing results of model generation and that are to be provided to an online tax return preparation application. According to one embodiment, the escrow controller is in communication with the on-line tax return preparation application and can receive data from the online tax return preparation application indicating, for example, that a user has initiated an on-line session or logged into the online tax return preparation application. The escrow controller is configured to process the model specification file, or escrow contract, to determine which types of electronic tax return data are required in order to trigger execution of a model. One or multiple models may be applicable to a user and an escrow record may be generated for each active model for a user, and one or more escrow contracts may specify one or more types or combinations of data required for respective model inputs and execution. The selective data retrieval service is in communication with the escrow controller and the persistent or first data store utilized by the online tax return preparation application to maintain all application data for all users over all time. The second data store or cache is in communication with the escrow controller. The selective data retrieval service is configured to retrieve the logged in user's electronic tax return data of types identified by the model specification file (e.g., based on a subscription or request by the escrow controller) and provide the retrieved electronic tax return data to the escrow controller.
[0013] For example, the selective data retrieval service may be a polling service employed by the escrow controller, or a native element of the first data store that automatically notifies the escrow controller of changes in specified types of electronic tax return for logged-in users without the need for polling or a subscription for specific data or changes thereto. The escrow controller can generate an escrow record for the user or online session and including retrieved electronic tax return data, which may be some or all of the data types identified by the specification file. When the escrow controller determines that an escrow record contains the minimum required types of electronic tax return data identified by the model specification file, model execution is triggered, and for this purpose, the escrow controller is configured to issue a call to the model processor to execute a model identified in the model specification file. A result generated by execution of a model is a ranking of tax return topics, or a ranking of menu items (generally, ranking of tax return topics). This ranking result is provided to the online tax return preparation application. For this purpose, the escrow controller may be configured to store the model result in an independent third data store until the model result is requested by the online tax return preparation application.
[0014] The third data store may include the most recent or current model result generated and can be modified as a result of iterations of changes to the first data store data, changes to the second data store or cache data, changes to whether model requirements are satisfied and which models are executed with different inputs, and resulting different results.
[0015] A further system embodiment includes one or more or all components of a computerized tax return topic ranking system discussed above (including an escrow controller, a model specification file, a selective data retrieval service, a second data store or cache for escrowed data, an execution service or model processor, and a third data store) and the on-line tax return preparation application.
[0016] Further embodiments are directed to computer-implemented methods for generating ranked or prioritized data, e.g. in the form of a list or menu item, such as a list of tax return topics that is presented to a user of an online tax return preparation application.
[0017] Additional embodiments are directed to computer program products or articles of manufacture comprising a non-transitory computer readable medium embodying instructions executable by a computer to execute a process generating a ranking of tax return topics for presentation to a user of an online tax return preparation application.
[0018] Yet other embodiments are directed to adaptive or dynamic user interfaces or interview screens that reflect or can change given a current state of a user's tax return data.
[0019] Further embodiments are directed to how a small portion of data is retrieved from the first or persistent data store that maintains all data of all users of the online tax return preparation application and processed to generate portions of interview screens or user interfaces during preparation of an electronic tax return.
[0020] In a single or multiple embodiments, a third data store different from and independent of the first data store and the second data store is utilized to store a generated tax return topic ranking for subsequent retrieval in response to a ranking request by the on-line tax return preparation application. With system embodiments utilizing a distributed data store configuration in which each data store is separate and independent of the others, the on-line tax return preparation application can write data to and read data from the first data store, but the on-line tax return preparation application cannot write data to or read data from the second data store or the third data store.
[0021] In a single or multiple embodiments, a model that is utilized to generate a tax topic ranking may be or be based one or more predictive models including logistic regression; naive bayes; k-means classification; K-means clustering; other clustering techniques; k-nearest neighbor; neural networks; decision trees; random forests; boosted trees; k-nn classification; kd trees; generalized linear models and support vector machines. One or multiple models may be published or activated for use, and one or multiple models may be available for execution for an individual user session, and escrow records can be generated for respective models for each logged in user.
[0022] In a single or multiple embodiments, the tax return topic ranking system is initiated for a user in response to receiving data indicating that a user has logged into the online tax return preparation application or initiated an on-line session, and terminated when the user has logged off from the online tax return preparation application or terminated the on-line session.
[0023] In a single or multiple embodiments, the tax return topic ranking system, e.g., via an escrow controller, receives a request for a tax return topic ranking from the on-line tax return preparation application during preparation of the electronic tax return for a logged-in user. The type of response provided depends in part upon whether a model has been executed and a result for a logged-in user that is the subject of the request is stored in the third data store.
[0024] For example, a request received by an escrow controller may identify logged in user, and in response, the escrow controller may initially access the third data store to determine whether a tax return topic ranking has already been generated for the identified logged-in user. If so, this ranking can be provided in response to the request. If not, a determination can be made whether the escrow record data of the identified user in the second data store satisfies pre-determined criteria or escrow requirements. If these criteria or requirements are satisfied, a corresponding model can be executed to generate a tax return topic ranking, which is provided to the on-line tax return preparation application in response to the request. If the escrow data does not satisfy pre-determined criteria for a model, the tax return topic ranking system can wait for the escrow record to be modified such that the pre-determined criteria is satisfied, and the corresponding model can be executed and/or respond to the request by notifying the online tax return preparation application that no generated tax return topic ranking is available, in which case the online tax return preparation application may utilize a default or pre-determined topic ranking. As the escrow record is modified, as a result of adding, deleting or changing data to generate a different escrow record, such modifications may result in execution of different models, which may result in different topic rankings.
[0025] In a single or multiple embodiments, data analyzed relative to requirements of an escrow contract or specification file is read from a single source, i.e., the second data store or cache of the portion of the data in the first data store. In another embodiment, electronic tax return data is provided by multiple sources, and in one embodiment, this involves both the second data store or cache and the on-line tax return preparation application, e.g., the request provided by the online tax return preparation application.
[0026] In a single or multiple embodiments, as escrow record data is modified, this may result in execution of the same model again with different data of the same type for that model, which may or may not result in a different tax return topic ranking. According to another embodiment, changes to an escrow record may trigger execution of a different model, which may result in a different tax return topic ranking.
[0027] In a single or multiple embodiments, a tax return topic ranking system being configured to execute the model independently, asynchronously, or in an uncoordinated fashion relative to the on-line tax return preparation application writing electronic tax return data to the first data store. Thus, after a tax return topic ranking system is initiated for a particular user in response to receiving a "start" signal from the online tax return preparation application indicating that a new online session has been initiated, the online tax return preparation application and the tax return topic ranking system are independent of each other and do not coordinate with each other until a request is made for a tax topic ranking at which point these components are not coordinated or in synchronization with each other.
BRIEF DESCRIPTION OF THE DRAWINGS
[0028] FIG. 1 is a block diagram of a computerized tax return preparation system constructed according to one embodiment including a ranking system that executes a model to determine a ranking of tax return topics for presentation in an interview screen generated by an online tax return preparation application;
[0029] FIG. 2 depicts how system embodiments may be utilized to change how tax return topics are presented in interview screens that are may be presented to a user at a later time relative to or based at least in part upon a currently displayed interview screen or other electronic tax return data in order to provide a more personalized listing of tax return topics for the user; [0030] FIG. 3 is a system flow diagram showing components of computerized tax return preparation systems constructed according to embodiments and aspects of computer-implemented methods for providing an escrow-based personalization system that can be utilized to provide tax return topic rankings to an online tax return preparation application to personalize tax return preparation experiences;
[0031] FIG. 4 is a system flow diagram illustrating one manner in which a polling service may be implemented and operate according to embodiments;
[0032] FIG. 5 is a system flow diagram illustrating how components shown in FIG. 3 can operate in a mode in which model execution to generate a tax return topic ranking is based on retrieval of selected data from a persistent data store managed by an online tax return preparation application;
[0033] FIG. 6 is a system flow diagram illustrating how components shown in FIG. 3 can operate in a mode in which model execution to generate a tax return topic ranking is based in part upon selected data retrieved from a persistent data store managed by an online tax return preparation application and selected data received from or derived from a request for a tax return topic ranking made by an online tax return preparation application;
[0034] FIG. 7 is a system flow diagram illustrating how components shown in FIG. 3 can operate in a mode in which the online tax return preparation application sends "heartbeat" signals to trigger analysis of data for execution of a model to generate a tax return topic ranking. [0035] FIG. 8 is a system flow diagram illustrating one manner in which an escrow controller can be configured and operate according to embodiments involving a "heartbeat" signal and thread pools;
[0036] FIG. 9 is a block diagram of components of a computer system that may be programmed or configured to execute embodiments; and
[0037] FIGS. 10A-B are system flow diagrams illustrating embodiments that are utilized to determine personalization experiences.
DETAILED DESCRIPTION OF ILLUSTRATED EMBODIMENTS
[0038] Embodiments of the invention relate to how a personalized list of topics or menu options is generated, communicated to a tax return preparation application (such as turbotax.com), and presented to a user, and how topics or menu items rankings can change as a user prepares an electronic tax return. Embodiments of the invention may be utilized with or be integrated into computing systems or applications such as on-line tax return preparation applications to provide taxpayer users with a list of tax topics or menu options that are arranged in a particular ranking, sequence or order and to change the order of menu options or list of tax topics as a taxpayer user enters or changes data during preparation of an electronic tax return. Thus, as an electronic tax return is prepared, the order or sequence of tax topics presented to taxpayer users is also changed to reflect changes or updates to the tax return data. Embodiments of the invention may also be utilized in other on-line or networked system such as television systems to or on-line movie systems (such as netflix.com) to provide viewers with a list of viewing options that are arranged in a particular ranking sequence or order. For example, as a user views various television programs or movies, an order or sequence of other available television programs or movies or types thereof (e.g., comedy, horror, romance, etc.) may be changed as the user browses or views various programs or movies.
[0039] For on-line tax return preparation applications, as an example, real time topic ranking changes are implemented using a computing system that that utilizes a only small portion of data in a "master" data store managed by the on-line tax return preparation application, while monitoring the master data store for changes to data for user that are currently logged into the on-line tax return preparation system. This small portion of data is escrowed or cached, or temporarily stored, to another, independent data store that the online tax return preparation application does not and cannot access. As data is escrowed, cached or temporarily stored, the system determines whether there is a model that can be triggered to determine a tax return topic ranking for one or more interview screens, and when these model requirements are satisfied, and the models are executed, the resulting rankings that are generated are provided to the on-line tax return preparation application.
[0040] Thus, when a user navigates to other interview screens, tax topics that can be selected within the interview screens, whether presented in a menu or other form, are arranged according to the determined ranking rather than according to a static, default or fixed sequence. As a simplified example, based on the data retrieved from the "master" data store, execution of a model may change the order of "income" topics in an interview screen that may be set to be displayed next from a default or static order that begins with "wages" (e.g., for Form W-2) to a new order of: dividends, capita! gains, business income, interest, rental income, etc., thus not only resulting in modification of the user interface presented to the user, but also providing the user with a more personalized and pertinent tax return preparation experience.
[0041] As the user continues to prepare the electronic tax return, the user's data changes may result in re-execution of the same model or execution of other models which, in turn, may result in changes to how tax topics are presented in various interview screens.
[0042] Thus, embodiments not only transform tax topic sequences during preparation of an electronic tax return, but also transform a pre-determined, static user experience into a dynamic, personalized experience, and do so utilizing only small portions of the "master" data which, leads to reduced demands on computing resources compared to other systems and methods that require additional data and interactions. Further aspects of embodiments are described with reference to FIGS. 1-10B.
[0043] Referring to FIG. 1 , in a computer system 100 constructed according to one embodiment, an online tax return preparation application 111 is managed by a host computer 110. The online tax return preparation application 111 is accessed by respective user computing devices 120a-c (generally, computing device) executing respective internet browsers 122a-c (generally, browser) through respective networks 130a-c (generally, network 130) utilizing the online tax return preparation application's Uniform Resource Locator (URL) address. Interview screens 112a-c (generally, interview screen) generated by the online tax return preparation application 110 are presented to respective users through screens or displays 124 of the computing devices 120. Users interact with the displayed interview screens 112 by entering or importing tax return related data into respective fields thereof to prepare respective electronic tax returns.
[0044] Examples of browsers 122 that may be utilized for this purpose include INTERNET EXPLORER, GOOGLE CRHOME and MOZILLA FIREFOX browsers, and examples of networks 130 that may be utilized for communications between computers include. A user computing device 120 maybe a desktop computer, a laptop computer, a mobile communication device such as a smartphone or a tablet computing device. Examples of networks 130 that may be utilized for communications between system components include but are not limited to a Local Area Network (LAN), a Wide Area Network (WAN), Metropolitan Area Network (MAN), a wireless network, other suitable networks capable of transmitting data, and a combination of such networks. For ease of explanation, reference is made to a network 130 generally, but various networks, combinations of networks and communication systems, methods and protocols may be utilized.
[0045] In the illustrated embodiment, the online tax return preparation application 111 manages a "master" or "global" persistent data store 141 (otherwise referred to as a "first" data store 141) for respective electronic tax return data or electronic tax returns of respective users of the online tax return preparation application 111. The first or master data store 141 may include data for millions of electronic tax returns for many users for many years. Thus, it is important for this first data store 141 to be a secure and robust and a persistent storage of many taxpayer files. The first data store 141 may be hosted by the same computer 110 or may be accessible by the on-line tax return application 111 through a communications network 130 (not illustrated). During use, when a user logs into the online tax return preparation application 111 or initiates an on-line session to begin or continue preparation of an electronic tax return, the online tax return preparation application 111 writes electronic tax return data to the first data store 141 to reflect the data entered, changed or imported (e.g., from a computerized financial management system such as mint.com), reads electronic tax data from the first data store 141 and generates interview screensl 12 and tax forms populated with the data.
[0046] In the illustrated embodiment, a tax return topic ranking system 150 (generally, ranking system) is in communication with the on-line tax return preparation application 111 and is shown in FIG. 1 as being in communication with the on-line tax return preparation application 111 and the first data store 141 managed thereby. In the illustrated embodiment, the ranking system 150 is in communication with the on-line tax return preparation application 111 and the first data store 141 , e.g., through respective communication networks 130 (not illustrated) or other communication channels, and may also be a component or module of, or a plug-in into, the online tax return preparation application 111 depending on the system 100 configuration employed. For ease of explanation, the ranking system 150 is shown as being an independent component in communication with the online tax return preparation application 111 and first data store 141 managed by the online tax return preparation application 111. [0047] The ranking system 150 can access the first data store 141 to read only certain types of data 141d ("d' referring to "data"), but is not able to write any data to the first data store 141 given integrity requirements of storing electronic tax return data as noted above. When a user has initiated an on-line session or logged into the online tax return preparation application 111 , the algorithm performed by the ranking system 150 includes retrieving a small portion of the first data store data 141 d for that particular user, i.e., for a specific on-line session, and caches, escrows or temporarily stores the retrieved data 141d locally until the raking system 150 determines that conditions or requirements 162 reflected in a specification file or escrow contract concerning an executable model 160 are satisfied such that the model 160 can be executed. The conditions or requirements 162 dictate when a model 160 is allowed to execute.
[0048] The model 160 may be a customized model or a model such as a predictive model, examples of which include logistic regression; naive bayes; k-means classification; K-means clustering; other clustering techniques; k-nearest neighbor; neural networks; decision trees; random forests; boosted trees; k-nn classification; kd trees; generalized linear models and support vector machines. For ease of explanation, reference is made to "model."
[0049] If model requirements 162 are not satisfied given a current set of data retrieved 141d from the first data store 141 and cached or locally stored by the ranking system 150 in a second data store or cache 142 (second data store), the data 141 d remains cached or escrowed for future use for subsequent iterations of checking for a different model 160 that can be executed or when the first data store 141 is updated or changed and those updates or changes are retrieved and cached locally by the ranking system 150. Then, a model 160 may be able to execute with the retrieved data updates 141d.
[0050] Thus, if the on-line tax return preparation application 111 issues a request 113 to the ranking system 150 for a tax return related topic ranking 161 , e.g., for an interview screen 112 determined to be displayed to a user next given the interview screen 112 that is currently being displayed, the response 153 to the request 113 indicates that there is no result, and the on-line tax return preparation application 111 may utilize the prior or default ranking. In certain embodiments, the data utilized for model 160 execution is only data 141d that was previously retrieved from the first data store 141 and cached locally by the ranking system 150 in the second data store 142. In another embodiment, data utilized for model 160 execution may come from multiple sources such as cached or locally stored data 141 d that was previously retrieved from the first data store 141 and data of or derived from the request 113 by the on-line tax return application 111.
[0051] For example, there may be instances in which a model 160 requires ten types of data and nine of the ten types of data 141d have been retrieved from the first data store 141 and are stored in the second data store 142. Thus, while all of the required data 141d has been retrieved from the first data store 141 , the model 160 still cannot execute until the final data is received or derived from the request 113 by the on-line tax return preparation application 111 after which the model 160 can be executed. [0052] If data requirements for mode! 160 execution are satisfied based on the current set of data retrieved 141 d from the first data store 141 and cached 142 by the ranking system 150, or based on a combination of data of the second data store 142 and another source such as the request 113, the model 160 is executed to generate a result 161 in the form of a ranking or sequence of tax topics 161. When the online tax return preparation application 111 issues a request 131 for a tax topic ranking, e.g., based on one or more screens 112 that are determined to be displayed next or later in a sequence of interview screens 112 given the interview screen 112 that is currently being displayed, the generated ranking 161 can be communicated to the online tax return preparation application 111 in response to the request 113.
[0053] Examples of tax return topics that may be processed by embodiments include, but are not limited to, income, deductions, taxes paid. Embodiments may also be utilized to rank sub-topics of a topic, e.g., certain types of income, such as wages or other income such as business income, interest, dividends, pension income, annuities, rental income, unemployment compensation, capital gains, gambling income, farming and fishing income, clergy earnings, social security, scholarships, alimony, canceled debt, 401 k distributions. As another example, sub-topics for the topic of "deductions" may include medical and denta! expenses, home mortgage points, interest expenses, charitable contributions, business related expenses (car, travel, etc.), educational expenses, property taxes. As used herein, "topic" is defined herein as including both such "topics" and "sub-topics" and further drilling down into additional sub-topic levels, and a tax return topic ranking may include only "root" node topics (e.g., income, deductions, etc.), a combination of "root" and "leaf node topics / sub-topics (e.g., income, property tax, mortgage interest), only "leaf node sub-topics of a certain topic (e.g., property taxes paid, points, mortgage interest), "leaf node sub-topics of different topics (e.g., property taxes paid, points, mortgage interest, wages, dividends), or a combination of one or more of topics, sub-topics and further drilled down or lower level, more specific sub-topics. "Topic" as defined herein includes such "topics" and "sub-topics" of various levels of specificity, and a resulting ranking 161 may include "topics" of one more levels of specificity or root and or leaf node level topics or sub-topics. A ranking 161 may include a small number of topics (e.g., three or four), or a larger number of topics (e.g., 10-20), the order of which such topics are displayed to a user being reconfigurable in real-time during preparation of an electronic tax return with embodiments of the invention and is reflective of the user's electronic tax return data managed by the online tax return preparation application 111.
[0054] Thus, with embodiments, when a user navigates to that associated interview screen 112, the tax return topics in that interview screen 112 that would normally be presented in an interface including a first sequence, order or ranking (e.g., Rank 1 ) which may be a static or fixed sequence) are transformed into another interface including a different, second sequence, order or ranking (e.g., Rank 2) such that the second ranking 161 generated by model execution 160 is personalized or customized for the user, while utilizing a very small portion of first data store data 141d. [0055] The ranking 161 that is generated and communicated to the on-line tax return preparation application 111 in response to the request 113 may be one that was previously generated and locally stored or cached, waiting to be delivered to the online tax return application 161 in response to a request 113. In another embodiment, if a requested 113 ranking has not been generated, the ranking system 150 may then execute a model 160 (assuming the requisite data for the model 160 has been collected and is stored in the second data store 142 per the specification file 162), and the resulting tax topic ranking 162 generated by execution of the model 160 is provided to the online tax return preparation application 161 in response to the request 113 and may be locally stored in a third data store 143 for future reference.
[0056] Referring to FIG. 2, the user may be interacting with a currently displayed interview screen 112c ("c" referring to "currently" displayed), and based on a predetermined, predicted or potential sequence of interview screens 112p (that are not currently displayed) given the currently displayed interview screen 112c, the online tax return preparation application 111 may request rankings 161 for these other interview screens 112p that are not currently displayed 112c but that will be, or might be, displayed. As shown in FIG. 2, rankings 161 may be generated for various numbers of "future" or "potential" interview screens 112p (Screenl - Screen3 112p that are not currently displayed are illustrated) and various numbers of rankings 161 (Rankl -Rank3 are illustrated) generated by respective models 160 that apply to the interview screens 112. [0057] While embodiments are described with reference to how rankings 161 may be generated for future potential interview screens 112p that are not currently displayed, embodiments may also be executed concurrently for multiple users, thus providing a personalized tax return preparation experience for each user. Thus, with reference again to FIG. 1 , the online tax return preparation application 111 may issue Requests 1 -3 for respective users U1 -U3, and each request may involve one or multiple topic rankings for interview screens that will or may be displayed to the user after a currently displayed interview screen. The online tax return preparation application 111 may also issue Requests 1 -3 for respective interview screens for a particular user. Examples of specific embodiments are described in further detail with reference to FIGS. 3-1 OB.
[0058] Referring to FIG. 3, a computerized tax return topic ranking system 150/350 (generally, ranking system 350) constructed according to one embodiment interfaces with a tax return preparation application 111/311 and comprises an escrow controller 320, an escrow or model specification file 162/362, a selective data retrieval service 380, a model processor or model executor 370, a second data store 142/342 to which selected data for a logged-in user is retrieved from a first or "master" data store 141/341 is written and temporarily retained as an escrow record 345. For each on-line session, an escrow record 345 is generated for each user and for each model 160/360 applicable to each user. Models 360 and escrow contracts or model specification files 362 are activated or made available for use via a publication service 365. [0059] Certain embodiments also include a third data store 143/343 that may be dedicated to storing a current tax return model ranking 161/361. System components are described in further detail together with how they work together with other system components to implement computer-implemented methods for determining and communicating tax return topic rankings 361 that are communicated to an on-line tax return preparation application 311.
[0060] The escrow controller 320, also referred to as an Decision Engine (DE) Escrow Service in FIG. 3, is a central, intermediate control element that interfaces with the on-line tax return preparation application 311 and manages escrow of retrieved data, model 360 execution, when a model 360 can be executed based on an escrow contract or specification file 363, and how a tax return topic ranking 361 is retrieved and communicated to the on-line tax return preparation application 311. The escrow controller 320 manages escrow records 345 generated for data retrieved from the first or master data store 341 and cached or temporarily stored to the second data store 342 and manages escrow records 345 in the second data store 343. There is one escrow record 345 per active user session, per active model 360. Thus, as data is collected from the master of first data store 341 , it is temporarily held or stored in the second data store 342 and accumulated to satisfy certain model execution requirements 362. In this way, the data retrieved from the first data store 141 is held "in escrow" until such time as it can be used to execute a model 160 to generate a result 161 in the form of a tax return topic ranking.
[0061] The escrow record 345 that is generated by the escrow controller 320 and stored to the second data store 342 includes a user identifier such as a ticket or token generated for the on-line session when the user logged into the on-line tax return preparation application 311 , a model identifier, which identifies the model 360 to which the escrow record 345 pertains (since an escrow record 345 is generated for each user and each model 360 applicable to each user), a description, such as a canonical description or unique identifier of required data and its location in the second data store 342 for the associated model 360, and which will be used to build a payload that is provided to a model execution service or model processor 370 by the escrow controller 320, and a description, such as a canonical description, of the location to which the generated result or model output 361 (in the form of a ranking of tax topics), is to be written, e.g., in a third data store 343 dedicated to storage of model results 361.
[0062] An escrow record 345 is never the source of truth for a particular value for a particular customer; instead, the source of truth will always be the location from where the data was initially retrieved. The set of user data managed by the escrow controller 320 is a much smaller subset of the user data managed by the collection of "source of truth" data stores, i.e., the first or "master" data store 341 in the illustrated embodiment. Whereas a "master" or first data store 341 is charged with managing all data for all users of the online tax return preparation application 311 for all time, embodiments operate based only on a sliver or very small set of that data - only for users that are currently logged into the online tax return preparation application 311 , and only for types of data required by models 360 that have been activated or published via a publication service 365. Given the very small of data of the first data store 141 that is utilized, embodiments provide for a system that utilizes data stores 342 that may be a smaller, upper bounded, low latency memory based cache rather than the first data store 141 , which is a very large, growing and permanently persistent storage mechanism.
[0063] A computer-executable model 360 is executable by the model processor or execute service 370 utilizing cached data of escrow records 345 for that model 360 to generate a result 361 in the form of a ranking of tax return topics. A model 360 includes data such as a unique model identifier, a description, e.g., canonical description expressed in XML Schema Definition (XSD), of the format of data inputs for the model 360, and a specification file or escrow contract 362 specifies the types and locations of data needed to execute a model 360. For example, a specification file 362 may list the canonical location of a piece of data, e.g., lists the zip code not just as word "zip" but as instruction that the escrow controller 320 can read and interpret to determine how to fetch the "zip" data from the second data store 342 and indicates when a call can be made by the escrow controller 320 to the model processor 370 to execute the model 360.
[0064] The escrow contract 362 also indicates whether certain inputs are required or optional. For example, a model 360 may require 10 total inputs to execute, and may only execute on those specified 10 inputs, or a model may require 10 total inputs to execute, but if an escrow record 345 includes other types of data, these other types of data may also serve as inputs to a model 360 to supplement the other required data. As shown in Fig. 3, the specification file 362 is provided to the escrow controller 320, but not the model processor 370, and the model 360 is provided to the model processor 370 but not the escrow controller 320. Thus, system embodiments involve a distributed model execution system in which execution of the model 360 is separate of determinations of whether the model 360 should even be executed, and if so, how the model 360 should be executed (i.e., how to locate the data for the model 360 in the second data store 342). Thus, the model processor or execution service 370 is a computer processor component that, once called by the escrow controller 320 to execute a model 360 upon satisfying the requirements specified by the specification file 362, provides an execution result 361 in the form of a ranking of tax return topics to the escrow controller 320 based on the cached data provided to it.
[0065] Embodiments also utilize a selective data retrieval service 380 that is also controlled by the escrow controller 320 for the purpose of retrieving pre-determined types of data for a currently logged in user from the first or master data store 341 and to check for changes in certain types of data in the shared data store 341. The data retrieved from the first data store 341 is utilized by the escrow controller 320 to generate an escrow record 345 or to update an escrow record 345 given a data change, and to monitor or check the first data store 341 for changes in predetermined types of data for a particular user so that corresponding escrow records 345 can be updated in the second data store 342. While the second data store 342 is utilized by the escrow controller 320, the second data store 342 is not accessible by the online tax return preparation application 311. This is further illustrated in dashed line in FIG. 3 that surrounds ranking system 350 components other than the online tax return preparation application 311 and the first data store 341 managed thereby. [0066] Examples of suitable caching applications that may be utilized in embodiments include, memcached and redis, which provides for the second data store 342 to serve in the role of a low latency remote persistence mechanism to which user data (which is a small fraction of the amount of data stored by the first data store 341 ) is stored and indexed for fast access and retrieval at decision computation time. The second data store or cache 342, as its name implies, is a cache of all the data required by all models 160 will be used to compute rankings or decisions 361 for all the currently active users (i.e., users with an active on-line session, but not users that are not currently logged in). In other words, the limited scope and role of the second data store 342 can be described through its contrast with the first data store 341 , which is a runtime persistent storage component responsible for maintaining all online tax return preparation application 311 data for all users over all time. This is in sharp contrast to the second data store 342, which needs only to maintain a small subset of that data for active users and only for the duration of the active user's session. Measured in volume of data managed, this means the scope data and responsibilities of the second data store 342 is much less than those of the first data store 341. As such, embodiments allow for more precise and tailored implementation decisions when choosing hardware and software that that will be used to implement the second data store 342 cache component.
[0067] One type of data retrieval service 380 that may be utilized to retrieve data from the first data store 341 and cache it to the second data store 142 is a polling service that manages one or more pollers that are active for a user's on-line session to retrieve the user's data from the first data store 341 and to detect when pre-determined data fields in the first data store 341 have been changed, e.g., added, deleted or modified. A poller is an object that works on behalf of a single active user model 360 to gather data from the first data store 341. A poller may include or utilize data such as a user identifier (such as an authid or user name and a ticket or token or password) for the on-line session, a collection of fields in the first data store 341 to poll or monitor, and a time during which the poller is active or how long the poller should continue to monitor the specified fields of the first data store 341. A poller can be configured to monitor or check the first data store 341 at certain times or periodically with a scheduler. Pollers are configurable such that frequency of polling can be modified to increase or decrease the frequency of fetching data or to run off or terminate the poller after a predetermined time.
[0068] Embodiments may utilize different types of polling services or polling mechanisms. In one embodiment, the escrow controller 320 submits a subscription 322 to changes in a specified data field of the first data store 341 such that all modifications to this data field by the on-line tax return preparation application 311 will result in a change event being fired and received by escrow controller 320. This "subscribe" configuration involves an ongoing, continuous subscription 322. in addition to an ongoing subscription 322 to get access to ail subsequent changes to a data field of the first data store 141 , the data field will also be fetched once at the time of the creation and initialization for purposes of generating an escrow record 145. [0069] This will allow the system to "catch up" to the last known state of the data field, and then stay caught up while the subscription 322 is active. This procedure can be used for any data field that is likely to change during the course of a user's session.
[0070] According to another embodiment, the polling service 380 utilizes a subscription 322 as described above, but rather than "subscribe" generally, the subscription is limited to "subscribe until found." Thus, in this embodiment, the subscription 322 is active during monitoring of a specified data change in the first data store 141, but when that data change in the form of a change from an empty field to a populated field of the first data store has been detected and the data has been cached to the second data store 142, the subscription 322 can be terminated. A "one time" subscription may also be employed to detect a change in a populated field to different data, after which the subscription 322 is terminated. According to yet another embodiment, the polling system 280 may be configured as a "fetch once" system such that a data field will be preemptively retrieved from the system and provided to the escrow controller 320. This configuration may be suitable for data items of the first data store 141 that are highly unlikely to change during the duration of the escrow controller's 320 operation and on-line session. For instance, any data representing an unchanging fact (i.e., "did an event occur at some point in the distant past?"), would likely be appropriate to retrieve with this strategy. This strategy also involves less computing resources than strategies involving a subscription. [0071] While it will be understood that various polling systems 380 may be utilized, for ease of explanation, reference is made to a polling system 380 generally, and one that utilizes a subscription 322 to changes of data in the first data store 141 as shown in FIG. 3.
[0072] Thus, embodiments that utilize a data retrieval service 380 in the form of a poling service (as illustrated in FIG. 3), which may take various forms as noted above, can be utilized in the event that the first data store 341 does not support native automated data feeds from the first data store 341 and that indicate to the escrow controller 320 which fields have been changed or updated. For example, rather than using a polling mechanism, the first data store 341 may be configured to automatically output a data feed in the form of a stream of information identifying changes to data to which the escrow controller 320 has subscribed.
[0073] In embodiments that utilize a data retrieval service 380 in the form of a polling service, there is one poller for every active escrow record 345, which in turn will exist for every active user and for every active model 360. The escrow controller 320 manages the lifecycle of a poller, which may be in the form of independent threads of execution across a large cluster of machines. This independence and statelessness of pollers allows for horizontal scaling of the machines, and the escrow controller 320 ensures that there is a single poller per unique active model for an on-line session.
[0074] The system flow diagram shown in FIG. 3 illustrates one example of a selective data retrieval service 380 in the form of a poller service and poller management and how the escrow controller 320 interacts with a polling service 380 One embodiment of a poller-based data retrieval service 380 includes a polling manager or service, which includes a collection of poller objects which, as noted above, work on behalf of a single active user model to gather data from the first data store 341 , a collection of buckets, which are sets of poller objects executed for each run of a poller executor, which is activated (e.g., periodically, such as every second) to select a bucket of poller objects from the poller manager for execution in an asynchronous manner, and a timing mechanism to implement poller expiration such that a poller is not active beyond a certain time.
[0075] FIG. 4 is a system flow diagram reflecting system components and an algorithm for of one manner in which a polling system 380 is executed and how the polling system 380 interacts with the escrow controller 320 ("Escrow Service" in FIG. 4). At 402, the escrow controller 320 sends a "create poller" request to a PollerCreation message queue 450, and at 404, this message is passed to a single poller application in the cluster 452, and the poller application 454 instantiates one poller process 456 and executes it according to instructions in the escrow controller's 320 request. At 406, at some later time, the escrow controller 320 wants to modify the behavior of a single existing poller process 456 or multiple poller processes 456 and for this purpose, sends a message to the "PollerNotification" topic 458. The message identifies the poller process 456 to modify together with instructions describing the nature of the modification. At 408, the modification message is broadcast to all poller applications 454 (represented by three arrows), and the poller application 454 that is managing the poller process 456 with the message's identifier reacts to the message by modifying the poller process 456 as appropriate. Other poller applications 454 can ignore the message. At some later time, the escrow controller 320 wants to remove the poller process 456 from service, and for this purpose, submits a message to the same "PollerNotification" topic 458. The message identifies the poller process 456 to remove, and this removal message is broadcast to all of the other poller applications 454. The poller application 454 managing the poller process 456 with the removal message's identifier responds to the removal message by removing the poller process 456 as appropriate, and other poller applications 454 can ignore the removal message.
[0076] At 410, at some later time, and following an update or change to data of an on-line session for which the poller has been activated, a poller process 456 instance detects the data change in the first data store 341 and sends a message containing the details of the detected data change to the "DataChangeNotification" queue 460, and at 412, the escrow controller 320 receives the message from the queue 460 and updates an escrow record 345 in the second data store 342 accordingly.
[0077] Referring again to FIG. 3, the system embodiment that is illustrated also utilizes a distributed and shared data store configuration (341 , 342, 343) such that each data store has its own purpose and can function independently of the other. With this system structure, the on-line tax return preparation application 311 can only access, and write to and read from, the first or master data store 341 , but cannot access the second data store 342 or the third data store 343. Instead, the second data store 342 and the third data store 343 are only accessible by the escrow controller 320.
[0078] With this system configuration, the first or master data store 341, or the runtime persistence data store, is the data store that includes electronic tax return data for numerous, if not all, users of the on-lien tax return preparation application 311 , for the current year as well as for prior years, and manages data that collectively forms a profile of a user such as user, product purchase history, tax information and electronic tax returns, and interaction events (e.g., pages visited). The first or master data store 341 is thus a very large, secure and robust data store given the amount and sensitive nature of the data stored in the first data store 341.
[0079] The second data store 342, in contrast to the first or master data store 341 , retrieves only a very small fraction of the data of the first data store 341 and caches or temporarily stores the retrieved data for a specific an on-line session - it is not necessary to store data retrieved from the first data store 341 when the user is offline since the second data store 342 is configured as a temporary data store or cache to which selected data is escrowed until it can be utilized for model 360 execution for a particular on-line session.
[0080] The third data store 343 is utilized for storing the results 361 of model 360 execution in the form of tax return topic rankings and for each result or tax return ranking generated, may include data such as an authid that identifies the user to which the tax return topic ranking 361 applies, an entity key that is globally unique among all generated decisions or tax return topic rankings 361 , a flag that is used to indicate whether or not tax return ranking 361 has been provided to or consumed by the online tax return preparation application 311 , and a decision or result document containing the content in the form of the tax return ranking 361 for the identified user and executed model 360.
[0081] Having described various components of system embodiments and their functionality, computer-implemented methods and further system configurations are described, and how embodiments may operate in "automatic'' mode and "semiautomatic" mode, with reference to FIGS. 5-10B.
[0082] Referring to FIG. 5, and with continuing reference to FIG. 3, in one embodiment, the escrow controller's 320 coordination of the execution of a model 360 is a "fully automatic" algorithm in the sense that the online tax return preparation application 311 that is consuming a generated tax return ranking 361 plays no role in the coordination of the timing of the production of the tax return topic ranking 361. Instead, the online tax return preparation application 311 operates as it normally does by incrementally adding, updating, deleting data in the first or master data store 341 , and the changes to data of a logged-in user in the first data store 341 serve as the triggers for the escrow controller 320 to analyze the second data store 342 data relative to the activated models 360 and specification files 362 indicating when such models 360 can be executed. In this manner, generation and management of escrow records 345 by the escrow controller 320 and eventual model 360 execution is a result of or side effect of a trigger in the form of changed data in the first data store 341.
[0083] With continuing reference to the system flow diagram of FIG. 5 and the description of an embodiment that operates in "automatic" mode, at 502, models 360 and specification files or escrow contracts 362 are loaded or published to the system via a publishing service 365, and the model 360 is available for execution by the model processor or execution service 370, and the escrow controller 320 receives a specification file 362 that specifies the conditions that must be satisfied before a call can be made to the model processor 370 to execute the model 360. This is shown by "get model documents" and "get escrow contracts" in FIG. 3 - the model processor or execute service 370 can access published models 360 but not a specification file or escrow contract 362, which is only accessible by the escrow controller 320. Thus, model 360 execution and how and when the model 360 is executed are separated from each other and independent of each other.
[0084] At 504, the escrow controller 320 calls the model processor 370, e.g., periodically, and the model processor 370 returns data about or identifying the models 360 that have been published and are activated or live and available for execution.
[0085] At 506, a user logs into the online tax return preparation application 311 with user credentials (such as user name and password), and a login service 550 of the online tax return preparation application 311 verifies the credentials. Assuming the credentials are correct, the login service 550 initiates an on-line user session for that user, and at 508, data indicating initiation of an on-line session is communicated from the login service 550 to the escrow controller 320. This is also illustrated in FIG. 3 as the arrow from the online tax return preparation application 311 to escrow controller 320 indicating "signal user session start/stop" (the beginning of an on-line session concerns the "start" portion of "start/stop"). [0086] At 510, knowing the model 360 data and the "when and how the model is executed" set forth in the specification files 362 that have been received, the escrow controller 320 utilizes a message queue 382 to communicate with the first data store 341 , and at 512, pulls or retrieves available types of data of the model 360 from the first data store 341 to generate, complete or supplement an escrow record 345, which is stored in or written to the second data store 342. For these purposes, the escrow controller 320 retrieves data from the first data store 341 utilizing a polling service 380, a native automatic data feed of the first data store 341 , or both types of selective data retrieval mechanisms.
[0087] According to one embodiment, in which a data retrieval service 380 in the form of a polling service is utilized, at 514, the escrow controller 320 calls the first data store 341 , e.g., via a messaging queue 382, to initiate a subscription 322 for changes in data of the logged-in user or for the activated on-line session in the first data store 341. This is shown in FIG. 3 as "data subscription requests." These data changes may involve data that has been added, deleted or updated. The subscription request 322 is provided as a set of canonically named data fields (e.g. , a set of XPaths or instructions for selecting nodes from an XML document.). The escrow controller 320 may initiate a subscription 322 for any changes (new or added data, deleted data, updated data) for the specific types of data required for execution of the model 360 as specified by the specification file or escrow contract or specification file 362. The escrow controller 320 calls the polling service 380 through the message queue 382 to initiate a polling process that will poll for changes in the specified types of data in the first data store 341. [0088] After the subscription 322 has been established, at 516, the escrow controller 320 calls the polling service 380 to initiate a polling process that monitors the first data store 341 for any changes concerning specified types of data associated with the activated on-in session. In another embodiment, which may be utilized instead of a polling service, or instead of or in conjunction with a polling service, at 518, the first data store 341 , by a native data feed component, sends a data feed, which may be a continuous data feed, indicating which specified data fields had changes, to the escrow controller 320.
[0089] Continuing with FIG. 5, at 520, the selective data retrieval service 380, e.g., using a polling service for ease of explanation not limitation, periodically checks for changes of specified data fields for the active on-line session such that as polling progresses, each polling check made to determine changes relative to a prior polling check. At 522, if no change is detected, the polling service 480 continues with periodic checks for changed data in the first data store 341 , but when a change is detected by the polling service 380, the escrow controller 320 is notified of the change by the polling service 380 through the message queue 382. The escrow controller 320 writes the data update to the second data store 342 or cache.
[0090] At 524, as the poller 380 checks for data changes, and data changes are cached to the second data store 342, the escrow controller 320, based on the data pre-requisites or conditions set forth in the specification file or escrow contract 362, eventually determines that the data requirements of a model 360 applicable to the logged in user have been satisfied. This event triggers the escrow controller 320 to read the required model data from the second data store 342, communicate with the model processor or execution service 370, and call the model processor 370 to execute the model 360 with the provided data that has been retrieved from the first data store 341 and that was cached to the second data store 342.
[0091] With continuing reference to FIG. 5, at 526, as noted the result 361 generated by execution of the model 360 in the form of a tax return topic or menu option ranking is returned by the model processor 370 to the escrow controller 320, which locally stores the tax return topic result 361 , e.g., in the third data store 343, that is a dedicated data store for tax return topics generated by model 360 execution and that is not accessible by the on-line tax return preparation application 311. The generated tax return topic ranking 361 is stored in the third data store 343 until retrieved by the escrow controller 320 in response to a request by the online tax return preparation application 311.
[0092] Data that is analyzed relative to requirements of an escrow contract or specification file 362 is read from a single source, i.e., the second data store 342 or cache of the portion of the data in the first data store 341. In another embodiment, electronic tax return data is provided by multiple sources, and in one embodiment, this involves both the second data store 342 or cache and the online tax return preparation application 311 , e.g., the request provided by the online tax return preparation application 311.
[0093] More specifically, in one embodiment, all of the data that is required for execution of a model 360 is retrieved from the first data store 341 via a selective data retrieval system 380, i.e., no data for model execution is received from the online tax return preparation application 311 or request thereby. Thus, in this embodiment, the escrow controller 320. via communications through the message queue 382 with the data retrieval service 380, is able to asynchronously find all of the data elements required by a model 360, cache the data to the second data store 342, and issue a call to the execution service 370 to execute a model 360 utilizing the retrieved data and to generate a decision or result 361 in the form of a tax return topic ranking. This may be done prior to the online tax return preparation application 311 issuing a request for the ranking.
[0094] According to another embodiment, the online tax return preparation application 311 requests a ranking 361 from the escrow controller 320 ("get decision" in FIG. 3), and as part of this request or subsequent communication associated with the request, provides the required data elements to the escrow controller 320. The escrow controller 320 then generates or updates an escrow record 345, which may already include data fulfilling the requirements for model 360 execution per the specification file 362, and this triggers the escrow controller 320 to issue a call to the model processor or execution service 370 to execute the model 360 utilizing the data received from the online tax return preparation application 311 to generate a result 361 in the form a tax return topic ranking, which is then returned by the escrow controller 320 to the on-line tax return preparation application 311 and may also be stored in the third data store 343 for future reference.
[0095] In a further embodiment, data that is utilized to fulfill conditions or predetermined model 360 criteria per the specification file 362 is received from multiple sources such that systems implement a "hybrid" type data retrieval system. In one embodiment, a first portion of the data required for model 360 execution is retrieved from the first data source 341 , and a second portion of the data required for execution of that model 360 is received from the online tax return preparation application 311 , e.g., as part of the request ("get decision" in FIG. 3). According to one embodiment, a majority of the required data is retrieved from the first data source 341 , whereas a much smaller portion of data is supplied by the online tax return preparation application 311. Further, according to one embodiment, the data supplied by the online tax return preparation application 311 is the final piece of data required before a call can be made by the escrow controller 32 to the model processor 370 in order to execute the model 360 with the collected or retrieved data.
[0096] For example, referring to FIG. 6, at 602, the escrow controller 320 accesses the first data store 341 or runtime persistence via communications made to the data retrieval service 380 through the message queue 382 to retrieve or pull data elements for a model 360 as specified by the escrow contract 362 and for a user that is currently logged into the online tax return preparation application 311. The escrow contract 362 specifies which data is required for model 360 execution, which data is options, and the location of such data in the second data store 342.
[0097] At 604, the escrow controller 320 retrieves or pulls the data from the second data store 342 or cache to update or complete an escrow record 345 that is stored in the second data store 342. This is repeated for each distinct data store that the escrow controller 320 accesses or manages for this purpose, and in the illustrated embodiment, there is a single data store, i.e., the second data store or cache 342. At 606, the escrow controller 320 calls the first data store 341 to initiate a subscription 322 for any changes (adds, updates, deletes) to only those data fields specified by the model 360 / escrow contract 362, and at 608, the escrow controller 320 calls the polling service 480 to initiate a polling process that will poll for data changes/additions on behalf of the user for the user's on-line session. At 610, in embodiments involving use of data feeds rather that a polling service, the data feed of the first data store 341 may send a continuous feed to the escrow controller 320 to notify the escrow controller 320 of changes that occurred to the data fields of the first data store 341 during a particular on-line session and involving a particular user. As noted above, for systems that do not provide for an automatic data feed, a polling service 480 may be utilized, and at 612, the polling service 480 polls the first data store 341 , e.g., periodically. Each poll request is a request for changes to any data fields that have occurred since the last time the poll request was made.
[0098] At 614, when the polling service 480 detects a change, a notification of that change is sent to the escrow controller 320. Thus, according to one embodiment, at this stage, some, but not all, of the data required for model 360 execution has been collected. At 616, at a later time, the online tax return preparation application 311 issues a request ("get decision" in FIG. 3) to the escrow controller 320 for a tax return topic ranking 361. The request includes the remaining data elements that are required for model 360 execution. [0099] At 618, the escrow controller 320 utilizes a combination of the data elements from the request received from the online tax return preparation application 311 and the data of the escrow record 345 of the second data store 342 such that requirements for model 360 execution per the specification file or escrow record 362 have now been fulfilled. This triggers escrow controller 320 to call the model processor 370 to execute the associated model 360 ("decision computation" in FIG. 3), resulting in generation of a result 361 in the form of tax return topic ranking. This ranking 361 is provided to the online tax return preparation application 311 in response to the request and may be stored to the third data store 343 by the escrow controller 320 for future reference.
[00100] It will be understood that data requirements for a model 360 may be fulfilled in different ways during an on-line session, which may involve various updates to the first data store 341 and various requests made by the online tax return preparation application 311. For example, for an on-line session for a particular user, a first request for a tax return topic ranking 361 may involve a model
360 that is executed with only data retrieved from the first data store 341 , whereas during the same on-line session, a second request for a tax return topic ranking
361 may involve a model 360 that is executed with data from both the first data store 341 and data of or derived from the request made by the online tax return preparation application 311.
[00101] The system is configured such that the current or most recent tax return ranking 361 applicable to an interview screen 112 or tax return topic ranking 361 is stored, but the resulting rankings that may have been generated by execution of prior versions of the same model 360 may also be stored in the third data sore 343, e.g., for reference to allow developers to see how different model 360 versions perform. Thus, for a first interview screen 112-1 , the third data store 343 may store a current ranking of tax return topics that is based on the most recent execution of the model 360 (version 3), but also store prior versions (version 2, version 1 ) of the model 360 and/or resulting tax return rankings 361 in the third data store 343 (but not in at least the first data store 341). This may be done for each logged-in user or each on-line session, but unlike the data in the second data store 342, which is temporary data cached from the first data store 341 for a particular on-line session, the rankings / models in the third data store 343 can be stored on a permanent basis, or with persistence, or for a longer duration of time, since the second data store 342 data is based on the actual data of a temporary on-line session.
[00102] Thus, referring again to FIG. 5, at 528, when the online tax return preparation application 311 determines that a ranking 361 generated by execution of a model 360 may be required (e.g., given a first interview screen 112-1 , it is known that current interview screen 112c containing rankable tax return topics 361 will be displayed or is a possible interview screen 112p that can be displayed given defined menu options), the on-line tax return preparation application 311 issues a request for the topic ranking 361 ("get decision" in FIG. 3). In response to receiving the request, the escrow controller 320 may be configured to first determine whether a tax return topic ranking 361 has already been generated and is already contained in the third data store 343 (indicated by "fetch" in the escrow controller 320 and "get decisions for immediate consumption" in FIG. 3). If so, then the generated tax return topic ranking 361 can be immediately read from the third data store 343 and provided to the online tax return preparation application 311 in response to the request. A flag can be set in the third data store 343 to indicate that the generated ranking was utilized or provided to the online tax return preparation application 311. If no topic ranking is stored in the third data store 343, then the escrow controller 320 is configured to determine whether a model 360 can be executed based on the specification file 362 indicating the data conditions for execution of a model 360 that would generate the requested topic ranking 361 and where such data can be retrieved from the third data store 341. If the data conditions have been satisfied for the model 360, the escrow controller 320 issues a call ("decision computation" in FIG. 3) to the model processor 370 together with the associated data, and the model processor 370 executes the model utilizing the received data to generate a result 361 in the form of a tax return topic ranking, which is communicated to the escrow controller 320 which, in turn, stores the generated ranking 361 to the third data store 343 and provides the determined ranking 361 to the online tax return preparation application 311 in response to the request. If there is no ranking stored in the third data store 343, and the data in the third data store 343 for the user's online session dose not satisfy the requirements set forth by the specification file 362 such that model 360 execution is premature, the escrow controller 320 can respond to the tax online preparation application 411 indicating that no other tax return topic ranking 361 is available, in response to which the online tax return preparation application 311 can use the current, original or default ranking. Thus, a current ranking 361 may replace a default or original topic ranking, thus resulting in an interface transformation by modifying one interview screen 112 / user experience into a different interview screenl 12 / user experience. A current ranking 361 also replace or modify a previously generated ranking 361. Or, if the escrow controller 320 does not respond to the request, thus resulting in timing out of the request, the online tax return preparation application 311 may continue to utilize the current, original or default ranking for the interview screen 112.
[00103] At 530, the user eventually logs out from the tax return preparation application 311 , thus terminating the on-line session, and for this purpose, the online tax return preparation application 311 may issue a call to the logout service 552 to terminate the on-line session and at 532, notify the escrow controller 320 regarding the terminated session (indicated by "signal user session... end" in FIG. 3). With the on-line session terminated, the escrow controller 320 is now able to delete the escrow record 345 from the second data store 342, thus clearing the data from the prior on-line session. This ensures when or if the user logs in again, the current tax return data for that particular user can be read from the first data store 341 and the additional processing is not performed on data that is not involved in a currently active on-line session. In contrast to the second data store 342, the rankings and data of executed models 360 (and versions thereof) may be maintained in the third data store 343 for future reference or for use during a future on-line session. [00104] While certain embodiments involving "automated" ranking system component interactions and data retrieval have been described, other embodiments involve "semi-automatic" interaction, meaning that the on-line tax return preparation application 311 is involved in when data is retrieved from the first data store 341 such that the online ta return preparation application 311 acts as a "trigger" that would normally be performed by a listener on a data feed. In "semi-automated" embodiments, the online tax return preparation application 311 , at specific points in time during preparation of an electronic tax return, sends a "heartbeat" message to the escrow controller 320. In response, the escrow controller 320 initiates a re-scan of the first data store 341 for purposes of determining whether an escrow record 345 should be updated for that user / online session and based on the escrow record 345 content relative to the specification file or escrow record 362 applicable to a model 360, whether the escrow controller 320 can proceed with executing any models 360 for which data is read and for which data requirements have been fulfilled.
[00105] One manner in which these types of "heartbeat" or application 311 trigger embodiments may be implemented is described with reference to FIG. 7, various aspects of which have been described above, and various details of which are not repeated.
[00106] At 702, as described above with reference to FIGS. 3-6, a model 360 is published 365 and pushed to the model processor or execution service 370, and at 704, the escrow controller 320 periodically calls the model processor 370, in response to which the model processor 370 returns data about published and live models 360. At 706, at various appropriate points during the electronic tax return preparation process (e.g., when the user has reached a certain current interview screen 112c of a pre-determined sequence of screens 112, or it is possible, given current navigation, that the user may proceed to a certain navigation screen 112p), the online tax return preparation application 311 calls the escrow controller 320 and transmits a 'heartbeat" or "check status" message 750 to the escrow controller 320. This "heartbeat" message 750 identifies the user (e.g., by authld or use name) and their credentials (e.g., ticket or password). At 708, in response to the heartbeat message 750, the escrow controller 320 accesses the second data store 342 to retrieve whatever data elements for a model 360 are available for that particular user or specific on-line session, and at 710, the escrow controller 320, through the message queue 382, calls the data retrieval service 380 to initiate a polling process. The polling process can periodically check the first data store 341 for data changes/additions made during the on-line session. At 712, in response to the escrow controller's 320 call, the polling service 380 checks the first data store 341 at some regular interval, and each poll request is a request for changes to any data fields that have occurred since the last time the poll request was made. At 714, when the polling service 380 detects a change, the polling service 380, through the message queue 382, transmits a message to the escrow controller 32o to notify the escrow controller 320 of the change. The changed data is cached or temporarily stored to the second data store 432. At 716, when the escrow controller 320 determines that the escrow record 345 stored in the second data store 342 including the first data store 341 changes fulfills the input requirements of the decision model 360 and that the model 360 can be executed per the specification file 362, the escrow controller 320 issues a call to the model processor 370 together with the requisite data read from the escrow record 345 in the second data store 342 to execute the model 360, and the generated result 361 in the form of a tax return ranking is provided by the model processor 370 to the escrow controller 320 and stored to the third data store 343 at 718. At 720, when the online tax return application 311 needs or may need the generated ranking 361 (e.g., based on an interview screen 112 to be displayed next or that may be possibly displayed given a current interview screen 112 or sequence of questions to be presented), the online tax return preparation application 311 issues a request for the tax return topic ranking 361 to the escrow controller 320, which reads the tax return topic ranking 320 for that user that was previously generated from the third data store 343, and responds to the request by providing the tax return topic ranking 361 to the online tax return preparation application 311. At 722, eventually, heartbeat calls 750 or checks to the escrow controller 320 for the active on-line session / user are terminated, or will not be issued after a pre-determined amount of time, in response to which the escrow controller 320 can terminate the polling service 380 and delete the escrow record 345 from the second data store 343 since the on-line session has been terminated.
[00107] One manner in which an escrow controller 320 may be configured for embodiments, including embodiments involving heartbeat message 750 applications, is described with reference to FIG. 8. One embodiment of an escrow controller 320 includes or utilizes a foreground thread pool 850 and a background thread pool 852 that are utilized to process heartbeat requests 750 and interface with the escrow records 345 in the second data store 342. A foreground thread pool 850 is a pool of processes that works to receive the incoming heartbeat requests 750, or a "request for future work" from the online tax return preparation application 311 and forwards the necessary instructions to the background thread pool 852. Foreground refers to work that is done will the calling application waits for the response, whereas background refers to what will be done at some time later, without the calling application needing to wait. Thus, the application 311 that sends the heartbeat signal 750 can receive a rapid response while "heavier" work is scheduled to be done later, such as initializing a cache or escrow record 342 and initializing a poller 380 (in the case of the very first heartbeat signal 750 received for a user) or keeping alive an existing cache or escrow record 342 and an existing poller 380 (in the case of a subsequent heartbeat signal 750 received for a user).
[00108] Pools include collection of executable elements, such as computer executable elements, such as threads. A thread is a component of a process and is the smallest sequences of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. Multiple threads can exist within the same process and may execute concurrently and share resources. On a single processor, multithreading is generally implemented by time slicing or multitasking, and the CPU switches between different software threads. On a multiprocessor or multi-core system, multiple threads can be executed in parallel or at the same time, with every processor or core executing a separate thread simuitaneously. The size of the pool is usually the number of concurrent users.
[00109] The background thread pool 852 is a pool of threads that performs asynchronous or background tasks consisting of creation and modification messages to a Java Message Service (JMS) broker, creation and modification of each unique escrow records 345, and decision engine or escrow controller 320 execution requests. A thread pool is a pre initialized collection of Java threads, and threads are small independent executable 'processes' capable of doing work without coordinating with another process. The size of the background thread pool 852 is the ratio of average task completion time of the foreground thread pool 850 divided by the average task completion time of the background thread pool 852.
[00110] In FIG. 8, MemCache refers to the cache or second data store 342 utilized by the escrow service or escrow controller 320. The second data store 342 includes escrow records 345 and data about user activity such as unique user identifiers and timestamp data of activity. An evictor thread 854 manages dormant escrow records 342 and the proper sizing of the background thread pool 852. Dormant records 354 are records for which the system has not received a heartbeat signal 750 in some pre-determined amount of time. When an escrow record 342 is dormant, it is deleted and its worker is deleted to result in "cleaning up" of at the end of a user's session of data that was created when the on-line session was initiated.
[00111 ] With continuing reference to FIG. 8, involving embodiments involving heartbeat signals 750 sent by the online tax return preparation application 311 , at 802, during an active on-line session, the online tax return preparation application 311 sends a "heartbeat" or "check" message 750 to the escrow controller 320 to notify the escrow controller 320 that the on-line session is still active, and upon receiving the heartbeat message 750, the foreground thread pool 850 of the escrow controller 320 quickly responds to the online tax return preparation application 311 with a "success" indicator. At 804, further instructions are forwarded by the foreground thread pool 850 to the background thread pool 852 for asynchronous processing.
[00112] At 806, the background thread pool 852 checks MemCache 856 (the cache or second data store 834) to determine whether a unique escrow record 345 has been created, and if not, then one is created and stored to the MemCache 856 / second data store 342. At 808, upon each escrow record 345 creation, the background thread pool 852 sends instructions to Java Message Service (JMS) brokers to create poller 380.
[00113] At 810, JMS brokers respond to the background thread pool with further instructions regarding data that has been requested. Resources that may be utilized for this purpose include, for example, Amazon Web Services (AWS) / Simple Queue Service (SQS) 858 available from Amazon Web Services, Inc. The Amazon SQS Java Messaging Library is a JMS interface to Amazon SQS that enables leveraging of Amazon SQS in applications that utilize use JMS, and the interface allows use of Amazon SQS as the JMS provider.
[00114] At 812, whenever an escrow record 845 is completed, the background thread pool 852 calls the model processor or execution service 370 (also referred to as decision engine in FIG. 8) to generate a result 361 in the form of a tax return topic ranking. At 814, the evictor thread 854 loops through the escrow records 345 and actively monitors the second data store / cache 342 for any dormant escrow records 345, and at 816, if a dormant escrow record is detected, the evictor thread 854 initiates termination of the poller 380, the escrow record 345, and resizing of the background thread pool 852. This is done in order to recover resources created when a user's session was initiated, rather having a result of increasing resource consumption over time.
[00115] FIG. 9 generally illustrates certain components of a computing device 900 that may be utilized or that that system components include for execution of various computer instructions according to embodiments. For example, the computing device may include a memory 910, program instructions 912, a processor or controller 920 to execute instructions 912, a network or communications interface 930, e.g., for communications with a network or interconnect 940 between such components. The memory 910 may be or include one or more of cache, RAM, ROM, SRAM, DRAM, RDRAM, EEPROM and other types of volatile or non-volatile memory capable of storing data. The processor unit 920 may be or include multiple processors, a single threaded processor, a multi-threaded processor, a multi-core processor, or other type of processor capable of processing data. Depending on the particular system component (e.g., whether the component is a computer or a hand held mobile communications device), the interconnect 940 may include a system bus, LDT, PCI, ISA, or other types of buses, and the communications or network interface may, for example, be an Ethernet interface, a Frame Relay interface, or other interface. The network interface 930 may be configured to enable a system component to communicate with other system components across a network which may be a wireless or various other networks. It should be noted that one or more components of computing device 900 may be located remotely and accessed via a network. Accordingly, the system configuration provided in FIG. 9 is provided to generally illustrate how embodiments may be configured and implemented, and it will be understood that embodiments may also involve communications through one or more networks between a user computer and a computer hosting system embodiments of on-line or cloud based tax return preparation applications
[00116] Method embodiments or certain steps thereof, some of which may be loaded on certain system components, computers or servers, and others of which may be loaded and executed on other system components, computers or servers, may also be embodied in, or readable from, a non-transitory, tangible medium or computer-readable medium or carrier, e.g., one or more of the fixed and/or removable data storage data devices and/or data communications devices connected to a computer. Carriers may be, for example, magnetic storage medium, optical storage medium and magneto-optical storage medium. Examples of carriers include, but are not limited to, a floppy diskette, a memory stick or a flash drive, CD-R, CD-RW, CD-ROM, DVD-R, DVD-RW, or other carrier now known or later developed capable of storing data. The processor 920 performs steps or executes program instructions 912 within memory 910 and/or embodied on the carrier to implement method embodiments. [00117] Although particular embodiments have been shown and described with reference to generating a ranking of tax return topics or menu options presented by an online tax return preparation application during preparation of an electronic tax return, it should be understood that the above discussion is not intended to limit the scope of these embodiments and that embodiments may be utilized in various other ways.
[00118] For example, given the capabilities of escrow-based personalization of platform embodiments, embodiments can be combined with a very specific kind of decision model that allows for enhanced and new AB Testing and targeted segmentation. Targeted segmentation is a capability offered by a specific kind of model that involves the model choosing, from a finite set of possible user experiences, the one user experience that should be delivered based on the state of a user's profile. Rather than starting with the premise that there is one optimal experience for everyone (as in traditional AB Testing), target segmentation with embodiments is based on the premise that there are multiple variations of a user experience, each user experience being optimal for a sub segment of the population.
[00119] This targeted segmentation model has the general property that it will assign a user to an experience if either (1 ) the experience has been proven to be optimal or (2) there is not enough data gathered to come to a firm conclusion. The targeted segmentation model will not assign a user to an experience if that user experience has been proven to be sub-optimal. With these types of systems, multiple variations of a user experience can quickly and safely be pushed into production, and certain variations will prove themselves to be successful as a result of increased on-line traffic, whereas other user experiences will prove themselves to be less desirable and less relevant.
[00120] In these user experience applications, a distinction is made regarding qualification versus assignment of a user experience. Embodiments distinguish between components responsible for producing a qualification and those that are responsible for producing an assignment, and this enables models 360 to be run and decisions or generates results in the form of a ranking 361 to be produced in the background and not coordinated with any direct user activity. The nature of this design means that the models are run and results produced typically well before they are actually needed by the requesting application 311 or, in other words, since the application 311 that will require the ranking 361 has not yet called to retrieve the ranking 361 and the user has not yet been affected by it, it may be the case that the resulting ranking 361 will never be consumed. Thus, by making a distinction between early qualification and just-in-time assignment, embodiments address shortcomings associated with situations in which users are assigned to an experiment and potentially never reach the portion of the application 311 that is relevant, thus providing improvements by addressing problems involving: 1. when the case of a set of experiments that are mutually exclusive with each other, there is a risk of wasting traffic / performing unnecessary communications, and a user is assigned to an experiment, disqualifying them from participating in anything else but then never collecting any useful data from that user because they never actually get to participate in the experiment they were assigned, 2. difficulty in distinguishing t users that were assigned to an experiment versus those that were assigned and actually effected by the experiment, and 3. difficulties for the escrow controller to known whether or not a better assignment could be produced when potentially more data becomes available at some later phase.
[00121] FIGS. 10A-B are system flow diagram illustrating how user experiences may be processed according to embodiments, i.e., how personalized experience flow would be implemented, and involves an experience management system that manages an experience set and products an experience assignment by considering a set of experience qualifications and performing any additional last minute checks that validate criteria (e.g., mutual exclusion rules). An experience set is defined as including elements including or involving traffic allocation percentage, a set of named variations that are members of the set, a set of other Experience Sets with which this one is mutually exclusive, a simple predicate that further determines the experiment eligibility of a participant, and a reference to the [00122] Experiment Qualification Mode
[00123] Referring to FIG. 10, for implementation of a successful personalized experience interaction, at 1002, the application 311 (such as online tax return preparation application, but general "application") makes a call to Jabba 1050 or another external system that assigns users to one of many equivalent experiences that can benefit from escrow-based ranking embodiments, with a unique identifier of the user and any valid JSON object as inputs to the experimentation mode, and at 1004, Jabba 1050 receives the request. Based on the established rules, Jabba 1050 determines which test and experience to call or serve. If the chosen experience is Experience Decision Engine 1052 (rather than "Baseline" or the current best performing experience from a set of equivalent experiences), then Jabba 1050 makes a request to Decision Engine 370:
[00124] At 1006, the decision engine 370 consumes the request and responds back to Jabba 1050 with a list of "buckets" with probabilities that sum 1054 to 1.0:
[00125] At 1008, Jabba 1050 then "flips a coin" 1060 on the options 261 a-c that the decision engine 370 has provided to return an ordered list 1062 of options.
[00126] Referring to FIG. 10B, after the ordered list 1062 of options is obtained, Jabba 1050 then applies business rules to determine the final experience 361 f (T referring to "final") to serve back to the application 311. The business rules are rules that define or constrain some aspect of business and always resolve to either true or false. An example of a business rule might be given experience A, B, and C, new users are not allowed to see experience B, thus experienceC should be shown to the user even though experienceB might have higher priority on the Jabba Result List.
[00127] While a successful personalized experience interaction is described above, it may be the case that there is an error while trying to process the request made from Jabba 1050 to the decision engine (involving 1004 and 1006 as described above). An example of a decision error message follows:
[00128] As a result of this situation, Jabba swallows the error message, and responds to let the application team choose which experience to serve to the user.
[00129] It may also be the case that the decision engine 370 is unreachable, in which case Jabba 1050 also responds to permit the application team choose which experience to serve to the user. It may also be the case that Jabba 1050 is unreachable. This may occur when the application 311 , for whatever reason, cannot communicate with Jabba 1050, and in this situation, the application team can decide which experience to serve to the user.
[00130] Thus, while embodiments and variations of the many aspects of the invention have been disclosed and described herein, such disclosure is provided for purposes of explanation and illustration only. Thus, various changes and modifications may be made without departing from the scope of the claims.
[00131] Further, where methods and steps described above with reference to various system-flow diagrams indicate certain events occurring in certain order, those of ordinary skill in the art having the benefit of this disclosure would recognize that the ordering of certain steps may be modified and that such modifications are in accordance with the variations of the invention. Additionally, certain of the steps may be performed concurrently in a parallel process as well as performed sequentially. Thus, the methods shown in various system-flow diagrams are not intended to be limited to a particular sequential order, unless otherwise stated or required.
[00132] Accordingly, embodiments are intended to exemplify alternatives, modifications, and equivalents that may fall within the scope of the claims.
APPENDIX 1 (Hosting in Cloud)
This section discusses the hosting design of the major components that make up the Platform. Since this is hosted in an AWS VPC with connectivity back to Intuit's collocated facilities, there are certain requirements and guidelines around securing the data and components.
Tech Stack Inventories
AWS Services
INTERNET TRAFFIC ESTIMATION
Trinity Beaconing
Assuming 2k TPS and each request is about 3k bytes = 6 MBPS. With headway, we want to have capacity for about 10 MBPS
SQS/SNS and Other AWS Services
Calculations:
For Escrow Heartbeat, we have about 25k TPS which results in a packet of about 1 kB (rounding up) being sent to SNS, traffic = 25 MBps outbound.
For each SNS message we send out, we get about 'η' times the data back where n is the number of polling instances running. Estimate of number of poller service instances required, e.g., for 250-300k concurrent users, to be - 8 instances as example. Thus, inbound traffic for polling instances will be 8x25 MBPS = 200 MBps. Outbound traffic from poller instance are essentially read calls which are very small in request size and negligible for bandwidth calculations.
CTDS Data Calls
As an estimate, if 300k concurrent users and querying on behalf of these users every minute = 300,000/60s = 5000 requests Per Second. If each request results in a response of about .1 MB (~100 KB) (estimate), then measure this response size) of data across different documents, resulting in about 500 MB/s of inbound traffic
New Relic Monitoring Traffic
Observed traffic during pert tests is - 100 MB/s for the Apache and native server monitoring
App Dynamics Monitoring Traffic
Estimate of at least 200 MB/s for monitoring 2 PUB, 6 EXE, 6 ECW, and 8 POLL = 22 servers at full capacity.
Consolidated internet Traffic
This traffic estimation is at full peak when ECW/EXE combined traffic is about 2k TPS
From our performance test, each server can handle ~50 MB/s consistently with bursts of 100-110 MB/s. Autoscaling group of a max of -10 servers gives internet bandwidth of 10*50 = 500 MB/s consistently with peaks of 1 GB/s.
APPENDIX 2 fDE Escrow Worker Specification)
This document lists the specification for the Escrow and Polling components of Decision Engine. The document also lists the interactions between the ECW, POLL and DAS.
Escrow Service:
An Escrow Service is a background service which provides an alternative way to invoke the Decision Engine Execute Service without having to provide all the data required to execute a model. The Escrow service can use pollers to proactively fetch data from the required sources so that the calling application only needs to provide the data that is not already available somewhere.
Use Cases
The main uses cases are as follows, and this is what each of DE's components will do at a high level:
Escrow Service will use the authld of the user to resolve the Entity Key (EK) for the user's current tax year filing data. The resolved EK will be stored in the Customer Data Cache.
Escrow service will pass along the Authld, EK, the Model Requirements and the Base line data state (which is initially empty) to the Poller
The poller will talk to specified sources to fetch the actual data. Whenever the Poller detects a change in the data compared to the baseline data, it sends a Data Change event to the JMS along with the details of the new data values. After successful transmission to the JMS, the Poller holds the current state of the data as the new Base Line
Whenever the Escrow Service figures out that it has enough data to execute a model, it will execute the model and pass the data to DAS for storage. The insights document will be created if required and the node specific to this decision key will be added or updated (depending on if this is the first time it is being added) whenever the decision is generated. If the decision node is created up front, there is a risk that the app will consume the empty node and mark it as consumed, so the decision node must only be created/updated after the decision has been generated.
We will have 2 types of models from a point of view of mutability of decisions. In one case the decisions are immutable once they have been consumed and may not be updated. The other type is where the decision might be updatable based on new data we know about the user. For the class of models where the decision is immutable once Escrow detects that the decision has been consumed, it can store this data in CDC so that subsequent heartbeats for this user dont try to create a new Escrow worker or poller for a user/model combination when the original decision has already been consumed.
When reading the sequence diagram below, when a call to DAS is made, read Authld as {Authld.Ticket} as parameters to DAS First Time a User logs in
Sequence Diagram Escrow Worker/Record created for a user/profile scoped writeback for
Sequence Diagram
Model in requirements contains 1 or more tax requirements that require EKto be resolved before DAS queries can be made
Sequence Diagram
Error Handling
Since the execution of the model can be potentially asynchronous and the invocation from a client depends on success of our data fetching pollers, we can have scenarios where the client provides us all the data they are supposed to, but we cannot execute the model. At runtime if the client invokes the Escrow Service but if we are missing any of the required fields as listed in the config document, the service will respond back with an error. If the missing fields are piece of data that the client was supposed to provide, then a HTTP 400 bad request is returned. If there is a required field which was supposed to be fetched from another source but has not yet been fetched (either because its not available or the poller has not successfully fetched it and given it back to the Escrow Service), then the service needs to respond back with a 500 internal server error with an error message which explicitly mentions the reason for the error and lists the required fields which are missing.
Escrow Worker Config Document
Each escrow worker has a config document that defines its behavior; how it gets its input, the model to invoke and how it communicates the data to the consumers of the decision.
Escrow Config Document Management
The Escrow config document exposes the contract of the Escrow Service for the model that it's exposing. It is related to the model's major version and needs to change along with it. It also has a life cycle of its own and can change on its own even for the same major version of the model. A representation of the
relationships is shown below:
Note that the Escrow Config is an entirely separately managed document from the Model. Importantly, an Escrow Config is independently versioned and published. This allows the isolation of the changes to a Model from the changes to an Escrow Config and vice versa. Consider the following use case:
Version 1.3 of a Model document requires a zipCode and a useragent. Version 1.1 of the Escrow Config is specified to retrieve the zipCode from the 'taxReturn' provider and the useragent from the 'local' provider, meaning that the useragent value is exposed to the calling client as a requirement for executing the model. At some point in the future, the useragent value is persisted to a document and retrievable through the triageTaxProfile document. In this scenario, a new version of the Escrow Config is produced that moves the useragent from the 'local' provider to the 'triageTaxProfile' provider, this materially effects the way that a client would interface with the Escrow Worker, thus, the Escrow Config's version is bumped to 2.0 while the version of the Model remains at 1.3.
DAS Interactions
DE will talk to DAS through 2 components: The Escrow Worker and the Poller
Escrow will talk to DAS once to get "resolve" the EK. It will get all the returns and figure out which is the current tax year and the corresponding EK that must be used in the URI to get the data entered for this tax year's returns.
Escrow will pass the EK to the Poller to use for its subsequent calls to get the actual data. Escrow will make read and update calls to DAS to store the
Decisions generated by EXE service.
Primary use cases with DAS are described below.
Poller Requesting Data from DAS
When the poller is requesting data from DAS, it wants to fetch multiple fields. We do not want to overload DAS with too many requests, so we'll use a feature that allows us to request more than 1 field in the same call
The response will have nodes with UUID passed by the caller to help it reconcile the fields in the request to nodes in the response
. ECW Worker Updating Decision
When the Escrow Worker is updating a decision, it needs to check that the decision has not already been consumed by the client (This is for use cases where the decision cannot be updated, such as an experiment allocation decision). This condition has the potential to go wrong in a race condition where the ESC worker checks to see if the decision has been consumed, it gets the response as false and then computes the decision and updates to DAS. But right after the Escrow worker checked the status of consumed flag, the client read the field and updated it to be true. In this case, the Escrow worker will end up updating a decision that has been consumed and updating the data that the client has already seen.
To avoid this case, DAS will be providing us a way to send a processing instruction as part of the request where we will update the request only if the flag has not been set. The request may be based on, for example:
The response on success can be based on, for example:
Client Consuming Decision and updating decision
When the client is consuming a decision, it needs to set a flag in the document to mark that it has consumed the document. The client can do this in 1 call to make it an atomic operation. DAS supports a feature where the caller can send across an array of instructions for processing which has multiple GETs and PUTs. The request may be based on, for example:

Claims

What is claimed is:
1. A computerized tax return preparation system, comprising:
an on-line tax return preparation application accessible by respective user computing devices executing respective browsers to prepare respective electronic tax returns of respective users, the on-line tax return preparation application being configured to write respective electronic tax return data of respective users to a first data store; and
a tax return topic ranking system in communication with the on-line tax return preparation application and the first data store, the tax return topic ranking system being operable independently of the on-line tax return preparation application and configured to retrieve from the first data store specified types of electronic tax return data of respective users logged into the tax return
preparation application, create respective escrow records for respective users including respective electronic tax return data of respective specified types of respective logged-in users, store respective escrow records to a second data store, and when the specified electronic tax return data of at least one escrow record satisfies pre-determined criteria, execute a model utilizing the specified electronic tax return data of the at least one escrow record to generate a tax return topic ranking and provide the generated tax return topic ranking to the online tax return preparation application,
the on-line tax return preparation application being further configured to generate an interview screen including tax return topics structured according to the generated tax return topic ranking, wherein the interview screen is presented through a screen of a computing device of the logged in user associated with the at least one escrow record.
2. The system of claim 1 , the tax return topic ranking system further comprising a third data store different from the first data store and the second data store, the tax return topic ranking system being configured to store the generated tax return topic ranking to the third data store before the generated tax return topic ranking is provided to the on-line tax return preparation application.
3. The system of claim 2, wherein the on-line tax return preparation application can write data to and read data from the first data store, but the online tax return preparation application cannot write data to or read data from the second data store or the third data store.
4. The system of claim 1 , wherein the on-line tax return preparation application can write data to and read data from the first data store, but the online tax return preparation application cannot write data to or read data from the second data store.
5. The system of claim 1 , the model being a predictive model selected from the group consisting of logistic regression; naive bayes; k-means classification; K-means clustering; other clustering techniques; k-nearest neighbor; neural networks; decision trees; random forests; boosted trees; k-nn classification; kd trees; generalized linear models and support vector machines.
6. The system of claim 1 , wherein multiple models are available for execution for an individual logged-in user.
7. The system of claim 1 , where multiple models are published or activated for execution by the tax return topic ranking system, and the second data store includes respective escrow records generated for respective models for each logged in user.
8. The system of claim 1 , the tax return topic ranking system being configured to receive from the online tax return preparation application data indicating that a user has logged into the online tax return preparation
application, wherein the tax return topic ranking system is activated for a user in response to the data indicating that the user has logged on to the online tax return preparation application.
9. The system of claim 1 , the tax return topic ranking system being configured to receive from the online tax return preparation application data indicating that a user has logged off from the online tax return preparation application, wherein the tax return topic ranking system is terminated for that user in response to the data indicating that the user has logged off from the online tax return preparation application.
10. The system of claim 1 , the tax return topic ranking system being configured to receive a request for a tax return topic ranking from the on-line tax return preparation application during preparation of the electronic tax return for a logged-in user.
11. The system of claim 10, in response to the request, the tax return topic ranking system being configured to:
determine identification data of the logged in user;
access a third data store different from the first data store and the second data store to determine whether a tax return topic ranking was previously generated for the identified logged-in user.
12. The system of claim 11 , when a tax return topic ranking is not stored in the third data store for the identified user, the tax return ranking system being configured to determine whether escrow record data of the identified user in the second data store satisfies pre-determined criteria.
13. The system of claim 10, when the identified user's escrow record data in the second data store satisfies pre-determined criteria for a model, the tax return topic ranking system being configured to execute the model utilizing the identified user's escrow record data to generate a tax return topic ranking, and provide the generated tax return topic ranking to the on-line tax return preparation application in response to the request.
14. The system of claim 10, when the identified user's escrow record data in the second data store does not satisfy pre-determined criteria for a model, the tax return topic ranking system being configured to wait for the escrow record to be updated such that the pre-determined criteria is satisfied, and the
corresponding model can be executed.
15. The system of claim 14, the tax return topic ranking system being configured to respond to the request by notifying the online tax return preparation application that no generated tax return topic ranking is available.
16. The system of claim 11 , the tax return topic ranking system being configured to retrieve a previously determined tax return topic ranking that was previously determined by execution of a model in response to the request by the on-line tax return preparation application and stored in the third data store, the previously determined tax return topic ranking being provided to the online tax return preparation application in response to the request.
17. The system of claim 1 , wherein a generated tax return topic ranking is presented to at least one logged-in user but not all logged-in users of the on-line tax return preparation application.
18. The system of claim 1 , wherein multiple users are logged in to the tax return preparation application, and respective escrow records are generated based on respective portions of user data in the first data store and stored to the second data store.
19. The system of claim 18, wherein multiple models are activated or published for execution by the tax return topic ranking system, and wherein, for each logged-in user, an escrow record is generated for each user-model combination and includes respective specified types of electronic tax return data associated with respective models.
20. The system of claim 1 , wherein the tax return topic ranking system is configured to execute different models based on respective escrow records of respective logged-in users.
21. The system of claim 20, wherein a first model is executed for a first escrow record of a first user to generate a first tax return topic ranking, and a second model different from the first model is executed for a second escrow record of a second user to generate a second tax return ranking different from the second tax return ranking.
22. The system of claim 1 , wherein the escrow record data satisfying the predetermined criteria for model execution is electronic tax return data stored in a single source, the single source comprising the second data store.
23. The system of claim 1 , wherein the escrow record data satisfying the predetermined criteria for model execution is electronic tax return data from multiple sources, at least one of the multiple sources comprising the second data store.
24. The system of claim 1 , at least one of the multiple sources comprising the online tax return preparation application, wherein data of a request for a tax return topic ranking transmitted by the on-line tax return for a tax return topic ranking is stored to the second data store and satisfies at least a portion of the pre-determined criteria.
25. The system of claim 1 , the tax topic ranking system being further configured to monitor the first data store for changes in the pre-determined types of electronic tax return data of the logged-in user.
26. The system of claim 25, the tax return ranking system being configured to modify a logged-in user's escrow record in the second data store based at least in part upon a change of the logged-in user's data in the first data store.
27. The system of claim 25, wherein an escrow record is modified by deleting electronic tax return data from the escrow record based on the electronic tax return data being deleted from the first data store, replacing electronic tax return data in the escrow record with different electronic tax return data retrieved from the first data store, or adding electronic tax return data retrieved from the data store to the escrow record.
28. The system of claim 25, the tax return ranking system comprising a polling service configured to periodically check the first data store for changes in the specified types of electronic tax return data.
29. The system of claim 26, the tax return topic ranking system being configured to determine whether electronic tax return data of a modified escrow record satisfies the pre-determined criteria, and when the electronic tax return data of the modified escrow record satisfies the pre-determined criteria, re- execute the model utilizing the updated electronic tax return data of the modified escrow record, re-execution of the model generating a different tax return topic ranking.
30. The system of claim 26, the tax return topic ranking system being configured to determine whether electronic tax return data of a modified escrow record satisfies second pre-determined criteria associated with a second model different from a first model, and when the electronic tax return data of the modified escrow record satisfies the second pre-determined criteria, execute a different, second model utilizing the electronic tax return data of the modified escrow record, execution of the second model generating a different, second tax return topic ranking.
31. The system of claim 1 , the tax return topic ranking system being configured to retrieve a portion of the electronic tax return data of a logged in user that is stored in the first data store for an escrow record to be generated and stored in the second data store.
32. The system of claim 1 , the tax return topic ranking system being configured to execute the model asynchronously relative to the on-line tax return preparation application writing electronic tax return data to the first data store.
33. A computerized tax return topic ranking system operable with an on-line tax return preparation application that is accessible by respective user computing devices executing respective browsers to prepare respective electronic tax returns of respective users, the on-line tax return preparation application being configured to write respective electronic tax return data of respective users to a first data store, the computerized tax return topic ranking system comprising: an escrow controller in communication with the on-line tax return preparation application, the escrow controller being configured to receive data from the online tax return preparation application indicating that a user has initiated an online session with the online tax return preparation application to prepare an electronic tax return;
a model specification file processed by the escrow controller to determine which types of electronic tax return data are required in order to trigger execution of a model;
a selective data retrieval service in communication with the escrow controller and the first data store utilized by the online tax return preparation application;
a second data store in communication with the escrow controller, the selective data retrieval service being configured to retrieve the logged in user's electronic tax return data of types identified by the model specification file and provide the retrieved electronic tax return data to the escrow controller, the escrow controller being configured to generate an escrow record for the logged in user, the escrow record including the retrieved electronic tax return data; and a model processor, wherein the escrow controller is determined to read an escrow record from the second data store, and when the escrow record includes types of electronic tax return data identified by the model specification file to trigger the model, call the model processor to execute a model identified in the model specification file, wherein a result of execution of a model is generation of a ranking of tax return topics that is provided to the online tax return preparation; and
a third data store, wherein the escrow controller is further configured to receive the tax return topic ranking from the model processor, store the tax return topic ranking to the third data store, and when the online tax return preparation application issues a request to the escrow controller for a tax return topic ranking, provide the tax return topic ranking stored in the third data store to the online tax return preparation in response to the request.
34. The system of claim 33, the selective data retrieval service comprising a polling service.
35. The system of claim 33, the selective data retrieval service comprising a native element of the first data store that notifies the escrow controller of changes in specified types of electronic tax return for logged-in users via a continuous feed.
36. The system of claim 33, the selective data retrieval service being further configured to monitor the first data store for changes in the user's electronic tax return data in the first data store and notify the escrow controller of detected changes, the escrow controller modifying an escrow record for a user stored in the second data store to reflect electronic tax return data changes for the user in the first data store.
37. A computerized tax return preparation system, comprising:
an on-line tax return preparation application that is accessible by respective user computing devices executing respective browsers to prepare respective electronic tax returns of respective users, the on-line tax return preparation application being configured to write respective electronic tax return data of respective users to a first data store;
an escrow controller in communication with the on-line tax return preparation application, the escrow controller being configured to receive data from the online tax return preparation application indicating that a user has initiated an online session with the online tax return preparation application to prepare an electronic tax return;
a model specification file processed by the escrow controller to determine which types of electronic tax return data are required in order to trigger execution of a model; a selective data retrieval service in communication with the escrow controller and the first data store utilized by the online tax return preparation application;
a second data store in communication with the escrow controller, the selective data retrieval service being configured to retrieve the logged in user's electronic tax return data of types identified by the model specification file and provide the retrieved electronic tax return data to the escrow controller, the escrow controller being configured to generate an escrow record for the logged in user, the escrow record including the retrieved electronic tax return data; and a model processor, wherein the escrow controller is determined to read an escrow record from the second data store, and when the escrow record includes types of electronic tax return data identified by the model specification file to trigger the model, call the model processor to execute a model identified in the model specification file, wherein a result of execution of a model is generation of a ranking of tax return topics that is provided to the online tax return preparation; and
a third data store, wherein the escrow controller is further configured to receive the tax return topic ranking from the model processor, store the tax return topic ranking to the third data store, and when the online tax return preparation application issues a request to the escrow controller for a tax return topic ranking, provide the tax return topic ranking stored in the third data store to the online tax return preparation in response to the request.
EP15907552.2A 2015-10-30 2015-10-30 Escrow personalization system Withdrawn EP3369069A4 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2015/058504 WO2017074468A1 (en) 2015-10-30 2015-10-30 Escrow personalization system

Publications (2)

Publication Number Publication Date
EP3369069A1 true EP3369069A1 (en) 2018-09-05
EP3369069A4 EP3369069A4 (en) 2019-04-03

Family

ID=58630911

Family Applications (1)

Application Number Title Priority Date Filing Date
EP15907552.2A Withdrawn EP3369069A4 (en) 2015-10-30 2015-10-30 Escrow personalization system

Country Status (4)

Country Link
EP (1) EP3369069A4 (en)
AU (1) AU2015412785A1 (en)
CA (1) CA2994232C (en)
WO (1) WO2017074468A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077242B (en) * 2021-04-30 2024-03-01 重庆市能源投资集团科技有限责任公司 Science and technology project reporting management system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002007710A (en) * 2000-06-26 2002-01-11 Miyabe Shinji System and method for processing final tax return work
US7627504B2 (en) * 2002-10-31 2009-12-01 Thomson Reuters (Tax and Accounting) Services, Inc. Information processing system for determining tax information
US7636886B2 (en) * 2003-04-24 2009-12-22 Sureprep Llc System and method for grouping and organizing pages of an electronic document into pre-defined categories
US7539635B1 (en) * 2003-12-29 2009-05-26 H&R Block Tax Services, Llc. System and method for generating a personalized tax advice document
US8204805B2 (en) * 2010-10-28 2012-06-19 Intuit Inc. Instant tax return preparation

Also Published As

Publication number Publication date
CA2994232A1 (en) 2017-05-04
AU2015412785A1 (en) 2018-02-15
EP3369069A4 (en) 2019-04-03
CA2994232C (en) 2021-02-16
WO2017074468A1 (en) 2017-05-04

Similar Documents

Publication Publication Date Title
US10901791B2 (en) Providing configurable workflow capabilities
US11288142B2 (en) Recovery strategy for a stream processing system
US10878379B2 (en) Processing events generated by internet of things (IoT)
US11503107B2 (en) Integrating logic in micro batch based event processing systems
US11086688B2 (en) Managing resource allocation in a stream processing framework
KR102300077B1 (en) Optimizing user interface data caching for future actions
US9842000B2 (en) Managing processing of long tail task sequences in a stream processing framework
US10755362B2 (en) Escrow personalization system
JP2019091474A (en) Access control for data resource
US9893904B2 (en) Rule-based messaging and dialog engine
CA3232663A1 (en) Method, apparatus and system for subscription management
CN111177237A (en) Data processing system, method and device
CN114785749A (en) Message group sending processing method and device
CA2994232C (en) Escrow personalization system
EP3942399A1 (en) Automated assistant for generating, in response to a request from a user, application input content using application data from other sources
Gutiérrez et al. A Cloud Pub/Sub Architecture to Integrate Google Big Query with Elasticsearch using Cloud Functions
US11438426B1 (en) Systems and methods for intelligent session recording
US20240220710A1 (en) System and method for integrated temporal external resource allocation
WO2022039755A1 (en) Data integrity optimization
Swientek High-Performance Near-Time Processing of Bulk Data

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20180327

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
RIN1 Information on inventor provided before grant (corrected)

Inventor name: BAKER, TRISTAN COOPER

A4 Supplementary search report drawn up and despatched

Effective date: 20190306

RIC1 Information provided on ipc code assigned before grant

Ipc: G06Q 50/26 20120101ALI20190227BHEP

Ipc: G06Q 40/02 20120101ALI20190227BHEP

Ipc: G06Q 40/00 20120101AFI20190227BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20191023

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20200826