GB2593890A - Automated determining of data targets in a website to enable actions in web browsers relating to the data targets - Google Patents

Automated determining of data targets in a website to enable actions in web browsers relating to the data targets Download PDF

Info

Publication number
GB2593890A
GB2593890A GB2005037.3A GB202005037A GB2593890A GB 2593890 A GB2593890 A GB 2593890A GB 202005037 A GB202005037 A GB 202005037A GB 2593890 A GB2593890 A GB 2593890A
Authority
GB
United Kingdom
Prior art keywords
data
website
event
code
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2005037.3A
Other versions
GB202005037D0 (en
Inventor
Samuel Phillips Paul
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cloudiq Ltd
Original Assignee
Cloudiq Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cloudiq Ltd filed Critical Cloudiq Ltd
Priority to GB2005037.3A priority Critical patent/GB2593890A/en
Publication of GB202005037D0 publication Critical patent/GB202005037D0/en
Priority to PCT/EP2021/058978 priority patent/WO2021204828A1/en
Publication of GB2593890A publication Critical patent/GB2593890A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A method comprises automatically determining (900), by a configuration system, identifiers of data targets in a website. Each data target is one of a plurality of predetermined data target types. The configuration system then generates (902) computer program code including the data target identifiers. The generated code is for execution by a web browser on a user device to enable the web browser to perform at least one predetermined action relating to the identified data targets. The configuration system then sends (906) the code to the web browser. The generating the code may comprise including, for each of the identified data targets, an identifier of a predetermined event, wherein the at least one action comprises detecting the identified events occurring in relation to the respective data targets, wherein each event corresponding to a predetermined user action. The data target types may include first data target types being types of webpage. The first data target types may comprise one or more of a product page, checkout page, payment confirmation page, cart page, login page, category page. The data target types may comprise second data target types such as a product, product image, buttons, products price, user display name and the like.

Description

AUTOMATED DETERMINING OF DATA TARGETS IN A WEBSITE TO ENABLE ACTIONS IN WEB BROWSERS RELATING TO THE DATA TARGETS
Field of the Invention
The invention relates to a method of automatically determining identifiers of data targets in a website and generating computer program code including the identifiers, where the code can be provided to web browsers to enable the browsers to perform at least one action relating to the data targets when a respective user is navigating the website. The invention also relates to a related computer program product.
Background of the Invention
It is often wanted to obtain information about user actions on a website. In order for such information to be obtained, code of webpages of the website may be provided with an HTML tag. When a user causes a web browser to load one of the tagged webpages, the browser processes the HTML tag to cause the browser to download a preconfigured JavaScript file. The browser then executes the downloaded file to result in the browser determining when predetermined events relating to the website that are specified in the file occur. Alternative methods of causing a browser to download such a JavaScript file are known in the art.
For the JavaScript file to be executable to result in the browser doing this, the JavaScript file specifies specific events relating to specific elements in the DOM (Document Object Model) of the website. For example, the JavaScript file may specify an event for an element in the form of a selection event by a cursor (i.e. an "onclick" event in JavaScript) on that element.
Execution of the JavaScript file can thus result in the browser detecting when a user has selected that element.
The JavaScript file may also be configured to result in the browser, after detecting that one of the specified events has occurred, to determine available wanted information and to send the information to a server. Thus, the JavaScript file may also specify the information that the browser is to capture after a particular event is detected. Such information may include, for example, time of detection of the event, URL of the relevant webpage, and information identifying other predetermined elements on the webpage.
A server may thus receive a stream of information from a web browser relating to events on the website. The information may be indicative of a user's navigation through the website.
Such information may be particularly useful, for example, where a user of a website adds products to an online shopping cart but leaves the website without purchasing the products, since the identity of the products may be derivable from the information and marketing of those products to the user may be possible. Such information may also be useful for website optimisation.
Configuration of the JavaScript file so that the browser is able to determine when predetermined events have occurred is conventionally laborious. Conventionally, elements for which events are to be detected are manually identified in a process involving manually inspecting an element to find an identifier of the element and then manually configuring the JavaScript file to identify the elements and the events relating to the identified elements. For example, a JavaScript file can be configured for a website on which a product can be bought to specify that the browser is to detect when a corresponding add-to-basket element on a webpage is selected.
Where the JavaScript file is to be configured to specify information that is to be captured and sent to a server, this is again conventionally done manually in a process involving inspecting an element about which information is wanted and specifying the element in the file. For example, elements showing price, product title, product description and product image may all be manually identified and identifiers of these elements specified in the file.
It is desirable to reduce the labour involved in configuring the JavaScript code. Also, when changes are made to the website, for example a new product is added for sale, the JavaScript file may no longer function correctly and may have to be manually reconfigured, which is also laborious. It is an object of the present invention to address these problems.
Summary of the Invention
According to a first aspect of the present invention, there is provided method, comprising: automatically determining, by a configuration system, identifiers of data targets in a website, each data target being one of a plurality of predetermined data target types; generating computer program code including at least some of the data target identifiers, the generated code being suitable for execution by web browsers to enable the respective web browser to perform at least one predetermined action relating to the identified data targets; and sending the code to the web browser.
Thus, the code can be configured for a website automatically. This advantageous saves on labour, thereby reducing code. Any human error in the code is also removed.
According to a second aspect of the present invention, there is provided a computer-implemented method comprising: automatically generating, for a website, computer program code including event processing instructions; sending the code to browsers running at user devices, the code being such that the browsers execute the code to configure event detectors using the event processing instructions and to cause sending of event data to a server further to the browser detecting an event due to one of the event detectors; receiving the event data and determining, based at least on the event data, that at least one stored rule is breached; further to the determining that the at least one rule is breached, automatically generating again the code. Thus, the method serves in monitoring of the correctness of the event processing instructions and/or in verification of the code after the code has been generated. The generation of the code, initially or again, may be in accordance the method of the first aspect.
According to a third aspect of the present invention, there is provided a method comprising: inputting code of a webpage that includes at least one data target into a predetermined model for a data target type, wherein the model is configured to determine an identifier for at least one data target of the data target type, to generate an output indicative of the identifier; configuring event processing instructions to include the identifier and a predetermined event, wherein execution of the event processing instructions by a browser results in at least one of: an event detector configured to detect at least one user action relating to the data target identified by the identifier at a browser; and data targets to captured following detection of an event.
According to other aspects of the present invention, there is provided data processing apparatus comprising: processing means; and memory means having computer program code stored thereon, where the processing means and the memory means with the computer program code are configured to cause the apparatus to perform the method of any one of first, second and third aspects.
According to yet other aspects of the present invention, there is provided a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method of any one of first, second and third aspects.
According to yet other aspects of the present invention, there is provided a computer-readable storage medium comprising instructions which, when executed by a computer, cause the computer to carry out the method of any one of first, second and third aspects
Brief Description of Figures
Embodiments of the present invention will now be described in the following, by way of example only, with reference to the accompanying Figures in which: Figure 1 is a block diagram of a system environment and components in which embodiments of the invention may be implemented; Figure 2 is a flowchart indicating steps by which the configuration system determines to begin configuring a script; Figure 3 is a table containing reference tags that may appear in code of website that reference particular elements; Figure 4 is a flowchart indicating outline steps in determining element identifiers for which it is wanted to receive event data on predetermined events using at least one of a number of possible processes; Figure 5 is a flowchart indicating steps in one of the processes indicated in Figure 4; Figure 6 is a flowchart indicating steps in another of the processes indicated in Figure Figure 7 is a flowchart indicating steps in yet another of the processes indicated in Figure 4; Figure 8 is a flowchart indicating steps in monitoring of proper operation of the code; Figure 9 is a flow diagram indicating general steps in accordance with embodiments of the invention; Figure 10 is a block diagram indicating components of a server on which embodiments of the invention may be implemented.
Detailed Description of Embodiments
A website with which embodiments of the invention may be used may comprise one or multiple webpages, with the or each webpage containing one or more content items. Each content item relates to a specific product offered for sale, or to a category of products such that the user can conveniently navigate the website. Where a webpage is primarily directed to a single content item, that page is referred to herein as a "product page". A content item on a product page includes product description data that has several parts. The parts include one, greater than one or all of: a product title, an image of the product, a product description and a price of the product. The product description data is not limited to these and in some embodiments may include other information. A webpage directed to a plurality of content items is referred to herein as a "category page". On a category page, one or more of the parts of each content item is typically selectable to navigate to a corresponding product page. Some websites do not include a category page.
In addition, such a website includes selectable buttons enabling a product to be added to an online shopping cart and removed from the online cart ("add-to-cart' and "remove-from-cart buttons"). An add-to-cart button is present on product pages to enable adding of the corresponding product to the online shopping cart and a remove-from-cart button may be present on product pages of some websites to enable removal of the corresponding product, although the remove-from-cart button may alternatively be absent and this functionality otherwise provided on the website. An add-to-cart button and remove-from-cart button for a product may also be present in association with a respective product on a category page.
Websites offering products for sale also have pages other than category and product pages, for example they may have any one or more of: a login page, a registration page, a cart page, a checkout page, a payment confirmation page, a wish list page, for example. The cart page is where the contents of the online cart are displayed. The checkout page is where payment and address details are input. The confirmation page is where payment is confirmed. The website registration page is where users can register with the site.
In order to gain insight into a user's navigation through a website, events have to be detected, and then information that is wanted further to these events having occurred has to be captured.
Such information is referred to herein as "event data". In order to detect occurrence of the events at a web browser, the browser has to be configured with event detectors. Each event detector is configured to detect a particular type of event relating to a particular data target.
The term "data target type" is used herein to refer to a predetermined type of data targets. The data target types include data targets that are elements on a webpage of a website that relate to display of, respectively, different types of product description data (that is, product title, product image, product price, et cetera, are different data target types), the add-to-cart and the remove-from cart buttons, online forms in which text can be entered by a user, elements that relate to display of a user's name, for example. Such data target types are referred to as "second" data target types.
The term "data target type" is also used herein to refer to predetermined types of webpage of the website ("first data target types"). For example, the webpages types may include: product page, category page, cart page, checkout page, payment confirmation page, login page, website registration page and wish list page. The first data target types may include one, greater than one or all of these and are not limited to these. One or more other webpage types may be defined, additionally or alternatively.
An event detector may be configured to detect any one of the following types of events: a user selection event on a particular data target of the second type; a page load event relating to particular data target of a first type; a mouse over, mouse out or like event relating to a particular data target of the second type; a text input event relating to a particular online form (or field thereof). An event detector is not limited to being configured to detect one of these events. Other events may be detected.
In order to configure the event detectors at a web browser a script is sent to the web browser in a configuration file. The script comprises event processing instructions. The web browser is configured to execute the event processing instructions to set the event detectors. The event processing instructions are also configured to be executable by the browser to cause capture of event data, which is sent to a server further to triggering of any of the event detectors.
Embodiments of the invention relate to automated generation of this configuration file for a website. Referring to Figure 9, in embodiments identifiers of data targets of each of the data target types in a website are automatically determined at step 900 and stored in association with the corresponding data target type. At step 902, a script including the event processing instructions is then generated, where the event processing instructions are generated based at least on the determined data target identifiers and predetermined rules. At step 904, a configuration file is created including the script. At step 906, when a user begins to navigate the website, the configuration file is sent to the respective web browser.
The predetermined rules associate identifiers of data targets of particular data target types with one or more events. Also, the rules may combine data target types of first and second data target types and associate the combination with one or more events, for example such that a data target of a first data target type on a webpage of a second data target type is associated with one or more events.
The rules may also specify the event data that is to be captured after a particular event has been detected at the browser. The event data may be specified using some or all of the determined identifiers of data targets on the web page on which the particular event has been detected. According to the rules the event data to be captured may be dependent on the particular event, the particular data target type of the data target, or the particular combination of first and second data types.
Alternatively, the event processing instructions may include one or more event data capture instruction each associated with a respective plurality of data targets. For example, the event data capture instruction may be to capture information pertaining to all of the data targets of the first type on the webpage on which the event is triggered.
A plurality of different automated processes may be performed to determine identifiers of the data targets of each data target type in the website, each process being performed in turn until the identifiers for the data targets are determined. In addition, in some embodiments, the operation of the browser when configured with the event detectors is monitored to determine if the script remains properly configured for the website, which may change for example if changes are made to the website. If the script is no longer properly configured for the site, the script and thus the configuration file can be reconfigured by running the automated processes again.
Referring to Figure 1, in an embodiment a configuration server 101 hosting a configuration system 100, a hosting server 103 hosting a website hosting system 104, and a plurality of user devices 106 are all operatively connected to a communications network 108. As will be appreciated, although only three user devices 106 are indicated, in practice the number would be much greater. The configuration system 100 is configured to handle all functionality ascribed to it herein, explicitly or implicitly, including generating of the configuration file for a website hosted by the website hosting system 104, provision of the configuration file to user devices 106, and processing of event data received from a browser at each of the user devices 106. In variant embodiments, the functionality of the configuration system 100 may be distributed across multiple servers.
The system 100 includes software applications and data stores that may be provided in accordance with embodiments. The software applications include a configuration module 112 and data processing module 118, and the data stores include a schema store 114, a website data store 115, a rules store 116, a machine learning (ML) models store 117, a user data store 122 and a configuration file store 119. The configuration module 112 is configured to provide the functionality relating to creating of a configuration file ascribed to the configuration system 100 herein. The data processing module 118 is configured to provide functionality relating to handling and storing received event data and marketing. The configuration module 112 and the data processing module 118 are configured to read and/or write to and/or from the data stores, as required to carry out the steps described in the following.
One configuration file is typically generated for each website. Although one website is indicated in Figure 1, in practice the configuration system 100 may generate configuration files for a plurality of websites and each configuration file is stored in the configuration file store 119. In variant embodiments, variant configuration files may be generated and stored for different websites where optimisation is required for different browsers or operating systems.
The hosting system 104 is configured to host a website. Each user device 106 has a client application in the form of a web browser 123 stored on it, configured to execute the configuration file. In the embodiments, the script is JavaScript code and the configuration file is a JavaScript file. The web browser 123 includes a JavaScript engine configured to execute the JavaScript file. In variant embodiments, the script may be written in a programming language other than JavaScript where the web browser 123 is configured to execute such a programming language, or a programming language that compiles to JavaScript where the web browser 123 is configured with a suitable compiler to compile to JavaScript, as would be apparent to a skilled person.
The configuration system 100 and the hosting system 104 are each configured for communication with the user devices 106 via the communications network 108 using HTTP/S in a conventional way. The term "request" as used in relation to the embodiments described below should be considered to mean an HTTP/S request, although embodiments of the invention are not limited to such.
In a conventional way, the web browser 123 on each user device is configured to send requests for webpages to the hosting system 104, typically in response to an input from a user. In response to a request for a webpage from a user device 106, the hosting system 104 is configured to send webpage data to the respective user device 106. The web browser 123 running on the respective user device 106 receives the webpage data and executes the webpage data to display a webpage in a viewport of the browser, which a user of the device 106 can view and with which the user can interact.
The rules store 116 stores the rules. As already mentioned, each rule associates data targets of one or more types with a particular one or more events. The configuration module 112 is configured to generate the event processing instructions based on the rules and the data target identifiers. Each rule includes or references predetermined template code for use in generating the event processing instructions. One or more data target identifiers are passed into the respective template code.
The rules may also associate data target types and events with event data to be captured.
The event data that is to be captured may be dependent on the data target types of the identified data targets and/or the particular one or more events that are to be detected.
The configuration module 112 is configured to first determine for a website identifiers of data targets for each of the predetermined data target types. The configuration module 112 is configured to generate the script to include the event processing instructions, and using stored template script. Examples of rules and event processing instructions are as follows: a) An example rule may associate data targets of the add-to-basket type and a user selection event (an "onclick" event"). The generated event detection instruction is such that a web browser will process it to set event detectors to detect user selection of add-to-basket buttons.
b) An example rule may associate data targets of the online form element type and an event being that the user has ceased entering text into the form (an "onblur" event). The event detection instructions are such that a web browser will set an event detector to detect the user ceasing entering text into forms.
c) An example rule may associate every data target of the first data target type and an event configured to trigger each time that one of the webpages is downloaded (an "onload" event).
The event detection instructions are such that a web browser will set event detectors to detect new page downloads.
Embodiments of the invention are not limited to these example rules. The rules may be configured to result in event detectors being set by the browser that detect other kinds of event, for example "mouseover", "mouseout" or the like events in JavaScript. The event detectors may be implemented as event listeners in JavaScript. Thus the event processing instructions are configured so that, when executed by the web browser, event listeners are set.
As already mentioned, the rules may associate particular event data to be captured with a particular event to be set and the event processing instructions may specify this. The event data to be captured is not limited to data targets of the second type on the respective webpage, but may include other data, such as time and page URL. Examples of how the rules may configure an event processing instruction to include event capture instructions associated with the above examples a) to c) are given below.
As to a), where an event detector is to be configured to detect user selection of an add-to-basket button on product pages, a rule may specify that on detection of such an event the corresponding product description data (namely, product title, product description, image link, product price, etc) is to be captured. The event data to be captured is not limited to this. Other event data may be captured, for example page URL.
As to b), where the event detector is to be configured to detect the user ceasing entering text into forms, a rule may also specify that text entered into the form is captured. Again, the event data to be captured is not limited to this. Other event data may be captured, for example page URL.
As to c), where an event detector is to be configured to detect new page downloads, on detection of such an event, the corresponding rule also species that the URL of the new page is to be captured. In some embodiments, different rules may be specified for different web page types. For example, where the new page is a product page, the product description data for the product on that page is also captured. On the cart page, any product description data displayed that identifies products listed to be bought is captured.
In some embodiments, where a data target for which an event detector is to be set is of the second type, the rule may specify the event data to be captured with reference to the first data target type of the webpage that the data target. For example, certain event data may be captured for all product pages.
Captured event data may, according to some embodiments, be sent to the configuration server 100 for processing by the data processing module 118. Such event data can be stored in the user data store 122 and indicates navigation history of the user through the website. For example, it can be determined from the received event data when a user has selected to add a product to the online cart, but not completed a purchase and not removed the product, for example. In this case, where an email address of the user has been captured after the user has input the address into a form, an email can be automatically generated from an email template populated with received product description data, and sent to the user device 106.
Thus, the user can be marketed to.
Event data stored in the user data store 122 can be used for other purposes. For example, the functionality may include web analytics, campaign analytics, audience measurement, user personalisation, A/B testing, ad servers, behavioural retargeting and conversion tracking.
In some embodiments, the event data are not sent to the server 100 but may be sent to an alternative server. The product description data can also be used at the web browser to populate a pop-up or pop-under. For example, the configuration file may be configured to cause the browser to determine when the user is likely to leave the website (in a way disclosed in US8806327 or US8645212 for example) and to cause a pop-up to be displayed within an iframe.
Operation of the system is now described with reference to Figure 2. First, at step 200, the website owner adds a snippet to each webpage of the website from which information on user actions is wanted. The snippet includes a URL for requesting the configuration file. This step may be done manually.
At step 202, the web browser running on a user device 106 sends a webpage request to the hosting server 102 in a conventional way, the webpage request being for webpage data of one of the webpages of the website. This typically occurs through the user entering the URL or the user selecting a link to the website from another website. In response to receiving the request, the hosting system 102 sends the webpage data to the user device 106 at step 204, also in a conventional way. The browser on the user device 106 then causes at step 206 the webpage to be displayed in the viewport of the web browser. The snippet in the webpage data causes the browser to send a request for the configuration file to the configuration system 100.
The configuration system 100 receives the request resulting from the snippet. The request includes at least an IP address identifying the user device and an URL of the website in a header of the request, as well as typically a cookie. The configuration system 100 determines if the request is the first request received from a web browser on any user device 106 for any page of the website at step 207. If so, the configuration system 100 initiates the process of generating the configuration file at step 210. This would typically occur soon after webpages of the website had the snippet added. If the request is not the first request received, but no configuration file is determined to be available at step 208, it is implicit that the configuration process is ongoing. If the configuration file is available, the configuration system 100 retrieves it and sends the file to the browser at step 209.
As already mentioned, the configuration system 100 is configured to identify data targets of the website relating to which information is wanted. Many e-commerce websites use a third party content management system, for example provided under the brand names MagentoTM, Big CommerceTM and ShopifyTM. Using such a content management system, the owner of the website can create and manage content items on the website.
In websites that use such a content management system, each part of the product description data is typically tagged with reference tags in a way that is standardised for the particular content management system and defined by a respective schema. For example, product description data for a content item stored in the content management system includes a reference tag for each of the various parts of product description data. Different content management systems may use different schemas defining different reference tags for the same part of product description data, as well as information displayed in other elements, such as add-to-cart and remove-from-cart buttons and online forms. Thus, such reference tags correspond to particular data target types. Examples of reference tags used in websites using example content management systems are indicated in the second and third column in Figure 3. In addition, such schemas identify types of webpage, for example whether the webpage is any one of the first data target types mentioned above. The reference tags may be in metadata.
Alternatively, a schema may be used as a standard by multiple content management systems.
Also, parts of product description data, add-to-cart and remove-from-cart buttons and online forms may be tagged with reference tags according to one or more third party schemas to allow third party platforms, for example social media platforms, to understand the data on the website. For example, OpenGraph enables FacebookTm to identify and pull product description data from a website. In addition, the Data Layer may include such reference tags. Corresponding schemas may also be stored in the schema store 122. Embodiments of the invention are not limited to use with such reference tags. A website may include other reference tags associated with the parts of the product description data, the add-to-cart and remove-from-cart buttons and online forms, for example.
The schema store 114 stores the schemas used by a plurality of predetermined content management systems and schemas of any other third party platforms, and includes a mapping between the reference tags in the schemas and predetermined identifiers of the second data target types used by the configuration system 100.
The possible processes by which data targets of the predetermined data target types are determined for a website are now described with reference to Figure 4.
The configuration system 100 is configured to enable input, by the website owner or an administrator, for a website that uses a content management system for which a schema is stored in the schema store 122, of an identifier of the website (e.g. the URL) and to associate the identifier with the corresponding stored schema. This is described in greater detail with reference to Figure 5. Thus, first the configuration system 100 determines whether, for a website, a stored schema has been associated at step 300. If so, the data targets are determined using the associated schema at step 301.
If the URL of the website has not been associated with a particular schema, the configuration server 100 attempts to determine whether the website nevertheless an associated schema, as indicated at steps 302 and 304. At step 302 (described in greater detail in Figure 6), the configuration module 112 determines if the websites uses a content management system for which a schema is stored. If there is an associated schema, the data targets are determined using the associated schema at step 303.
If the website uses a content management system for which a schema for a content management system is not stored or cannot be identified, or if the website does not use a content management system, the configuration server 100 attempts to determine the reference tags using OpenGraph or from the Data Layer at step 304. Embodiments of the invention are not limited to such third party schemas. If there is an associated schema, the data targets are determined using the associated schema at step 305.
Where a schema is stored, the first and/or second data targets in the website can be identified by reference to their reference tags. A copy of each page of the website is downloaded to this end.
If there is no identifiable schema, the configuration system 100 then initiates a data collection phase. In this, the configuration server 100 collects URLs of webpages of the website, obtains a copy of each of the webpages, and applies preconfigured classification models to code of each webpage to determine identify data targets of each data target type (step 306) for each webpage. This is described in greater detail with reference to Figure 7. The data targets are the determined using the models at step 307.
Finally, if data targets cannot be identified for any website using the above processes, elements can be manually associated with particular data target types (step 308), as may be done conventionally.
Also, in some embodiments, a combination of the processes may be used to identify all of the data targets. For example, reference tags may be determined using both OpenGraph and the from the Data Layer. By way of another example, first data targets may be determined using OpenGraph and/or the Data Layer, but second data targets may be determined using the classification models.
With reference to Figure 5, the configuration system 100 is configured to receive input of an identifier of the content management system that the website uses, where the content management system is one for which a schema is stored, together with an identifier of the website (e.g. the URL). The configuration system 100 stores the content management system identifier in association with the website identifier.
In response to receiving the request sent from the browser at step 208, the configuration server 100 determines at step 500 if the website identifier (i.e. website domain name) extracted from the received request is stored in association with any of the previously stored schemas. If so, the configuration server 100 proceeds to generate the configuration file at step 502. Otherwise, the determining of the schema is attempted using another of the processes (step 504).
Referring to Figure 6, the configuration system 100 determines automatically if the website uses one of the content management systems for which a schema is stored. In a preferred embodiment, the content management system is identified using third party content management systems identification software. For example, a company, BuiltWth (builtwith.com) provides technology enabling identification of common content management systems by crawling the website, identifying predetermine markers, and matching those markers against predetermined markers of the particular content management system.
Alternatively, the configuration system 100 may be configured to do this. Thus, in a step 600 an identifier of the content management system is determined, if the website is based on a content management system. If a content management system is identified, the configuration server 100 determines if a corresponding schema is stored at step 602. Where a schema for the content management system is stored in the schema store 114, the configuration server proceeds to configure the file at step 604. If no content management system is identified, or a content management system is identified but no schema is stored, the configuration system proceeds to step 608.
If no content management system is identified, the configuration system 100 then determines if reference tags are specified in the website in the Data Layer or by OpenGraph. If so, the corresponding stored schema may be used.
Embodiments of the invention are not limited to the possible processes being performed in any particular order.
Referring to Figure 7, a webpage identification phase is begun. When a request is received at the configuration system 100 from a user device 106, the URL of the webpage from which the request originated is extracted from the header of the request and stored in the website data store 115 at the configuration system 100 at step 700.
At step 702, the configuration system 100 determines if a sufficient number of requests have been received so that all the URLs of the webpages of the website can be deemed to have been identified and stored. This is done by determining that at least one criterion has been met indicative of all webpages having been identified. The criterion may be that in the requests most recently sent by user devices for pages of the website, no new webpages are identified.
For example, if the most recently sent proportion of all requests for webpages of the website, for example 10%, have not resulted in any new webpages being identified, the webpage identification phase may be ended. Alternatively, if in the most recently received predetermined number of requests for webpages of the website, for example 500, no new webpages have been identified, the webpage identification phase may be ended. Other statistical techniques may be alternatively be used. If the criterion is not met, the configuration system 100 continues to receive requests and store URLs of webpages of the website until the criterion is met.
Provided the criterion is met, at step 704 the configuration system 100 obtains a copy of the webpage data (the HTML) of each webpage identified in the webpage identifier store for the website. The configuration server 100 does this by sending a request for each identified webpage to the hosting server 104, which responds by sending the respective webpage data. The configuration server 100 then cleans the code at step 706.
The configuration server 100 then, at step 708, identifies data target identifiers of the data target types in the website code using a respective classification model for each of the data target types. For example, a classification model is provided for each data target of the second type, that is, for example, product title, product image, an "add to cart' button, a "remove from cart" button, a name of a user displayed on the page, and a field for capturing an email address of the user. Each model outputs a list of data target identifiers of the respective second data target type. Different models are used to identify whether each webpage is one of the first data target types, in which case the output of each model is whether the page is the corresponding type. Such models for determining data target identifiers of the first and second types are known in the art.
In the above described process after the data target identifiers have been determined the system 100 stores the data target identifiers each in association with an identifier of its data target type in the website data store 115. Where the data target identifier is one of the second types, the data target identifier may be in the form of an element identifier. The system 100 may also store each data target identifier of the second type in association with an identifier of the first data target type of the webpage on which the corresponding data target identifier is found. Thus, the stored information identifying the data targets and data target types enables the system 100 to apply the rules.
Alternatively, as indicated at step 504, the data targets can be identified manually in a conventional way.
After the configuration file is configured, a verification phase is begun, in which verification rules are applied to check that the file is properly configured. With reference to Figure 8, the configuration module 112 applies at step 800 the verification rules to event data received from user devices. If any of the verification rules is breached (step 802), the configuration file is no longer properly configured and the configuration process is run again (804).
In one example verification rule, prices received at the data processing module 118 are each checked to determine if the price is a real number within predetermined bounds, using regular expression equations. The rule is breached if the number is outside the bounds.
In another example verification rule, a ratio is determined of the number of products viewed against the number of products purchased. If the ratio is not above a minimum and below a maximum, the rule is breached.
In another example verification rule, a check is performed on whether an identifier of an image of a product does indeed reference an image file. If it does not, the rule is breached.
After the verification phase, the file is monitored using the same and/or different rules. The file may no longer be properly configured if, for example, the owner of the website makes changes to the website. Figure 8 refers to steps at in both the verification phase and during monitoring.
If a change is made on a website, for example if a new product is added, the configuration file is no longer properly configured and the configuration process has to be run again. The configuration module 112 and the data processing module 114, where implemented as separate modules as indicated in Figure 1, are coupled to enable such verification and/or monitoring by the configuration module 112.
An example of operation of the system is now described, after a user has accessed a home page on a webpage and the user's browser has received the configuration file and the browser has executed the file to set event detectors in accordance with event detector instructions in the file. The event detector instructions are as described in a) to c) above. The user navigates to a category page by selecting a link to that page. The titles of products are selectable, and the user selects one of the titles, to navigate to a product page for that product. As understood from c), an event detector detects that a new page has loaded, and captures the URL of the new product page and product description data from the new page and sends this to the data processing server together with a timestamp.
The user then selects the add-to-cart button and, as understood from a), an event detector detects selection of the button and the product information is against captured and send to the data processing server.
The user then goes to the cart page and an event detector in accordance with c) is then triggered. The user then selects to proceed to a payment page an event detector in accordance with c) is again triggered. The user then entered an email address in a form, which is required to proceed with payment. When the user leaves the form, an event detector in accordance with b) is triggered and the text input to the form is captured. Thus, where the user then abandons the payment process, there is sufficient information at the data processing server to market the product to the user via email.
The term "products" is used herein with reference to items advertised for sale on the website. "Services" could equally be sold.
The configuration server 100 may comprise a system that may include a processor 1001 or processing circuit operatively coupled to a bus (not shown) or other communication component for communicating information between components. The system also includes a random access memory (RAM) or other dynamic storage device, operatively coupled to the bus for storing information, and instructions to be executed by the processor. The system may further include a read only memory (ROM) or other static storage device operatively coupled to the bus, for storing static information and instructions for the processor. A data storage device 1004, such as a solid state device, magnetic disk or optical disk, is coupled to the bus for persistently storing information and instructions. The server 100 also includes a communications interface 1002, communicatively coupling the respective server to the internet The functionality ascribed to the configuration system herein may be provided by computer programs stored in the main memory and executable by the processor or the processing circuit. Alternatively, the functionality may be implemented in dedicated hardware or a mix of hardware and software.
Each user device may be a smartphone, watch, laptop, tablet, or other user device, including wearable devices, configured for wireless communication via the intemet, where the user device can run a browser. Each user device includes user input means, for example a mouse and keyboard, and a display. The user input means and the display may be combined as a touchscreen. Various forms of user input means are well known. Such user devices also each include a communications interface (wired or wireless) data storage and memory (ROM and RAM) and a processor for performing the functional operations of the device based on data stored in the data storage and/or memory.
Unless otherwise stated, all features of all embodiments of the invention described herein are combinable with all other features, in any desired combination. Various modification may be made to the above described embodiments within the scope of the invention.
When used in this description and claims, the terms "comprises" and "comprising" and variations thereof mean that the specified features, steps or components are included. The terms are not to be interpreted to exclude the presence of other features, steps or components.
The features disclosed in the foregoing description, or the following claims, or the accompanying drawings, expressed in their specific forms or in terms of a means for performing the disclosed function, or a method or process for attaining the disclosed result, as appropriate, may, separately, or in any combination of such features, be utilised for realising the invention.

Claims (20)

  1. CLAIMS1. A method, comprising: automatically determining, by a configuration system, identifiers of data targets in a website, each data target being one of a plurality of predetermined data target types; generating computer program code including at least some of the data target identifiers, the generated code being suitable for execution by web browsers to enable the respective web browser to perform at least one predetermined action relating to the included data targets; and sending the code to a web browser.
  2. 2. The method of claim 1, wherein the data target types comprise first data target types being types of webpage. 15
  3. 3. The method of claim 2, wherein the first data target types comprise one or more of: product page; checkout page, payment confirmation page; cart page; login page; category page.
  4. 4. The method of any one of the preceding claims, wherein the data target types comprise second data target types being identifiers of elements displaying information of particular types.
  5. 5. The method of claim 4, wherein the information of a particular type is a respective one of: a product title; a product image; a button selectable to add an item from an online shopping cart; a button selectable to remove an item from the cart; an online form; a product price,a product description; a user's displayed name.
  6. 6. The method of any one of the preceding claims, wherein the generating the code comprises also including an identifier of one or more predetermined events in the code, the or each predetermined event being associated with one or more of the included data targets, wherein the at least one action comprises detecting occurring of the identified events in relation to the associated one or more data targets.
  7. 7. The method of claim 6, wherein each event corresponds to a predetermined user action relating to the associated one or more data targets. 10
  8. 8. The method of claim 6 or claim 7, wherein the generating the code including the one or more data target identifiers and the one or more event identifiers comprises creating a plurality of event processing instructions, wherein each event processing instruction includes at least one of the included data target identifiers and the associated event identifier, and including the event processing instructions in the code, such that the web browser can process the code to set an event detector based on each event processing instruction to detect occurring of the identified event in relation to the respective data targets.
  9. 9. The method of claim 8, wherein the creating the event processing instructions is performed based on predetermined stored rules associating data target types and predetermined event identifiers.
  10. 10. The method of claim 9, wherein at least one of the rules associates data targets of a second type on a page of a first type with one of the event identifiers. 25
  11. 11. The method of any one of the preceding claims, wherein the determining comprises: attempting to determine the identifiers of the data targets in the website using one of at least two predetermined processes; if the data target identifiers cannot be determined using the one process, determining the identifiers of the data targets using another of the processes.
  12. 12. The method of claim 11, further comprising: storing, at the configuration system, a plurality of schemas each identifying a plurality of references used in website code that indicate data targets in a website and each correspond to a data target type, wherein the determining the identifiers of the data targets comprises: determining that one of the schemas is used in the website; and determining the data target identifiers using the references of the determined schema in the website code.
  13. 13. The method of claim 12, wherein each stored schema is associated with a third party technology in the form of one or more of: a respective content management system; a standardised schema used by content management systems; a data layer schema; a third party schema enabling integration with third party technologies.
  14. 14. The method of claim 13, wherein the determining that one of the schemas is used in the website comprises: crawling the website to determine a plurality of characteristics of the website; determining based on the determined characteristics if the website uses the third party technology; if so, determining to use the corresponding stored schema.
  15. 15. The method of any one of claims, wherein one of the at least two processes comprises: receiving requests from a plurality of user devices, wherein each of the requests identifies a page of the predetermined website; obtaining a copy of the webpage code of each page; inputting the code of each webpage to classification models, wherein each is configured to receive input of the webpage code and to output: identifiers of data targets of one or more second data target types associated with the model; or a first data target type for the webpage.
  16. 16. The method of claim 14, further comprising: further to the receiving of the requests from the plurality of user devices, determining if at least one criterion is met indicating that all relevant pages of the website have been identified; if the criterion is not met, receiving further of the requests, and then again determining if the at least one criterion is met.
  17. 17. The method of any one of the preceding comprising: before the determining, by the configuration system, of identifiers of the data targets, determining that the code is to be configured for the website.
  18. 18. A data processing apparatus comprising: processing means; memory means having computer program code stored thereon, where the processing means and the memory means with the computer program code are configured to cause the apparatus to perform the method of any one of the preceding claims.
  19. 19. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method of any one of claims 1 to 17.
  20. 20. A computer-readable storage medium comprising instructions which, when executed by a computer, cause the computer to carry out the method of claims 1 to 17.
GB2005037.3A 2020-04-06 2020-04-06 Automated determining of data targets in a website to enable actions in web browsers relating to the data targets Pending GB2593890A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB2005037.3A GB2593890A (en) 2020-04-06 2020-04-06 Automated determining of data targets in a website to enable actions in web browsers relating to the data targets
PCT/EP2021/058978 WO2021204828A1 (en) 2020-04-06 2021-04-06 Automated determining of data targets in a website to enable actions in web browsers relating to the data targets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB2005037.3A GB2593890A (en) 2020-04-06 2020-04-06 Automated determining of data targets in a website to enable actions in web browsers relating to the data targets

Publications (2)

Publication Number Publication Date
GB202005037D0 GB202005037D0 (en) 2020-05-20
GB2593890A true GB2593890A (en) 2021-10-13

Family

ID=70768983

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2005037.3A Pending GB2593890A (en) 2020-04-06 2020-04-06 Automated determining of data targets in a website to enable actions in web browsers relating to the data targets

Country Status (2)

Country Link
GB (1) GB2593890A (en)
WO (1) WO2021204828A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230186327A1 (en) * 2021-10-01 2023-06-15 GetEmails LLC User identification and activity monitoring service

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8806327B2 (en) 2005-08-15 2014-08-12 Iii Holdings 1, Llc System and method for displaying unrequested information within a web browser
WO2011127049A1 (en) * 2010-04-07 2011-10-13 Liveperson, Inc. System and method for dynamically enabling customized web content and applications
US8645212B2 (en) 2012-04-30 2014-02-04 Bounce Exchange Llc Detection of exit behavior of an internet user
US10491694B2 (en) * 2013-03-15 2019-11-26 Oath Inc. Method and system for measuring user engagement using click/skip in content stream using a probability model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230186327A1 (en) * 2021-10-01 2023-06-15 GetEmails LLC User identification and activity monitoring service

Also Published As

Publication number Publication date
GB202005037D0 (en) 2020-05-20
WO2021204828A1 (en) 2021-10-14

Similar Documents

Publication Publication Date Title
EP2433258B1 (en) Protected serving of electronic content
US9734503B1 (en) Hosted product recommendations
JP6041326B2 (en) Determining information related to online video
US20180047064A1 (en) Media enrichment system and method
US11500709B1 (en) Mobile application crash monitoring user interface
CA3004340C (en) Method and system to provide video-based search results
CA2861616C (en) Method and system to provide a scroll map
US20100281008A1 (en) Universal Tracking Agent System and Method
US20140249935A1 (en) Systems and methods for forwarding users to merchant websites
US20110125512A1 (en) Systems and methods for providing digital publications
EP2729888A2 (en) A method of a web based product crawler for products offering
US20110238533A1 (en) Data processing
JP5249415B2 (en) Method and apparatus for providing data statistics
US8234307B1 (en) Determining search configurations for network sites
WO2021204828A1 (en) Automated determining of data targets in a website to enable actions in web browsers relating to the data targets
JP5767413B1 (en) Information processing system, information processing method, and information processing program
CN111833219A (en) Method and device for providing intellectual property service commodity data
TW202333098A (en) System, method, and computer program for automatic coupon code fill in a mobile application
US20150310482A1 (en) Method and survey server for generating metrics indicative of website visit originating channel effectiveneess
WO2022066848A1 (en) Systems and methods for decentralized detection of software platforms operating on website pages
JP5669330B2 (en) RECOMMENDATION DEVICE, RECOMMENDATION METHOD, AND RECOMMENDATION PROGRAM
JP6911210B2 (en) Systems and methods for providing cashback reward notifications from the shopping portal
US20220337498A1 (en) Automated Manipulation and Monitoring of Embeddable Browsers
JP2023119597A (en) Tracking system, tracking method and tracking program
Dakov et al. ENHANCING THE E-COMMERCE EXPERIENCE: A WISHLIST BROWSER EXTENSION WITH PRICE-DETECTION