WO2023169165A1 - Procédé et appareil de traitement de données d'accès, dispositif électronique et support lisible par ordinateur - Google Patents

Procédé et appareil de traitement de données d'accès, dispositif électronique et support lisible par ordinateur Download PDF

Info

Publication number
WO2023169165A1
WO2023169165A1 PCT/CN2023/076143 CN2023076143W WO2023169165A1 WO 2023169165 A1 WO2023169165 A1 WO 2023169165A1 CN 2023076143 W CN2023076143 W CN 2023076143W WO 2023169165 A1 WO2023169165 A1 WO 2023169165A1
Authority
WO
WIPO (PCT)
Prior art keywords
page
access
application
track
terminal device
Prior art date
Application number
PCT/CN2023/076143
Other languages
English (en)
Chinese (zh)
Inventor
吴轶伦
Original Assignee
北京京东拓先科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东拓先科技有限公司 filed Critical 北京京东拓先科技有限公司
Publication of WO2023169165A1 publication Critical patent/WO2023169165A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Definitions

  • the present disclosure relates to the field of computer technology, specifically to data analysis and other technical fields, and in particular to access data processing methods and devices, electronic equipment, computer-readable media and computer program products.
  • Embodiments of the present disclosure provide access data processing methods and devices, electronic devices, computer-readable media, and computer program products.
  • an access data processing method includes: collecting access logs of pages of the application accessed by different terminal devices through buried point data of the application; based on the access log, obtaining at least The access track of each terminal device to the application page in at least one visit in a preset time period; aggregate statistics on terminal devices with the same access track to obtain an aggregation cluster of corresponding access tracks and terminal device information; based on each aggregation The access track and terminal device information corresponding to the cluster are used to optimize the application page and obtain the optimized application.
  • the above method further includes: receiving a query track, where the query track includes at least one page of the application; and matching the query track with the access track corresponding to each aggregation cluster.
  • the traces are matched; in response to determining that the query trace matches the access trace corresponding to the aggregation cluster, the terminal device information corresponding to the aggregation cluster is obtained and displayed.
  • the above method further includes: using a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.
  • the above method further includes: labeling all pages of the application; and in response to receiving the label of the page, performing statistics on access trajectories where the page is located to obtain access trajectory statistical results.
  • the above method also includes: marking the landing page for the first page of the access track corresponding to each aggregation cluster; in response to receiving a query condition using one page as the landing page, obtaining and displaying all the pages as the landing page.
  • the access track of the landing page in response to receiving a query condition using one page as the landing page, obtaining and displaying all the pages as the landing page.
  • the above method also includes: marking the exit page for the last page of the access track corresponding to each aggregation cluster; in response to receiving a query condition using one page as the exit page, obtaining and displaying all the pages as landing pages. Page access track.
  • the pages of the above-mentioned application include: at least one landing page, and the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track.
  • the above-mentioned access based on each aggregation cluster corresponds to Trajectory and terminal device information, optimize the application page, and obtain the optimized application, including: targeting at least one page as a landing page, when the next node of the landing page in the access trajectory is a bounce node, calculating the bounce node
  • the above-mentioned optimization of application pages based on the access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application includes: querying the preset access trajectories from the access trajectories corresponding to all aggregation clusters; Based on the terminal device information of each page in the preset access track, calculate the conversion rate of each page in the preset access track; in response to the fact that the conversion rate of a page in the preset access track is less than that of pages other than the page in the preset access track conversion rate, optimize the page, and get the optimized application.
  • the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track, and the application pages are optimized based on the access track and terminal device information corresponding to each aggregation cluster.
  • optimized applications include: Traverse the access trajectories corresponding to all aggregated clusters, and calculate the number of terminal devices in all access trajectories that jump out of the page before the node; in response to the number of terminal devices that jump out of the page before the node being greater than the preset threshold, optimize the page before jumping out of the node , get the optimized application.
  • an access data processing device includes: a collection unit configured to collect access logs of pages of the application accessed by different terminal devices through buried point data of the application;
  • the acquisition unit is configured to obtain, based on the access log, access trajectories of each terminal device to the page of the application in at least one visit in at least one preset time period;
  • the aggregation unit is configured to perform an aggregation operation on terminal devices with the same access trajectories.
  • the optimization unit is configured to optimize application pages based on access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application.
  • the above device further includes: a receiving unit configured to receive a query track, where the query track includes at least one page of the application; a matching unit configured to match the query track with access tracks corresponding to each aggregation cluster. ; The obtaining unit is configured to obtain and display the terminal device information corresponding to the aggregation cluster in response to determining that the query trace matches the access trace corresponding to the aggregation cluster.
  • the above apparatus further includes: a display unit configured to use a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.
  • the above-mentioned device further includes: a page labeling unit configured to label all pages of the application; a statistics unit configured to respond to receiving the label of the page and perform statistics on the access track where the page is located, Obtain the access trajectory statistics results.
  • the above device further includes: a landing annotation unit configured to annotate the landing page for the first page of the access track corresponding to each aggregation cluster; a landing query unit configured to respond to receiving a page as the landing page. When querying the page, get and display all the access tracks of this page as a landing page.
  • the above device further includes: an exit labeling unit configured to label the last page of the access track corresponding to each aggregation cluster as an exit page; an exit query unit configured to respond to receiving a page as the exit page When the query conditions are Go to and display all access tracks of this page as a landing page.
  • the pages of the above-mentioned application include: at least one landing page, and the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track.
  • the above-mentioned optimization unit includes: a computing module, It is configured to target at least one page as a landing page. When the next node of the landing page in the access track is a bounce node, calculate the number of terminal devices of the bounce node and the number of terminal devices of the landing page at the first node of the access track. The proportion; the page optimization module is configured to respond to the proportion being greater than the average exit rate of all landing pages, optimize the landing page, and obtain an optimized application.
  • the above-mentioned optimization unit includes: a query module, configured to query preset access trajectories from access trajectories corresponding to all aggregated clusters; a conversion module, configured to based on the terminal device information of each page of the preset access trajectories , calculate the conversion rate of each page in the preset access track; the application optimization module is configured to respond to the conversion rate of a page in the preset access track being less than the conversion rate of pages other than the page in the preset access track, This page is optimized and the optimized application is obtained.
  • the access track of each aggregation cluster is provided with a jump node adjacent to the last page of the access track.
  • the above-mentioned optimization unit includes: a traversal module configured to traverse the access tracks corresponding to all aggregation clusters. , calculate the number of terminal devices in all access trajectories that jump out of the page before the node; the node optimization module is configured to respond to the number of terminal devices that jump out of the page before the node is greater than the preset threshold, optimize the page before jumping out of the node, Get optimized applications.
  • an electronic device includes: one or more processors; a storage device with one or more programs stored thereon; when one or more programs Executed by one or more processors, causing the one or more processors to implement the method described in any of the above implementations.
  • a computer-readable medium is provided, with a computer program stored thereon.
  • the program is executed by a processor, the method described in any of the above implementations is implemented.
  • a computer program product including a computer program.
  • the computer program When executed by a processor, the computer program implements the method described in any of the above implementations.
  • Figure 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;
  • Figure 2 is a flow chart of an embodiment of an access data processing method according to the present disclosure
  • Figure 3 is a flow chart of another embodiment of an access data processing method according to the present disclosure.
  • Figure 4 is a schematic diagram using a Sankey diagram to display access trajectories according to the present disclosure
  • Figure 5 is a schematic structural diagram of an embodiment of an access data processing device according to the present disclosure.
  • FIG. 6 is a schematic structural diagram of an electronic device suitable for implementing embodiments of the present disclosure.
  • FIG. 1 illustrates an exemplary system architecture 100 to which the access data processing method of the present disclosure may be applied.
  • the system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105.
  • the network 104 is a medium used to provide communication links between the terminal devices 101, 102, 103 and the server 105.
  • Network 104 may include various connection types and may typically include wireless communication links and the like.
  • the terminal devices 101, 102, 103 interact with the server 105 through the network 104 to receive or send messages, etc.
  • Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as instant messaging tools, email clients, etc.
  • the terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be devices with communication and control functions. superior The device can communicate with the server 105. When the terminal devices 101, 102, and 103 are software, they can be installed in the above-mentioned terminals. It may be implemented as multiple software or software modules (such as software or software modules used to access applications), or as a single software or software module. There are no specific limitations here.
  • the server 105 may be a server that provides various services, such as an application server that provides support for applications on the terminal devices 101, 102, and 103.
  • the application server can analyze and process the relevant information of each terminal in the network, and feed back the processing results (such as optimized application installation programs, etc.) to the terminal device.
  • the server can be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers or as a single server.
  • the server is software, it can be implemented as multiple software or software modules (for example, software or software modules used to provide distributed services), or it can be implemented as a single software or software module. There are no specific limitations here.
  • the access data processing method provided by the embodiment of the present disclosure is generally executed by the server 105.
  • the access data processing method includes the following steps:
  • Step 201 Collect access logs of pages of the application accessed by different terminal devices through the embedded data of the application.
  • the hidden data of the application is obtained by burying data on the application page.
  • Buried data is a common data collection method for page analysis. Specifically, it refers to injecting a paragraph or paragraph into the source code corresponding to each page of the application.
  • Several scripts are used to obtain the behavior data corresponding to the operation events of the terminal device operation page. The behavior data of multiple different time periods are combined together to obtain the access log of the terminal device.
  • the buried point data of the application includes the behavior data of at least one terminal device operating the application.
  • the terminal device information in the buried point data can be used to determine the behavior data of the terminal device in at least one terminal device.
  • the terminal device The behavioral data are arranged together according to the timestamp, and the access log of the terminal can be obtained.
  • Access logs can be obtained through multiple channels, for example, by using tracking tools to obtain tracking data of applications installed on terminal devices (such as terminals 101, 102, and 103 shown in Figure 1) in real time, or by obtaining pre-set tracking data from a database. Buried data after burying the application.
  • the access log may be a browsing log of the terminal device browsing the application page.
  • the tracking data reported to the execution subject by the tracking tool or script may include: reporting site, The unique identifier of the page, the number that identifies the same visit, the device number, the request time period, etc. Every time the terminal device browses, the tracking tool or script will generate a piece of tracking data and store the tracking data.
  • the page of the application can be a page opened through a browser, or a native page of the application, or a page embedded in the application that needs to be invoked; the access log is the data generated by multiple device terminals in different visits.
  • a visit refers to a device terminal accessing an application.
  • the user opening the application through the device terminal is the beginning of a visit.
  • Step 202 Based on the access log, obtain the access track of each terminal device to the page of the application in at least one visit in at least one preset time period.
  • the access log is a collection of behavior data of multiple terminal devices operating on the application page.
  • the access log can be used to obtain the information about the terminal device's operation of the application page. Visit trajectories during visits.
  • the application has multiple pages, and the access track is a record of the user accessing the application through the terminal device and operating each page in the application.
  • the application includes a home page, an activity page, a search page, and a details page.
  • the user accesses the application through the device.
  • the terminal browses the homepage, activity page, homepage, and search page of the application in sequence, and the access track corresponding to the device terminal is homepage-activity page-homepage-search page.
  • the access track includes multiple nodes, each node corresponding to a page of the application. Any two nodes among the multiple nodes can be the same or different.
  • the access track above has two home pages, that is, in the access track In the track, there are two nodes corresponding to the same page.
  • the preset time period can be set according to application development requirements.
  • the preset time period is one day, or half a day, etc.
  • Daily scheduled tasks can be used to calculate the access track of each terminal device to the application's page in at least one visit through the basic traffic data obtained.
  • the calculation task is executed at 3 a.m. every day.
  • the access track of each terminal device to the application page in one visit in the preset time period refers to the collection of pages viewed by the same terminal device in the same visit sequence in the preset time period.
  • the above-mentioned obtaining the access track of each terminal to the application page in at least one visit in at least one preset time period includes: taking the preset time period as the cycle, sorting the pages browsed by the same user in the same visit according to the timestamp from first to last. , perform adjacent deduplication according to the unique identifier of the page (two adjacent nodes in the access trajectory are the same and remove one of the nodes), and obtain the access trajectory of the terminal device in the preset time period.
  • an exit identifier can be added as a unified APP exit after the last page in the access track of all terminal devices, that is, exit is the exit node of the access track.
  • the access track of terminal A is home page -> activity page -> home page -> search page -> exit.
  • Step 203 Aggregate statistics on terminal devices with the same access trajectories to obtain aggregation clusters corresponding to access trajectories and terminal device information.
  • the same access trajectory corresponds to at least one terminal device.
  • an aggregation cluster corresponding to the access trajectory can be obtained.
  • the obtained aggregation cluster can include at least one terminal device. Terminal device information.
  • the terminal device information is information related to the terminal device.
  • the terminal device information includes: a unique identifier of the terminal device, a device number of the terminal device, and the number of terminal devices.
  • the number of terminal devices is the sum of the number of at least one terminal with the same access trajectory.
  • the access trajectory of terminal A is home page -> activity page -> home page -> search page -> exit, and only on that day
  • the access track of terminal B is exactly the same as that of terminal A.
  • the access track of terminal B is also home page -> activity page -> home page -> search page -> exit.
  • the number of terminal devices on this access track that day is 2.
  • the access track is a page or node arranged in forward chronological order.
  • the access track clearly identifies the pages accessed from the first step to the Nth.
  • the aggregation table of the aggregation cluster is shown in Table 1.
  • the first to fourth steps are the accessed pages of the access track, or the set nodes (such as exit nodes).
  • the accessed pages include: activity page, details page, search page, homepage, list page, These access pages are related to the application. When the application is different, the pages contained are different, and the names of the pages are also different accordingly.
  • Step 204 Based on the access trajectories and terminal device information corresponding to each aggregation cluster, optimize the application page to obtain an optimized application.
  • the application may have one or more pages. Based on the access trajectories and terminal device information corresponding to each aggregation cluster, the application pages may be optimized. This may be to optimize one page of the application or to optimize multiple pages. Page optimization. Among them, the optimization of the page can include: deleting the page, rearranging the page, modifying the content of the page and other optimization methods.
  • an aggregation cluster corresponds to an access track in one visit.
  • An aggregation cluster may have terminal device information of one or more terminal devices, and the one or more terminal devices correspond to an access track in the same visit. .
  • the pages of the above application include: at least one landing page, and the access track of each aggregate cluster is provided with a jump node adjacent to the last page of the access track.
  • optimize the application pages Based on each Aggregate the access trajectories and terminal device information corresponding to the cluster, optimize the application pages, and obtain the optimized application, including: targeting at least one page as a landing page, when the next node of the landing page in the access trajectory is a bounce node, Calculate the ratio of the number of terminal devices that jump out of the node to the number of terminal devices that are at the first node of the access trajectory for the landing page; in response to the ratio being greater than the average exit rate of all landing pages, optimize the landing page to obtain the optimized application .
  • the landing page refers to a specific activity with a clear theme that the visitor sees in other places outside the application, such as attractive discount information published through emails, social media or advertisements, etc., and is linked to after clicking.
  • This landing page can be Any page in the application, for example, the landing page is the details page in the application, or the landing page is the home page in the application, etc.
  • the exit node is the last node in all access trajectories. Through the exit node, it can be determined that the terminal device has finished accessing the application.
  • exit rate the number of terminal devices whose next node on the landing page is the exit node, as a proportion of the number of terminal devices on the first page of the landing page's access track.
  • the average exit rate is calculated by counting the traffic of all landing pages. The obtained value, specifically, the average exit rate has three calculation methods: weighted average, arithmetic average and geometric average:
  • Weighted average The sum of the number of terminal devices that jump out of the node after the landing page in all access trajectories in the application is used as the numerator, and the sum of the number of terminal devices that have the landing page on the first page of the access trajectory is used as the denominator. The proportion value obtained by comparison .
  • Geometric mean The exit rates of all landing pages are multiplied and raised to the power N. N is equal to the number of landing pages.
  • landing page there is not only one landing page, but there can be many landing pages.
  • the definition of landing page is that the first page of the application is called the landing page.
  • the access situation of the landing page can be effectively analyzed, providing a reliable basis for improving the landing page. in accordance with.
  • the application pages are optimized based on the access trajectories and terminal device information corresponding to each aggregation cluster, and the optimized application is obtained, including: from the access trajectories corresponding to all aggregation clusters Query the preset access track; based on the terminal device information of each page of the preset access track, calculate the conversion rate of each page in the preset access track; in response to the conversion rate of a page in the preset access track being less than the preset access track except The conversion rate of pages other than this page, the page is optimized, and the optimized application is obtained.
  • the conversion rate is related to the number of terminal device visits to each page. For example, if a visit track is from the search page to the product details page, then the conversion rate of the search page is: the ratio with the number of terminal devices on the search page as the denominator and the number of terminal devices from the search page to the product details page as the numerator.
  • the page is a page to be optimized and needs to be Consider removing or changing this page.
  • the conversion rate of each page of the application is calculated to ensure the optimization effect of the core pages in the application and improve the efficiency of application optimization.
  • the access track of each aggregation cluster is provided with a jump-out node adjacent to the last page of the access track.
  • the above is based on the access track and terminal device information corresponding to each aggregation cluster.
  • optimize the application pages to obtain the optimized application including: traversing the access trajectories corresponding to all aggregation clusters, calculating the number of terminal devices in all access trajectories that jump out of the page before the node; responding to the terminal devices that jump out of the page before the node If the number of devices is greater than the preset threshold, the page before jumping out of the node will be optimized to obtain an optimized application.
  • the traffic of all jump pages (the number of terminal devices) is counted.
  • the jump page with the highest traffic is the high-frequency jump page.
  • the high-frequency jumping pages in the application are obtained, which provides a reliable basis for optimizing the application's pages.
  • the access data processing method provided by the embodiments of the present disclosure can view the path distribution of users when using products through cluster information without configuration, and supports global observation of users' hot pages and main access trajectories in products.
  • users can be grouped according to the terminal device information and access trajectories in the cluster, supporting the comparison of the differences in behavioral paths of different types of users.
  • landing pages in the access track you can also filter different landing pages as a starting point to view the user's subsequent path distribution. It can not only view the global user path distribution with zero configuration, but also meet the needs of configurable analysis to a certain extent.
  • the access data processing method first collects the access logs of the pages of the application accessed by different terminal devices through the embedded data of the application; secondly, based on the access data, Query the log to obtain the access track of each terminal device to the application page in at least one visit in at least one preset time period; thirdly, perform aggregate statistics on the terminal devices with the same access track to obtain the corresponding access track and terminal device information aggregation clusters; finally, based on the access trajectories and terminal device information corresponding to each aggregation cluster, the application page is optimized to obtain an optimized application. Therefore, based on the access trajectory of the terminal device to the application page in one visit, clustering the terminal devices can determine all terminal devices with the same access trajectory, providing an effective optimization basis for the application page and improving application optimization. Efficiency and improved user experience.
  • the above-mentioned access data processing method also includes: receiving query trajectories, where the query trajectories include at least one page of the application; and corresponding query trajectories to each aggregation cluster.
  • the access traces are matched; in response to determining that the query trace matches the access trace corresponding to the aggregation cluster, the terminal device information corresponding to the aggregation cluster is obtained and displayed.
  • the query track sent by the developer can be received.
  • the query track can be the same as the access track corresponding to the aggregation cluster, or it can be different from the access track corresponding to the aggregation cluster.
  • Trajectory when the query trajectory is the same as the access trajectory, it is determined that the query trajectory matches the access trajectory corresponding to the aggregation cluster. Accordingly, the terminal device information of the aggregation cluster corresponding to the query trajectory can be obtained, such as the number of terminal devices, so that the query trajectory can be analyzed. Data access traffic.
  • the information of the terminals of the aggregation cluster is queried through the query trajectory, which provides a reliable query basis for the data access status of the application page and ensures the reliability of the application access data analysis.
  • the above access data processing method also includes: labeling all pages of the application; in response to receiving the label of the page, labeling the page where the page is located. Perform statistics on access trajectories to obtain access trajectories statistical results.
  • all pages of the application are tagged, and the generated access trajectories also have tagged tags.
  • Timing can determine the access track where the page is located and the aggregation cluster where the page is located, and then the number of access tracks, the number of aggregation clusters and other information can be obtained.
  • the access trajectory statistics results include: access trajectory name, access trajectory number, aggregation cluster corresponding to the access trajectory, number of terminal devices corresponding to the access trajectory, etc.
  • tags can be text, symbols, codes, etc.
  • the tag of the received page can uniquely indicate the page of the application.
  • independent query of the page can be supported, as well as the access track of the received page.
  • Statistics thereby realizing an interleaved query method with access trajectories as horizontal queries and pages as vertical queries.
  • each page is tagged, and the user path is tagged according to the included page.
  • the user can view the access track through the page after inputting the unique identifier of the page of interest. Distribution, you can also effectively query the access trajectory statistics such as the number of terminal devices in the aggregate cluster, which improves the reliability of page retrieval.
  • the above method further includes: marking the landing page for the first page of the access track corresponding to each aggregation cluster; in response to receiving a page as When querying the landing page, obtain and display all the access tracks of this page as a landing page.
  • query conditions for the access trajectories can be added; for example, when an access trajectory includes: details page and search page; then the details page is used as the landing page.
  • the query conditions are: when the details page is used as the landing page, query and display all access trajectories with the details page as the landing page.
  • the Sankey diagram can be used to display and query the access trajectories.
  • the landing page is marked for the first page of the access track corresponding to each aggregation cluster.
  • the query condition of the landing page all access tracks of this page as the landing page are obtained and displayed, which improves access Richness of trajectory queries.
  • the above method also includes: marking the last page of the access track corresponding to each aggregation cluster as an exit page; in response to receiving a page as the exit Page query conditions are obtained and Display all access tracks of this page as a landing page.
  • the exit page is marked on the last page of all access trajectories, and query conditions for the access trajectories can be added; for example, when an access trajectory includes: details page and search page; then the search page is used as the exit page.
  • querying The conditions are: when the search page is used as the exit page, query and display the access trajectories in all access trajectories that use the search page as the exit page. It should be noted that a Sankey diagram can be used to display the access trajectories.
  • the exit page is marked for the last page of the access track corresponding to each aggregation cluster.
  • the query condition for the exit page all the access tracks for this page as the exit page are obtained and displayed, which improves the access track. The richness of the query.
  • FIG. 3 shows a process 300 of another embodiment of the access data processing method provided by the present disclosure.
  • the access data processing method may include the following steps:
  • Step 301 Collect access logs of pages of the application accessed by different terminal devices through the embedded data of the application.
  • Step 302 Based on the access log, obtain the access track of each terminal device to the page of the application in at least one visit in at least one preset time period.
  • Step 303 Perform aggregation statistics on terminal devices with the same access trajectories to obtain aggregation clusters corresponding to access trajectories and terminal device information.
  • Step 304 Use a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.
  • Sankey diagram Sankey energy distribution diagram, also called Sankey energy balance diagram. It is a specific type of flowchart in which the width of the extended branches corresponds to the size of the data flow.
  • the Sankey diagram can show the access trajectory distribution of an application without configuration.
  • the number of terminal devices passing through the page is prompted; when the mouse slides between any two steps of the access track, the prompt is The number of terminal devices that go from one page to another.
  • FIG 4 it is a schematic diagram showing access trajectories using Sankey diagram.
  • S, T, W, V, M, N, and U represent different pages in the application.
  • terminal device information such as the number of terminal devices, ID, etc.
  • Step 305 Based on the access trajectories and terminal device information corresponding to each aggregation cluster, optimize the application page to obtain an optimized application.
  • the access data processing method provided in this embodiment uses a Sankey diagram to display the terminal device information and access trajectories before optimizing the application page, which can visually represent the aggregation clusters and provide a vivid trajectory display effect for application improvement. .
  • the present disclosure provides an embodiment of an access data processing device.
  • the device embodiment corresponds to the method embodiment shown in Figure 2.
  • the device can be specifically applied in various electronic devices.
  • an embodiment of the present disclosure provides an access data processing device 500.
  • the device 500 includes: a collection unit 501, an acquisition unit 502, an aggregation unit 503, and an optimization unit 504.
  • the above-mentioned collection unit 501 may be configured to collect access logs of pages of the application accessed by different terminal devices through buried point data of the application.
  • the above-mentioned obtaining unit 502 may be configured to obtain, based on the access log, the access track of each terminal device to the page of the application in at least one visit in at least one preset time period.
  • the above-mentioned aggregation unit 503 may be configured to perform aggregation statistics on terminal devices with the same access trajectories, and obtain aggregation clusters corresponding to access trajectories and terminal device information.
  • the above-mentioned optimization unit 504 may be configured to optimize application pages based on access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application.
  • the specific processing of the collection unit 501, the acquisition unit 502, the aggregation unit 503, and the optimization unit 504 and the technical effects they bring can be referred to the steps in the corresponding embodiment of Figure 2 respectively. 201, step 202, step 203, Step 204.
  • the above-mentioned device 500 further includes: a receiving unit (not shown in the figure), a matching unit (not shown in the figure), and a obtaining unit (not shown in the figure).
  • the above-mentioned receiving unit may be configured to receive a query track, where the query track includes at least one page of the application.
  • the above-mentioned matching unit may be configured to match the query trajectories with the access trajectories corresponding to each aggregation cluster.
  • the above obtaining unit may be configured to obtain and display the terminal device information corresponding to the aggregation cluster in response to determining that the query trajectory matches the access trajectory corresponding to the aggregation cluster.
  • the above-mentioned device 500 further includes: a display unit (not shown in the figure).
  • the above display unit may be configured to use a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.
  • the above-mentioned device 500 also includes: a page annotation unit (not shown in the figure) and a statistics unit (not shown in the figure).
  • the above-mentioned labeling unit can be configured to label all pages of the application.
  • the above statistics unit may be configured to, in response to receiving the tag of the page, perform statistics on the access track where the page is located, and obtain access track statistics results.
  • the above-mentioned device 500 also includes: a landing annotation unit (not shown in the figure) and a landing query unit (not shown in the figure).
  • the above-mentioned landing annotation unit may be configured to perform landing page annotation on the first page of the access track corresponding to each aggregation cluster.
  • the landing query unit is configured to, in response to receiving a query condition using a page as a landing page, obtain and display all access tracks of the page as a landing page.
  • the above-mentioned device 500 also includes: an exit annotation unit (not shown in the figure) and an exit query unit (not shown in the figure).
  • the above-mentioned exit labeling unit can be configured to perform exit page labeling on the last page of the access track corresponding to each aggregation cluster.
  • the above-mentioned exit query unit may be configured to, in response to receiving a query condition that uses a page as an exit page, obtain and display all access trajectories of this page as a landing page.
  • the pages of the above-mentioned application include: at least one landing page, and the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track.
  • the above-mentioned optimization unit 504 includes: a calculation module (not shown in the figure), page optimization module (not shown in the figure).
  • the above-mentioned computing module can be configured to target at least one A page is used as a landing page.
  • the next node of the landing page in the access trajectory is a bounce node
  • the ratio of the number of terminal devices of the bounce node to the number of terminal devices of the landing page at the first node of the access trajectory is calculated.
  • the above page optimization module can be configured to optimize the landing page in response to a ratio greater than the average exit rate of all landing pages to obtain an optimized application.
  • the above-mentioned optimization unit 504 includes: a query module (not shown in the figure), a conversion module (not shown in the figure), and an application optimization module (not shown in the figure).
  • the above query module can be configured to query the preset access trajectories from the access trajectories corresponding to all aggregation clusters.
  • the above-mentioned conversion module may be configured to calculate the conversion rate of each page in the preset access track based on the terminal device information of each page in the preset access track.
  • the above-mentioned application optimization module may be configured to optimize the page in response to the conversion rate of a page in the preset access track being less than the conversion rate of pages other than the page in the preset access track to obtain an optimized application.
  • the access track of each aggregation cluster is provided with a jump node adjacent to the last page of the access track.
  • the above-mentioned optimization unit 504 includes: a traversal module (not shown in the figure), a node optimization module (not shown in the figure).
  • the above traversal module can be configured to traverse the access trajectories corresponding to all aggregation clusters, and calculate the number of terminal devices in all access trajectories that jump out of the page before the node.
  • the above-mentioned node optimization module may be configured to respond to the number of terminal devices that jump out of the page before the node is greater than the preset threshold, optimize the page before jumping out of the node, and obtain an optimized application.
  • the collection unit 501 first collects the access logs of the pages of the application accessed by different terminal devices through the embedded data of the application; secondly, the acquisition unit 502 obtains at least one preset time based on the access log. The access track of each terminal device to the application page in at least one visit during the cycle; again, the aggregation unit 503 performs aggregate statistics on the terminal devices with the same access track, and obtains an aggregation cluster corresponding to the access track and terminal device information; finally, The optimization unit 504 optimizes the application pages based on the access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application. Therefore, based on the access trajectory of the terminal device to the application page in one visit, clustering the terminal devices can determine all terminal devices with the same access trajectory, providing an effective optimization basis for the application page and improving application optimization. efficiency, improved user experience.
  • FIG. 6 a schematic structural diagram of an electronic device 600 suitable for implementing embodiments of the present disclosure is shown.
  • the electronic device 600 may include a processing device (eg, central processing unit, graphics processor, etc.) 601, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608.
  • the program in the memory (RAM) 603 executes various appropriate actions and processes.
  • various programs and data required for the operation of the electronic device 600 are also stored.
  • the processing device 601, ROM 602 and RAM 603 are connected to each other via a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604.
  • the following devices can be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, etc.; output devices including, for example, a liquid crystal display (LCD, Liquid Crystal Display), speakers, vibrators, etc. 607; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609.
  • Communication device 609 may allow electronic device 600 to communicate wirelessly or wiredly with other devices to exchange data.
  • FIG. 6 illustrates electronic device 600 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided. Each block shown in Figure 6 may represent one device, or may represent multiple devices as needed.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via communication device 609, or from storage device 608, or from ROM 602.
  • the processing device 601 the above-described functions defined in the method of the embodiment of the present disclosure are performed.
  • Computer-readable medium in the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • Computer-readable storage media may be, for example, but not limited to, electronic, magnetic, optical, electromagnetic, Infrared, or semiconductor systems, devices or devices, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmd read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code contained on a computer-readable medium can be transmitted using any appropriate medium, including but not limited to: wires, optical cables, RF (Radio Frequency, Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned server; it may also exist separately without being assembled into the server.
  • the above-mentioned computer-readable medium carries one or more programs.
  • the server collects the access logs of the pages of the application accessed by different terminal devices through the buried point data of the application; based on Access log, obtain the access track of each terminal device to the application page in at least one visit in at least one preset time period; perform aggregate statistics on terminal devices with the same access track, and obtain an aggregation of the corresponding access track and terminal device information Clusters; based on the access trajectories and terminal device information corresponding to each aggregated cluster, the application page is optimized to obtain an optimized application.
  • Computer program code for performing operations of embodiments of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, and A conventional procedural programming language—such as "C" or a similar programming language.
  • Program code can be completely Execute partly on the user's computer, execute partly on the user's computer, execute as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as an Internet service provider through Internet connection
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in software or hardware.
  • the described unit can also be provided in a processor.
  • a processor including a collection unit, an acquisition unit, an aggregation unit, and an optimization unit.
  • the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • the collection unit can also be described as "configured to collect the pages of the application accessed by different terminal devices through the embedded data of the application.” Access Log" unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

La présente invention se rapporte aux domaines techniques de l'analyse de données et analogues, et concerne un procédé et un appareil de traitement de données d'accès. Un mode de réalisation spécifique du procédé consiste : à collecter, au moyen de données de point enfoui d'une application, des journaux d'accès de différents dispositifs terminaux accédant à une page de l'application; sur la base des journaux d'accès, à acquérir une trajectoire d'accès de chaque dispositif terminal vers la page de l'application dans au moins un accès dans au moins une période prédéfinie; à effectuer des statistiques d'agrégation sur les dispositifs terminaux ayant la même trajectoire d'accès afin d'obtenir des grappes d'agrégation correspondant à la trajectoire d'accès et des informations de dispositifs terminaux; et à optimiser la page de l'application sur la base de la trajectoire d'accès et des informations de dispositifs terminaux correspondant à chaque grappe d'agrégation afin d'obtenir l'application optimisée.
PCT/CN2023/076143 2022-03-10 2023-02-15 Procédé et appareil de traitement de données d'accès, dispositif électronique et support lisible par ordinateur WO2023169165A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210231228.6 2022-03-10
CN202210231228.6A CN114595473A (zh) 2022-03-10 2022-03-10 访问数据处理方法和装置、电子设备、计算机可读介质

Publications (1)

Publication Number Publication Date
WO2023169165A1 true WO2023169165A1 (fr) 2023-09-14

Family

ID=81808906

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/076143 WO2023169165A1 (fr) 2022-03-10 2023-02-15 Procédé et appareil de traitement de données d'accès, dispositif électronique et support lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN114595473A (fr)
WO (1) WO2023169165A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114595473A (zh) * 2022-03-10 2022-06-07 北京京东拓先科技有限公司 访问数据处理方法和装置、电子设备、计算机可读介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040133671A1 (en) * 2003-01-08 2004-07-08 David Taniguchi Click stream analysis
CN103823904A (zh) * 2014-03-19 2014-05-28 广东绿瘦健康信息咨询有限公司 网页浏览路径优化方法及其系统
CN106909567A (zh) * 2015-12-23 2017-06-30 北京国双科技有限公司 数据处理方法及装置
CN109242164A (zh) * 2018-08-22 2019-01-18 中国平安人寿保险股份有限公司 优化产品路径的方法及装置、计算机存储介质、电子设备
CN113590974A (zh) * 2021-09-29 2021-11-02 北京每日优鲜电子商务有限公司 推荐页面配置方法、装置、电子设备和计算机可读介质
CN114595473A (zh) * 2022-03-10 2022-06-07 北京京东拓先科技有限公司 访问数据处理方法和装置、电子设备、计算机可读介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040133671A1 (en) * 2003-01-08 2004-07-08 David Taniguchi Click stream analysis
CN103823904A (zh) * 2014-03-19 2014-05-28 广东绿瘦健康信息咨询有限公司 网页浏览路径优化方法及其系统
CN106909567A (zh) * 2015-12-23 2017-06-30 北京国双科技有限公司 数据处理方法及装置
CN109242164A (zh) * 2018-08-22 2019-01-18 中国平安人寿保险股份有限公司 优化产品路径的方法及装置、计算机存储介质、电子设备
CN113590974A (zh) * 2021-09-29 2021-11-02 北京每日优鲜电子商务有限公司 推荐页面配置方法、装置、电子设备和计算机可读介质
CN114595473A (zh) * 2022-03-10 2022-06-07 北京京东拓先科技有限公司 访问数据处理方法和装置、电子设备、计算机可读介质

Also Published As

Publication number Publication date
CN114595473A (zh) 2022-06-07

Similar Documents

Publication Publication Date Title
US11106442B1 (en) Information technology networked entity monitoring with metric selection prior to deployment
US11244247B2 (en) Facilitating concurrent forecasting of multiple time series
US11947556B1 (en) Computerized monitoring of a metric through execution of a search query, determining a root cause of the behavior, and providing a notification thereof
US11593400B1 (en) Automatic triage model execution in machine data driven monitoring automation apparatus
US11620300B2 (en) Real-time measurement and system monitoring based on generated dependency graph models of system components
US11693895B1 (en) Graphical user interface with chart for event inference into tasks
US11379475B2 (en) Analyzing tags associated with high-latency and error spans for instrumented software
US10942960B2 (en) Automatic triage model execution in machine data driven monitoring automation apparatus with visualization
US20190095478A1 (en) Information technology networked entity monitoring with automatic reliability scoring
US11886475B1 (en) IT service monitoring by ingested machine data with KPI prediction and impactor determination
US11989242B2 (en) Generating sequential segments with pre-sequence or post-sequence analytics data
US10657146B2 (en) Techniques for generating structured metrics from ingested events
Zheng et al. Service-generated big data and big data-as-a-service: an overview
US9171319B2 (en) Analysis system and method used to construct social structures based on data collected from monitored web pages
US20170220672A1 (en) Enhancing time series prediction
US20070271519A1 (en) System and Method for Collecting User Interest Data
US8639560B2 (en) Brand analysis using interactions with search result items
CN110020273B (zh) 用于生成热力图的方法、装置以及系统
WO2023169165A1 (fr) Procédé et appareil de traitement de données d'accès, dispositif électronique et support lisible par ordinateur
WO2013143407A1 (fr) Traitement de données et collecte de données
US11663109B1 (en) Automated seasonal frequency identification
US11676345B1 (en) Automated adaptive workflows in an extended reality environment
CN111488386A (zh) 数据查询方法和装置
WO2022212724A1 (fr) Fourniture d'expériences guidées par des données et interplateformes basée sur de cohortes comportementales et une résolution d'identité
Diakun et al. Splunk Operational Intelligence Cookbook: Over 80 recipes for transforming your data into business-critical insights using Splunk

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23765743

Country of ref document: EP

Kind code of ref document: A1