WO2023169165A1

WO2023169165A1 - Access data processing method and apparatus, electronic device, and computer readable medium

Info

Publication number: WO2023169165A1
Application number: PCT/CN2023/076143
Authority: WO
Inventors: 吴轶伦
Original assignee: 北京京东拓先科技有限公司
Priority date: 2022-03-10
Filing date: 2023-02-15
Publication date: 2023-09-14
Also published as: CN114595473A

Abstract

The present invention relates to the technical fields of data analysis and the like, and disclosed are an access data processing method and apparatus. A specific embodiment of the method comprises: collecting, by means of buried point data of an application, access logs of different terminal devices accessing a page of the application; on the basis of the access logs, acquiring an access trajectory of each terminal device to the page of the application in at least one access in at least one preset period of time; performing aggregation statistics on the terminal devices having the same access trajectory to obtain aggregation clusters corresponding to the access trajectory and terminal device information; and optimizing the page of the application on the basis of the access trajectory and terminal device information corresponding to each aggregation cluster to obtain the optimized application.

Description

Access data processing methods and devices, electronic equipment, computer-readable media

Cross-references to related applications

This patent application claims priority to the Chinese patent application submitted on March 10, 2022, with the application number 202210231228.6 and the invention title "Access data processing method and device, electronic equipment, computer-readable medium". The full text of the application incorporated by reference into this application.

Technical field

The present disclosure relates to the field of computer technology, specifically to data analysis and other technical fields, and in particular to access data processing methods and devices, electronic equipment, computer-readable media and computer program products.

Background technique

For web pages or applications, a large number of users visit every day and click on every part of the page. If the user's behavior track on the page can be accurately obtained, it will be very helpful for the improvement of Internet products and the convenience of user operations. of.

Contents of the invention

Embodiments of the present disclosure provide access data processing methods and devices, electronic devices, computer-readable media, and computer program products.

In one or more embodiments of the present disclosure, an access data processing method is provided. The method includes: collecting access logs of pages of the application accessed by different terminal devices through buried point data of the application; based on the access log, obtaining at least The access track of each terminal device to the application page in at least one visit in a preset time period; aggregate statistics on terminal devices with the same access track to obtain an aggregation cluster of corresponding access tracks and terminal device information; based on each aggregation The access track and terminal device information corresponding to the cluster are used to optimize the application page and obtain the optimized application.

In some embodiments, the above method further includes: receiving a query track, where the query track includes at least one page of the application; and matching the query track with the access track corresponding to each aggregation cluster. The traces are matched; in response to determining that the query trace matches the access trace corresponding to the aggregation cluster, the terminal device information corresponding to the aggregation cluster is obtained and displayed.

In some embodiments, the above method further includes: using a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.

In some embodiments, the above method further includes: labeling all pages of the application; and in response to receiving the label of the page, performing statistics on access trajectories where the page is located to obtain access trajectory statistical results.

In some embodiments, the above method also includes: marking the landing page for the first page of the access track corresponding to each aggregation cluster; in response to receiving a query condition using one page as the landing page, obtaining and displaying all the pages as the landing page. The access track of the landing page.

In some embodiments, the above method also includes: marking the exit page for the last page of the access track corresponding to each aggregation cluster; in response to receiving a query condition using one page as the exit page, obtaining and displaying all the pages as landing pages. Page access track.

In some embodiments, the pages of the above-mentioned application include: at least one landing page, and the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track. The above-mentioned access based on each aggregation cluster corresponds to Trajectory and terminal device information, optimize the application page, and obtain the optimized application, including: targeting at least one page as a landing page, when the next node of the landing page in the access trajectory is a bounce node, calculating the bounce node The ratio of the number of terminal devices to the number of terminal devices for which the landing page is at the first node of the access trajectory; in response to the ratio being greater than the average exit rate of all landing pages, the landing page is optimized to obtain an optimized application.

In some embodiments, the above-mentioned optimization of application pages based on the access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application includes: querying the preset access trajectories from the access trajectories corresponding to all aggregation clusters; Based on the terminal device information of each page in the preset access track, calculate the conversion rate of each page in the preset access track; in response to the fact that the conversion rate of a page in the preset access track is less than that of pages other than the page in the preset access track conversion rate, optimize the page, and get the optimized application.

In some embodiments, the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track, and the application pages are optimized based on the access track and terminal device information corresponding to each aggregation cluster. , optimized applications include: Traverse the access trajectories corresponding to all aggregated clusters, and calculate the number of terminal devices in all access trajectories that jump out of the page before the node; in response to the number of terminal devices that jump out of the page before the node being greater than the preset threshold, optimize the page before jumping out of the node , get the optimized application.

In one or more embodiments of the present disclosure, an access data processing device is provided. The device includes: a collection unit configured to collect access logs of pages of the application accessed by different terminal devices through buried point data of the application; The acquisition unit is configured to obtain, based on the access log, access trajectories of each terminal device to the page of the application in at least one visit in at least one preset time period; the aggregation unit is configured to perform an aggregation operation on terminal devices with the same access trajectories. Aggregate statistics to obtain aggregation clusters corresponding to access trajectories and terminal device information; the optimization unit is configured to optimize application pages based on access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application.

In some embodiments, the above device further includes: a receiving unit configured to receive a query track, where the query track includes at least one page of the application; a matching unit configured to match the query track with access tracks corresponding to each aggregation cluster. ; The obtaining unit is configured to obtain and display the terminal device information corresponding to the aggregation cluster in response to determining that the query trace matches the access trace corresponding to the aggregation cluster.

In some embodiments, the above apparatus further includes: a display unit configured to use a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.

In some embodiments, the above-mentioned device further includes: a page labeling unit configured to label all pages of the application; a statistics unit configured to respond to receiving the label of the page and perform statistics on the access track where the page is located, Obtain the access trajectory statistics results.

In some embodiments, the above device further includes: a landing annotation unit configured to annotate the landing page for the first page of the access track corresponding to each aggregation cluster; a landing query unit configured to respond to receiving a page as the landing page. When querying the page, get and display all the access tracks of this page as a landing page.

In some embodiments, the above device further includes: an exit labeling unit configured to label the last page of the access track corresponding to each aggregation cluster as an exit page; an exit query unit configured to respond to receiving a page as the exit page When the query conditions are Go to and display all access tracks of this page as a landing page.

In some embodiments, the pages of the above-mentioned application include: at least one landing page, and the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track. The above-mentioned optimization unit includes: a computing module, It is configured to target at least one page as a landing page. When the next node of the landing page in the access track is a bounce node, calculate the number of terminal devices of the bounce node and the number of terminal devices of the landing page at the first node of the access track. The proportion; the page optimization module is configured to respond to the proportion being greater than the average exit rate of all landing pages, optimize the landing page, and obtain an optimized application.

In some embodiments, the above-mentioned optimization unit includes: a query module, configured to query preset access trajectories from access trajectories corresponding to all aggregated clusters; a conversion module, configured to based on the terminal device information of each page of the preset access trajectories , calculate the conversion rate of each page in the preset access track; the application optimization module is configured to respond to the conversion rate of a page in the preset access track being less than the conversion rate of pages other than the page in the preset access track, This page is optimized and the optimized application is obtained.

In some embodiments, the access track of each aggregation cluster is provided with a jump node adjacent to the last page of the access track. The above-mentioned optimization unit includes: a traversal module configured to traverse the access tracks corresponding to all aggregation clusters. , calculate the number of terminal devices in all access trajectories that jump out of the page before the node; the node optimization module is configured to respond to the number of terminal devices that jump out of the page before the node is greater than the preset threshold, optimize the page before jumping out of the node, Get optimized applications.

In one or more embodiments of the present disclosure, an electronic device is provided. The electronic device includes: one or more processors; a storage device with one or more programs stored thereon; when one or more programs Executed by one or more processors, causing the one or more processors to implement the method described in any of the above implementations.

In one or more embodiments of the present disclosure, a computer-readable medium is provided, with a computer program stored thereon. When the program is executed by a processor, the method described in any of the above implementations is implemented.

In one or more embodiments of the present disclosure, a computer program product is provided, including a computer program. When executed by a processor, the computer program implements the method described in any of the above implementations.

Description of the drawings

Other features, objects and advantages of the present disclosure will become more apparent upon reading the detailed description of the non-limiting embodiments with reference to the following drawings:

Figure 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;

Figure 2 is a flow chart of an embodiment of an access data processing method according to the present disclosure;

Figure 3 is a flow chart of another embodiment of an access data processing method according to the present disclosure;

Figure 4 is a schematic diagram using a Sankey diagram to display access trajectories according to the present disclosure;

Figure 5 is a schematic structural diagram of an embodiment of an access data processing device according to the present disclosure;

FIG. 6 is a schematic structural diagram of an electronic device suitable for implementing embodiments of the present disclosure.

Detailed ways

The present disclosure will be further described in detail below in conjunction with the accompanying drawings and examples. It can be understood that the specific embodiments described here are only used to explain the relevant invention, but not to limit the invention. It should also be noted that, for convenience of description, only the parts related to the invention are shown in the drawings.

It should be noted that, as long as there is no conflict, the embodiments and features in the embodiments of the present disclosure can be combined with each other. The present disclosure will be described in detail below in conjunction with embodiments with reference to the accompanying drawings.

FIG. 1 illustrates an exemplary system architecture 100 to which the access data processing method of the present disclosure may be applied.

As shown in Figure 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 is a medium used to provide communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types and may typically include wireless communication links and the like.

The terminal devices 101, 102, 103 interact with the server 105 through the network 104 to receive or send messages, etc. Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as instant messaging tools, email clients, etc.

The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they may be devices with communication and control functions. superior The device can communicate with the server 105. When the terminal devices 101, 102, and 103 are software, they can be installed in the above-mentioned terminals. It may be implemented as multiple software or software modules (such as software or software modules used to access applications), or as a single software or software module. There are no specific limitations here.

The server 105 may be a server that provides various services, such as an application server that provides support for applications on the terminal devices 101, 102, and 103. The application server can analyze and process the relevant information of each terminal in the network, and feed back the processing results (such as optimized application installation programs, etc.) to the terminal device.

It should be noted that the server can be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, software or software modules used to provide distributed services), or it can be implemented as a single software or software module. There are no specific limitations here.

It should be noted that the access data processing method provided by the embodiment of the present disclosure is generally executed by the server 105.

As shown in Figure 2, a process 200 of an embodiment of an access data processing method according to the present disclosure is shown. The access data processing method includes the following steps:

Step 201: Collect access logs of pages of the application accessed by different terminal devices through the embedded data of the application.

In this embodiment, the hidden data of the application is obtained by burying data on the application page. Buried data is a common data collection method for page analysis. Specifically, it refers to injecting a paragraph or paragraph into the source code corresponding to each page of the application. Several scripts are used to obtain the behavior data corresponding to the operation events of the terminal device operation page. The behavior data of multiple different time periods are combined together to obtain the access log of the terminal device.

In this embodiment, the buried point data of the application includes the behavior data of at least one terminal device operating the application. The terminal device information in the buried point data can be used to determine the behavior data of the terminal device in at least one terminal device. The terminal device The behavioral data are arranged together according to the timestamp, and the access log of the terminal can be obtained.

Access the execution principal on which the data processing method runs (such as the service shown in Figure 1 Access logs can be obtained through multiple channels, for example, by using tracking tools to obtain tracking data of applications installed on terminal devices (such as terminals 101, 102, and 103 shown in Figure 1) in real time, or by obtaining pre-set tracking data from a database. Buried data after burying the application.

In this embodiment, the access log may be a browsing log of the terminal device browsing the application page. When the user browses the application page through the terminal device, the tracking data reported to the execution subject by the tracking tool or script may include: reporting site, The unique identifier of the page, the number that identifies the same visit, the device number, the request time period, etc. Every time the terminal device browses, the tracking tool or script will generate a piece of tracking data and store the tracking data.

In this embodiment, the page of the application can be a page opened through a browser, or a native page of the application, or a page embedded in the application that needs to be invoked; the access log is the data generated by multiple device terminals in different visits. A collection of behavioral data for operations on application pages. A visit refers to a device terminal accessing an application. The user opening the application through the device terminal is the beginning of a visit. The user closes the application or returns the application to the background to run for a set time ( For example, 30min) without operation. If a user visits an application multiple times within a preset time period, the execution entity will record multiple visits.

Step 202: Based on the access log, obtain the access track of each terminal device to the page of the application in at least one visit in at least one preset time period.

In this embodiment, the access log is a collection of behavior data of multiple terminal devices operating on the application page. For one terminal device among the multiple terminal devices, the access log can be used to obtain the information about the terminal device's operation of the application page. Visit trajectories during visits.

In this embodiment, the application has multiple pages, and the access track is a record of the user accessing the application through the terminal device and operating each page in the application. For example, the application includes a home page, an activity page, a search page, and a details page. The user accesses the application through the device. The terminal browses the homepage, activity page, homepage, and search page of the application in sequence, and the access track corresponding to the device terminal is homepage-activity page-homepage-search page.

In this embodiment, the access track includes multiple nodes, each node corresponding to a page of the application. Any two nodes among the multiple nodes can be the same or different. For example, the access track above has two home pages, that is, in the access track In the track, there are two nodes corresponding to the same page.

In this embodiment, the preset time period can be set according to application development requirements. For example, the preset time period is one day, or half a day, etc. Daily scheduled tasks can be used to calculate the access track of each terminal device to the application's page in at least one visit through the basic traffic data obtained. For example, the calculation task is executed at 3 a.m. every day. The access track of each terminal device to the application page in one visit in the preset time period refers to the collection of pages viewed by the same terminal device in the same visit sequence in the preset time period. The above-mentioned obtaining the access track of each terminal to the application page in at least one visit in at least one preset time period includes: taking the preset time period as the cycle, sorting the pages browsed by the same user in the same visit according to the timestamp from first to last. , perform adjacent deduplication according to the unique identifier of the page (two adjacent nodes in the access trajectory are the same and remove one of the nodes), and obtain the access trajectory of the terminal device in the preset time period.

Optionally, an exit identifier can be added as a unified APP exit after the last page in the access track of all terminal devices, that is, exit is the exit node of the access track. For example, the access track of terminal A is home page -> activity page -> home page -> search page -> exit.

Step 203: Aggregate statistics on terminal devices with the same access trajectories to obtain aggregation clusters corresponding to access trajectories and terminal device information.

In this embodiment, the same access trajectory corresponds to at least one terminal device. When terminal devices with the same access trajectory are aggregated together, an aggregation cluster corresponding to the access trajectory can be obtained. The obtained aggregation cluster can include at least one terminal device. Terminal device information.

In this embodiment, the terminal device information is information related to the terminal device. For example, the terminal device information includes: a unique identifier of the terminal device, a device number of the terminal device, and the number of terminal devices. It should be noted that the number of terminal devices is the sum of the number of at least one terminal with the same access trajectory. For example, the access trajectory of terminal A is home page -> activity page -> home page -> search page -> exit, and only on that day The access track of terminal B is exactly the same as that of terminal A. The access track of terminal B is also home page -> activity page -> home page -> search page -> exit. The number of terminal devices on this access track that day is 2.

In this embodiment, the access track is a page or node arranged in forward chronological order. The access track clearly identifies the pages accessed from the first step to the Nth. Specifically, the aggregation table of the aggregation cluster is shown in Table 1.

Table 1

In Table 1, the first to fourth steps are the accessed pages of the access track, or the set nodes (such as exit nodes). The accessed pages include: activity page, details page, search page, homepage, list page, These access pages are related to the application. When the application is different, the pages contained are different, and the names of the pages are also different accordingly.

Step 204: Based on the access trajectories and terminal device information corresponding to each aggregation cluster, optimize the application page to obtain an optimized application.

In this embodiment, the application may have one or more pages. Based on the access trajectories and terminal device information corresponding to each aggregation cluster, the application pages may be optimized. This may be to optimize one page of the application or to optimize multiple pages. Page optimization. Among them, the optimization of the page can include: deleting the page, rearranging the page, modifying the content of the page and other optimization methods.

In this embodiment, an aggregation cluster corresponds to an access track in one visit. An aggregation cluster may have terminal device information of one or more terminal devices, and the one or more terminal devices correspond to an access track in the same visit. .

In some optional implementations of this embodiment, the pages of the above application include: at least one landing page, and the access track of each aggregate cluster is provided with a jump node adjacent to the last page of the access track. Based on each Aggregate the access trajectories and terminal device information corresponding to the cluster, optimize the application pages, and obtain the optimized application, including: targeting at least one page as a landing page, when the next node of the landing page in the access trajectory is a bounce node, Calculate the ratio of the number of terminal devices that jump out of the node to the number of terminal devices that are at the first node of the access trajectory for the landing page; in response to the ratio being greater than the average exit rate of all landing pages, optimize the landing page to obtain the optimized application .

In this embodiment, the landing page refers to a specific activity with a clear theme that the visitor sees in other places outside the application, such as attractive discount information published through emails, social media or advertisements, etc., and is linked to after clicking. The first page of the application. This landing page can be Any page in the application, for example, the landing page is the details page in the application, or the landing page is the home page in the application, etc.

In this optional implementation, the exit node is the last node in all access trajectories. Through the exit node, it can be determined that the terminal device has finished accessing the application.

In this optional implementation, exit rate: the number of terminal devices whose next node on the landing page is the exit node, as a proportion of the number of terminal devices on the first page of the landing page's access track. The average exit rate is calculated by counting the traffic of all landing pages. The obtained value, specifically, the average exit rate has three calculation methods: weighted average, arithmetic average and geometric average:

Weighted average: The sum of the number of terminal devices that jump out of the node after the landing page in all access trajectories in the application is used as the numerator, and the sum of the number of terminal devices that have the landing page on the first page of the access trajectory is used as the denominator. The proportion value obtained by comparison .

Arithmetic average: The exit rates of all landing pages are directly added up and divided by the number of landing pages.

Geometric mean: The exit rates of all landing pages are multiplied and raised to the power N. N is equal to the number of landing pages.

In this embodiment, there is not only one landing page, but there can be many landing pages. The definition of landing page is that the first page of the application is called the landing page.

The mean value of the ratio of the number of terminal devices that jump out of the node after the landing page in all access trajectories in the application to the number of terminal devices that have the landing page at the first node of the access trajectory.

In this optional implementation, by calculating the ratio of the number of terminal devices that jump out of the node to the number of terminal devices with the landing page at the first node of the access trajectory, the access situation of the landing page can be effectively analyzed, providing a reliable basis for improving the landing page. in accordance with.

In some optional implementations of this embodiment, the application pages are optimized based on the access trajectories and terminal device information corresponding to each aggregation cluster, and the optimized application is obtained, including: from the access trajectories corresponding to all aggregation clusters Query the preset access track; based on the terminal device information of each page of the preset access track, calculate the conversion rate of each page in the preset access track; in response to the conversion rate of a page in the preset access track being less than the preset access track except The conversion rate of pages other than this page, the page is optimized, and the optimized application is obtained.

In this optional implementation, the conversion rate is related to the number of terminal device visits to each page. For example, if a visit track is from the search page to the product details page, then the conversion rate of the search page is: the ratio with the number of terminal devices on the search page as the denominator and the number of terminal devices from the search page to the product details page as the numerator.

In this optional implementation, by focusing on the core page as the conversion rate of any page in the process, when the conversion rate of a certain page in the entire access track is lower than the conversion rate of the remaining other pages, the page is a page to be optimized and needs to be Consider removing or changing this page.

In this optional implementation method, the conversion rate of each page of the application is calculated to ensure the optimization effect of the core pages in the application and improve the efficiency of application optimization.

In some optional implementations of this embodiment, the access track of each aggregation cluster is provided with a jump-out node adjacent to the last page of the access track. The above is based on the access track and terminal device information corresponding to each aggregation cluster. , optimize the application pages to obtain the optimized application, including: traversing the access trajectories corresponding to all aggregation clusters, calculating the number of terminal devices in all access trajectories that jump out of the page before the node; responding to the terminal devices that jump out of the page before the node If the number of devices is greater than the preset threshold, the page before jumping out of the node will be optimized to obtain an optimized application.

In this optional implementation, by traversing the pages before the jump node (bounce page) in all access trajectories, the traffic of all jump pages (the number of terminal devices) is counted. The jump page with the highest traffic is the high-frequency jump page. Through optimization Frequently jumping out of the page, considering the reasons for losing users on the page and the optimization direction, can effectively optimize the application.

In this optional implementation, by analyzing the number of terminal devices in all access trajectories that jump out of the page before the node, the high-frequency jumping pages in the application are obtained, which provides a reliable basis for optimizing the application's pages.

The access data processing method provided by the embodiments of the present disclosure can view the path distribution of users when using products through cluster information without configuration, and supports global observation of users' hot pages and main access trajectories in products. At the same time, users can be grouped according to the terminal device information and access trajectories in the cluster, supporting the comparison of the differences in behavioral paths of different types of users. When there are landing pages in the access track, you can also filter different landing pages as a starting point to view the user's subsequent path distribution. It can not only view the global user path distribution with zero configuration, but also meet the needs of configurable analysis to a certain extent.

The access data processing method provided by the embodiment of the present disclosure first collects the access logs of the pages of the application accessed by different terminal devices through the embedded data of the application; secondly, based on the access data, Query the log to obtain the access track of each terminal device to the application page in at least one visit in at least one preset time period; thirdly, perform aggregate statistics on the terminal devices with the same access track to obtain the corresponding access track and terminal device information aggregation clusters; finally, based on the access trajectories and terminal device information corresponding to each aggregation cluster, the application page is optimized to obtain an optimized application. Therefore, based on the access trajectory of the terminal device to the application page in one visit, clustering the terminal devices can determine all terminal devices with the same access trajectory, providing an effective optimization basis for the application page and improving application optimization. Efficiency and improved user experience.

In order to better analyze access trajectories, in another embodiment of the present disclosure, the above-mentioned access data processing method also includes: receiving query trajectories, where the query trajectories include at least one page of the application; and corresponding query trajectories to each aggregation cluster. The access traces are matched; in response to determining that the query trace matches the access trace corresponding to the aggregation cluster, the terminal device information corresponding to the aggregation cluster is obtained and displayed.

In this embodiment, in order to better query the access track, the query track sent by the developer can be received. The query track can be the same as the access track corresponding to the aggregation cluster, or it can be different from the access track corresponding to the aggregation cluster. Trajectory, when the query trajectory is the same as the access trajectory, it is determined that the query trajectory matches the access trajectory corresponding to the aggregation cluster. Accordingly, the terminal device information of the aggregation cluster corresponding to the query trajectory can be obtained, such as the number of terminal devices, so that the query trajectory can be analyzed. Data access traffic.

In this embodiment, the information of the terminals of the aggregation cluster is queried through the query trajectory, which provides a reliable query basis for the data access status of the application page and ensures the reliability of the application access data analysis.

In order to collect statistics on access to application pages, in another embodiment of the present disclosure, the above access data processing method also includes: labeling all pages of the application; in response to receiving the label of the page, labeling the page where the page is located. Perform statistics on access trajectories to obtain access trajectories statistical results.

In this embodiment, all pages of the application are tagged, and the generated access trajectories also have tagged tags. After the access trajectories of any page among all pages are unified, Timing can determine the access track where the page is located and the aggregation cluster where the page is located, and then the number of access tracks, the number of aggregation clusters and other information can be obtained.

In this embodiment, the access trajectory statistics results include: access trajectory name, access trajectory number, aggregation cluster corresponding to the access trajectory, number of terminal devices corresponding to the access trajectory, etc.

In this embodiment, tags can be text, symbols, codes, etc. The tag of the received page can uniquely indicate the page of the application. By setting tags for the page, independent query of the page can be supported, as well as the access track of the received page. Statistics, thereby realizing an interleaved query method with access trajectories as horizontal queries and pages as vertical queries.

In this embodiment, each page is tagged, and the user path is tagged according to the included page. In the front-end configuration, the user can view the access track through the page after inputting the unique identifier of the page of interest. Distribution, you can also effectively query the access trajectory statistics such as the number of terminal devices in the aggregate cluster, which improves the reliability of page retrieval.

In order to count the landing pages of application pages, in another embodiment of the present disclosure, the above method further includes: marking the landing page for the first page of the access track corresponding to each aggregation cluster; in response to receiving a page as When querying the landing page, obtain and display all the access tracks of this page as a landing page.

In this embodiment, by marking the landing page on the first page of all access trajectories, query conditions for the access trajectories can be added; for example, when an access trajectory includes: details page and search page; then the details page is used as the landing page. The query conditions are: when the details page is used as the landing page, query and display all access trajectories with the details page as the landing page. It should be noted that the Sankey diagram can be used to display and query the access trajectories.

In this embodiment, the landing page is marked for the first page of the access track corresponding to each aggregation cluster. When one page is used as the query condition of the landing page, all access tracks of this page as the landing page are obtained and displayed, which improves access Richness of trajectory queries.

In order to collect statistics on landing pages of application pages, in another embodiment of the present disclosure, the above method also includes: marking the last page of the access track corresponding to each aggregation cluster as an exit page; in response to receiving a page as the exit Page query conditions are obtained and Display all access tracks of this page as a landing page.

In this embodiment, the exit page is marked on the last page of all access trajectories, and query conditions for the access trajectories can be added; for example, when an access trajectory includes: details page and search page; then the search page is used as the exit page. When querying The conditions are: when the search page is used as the exit page, query and display the access trajectories in all access trajectories that use the search page as the exit page. It should be noted that a Sankey diagram can be used to display the access trajectories.

In this embodiment, the exit page is marked for the last page of the access track corresponding to each aggregation cluster. When one page is used as the query condition for the exit page, all the access tracks for this page as the exit page are obtained and displayed, which improves the access track. The richness of the query.

Please refer to FIG. 3 , which shows a process 300 of another embodiment of the access data processing method provided by the present disclosure. The access data processing method may include the following steps:

Step 301: Collect access logs of pages of the application accessed by different terminal devices through the embedded data of the application.

Step 302: Based on the access log, obtain the access track of each terminal device to the page of the application in at least one visit in at least one preset time period.

Step 303: Perform aggregation statistics on terminal devices with the same access trajectories to obtain aggregation clusters corresponding to access trajectories and terminal device information.

Step 304: Use a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.

Sankey diagram: Sankey energy distribution diagram, also called Sankey energy balance diagram. It is a specific type of flowchart in which the width of the extended branches corresponds to the size of the data flow.

Specifically, you can use visualization tools to call application layer data and drag and drop fields to complete the construction of the Sankey diagram. Drag the first, second, and Nth (N>1) step fields in each access track to the dimension in sequence, set the indicator to the user number field, and then set the sorting conditions (for example, descending order of user number).

Without any conditional filtering, the Sankey diagram can show the access trajectory distribution of an application without configuration. When the mouse slides over a certain step, the number of terminal devices passing through the page is prompted; when the mouse slides between any two steps of the access track, the prompt is The number of terminal devices that go from one page to another.

As shown in Figure 4, it is a schematic diagram showing access trajectories using Sankey diagram. In Figure 4, S, T, W, V, M, N, and U represent different pages in the application. Correspondingly, you can also The corresponding position of the Sankey diagram displays terminal device information (such as the number of terminal devices, ID, etc.).

Step 305: Based on the access trajectories and terminal device information corresponding to each aggregation cluster, optimize the application page to obtain an optimized application.

It should be understood that the operations and features in the above steps 301 to 303 and 305 respectively correspond to the operations and features in steps 201 to 204. Therefore, the descriptions of the operations and features in the above steps 201 to 204 are also the same. It is applicable to step 301 to step 303 and step 305, and will not be described again here.

The access data processing method provided in this embodiment uses a Sankey diagram to display the terminal device information and access trajectories before optimizing the application page, which can visually represent the aggregation clusters and provide a vivid trajectory display effect for application improvement. .

With further reference to Figure 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an access data processing device. The device embodiment corresponds to the method embodiment shown in Figure 2. The device can be specifically applied in various electronic devices.

As shown in Figure 5, an embodiment of the present disclosure provides an access data processing device 500. The device 500 includes: a collection unit 501, an acquisition unit 502, an aggregation unit 503, and an optimization unit 504. The above-mentioned collection unit 501 may be configured to collect access logs of pages of the application accessed by different terminal devices through buried point data of the application. The above-mentioned obtaining unit 502 may be configured to obtain, based on the access log, the access track of each terminal device to the page of the application in at least one visit in at least one preset time period. The above-mentioned aggregation unit 503 may be configured to perform aggregation statistics on terminal devices with the same access trajectories, and obtain aggregation clusters corresponding to access trajectories and terminal device information. The above-mentioned optimization unit 504 may be configured to optimize application pages based on access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application.

In this embodiment, in the access data processing device 500, the specific processing of the collection unit 501, the acquisition unit 502, the aggregation unit 503, and the optimization unit 504 and the technical effects they bring can be referred to the steps in the corresponding embodiment of Figure 2 respectively. 201, step 202, step 203, Step 204.

In some embodiments, the above-mentioned device 500 further includes: a receiving unit (not shown in the figure), a matching unit (not shown in the figure), and a obtaining unit (not shown in the figure). Wherein, the above-mentioned receiving unit may be configured to receive a query track, where the query track includes at least one page of the application. The above-mentioned matching unit may be configured to match the query trajectories with the access trajectories corresponding to each aggregation cluster. The above obtaining unit may be configured to obtain and display the terminal device information corresponding to the aggregation cluster in response to determining that the query trajectory matches the access trajectory corresponding to the aggregation cluster.

In some embodiments, the above-mentioned device 500 further includes: a display unit (not shown in the figure). The above display unit may be configured to use a Sankey diagram to display the terminal device information and access trajectories corresponding to each aggregation cluster.

In some embodiments, the above-mentioned device 500 also includes: a page annotation unit (not shown in the figure) and a statistics unit (not shown in the figure). Among them, the above-mentioned labeling unit can be configured to label all pages of the application. The above statistics unit may be configured to, in response to receiving the tag of the page, perform statistics on the access track where the page is located, and obtain access track statistics results.

In some embodiments, the above-mentioned device 500 also includes: a landing annotation unit (not shown in the figure) and a landing query unit (not shown in the figure). Wherein, the above-mentioned landing annotation unit may be configured to perform landing page annotation on the first page of the access track corresponding to each aggregation cluster. The landing query unit is configured to, in response to receiving a query condition using a page as a landing page, obtain and display all access tracks of the page as a landing page.

In some embodiments, the above-mentioned device 500 also includes: an exit annotation unit (not shown in the figure) and an exit query unit (not shown in the figure). Among them, the above-mentioned exit labeling unit can be configured to perform exit page labeling on the last page of the access track corresponding to each aggregation cluster. The above-mentioned exit query unit may be configured to, in response to receiving a query condition that uses a page as an exit page, obtain and display all access trajectories of this page as a landing page.

In some embodiments, the pages of the above-mentioned application include: at least one landing page, and the access track of each aggregation cluster is provided with a bounce node adjacent to the last page of the access track. The above-mentioned optimization unit 504 includes: a calculation module (not shown in the figure), page optimization module (not shown in the figure). Wherein, the above-mentioned computing module can be configured to target at least one A page is used as a landing page. When the next node of the landing page in the access trajectory is a bounce node, the ratio of the number of terminal devices of the bounce node to the number of terminal devices of the landing page at the first node of the access trajectory is calculated. The above page optimization module can be configured to optimize the landing page in response to a ratio greater than the average exit rate of all landing pages to obtain an optimized application.

In some embodiments, the above-mentioned optimization unit 504 includes: a query module (not shown in the figure), a conversion module (not shown in the figure), and an application optimization module (not shown in the figure). The above query module can be configured to query the preset access trajectories from the access trajectories corresponding to all aggregation clusters. The above-mentioned conversion module may be configured to calculate the conversion rate of each page in the preset access track based on the terminal device information of each page in the preset access track. The above-mentioned application optimization module may be configured to optimize the page in response to the conversion rate of a page in the preset access track being less than the conversion rate of pages other than the page in the preset access track to obtain an optimized application.

In some embodiments, the access track of each aggregation cluster is provided with a jump node adjacent to the last page of the access track. The above-mentioned optimization unit 504 includes: a traversal module (not shown in the figure), a node optimization module (not shown in the figure). Wherein, the above traversal module can be configured to traverse the access trajectories corresponding to all aggregation clusters, and calculate the number of terminal devices in all access trajectories that jump out of the page before the node. The above-mentioned node optimization module may be configured to respond to the number of terminal devices that jump out of the page before the node is greater than the preset threshold, optimize the page before jumping out of the node, and obtain an optimized application.

In the access data processing device provided by the embodiment of the present disclosure, the collection unit 501 first collects the access logs of the pages of the application accessed by different terminal devices through the embedded data of the application; secondly, the acquisition unit 502 obtains at least one preset time based on the access log. The access track of each terminal device to the application page in at least one visit during the cycle; again, the aggregation unit 503 performs aggregate statistics on the terminal devices with the same access track, and obtains an aggregation cluster corresponding to the access track and terminal device information; finally, The optimization unit 504 optimizes the application pages based on the access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application. Therefore, based on the access trajectory of the terminal device to the application page in one visit, clustering the terminal devices can determine all terminal devices with the same access trajectory, providing an effective optimization basis for the application page and improving application optimization. efficiency, improved user experience.

Referring now to FIG. 6 , a schematic structural diagram of an electronic device 600 suitable for implementing embodiments of the present disclosure is shown.

As shown in FIG. 6, the electronic device 600 may include a processing device (eg, central processing unit, graphics processor, etc.) 601, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608. The program in the memory (RAM) 603 executes various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the electronic device 600 are also stored. The processing device 601, ROM 602 and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices can be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, etc.; output devices including, for example, a liquid crystal display (LCD, Liquid Crystal Display), speakers, vibrators, etc. 607; a storage device 608 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 609. Communication device 609 may allow electronic device 600 to communicate wirelessly or wiredly with other devices to exchange data. Although FIG. 6 illustrates electronic device 600 with various means, it should be understood that implementation or availability of all illustrated means is not required. More or fewer means may alternatively be implemented or provided. Each block shown in Figure 6 may represent one device, or may represent multiple devices as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such embodiments, the computer program may be downloaded and installed from the network via communication device 609, or from storage device 608, or from ROM 602. When the computer program is executed by the processing device 601, the above-described functions defined in the method of the embodiment of the present disclosure are performed.

It should be noted that the computer-readable medium in the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. Computer-readable storage media may be, for example, but not limited to, electronic, magnetic, optical, electromagnetic, Infrared, or semiconductor systems, devices or devices, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmed read-only memory (EPROM or flash memory), fiber optics, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In embodiments of the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . Program code contained on a computer-readable medium can be transmitted using any appropriate medium, including but not limited to: wires, optical cables, RF (Radio Frequency, Radio Frequency), etc., or any suitable combination of the above.

The above-mentioned computer-readable medium may be included in the above-mentioned server; it may also exist separately without being assembled into the server. The above-mentioned computer-readable medium carries one or more programs. When the above-mentioned one or more programs are executed by the server, the server: collects the access logs of the pages of the application accessed by different terminal devices through the buried point data of the application; based on Access log, obtain the access track of each terminal device to the application page in at least one visit in at least one preset time period; perform aggregate statistics on terminal devices with the same access track, and obtain an aggregation of the corresponding access track and terminal device information Clusters; based on the access trajectories and terminal device information corresponding to each aggregated cluster, the application page is optimized to obtain an optimized application.

Computer program code for performing operations of embodiments of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, and A conventional procedural programming language—such as "C" or a similar programming language. Program code can be completely Execute partly on the user's computer, execute partly on the user's computer, execute as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through Internet connection).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products in accordance with various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented in software or hardware. The described unit can also be provided in a processor. For example, it can be described as: a processor including a collection unit, an acquisition unit, an aggregation unit, and an optimization unit. Among them, the names of these units do not constitute a limitation on the unit itself under certain circumstances. For example, the collection unit can also be described as "configured to collect the pages of the application accessed by different terminal devices through the embedded data of the application." Access Log" unit.

The above description is only a description of the preferred embodiments of the present disclosure and the technical principles applied. Persons skilled in the art should understand that the scope of the invention involved in the embodiments of the present disclosure is not limited to technical solutions composed of specific combinations of the above technical features, and should also cover the above-mentioned technical solutions without departing from the above-mentioned inventive concept. Other technical solutions formed by any combination of technical features or their equivalent features. For example, the above features are formed by mutually replacing technical features with similar functions disclosed in the embodiments of the present disclosure (but not limited to). case.

Claims

An access data processing method, wherein the method includes:

Collect the access logs of pages of the application accessed by different terminal devices through the embedded data of the application;

Based on the access log, obtain the access track of each terminal device to the page of the application in at least one visit in at least one preset time period;

Perform aggregation statistics on terminal devices with the same access trajectories to obtain aggregation clusters corresponding to the access trajectories and terminal device information;

Based on the access trajectories and terminal device information corresponding to each aggregation cluster, the page of the application is optimized to obtain an optimized application.
The method of claim 1, further comprising:

Receive a query track, the query track including at least one page of the application;

Match the query trajectories with the access trajectories corresponding to each aggregation cluster;

In response to determining that the query trace matches the access trace corresponding to the aggregation cluster, terminal device information corresponding to the aggregation cluster is obtained and displayed.
The method of claim 1, further comprising:

A Sankey diagram is used to display the terminal device information and access trajectories corresponding to each aggregation cluster.
The method of claim 1, further comprising:

Label all pages of the application;

In response to receiving the tag of the page, statistics are performed on the access track where the page is located, and the access track statistics result is obtained.
The method of claim 1, further comprising:

Mark the landing page for the first page of the access track corresponding to each aggregation cluster;

In response to receiving a query condition that uses a page as the landing page, obtain and display All this page serves as the access track of the landing page.
The method of claim 1, further comprising:

Mark the exit page for the last page of the access track corresponding to each aggregation cluster;

In response to receiving a query condition that uses a page as an exit page, obtain and display all access tracks of this page as a landing page.
The method according to any one of claims 1 to 6, wherein the application pages include: at least one landing page, and the access track of each aggregate cluster is provided with a jumpout adjacent to the last page of the access track. Node, the page of the application is optimized based on the access trajectory and terminal device information corresponding to each aggregation cluster, and the optimized application is obtained, including:

For at least one page as a landing page, when the next node of the landing page in the access trajectory is a bounce node, calculate the ratio of the number of terminal devices of the bounce node to the number of terminal devices of the landing page at the first node of the access trajectory;

In response to the ratio being greater than the average exit rate of all landing pages, the landing page is optimized to obtain an optimized application.
The method according to any one of claims 1 to 6, wherein the page of the application is optimized based on the access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application, including:

Query the preset access trajectories from the access trajectories corresponding to all aggregated clusters;

Calculate the conversion rate of each page in the preset access trajectory based on the terminal device information of each page in the preset access trajectory;

In response to the conversion rate of a page in the preset access track being less than the conversion rate of pages other than the page in the preset access track, the page is optimized to obtain an optimized application.
The method according to any one of claims 1 to 6, wherein the access track of each aggregation cluster is provided with a jump node adjacent to the last page of the access track, said Based on the access trajectories and terminal device information corresponding to each aggregation cluster, the page of the application is optimized to obtain an optimized application, including:

Traverse the access trajectories corresponding to all aggregation clusters, and calculate the number of terminal devices in all access trajectories that jump out of the page before the node;

In response to the number of terminal devices on the page before the jumping out node being greater than the preset threshold, the page before the jumping out node is optimized to obtain an optimized application.
An access data processing device, wherein the device includes:

The collection unit is configured to collect the access logs of different terminal devices accessing the pages of the application through the embedded data of the application;

The acquisition unit is configured to acquire, based on the access log, the access track of each terminal device to the page of the application in at least one visit in at least one preset time period;

an aggregation unit configured to perform aggregation statistics on terminal devices with the same access trajectory, and obtain an aggregation cluster corresponding to the access trajectory and terminal device information;

The optimization unit is configured to optimize the page of the application based on the access trajectories and terminal device information corresponding to each aggregation cluster to obtain an optimized application.
An electronic device including:

one or more processors;

A storage device on which one or more programs are stored;

Wherein, when the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-9.
A computer-readable medium with a computer program stored thereon, wherein when the program is executed by a processor, the method according to any one of claims 1-9 is implemented.
A computer program product comprising a computer program, wherein the computer program implements the method of any one of claims 1-9 when executed by a processor.