WO2021147220A1 - 页面访问时长采集方法、装置、介质及电子设备 - Google Patents

页面访问时长采集方法、装置、介质及电子设备 Download PDF

Info

Publication number
WO2021147220A1
WO2021147220A1 PCT/CN2020/093584 CN2020093584W WO2021147220A1 WO 2021147220 A1 WO2021147220 A1 WO 2021147220A1 CN 2020093584 W CN2020093584 W CN 2020093584W WO 2021147220 A1 WO2021147220 A1 WO 2021147220A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
point
time
target
page
Prior art date
Application number
PCT/CN2020/093584
Other languages
English (en)
French (fr)
Inventor
江彬
胡娟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021147220A1 publication Critical patent/WO2021147220A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This application relates to the field of electronic information technology, and in particular to a method, device, medium, and electronic equipment for collecting page access time.
  • Page access time collection is the process of collecting the length of time the user stays on the Internet page during the visit.
  • the inventor realizes that in the process of using smart devices such as mobile phones and tablets to access Internet pages, the solution of time collection: Generally, the maximum access time is artificially set to avoid server overload, but this will cause the time length statistics to be inaccurate, and the reported data will be Much less than the actual situation.
  • the purpose of this application is to provide a solution for collecting page access time, so as to solve the problem of low accuracy and reliability of page access time collection.
  • a method for collecting page access duration including: obtaining event buried point configuration information corresponding to the page code of the target page and the first historical access record of the target visitor to the target page; and obtaining the The second historical access record of the target visitor to the associated page of the target page; extract the event point configuration feature of the target page from the event point configuration information; from the first historical access record and the first 2.
  • the time-length collection point information includes the predicted event point that will be triggered when the visitor visits the target page; the point-associated time-length collection code is embedded for the target event in the page code, so that when the visitor accesses the target page
  • the duration collection code associated with the target event tracking point reports time length data, and the target event tracking point corresponds to the visitor's duration Collect the event buried point in the buried point information.
  • a page access time collection device which includes: a first acquisition module for acquiring event buried point configuration information corresponding to the page code of the target page and the target visitor’s information on the target page The first historical visit record; the second acquisition module, which is used to acquire the second historical visit record of the associated page of the target page by the target visitor; the first extraction module, which is used to extract the configuration information of the event buried point The configuration feature of the event burial point of the target page; the second extraction module is used to extract the event burial point trigger feature for a predetermined period of time from the first historical visit record and the second historical visit record; the analysis module uses After inputting the event buried point configuration feature and the event buried point trigger feature into the time collection point prediction model, the time length collection point information of the visitor is obtained, and the time length collection point information includes the predicted visit The event burying point that will be triggered when the visitor visits the target page; the collection module is used to bury the point-associated time collection code for the target event in the page code,
  • a computer-readable storage medium having program instructions stored thereon, wherein the program instructions implement the method described in any one of the above when the program instructions are executed by a processor.
  • an electronic device which includes: a processor; and a memory, configured to store program instructions of the processor; wherein the processor is configured to execute by executing the program instructions : Obtain the event buried point configuration information corresponding to the page code of the target page and the first historical visit record of the target visitor to the target page; obtain the second historical visit record of the target visitor to the associated page of the target page Extracting the event buried point configuration feature of the target page from the event buried point configuration information; extracting the event buried point trigger feature for a predetermined period of time from the first historical visit record and the second historical visit record; The event buried point configuration feature and the event buried point trigger feature input time length collection point prediction model to obtain the visitor’s time length collection point information, and the time length collection point information includes the predicted visitor’s visit The event embedding point that will be triggered when the target page is triggered; and the associated duration collection code for the embedding point of the target event in the page code, so that when the visitor visits the target page,
  • This application can realize the segmented reporting of the access duration, ensuring that the duration data is collected in various situations, and avoiding the loss of the access duration data; at the same time, it can avoid code transition coupling when adding the associated duration collection code to all buried points, and ensure the reliability of the duration data collection sex.
  • Fig. 1 schematically shows a flow chart of a method for collecting page access duration.
  • Fig. 2 schematically shows an example diagram of an application scenario of a method for collecting page access duration.
  • Fig. 3 schematically shows a flow chart of a method for triggering feature acquisition.
  • Fig. 4 schematically shows a block diagram of a device for collecting page access duration.
  • Fig. 5 schematically shows an example block diagram of an electronic device for implementing the above-mentioned method for collecting page access duration.
  • Fig. 6 schematically shows a computer-readable storage medium for implementing the above-mentioned method for collecting page access duration.
  • a method for collecting page access time is first provided.
  • the page access time collection method can be run on a server, a server cluster or a cloud server, etc.
  • a server cluster or a cloud server etc.
  • those skilled in the art can also run on other platforms as required
  • the method of this application is not particularly limited in this exemplary embodiment.
  • the page access time collection method provided in the embodiments of this application is applicable to the field of big data.
  • a data warehouse is constructed by obtaining page access time, so as to analyze the access behavior of employees or users according to the page access time of users or employees, and provide relevant information. Decisions, etc., can be specifically determined based on actual application scenarios, and are not limited here.
  • the method for collecting the page access duration may include the following steps S110-S160.
  • Step S110 Obtain the event embedding configuration information corresponding to the page code of the target page and the first historical visit record of the target visitor to the target page.
  • Step S120 Obtain a second historical visit record of the target visitor to the associated page of the target page.
  • Step S130 extracting the configuration feature of the event location of the target page from the configuration information of the event location.
  • Step S140 extracting event buried point trigger features for a predetermined period of time from the first historical visit record and the second historical visit record.
  • step S150 the event buried point configuration feature and the event buried point trigger feature are input into the time collection point prediction model to obtain the visitor's time collection buried point information, and the time length collection buried point information includes the predicted all points. Describes the event point that will be triggered when the visitor visits the target page.
  • Step S160 Collect code for the target event tracking point correlation duration in the page code, so that when the visitor visits the target page, in response to the triggering of the target event tracking point, the target event tracking point is associated with the target event tracking point.
  • the time length collection code reports time length data, and the target event burying point corresponds to the event burying point in the visitor's time length collection burying point information.
  • the event buried point configuration feature of the visited page and the event buried point trigger feature of the visit record input time length collection point prediction model to predict the time length collection that shows the visitor's visit habits Burying point information, personalized in the page code and time length collection buried point information corresponding to the event burying point to add the associated duration collection code, to report the duration data, you can achieve segmented reporting of the access duration, to ensure that the duration of the collection in various situations Data, to avoid the loss of access duration data; at the same time, it can avoid code transition coupling when adding associated duration collection codes to all buried points, and ensure the reliability of duration data collection.
  • step S110 the event buried point configuration information corresponding to the page code of the target page and the first historical visit record of the target visitor to the target page are obtained.
  • the server 201 obtains the event embedding configuration information corresponding to the page code of the target page of the user terminal 202 and the target visitor (for example, it may be the user terminal 202 or corresponding to a certain user Account visitors) the first historical visit record to the target page. In this way, in the subsequent steps, the server 201 processes the event configuration information and the first historical access record of the target visitor to the target page, so as to accurately collect the access duration of the target page.
  • the server 201 can be any device with processing capability, such as a computer, a microprocessor, etc., which is not specifically limited here; the user terminal 202 can be any terminal with an Internet page access function.
  • Event buried point configuration information is pre-configured in the page code of the target page, clicks, slides, and other events corresponding to the data related to the configuration of the reported buried points, for example, the associated relationship between the configured event buried points and the triggering event, and the buried points The location and so on.
  • the target visitor may be a user who visits the target page (for example, a visitor corresponding to a certain user account) or a terminal that visits the target page (for example, the terminal where a certain visited page is located).
  • the first historical visit record is long-term operation data generated when the target visitor visits the target page in history, such as the number of times and time of triggering each event. Historical visit records can reflect the visitor’s visit habits. For example, users will continue to churn as the page level deepens. The churn rate is the highest in the previous pages; button clicks are affected by the page level; some pictures are not dynamic. People have a desire to click, such as treasure chests, gift boxes and other elements in life that will have surprises; the average stay time of the first and last screens of the page is longer than the average stay time of the middle page.
  • step S120 a second historical visit record of the target visitor to the associated page of the target page is obtained.
  • the associated page of the target page may be another page that has a jump relationship after the target page is accessed, or a page that has a predetermined association relationship or a regular access relationship with the target page (usually the target page is visited) , And other related pages that must be visited with a high probability).
  • the second historical visit record is long-term operation data when the target visitor visits the associated page in history, such as regular data that triggers various events.
  • the second historical visit record of the target visitor to the associated page of the target page can be obtained, which can cover the hidden characteristics of the visitor's visit to the target page, and further ensure that the visitor has the target in the subsequent steps. Accuracy of page access rule analysis.
  • step S130 the event location configuration feature of the target page is extracted from the event location configuration information.
  • the event buried point configuration feature can be the machine learning feature of the extracted event buried point configuration information, such as the feature vector of the information; it can also be compared with preset standard data to exclude abnormal data ( For the configuration features of the event buried points obtained after analyzing the time length to collect the unnecessary and redundant configuration information of the buried points in the subsequent steps, for example, a few of the event buried points that need to be operated too frequently when browsing the page are excluded.
  • extracting the configuration feature of the event configuration of the target page from the configuration information of the event configuration includes: inputting the configuration information of the event configuration into a preset configuration feature extraction model to obtain the configuration feature of the target page Event buried point configuration characteristics.
  • the preset configuration feature extraction model is a pre-trained machine learning model, which can embed configuration information for a large number of events. According to the model training strategy, it automatically calculates and analyzes the current event embedding configuration information, and outputs the event embedding of the target page that meets the requirements. Point configuration features.
  • the training method for configuring the feature extraction model includes: obtaining a sample set of event buried point configuration information, where each sample includes event buried point configuration information and calibrated event buried point configuration features; The data are respectively input into the configuration feature extraction model to obtain the predicted event buried point configuration feature output by the configuration feature extraction model; if there is data of the sample input into the configuration feature extraction model, the obtained predicted event buried point configuration feature and the corresponding location If the pre-calibrated event buried point configuration features of the sample are inconsistent, adjust the coefficients of the configuration feature extraction model until they are consistent; when all the sample data is input into the configuration feature extraction model, the predicted event buried point configuration features obtained are consistent with the The pre-calibrated event buried point configuration features of the samples are consistent, and the training ends.
  • the buried point configuration information sample set includes a plurality of samples, where each sample includes event buried point configuration information and standard event buried point configuration features that are calibrated by experts and useful for time-length collection and buried point information analysis. This can ensure the accuracy of model training.
  • step S140 the event buried point trigger feature for a predetermined period of time is extracted from the first historical visit record and the second historical visit record.
  • the predetermined time period may be set according to requirements, for example, a periodic time period during which a certain event lasts or a full time period.
  • the first historical visit record and the second historical visit record include the visit information of the target page, that is, the visit rule of the visitor (for example, the visit rule of the visitor in a specific event or the visit of the visitor in the full record Laws), these laws can be clearly reflected by the triggering rules of the visitor's burial of page events.
  • the visit rule of the visitor for example, the visit rule of the visitor in a specific event or the visit of the visitor in the full record Laws
  • the event buried point trigger feature can be the machine learning feature of the extracted access record, such as the feature vector of the information; it can also be compared with the preset standard data to eliminate abnormal data (it is useless for the analysis time in the subsequent steps to collect the buried point. , Redundant operation information) to obtain the trigger feature of the event buried point, for example, remove a few of the event buried points that need to be operated too frequently when browsing the page.
  • extracting the event buried point trigger feature for a predetermined time period from the first historical visit record and the second historical visit record includes: step S310, from the first historical visit record and the second historical visit record.
  • the historical visit record and the second historical visit record extract event buried point trigger information for a predetermined period of time; step S320, the event buried point trigger feature is extracted from the event buried point trigger information for the predetermined period of time.
  • First extract part of the event burying point trigger information for a predetermined time period (it can be a specified time period or a time period corresponding to a certain access event, such as a certain marketing activity time period), and then extract the event burying point in the predetermined time period.
  • the point trigger feature is extracted from the point trigger information.
  • extracting the event buried point trigger feature from the event buried point trigger information of the predetermined period of time includes: inputting the event buried point trigger information of the predetermined period of time into a preset trigger feature extraction model to obtain all The event buried point trigger feature.
  • the preset trigger feature extraction model is a pre-trained machine learning model that can embed trigger information for a large number of events. According to the model training strategy, automatically calculate and analyze the current event embedding trigger information, and output the event embedding of the target page that meets the requirements. Point trigger feature.
  • the training method of the trigger feature extraction model includes: obtaining a sample set of event buried point trigger information, wherein each sample includes event buried point trigger information and a calibrated event buried point trigger feature; The data is input into the trigger feature extraction model to obtain the predicted event buried point trigger feature output by the trigger feature extraction model; if there is data from the sample input into the trigger feature extraction model, the obtained predicted event buried point trigger feature and the corresponding If the pre-calibrated trigger feature of the event point of the sample is inconsistent, adjust the coefficients of the trigger feature extraction model until they are consistent; when the data of all the samples are input to the trigger feature extraction model, the predicted trigger feature extraction model obtained is consistent with the The trigger features of the pre-calibrated event points of the samples are consistent, and the training ends.
  • the event buried point trigger information sample set includes multiple samples, where each sample includes event buried point trigger information and standard event buried point trigger features that are calibrated by experts and useful for time-length collection and buried point information analysis. This can ensure the accuracy of model training.
  • the above-mentioned event buried point configuration information sample set, the above-mentioned event buried point trigger information sample set, the above first historical access record, and the above second historical access record may be stored in a database in advance, or stored in a blockchain to realize data
  • the sharing of information between different platforms can also prevent data from being tampered with.
  • when obtaining the above-mentioned event buried point configuration information sample set, the above-mentioned event buried point trigger information sample set, the above first historical access record, and the above second historical access record from the blockchain it can be implemented by invoking smart contracts. No explanation here.
  • Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • the blockchain is essentially a decentralized database, which is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify the validity of the information. (Anti-counterfeiting) and generate the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • determining the predetermined time period includes: acquiring a target time period collected by the page access duration of the target page, the target time period indicating the duration of a specific page access event; acquiring the duration of the specific page access event The event characteristics of the specific page access event in the time period; according to the event characteristics, from the first historical visit record and the second historical visit record, the history with the similarity of the event characteristic higher than a predetermined threshold is determined The duration of the historical event corresponding to the event feature; the duration of the historical event is determined as the predetermined time period.
  • target time period collected by page visit duration, where the target time period indicates a duration period of a specific page visit event, for example, a duration period during which a certain marketing event needs to be monitored and collected in the future.
  • a personalized analysis of time-length collection and buried points can be performed for a specific time period.
  • Obtaining the event characteristics of the specific page access event in the specific page access event duration period can obtain the event characteristics of the access event in the target time period, for example, the characteristics of the event activity rule or the activity name.
  • the duration of the historical event corresponding to the historical event feature whose similarity of the event feature is higher than the predetermined threshold is determined from the first historical visit record and the second historical visit record, and the first historical visit record can be found
  • the second historical visit records the duration period of the historical event corresponding to the multiple events with similar activity rules of the target event section.
  • the predetermined threshold is set according to demand.
  • the duration of the historical event may be determined as the predetermined time period.
  • step S150 the event buried point configuration feature and the event buried point trigger feature are input into the time-length collection point prediction model to obtain the visitor's time-length collection point information, and the time-length collection point information includes prediction Of the events that will be triggered when the visitor visits the target page.
  • the time-length collection point prediction model is a pre-trained machine learning model, which can configure features and trigger features for a large number of event burying points, and automatically calculate and analyze the current event burying points according to the model training strategy. Configure features and event burial point trigger features, and output the duration of visitors meeting the requirements to collect burial point information. According to the visitor's duration, the information of the buried point can be collected, and then the corresponding duration can be collected and the buried point can be set.
  • the training method of the time-length collection point prediction model includes: acquiring a feature sample set, where each sample includes event buried point configuration features, event buried point trigger features, and calibrated time duration to collect buried point information; The data of the sample are respectively input to the time-length collection point prediction model to obtain the predicted time-length collection buried point information output by the time-length collection point prediction model; if there is data of the sample after the time-length collection point prediction model exists, the predicted time length is obtained When the collected buried point information is inconsistent with the pre-calibrated duration of the sample, the coefficients of the machine learning model are adjusted until they are consistent; when the data of all the samples are input into the time-length collection point prediction model, the prediction is obtained The time-length collection of buried point information is consistent with the pre-calibrated time-length collection of buried point information for the sample, and the training ends.
  • the feature sample set includes multiple samples, and each sample includes event buried point configuration feature, event buried point trigger feature, and time length collection buried point information calibrated by experts. This can ensure the accuracy of model training.
  • step S160 a code is collected for the associated duration of the target event in the page code, so that when the visitor visits the target page, in response to the triggering of the target event register, the target event is associated with the target event.
  • the associated time length collection code reports time length data, and the target event burial point corresponds to the event burial point in the visitor's time-length collection burial point information.
  • the associated duration collection code is added to the event capture point corresponding to the duration collection point information, and the duration collection code can be used to respond to the trigger of the target event capture point when the visitor visits the target page.
  • the time length collection code associated with the target event buried point reports the time length data, so that the time length can be reported in segments, and the segment reporting is related to the visitor's access habits.
  • the correlation of the instant long collection code to the visitor's access habits can ensure The accuracy of time collection is at the same time avoiding code transition coupling as much as possible.
  • the time length collection code associated with the target event buried point reports the time length data, including: when the target event buried point is associated with the page code When the first time-length collection code captures the buried point, the first time-length collection code acquires the difference between the trigger time point of the target event buried point and the time point when the user enters the target page, and reports it; when the target event When the buried point is not the one associated with the first duration collection code in the page code, the duration collection code associated with the target event buried point acquires the trigger time point of the target event buried point and the previous one of the duration collection code The time difference of the reporting time of the time collection code is reported.
  • time-length data collection there are mainly two implementation schemes for time-length data collection; one is to enter the second page to report the time length of the first visited page, and the other is to use a heartbeat reporting method.
  • the first method will cause data loss due to failure to report in time, and is not compatible with abnormal exits; the second method is overloaded with the server, which does not meet the high-volume scenarios, and cannot use heartbeats throughout the page access , Generally, the maximum access duration is artificially set to avoid server overload, but this will cause inaccurate duration statistics, and the reported data will be much less than the actual situation.
  • T+1 When T+1 is used for data analysis, it will be tricky for the page time spanning multiple days. Because the end point of the user visit is not clear, the visit time of the day may be counted as the data of the second day or even the third day and the fourth day, and for the abnormal situation where the end point of the page is not captured, the duration data is lost; and If the end point is artificially set (the previous processing method), when a cross-day visit occurs, the end of a page will be split into two data, the artificial end point and the actual end point, which will cause duplicate statistics.
  • this solution proposes a solution of time-length segmentation reporting in combination with event buried points.
  • Real-time accumulated active time monitor the mouse and keyboard touch and other events to accumulate the user's active time in real time, in the case that the page is not visible but not exited (for example, the mobile phone page cuts into the background, and the PC side browses other web pages without closing the browser), This part of the duration is not included in the cumulative duration. Filter invalid time, the data is close to the real data.
  • Dismantling time segment reporting This solution uses the embedded point of the page as the trigger point to split the access time into multiple segments. If the user triggers event logging or other reporting server events, the accumulated access time of the previous period will be reported to the server together with the event logging point, and then the duration count will be cleared, and the next period of time will be accumulated from zero. After entering the next page, the cumulative duration of the last segment of the previous page is reported.
  • the visit of the same page is divided into multiple segments for reporting, and each segment is equivalent to an end point, which can optimize the error caused by the page duration not being reported in time, and it will not cause excessive server load.
  • H5 will immediately report a piece of data each time an event is triggered. This solution puts the time-length report into the data structure of the event burying point for reporting. For the server, it processes the same piece of data and does not add too many servers. load.
  • Segmented reporting can ensure the successful collection of valid data to the greatest extent. Even if the access ends abnormally, the access time before the abnormality will be retained, and the entire data will not be under-reported or lost, which improves the accuracy of data collection. sex.
  • This scheme adopts the scheme of segmented statistics of page visit time, each short period of time can be regarded as an end point, decouples the dependence of visit time on the end point of the page, and solves the problem that the calculation of T+1 big data in related technologies cannot be timely on the same day Counting the cross-day visit time of the same page, or when there are artificial ending points and actual ending points at the same time, causing double counting problems.
  • the duration of each day is accumulated, and the duration statistics will not be cross-day and will not be double-calculated.
  • the device for collecting page access duration includes: a first obtaining module 410 for obtaining event buried point configuration information corresponding to the page code of the target page and the first historical visits of the target visitor to the target page Records; the second acquisition module 420 is used to acquire the second historical visit record of the target visitor to the associated page of the target page; the first extraction module 430 is used to extract the target page from the incident configuration information The second extraction module 440 is used to extract the event buried point trigger feature for a predetermined period of time from the first historical visit record and the second historical visit record; the analysis module 450 is used to combine the The event buried point configuration feature and the event buried point trigger feature input time length collection point prediction model to obtain the visitor’s time length collection point information, and the time length collection point information includes the predicted visitor’s visit to the The event embedding point that will be triggered when the target page is reached; the collection module 460 is used to embed the point associated duration collection code for the target event in the page code
  • modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which can be a personal computer, a server, a mobile terminal, or a network device, etc.) execute the method according to the embodiment of the present application.
  • a non-volatile storage medium which can be a CD-ROM, U disk, mobile hard disk, etc.
  • Including several instructions to make a computing device which can be a personal computer, a server, a mobile terminal, or a network device, etc.
  • an electronic device capable of implementing the above method is also provided.
  • the electronic device 500 according to this embodiment of the present application will be described below with reference to FIG. 5.
  • the electronic device 500 shown in FIG. 5 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present application.
  • the electronic device 500 is represented in the form of a general-purpose computing device.
  • the components of the electronic device 500 may include, but are not limited to: the aforementioned at least one processing unit 510, the aforementioned at least one storage unit 520, and a bus 530 connecting different system components (including the storage unit 520 and the processing unit 510).
  • the storage unit stores program code, and the program code can be executed by the processing unit 510, so that the processing unit 510 executes the various exemplary methods described in the “Exemplary Method” section of this specification. Steps of implementation.
  • the processing unit 510 may perform the steps shown in FIG. 1.
  • the storage unit 520 may include a readable storage medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 5201 and/or a cache storage unit 5202, and may further include a read-only storage unit (ROM) 5203.
  • RAM random access storage unit
  • ROM read-only storage unit
  • the storage unit 520 may also include a program/utility tool 5204 having a set of (at least one) program module 5205.
  • program module 5205 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
  • the bus 530 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
  • the electronic device 500 can also communicate with one or more external devices 700 (such as keyboards, pointing devices, Bluetooth devices, etc.), and can also communicate with one or more devices that enable customers to interact with the electronic device 500, and/or communicate with Any device (such as a router, modem, etc.) that enables the electronic device 500 to communicate with one or more other computing devices. Such communication may be performed through an input/output (I/O) interface 550, and may include a display unit 540 connected to the input/output (I/O) interface 550.
  • the electronic device 500 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 560.
  • networks for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • the network adapter 560 communicates with other modules of the electronic device 500 through the bus 530.
  • other hardware and/or software modules can be used in conjunction with the electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiment of the present application.
  • a computing device which can be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium is also provided, on which a program product capable of implementing the above method of this specification is stored.
  • the computer-readable storage medium may be non-volatile or volatile.
  • each aspect of the present application can also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to make the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.
  • a program product 600 for implementing the above method according to an embodiment of the present application is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be installed in a terminal device, For example, running on a personal computer.
  • CD-ROM compact disk read-only memory
  • the program product of this application is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, device, or device.
  • the program product may adopt any combination of one or more readable storage media.
  • the readable storage medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable storage medium other than the readable storage medium, and the readable storage medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.
  • the program code contained on the readable storage medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the above.
  • the program code used to perform the operations of the present application can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages. Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the client computing device, partly executed on the client device, executed as an independent software package, partly executed on the client computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device can be connected to a client computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (for example, using Internet service providers). Business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service providers for example, using Internet service providers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种页面访问时长采集方法、装置、介质及电子设备,属于电子信息技术领域。该方法包括:获取目标页面的页面代码对应的事件埋点配置信息及目标页面的第一历史访问记录(S110);获取对所述目标页面的关联页面的第二历史访问记录(S120);从事件埋点配置信息提取事件埋点配置特征(S130);从第一历史访问记录及第二历史访问记录中提取预定时间段的事件埋点触发特征(S140);将事件埋点配置特征及事件埋点触发特征输入时长采集点预测模型,得到访问者的时长采集埋点信息(S150);在页面代码中与时长采集埋点信息相应的事件埋点添加关联时长采集代码,以使得与所述目标事件埋点关联的时长采集代码上报时长数据(S160)。该方法可以提升页面访问时长采集的准确性和可靠性。

Description

页面访问时长采集方法、装置、介质及电子设备
本申请要求于2020年1月20日提交中国专利局,申请号为2020100669406、发明名称为“页面访问时长采集方法、装置、介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子信息技术领域,具体而言,涉及一种页面访问时长采集方法、装置、介质及电子设备。
背景技术
页面访问时长采集是采集互联网页面中用户在访问时驻留的时长的过程。发明人意识到,在使用手机、平板等智能设备访问互联网页面过程中,时长采集的方案:一般通过人为设置最大的访问时长来避免服务器超负荷,但这样会造成时长统计不准,上报数据会比实际情况少很多。
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本申请的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。
技术问题
本申请的目的在于提供一种页面访问时长采集方案,进而解决页面访问时长采集的准确性和可靠性较低的问题。
技术解决方案
根据本申请的一个方面,提供一种页面访问时长采集方法,包括:获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于于所述访问者的时长采集埋点信息中的事件埋点。
根据本申请的一个方面,提供一种页面访问时长采集装置,其中,包括:第一获取模块,用于获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;第二获取模块,用于获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;第一提取模块,用于从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;第二提取模块,用于从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;分析模块,用于将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;采集模块,用于为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于于所述访问者的时长采集埋点信息中的事件埋点。
根据本申请的一个方面,提供一种计算机可读存储介质,其上存储有程序指令,其中,所述程序指令被处理器执行时实现上述任一项所述的方法。
根据本申请的一个方面,提供一种电子设备,其中,包括:处理器;以及存储器,用于存储所述处理器的程序指令;其中,所述处理器配置为经由执行所述程序指令来执行:获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于所述访问者的时长采集埋点信息中的事件埋点。
有益效果
本申请可以实现访问时长的分段上报,保证各种情况下采集到时长数据,避免访问时长数据丢失;同时可以避免全部埋点添加关联时长采集代码时出现代码过渡耦合,保证时长数据采集的可靠性。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示意性示出一种页面访问时长采集方法的流程图。
图2示意性示出一种页面访问时长采集方法的应用场景示例图。
图3示意性示出一种触发特征获取的方法流程图。
图4示意性示出一种页面访问时长采集装置的方框图。
图5示意性示出一种用于实现上述页面访问时长采集方法的电子设备示例框图。
图6示意性示出一种用于实现上述页面访问时长采集方法的计算机可读存储介质。
本发明的实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本申请将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本申请的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本申请的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本申请的各方面变得模糊。
此外,附图仅为本申请的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
本示例实施方式中首先提供了页面访问时长采集方法,该页面访问时长采集方法可以运行于服务器,也可以运行于服务器集群或云服务器等,当然,本领域技术人员也可以根据需求在其他平台运行本申请的方法,本示例性实施例中对此不做特殊限定。
本申请实施例提供的页面访问时长采集方法和适用于大数据领域,如通过获取页面访问时长来构建数据仓库,以根据用户或者员工的页面访问时长对员工或者用户的访问行为进行分析,提供相关决策等,具体可基于实际应用场景确定,在此不做限制。
参考图1所示,该页面访问时长采集方法可以包括以下步骤S110-S160。
步骤S110,获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录。
步骤S120,获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录。
步骤S130,从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征。
步骤S140,从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征。
步骤S150,将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点。
步骤S160,为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于于所述访问者的时长采集埋点信息中的事件埋点。
以这种方式,通过分析访问者的访问记录,通过将访问页面的事件埋点配置特征及访问记录的事件埋点触发特征输入时长采集点预测模型,预测出表现访问者的访问习惯的时长采集埋点信息,个性化的在页面代码中与时长采集埋点信息相应的事件埋点添加关联时长采集代码,进行上报时长数据,可以实现访问时长的分段上报,保证各种情况下采集到时长数据,避免访问时长数据丢失;同时可以避免全部埋点添加关联时长采集代码时出现代码过渡耦合,保证时长数据采集的可靠性。
下面,将结合附图对本示例实施方式中上述页面访问时长采集方法中的各步骤进行详细的解释以及说明。
在步骤S110中,获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录。
本示例的实施方式中,参考图2所示,服务器201获取用户终端202的目标页面的页面代码对应的事件埋点配置信息及目标访问者(例如,可以是用户终端202或者对应于某个用户账号的访问者)对目标页面的第一历史访问记录。这样可以在后续步骤中,由服务器201对事件埋点配置信息及目标访问者对目标页面的第一历史访问记录进行处理,以准确采集目标页面的访问时长。可以理解,其中,服务器201可以是任何具有处理能力的设备,例如,电脑、微处理器等,在此不做特殊限定;用户终端202可以是任何具有互联网页面访问功能的终端。
事件埋点配置信息是目标页面的页面代码中预先配置的点击、滑动等事件对应的数据上报埋点的配置情况的相关信息,例如,配置的事件埋点与触发事件的关联关系、以及埋点的位置等。
目标访问者可以是访问该目标页面的用户(例如,对应于某个用户账号的访问者)或者访问该目标页面的终端(例如,某个访问页面所在的终端)。
第一历史访问记录是该目标访问者历史上访问该目标页面时产生的长期的操作数据,例如触发各个事件的次数、时间等数据。通过历史访问记录可以反映访问者的访问习惯,例如,用户会随着页面层级的加深而不断流失,流失率在前几页最高;按钮的点击量受页面层级影响;有些图片不是动态的,也让人很有点击的欲望,例如宝箱、礼物盒等生活中打开了会有惊喜的元素;页面首屏和最后一屏的平均停留时间比中间页面的平均停留时间长。
在步骤S120中,获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录。
本示例的实施方式中,目标页面的关联页面可以是该目标页面访问后具有跳转关系的其它页面,或者,与该目标页面具有预定关联关系或者访问规律关系的页面(通常访问了该目标页面,还必须或者高概率会访问的其它关联页面)。
第二历史访问记录是该目标访问者历史上访问该关联页面时长期的操作数据,例如触发各个事件的规律数据等。
以这种方式,可以获取到目标访问者对该目标页面的关联页面的第二历史访问记录,可以涵盖访问者对该目标访问页面访问的隐性特征,进一步保证后续步骤中访问者对该目标页面访问规律分析的准确性。
在步骤S130中,从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征。
本示例的实施方式中,事件埋点配置特征可以是提取的事件埋点配置信息的机器学习特征,例如信息的特征向量;也可以是通过与预设的标准数据进行比较,排除了异常数据(对于后续步骤中分析时长采集埋点无用、多余的配置信息)后得到的事件埋点配置特征,例如,剔除浏览页面时需操作的频率过于频繁的事件埋点中的几个。
以这种方式,可以获得对于分析时长采集埋点的标准形式的数据特征,保证分析效率和准确性。
一种实施例中,从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征,包括:将所述事件埋点配置信息输入预设配置特征提取模型,得到所述目标页面的事件埋点配置特征。预设配置特征提取模型是预先训练好的机器学习模型,可以对大量的事件埋点配置信息,根据模型训练策略,自动计算分析当前的事件埋点配置信息,输出满足要求的目标页面的事件埋点配置特征。
一种实施例中,配置特征提取模型的训练方法包括:获取事件埋点配置信息样本集,其中每个样本包括事件埋点配置信息及标定的事件埋点配置特征;将每个所述样本的数据分别输入配置特征提取模型,得到所述配置特征提取模型输出的预测事件埋点配置特征;如果存在有所述样本的数据输入配置特征提取模型后,得到的预测事件埋点配置特征与对所述样本事先标定的事件埋点配置特征不一致,则调整所述配置特征提取模型的系数,直到一致;当所有所述样本的数据输入配置特征提取模型后,得到的预测事件埋点配置特征与对所述样本事先标定的事件埋点配置特征一致,训练结束。
埋点配置信息样本集包括多个样本,其中每个样本包括事件埋点配置信息及由专家标定的对于时长采集埋点信息分析有用的、标准的事件埋点配置特征。这样可以保证模型训练的准确性。
在步骤S140中,从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征。
本示例的实施方式中,预定时间段可以是根据需求设定,例如,某个事件持续的周期时间段或者全量的时间段。
第一历史访问记录及第二历史访问记录中包括了目标页面的访问信息,也就是可以包括访问者的访问规律(例如,对于特定事件中访问者的访问规律或者全量的记录中访问者的访问规律),这些规律可以清楚地通过访问者对于页面事件埋点的触发规律体现。
事件埋点触发特征可以是提取的访问记录的机器学习特征,例如信息的特征向量;也可以是通过与预设的标准数据进行比较,排除了异常数据(对于后续步骤中分析时长采集埋点无用、多余的操作信息)后得到的事件埋点触发特征,例如,剔除浏览页面时需操作的频率过于频繁的事件埋点中的几个。
以这种方式,可以获得对于分析时长采集埋点的标准形式的数据特征,保证分析效率和准确性。
一种实施例中,参考图3所示,从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征,包括:步骤S310,从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发信息;步骤S320,从所述预定时间段的事件埋点触发信息中提取事件埋点触发特征。
先提取预定时间段(可以是制定的时间段也可以是某个访问事件对应的时间段,例如某个营销活动时间段)的部分事件埋点触发信息,然后在提取该预定时间段的事件埋点触发信息中提取事件埋点触发特征。
一种实施例中,从所述预定时间段的事件埋点触发信息中提取事件埋点触发特征,包括:将所述预定时间段的事件埋点触发信息输入预设触发特征提取模型,得到所述事件埋点触发特征。
预设触发特征提取模型是预先训练好的机器学习模型,可以对大量的事件埋点触发信息,根据模型训练策略,自动计算分析当前的事件埋点触发信息,输出满足要求的目标页面的事件埋点触发特征。
一种实施例中,触发特征提取模型的训练方法包括:获取事件埋点触发信息样本集,其中每个样本包括事件埋点触发信息及标定的事件埋点触发特征;将每个所述样本的数据分别输入触发特征提取模型,得到所述触发特征提取模型输出的预测事件埋点触发特征;如果存在有所述样本的数据输入触发特征提取模型后,得到的预测事件埋点触发特征与对所述样本事先标定的事件埋点触发特征不一致,则调整所述触发特征提取模型的系数,直到一致;当所有所述样本的数据输入触发特征提取模型后,得到的预测触发特征提取模型与对所述样本事先标定的事件埋点触发特征一致,训练结束。
事件埋点触发信息样本集包括多个样本,其中每个样本包括事件埋点触发信息及由专家标定的对于时长采集埋点信息分析有用的、标准的事件埋点触发特征。这样可以保证模型训练的准确性。
其中,上述事件埋点配置信息样本集、上述事件埋点触发信息样本集、上述第一历史访问记录以及上述第二历史访问记录可预先存储于数据库中,或者存储于区块链中以实现数据信息在不同平台之间的共享,也可防止数据被篡改。其中,从区块链中获取上述事件埋点配置信息样本集、上述事件埋点触发信息样本集、上述第一历史访问记录以及上述第二历史访问记录时,可通过调用智能合约的方式实现,在此不做说明。
区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。
一种实施例中,确定预定时间段,包括:获取所述目标页面的页面访问时长采集的目标时间段,所述目标时间段指示特定页面访问事件持续时间段;获取所述特定页面访问事件持续时间段中所述特定页面访问事件的事件特征;根据所述事件特征,从所述第一历史访问记录及所述第二历史访问记录中确定与所述事件特征相似度高于预定阈值的历史事件特征对应的历史事件的持续时间段;将所述历史事件的持续时间段确定为所述预定时间段。
获取页面访问时长采集的目标时间段,其中,目标时间段指示特定页面访问事件持续时间段,例如,在需要在未来需要针对某个营销活动事件进行时间监控采集的持续时间段。这样可以针对某个特定时间段进行时长采集埋点的个性化分析。
获取该特定页面访问事件持续时间段中该特定页面访问事件的事件特征,可以获取到目标时间段中访问事件的事件特征,例如,事件的活动规则或者活动名称等特征。
然后,根据事件特征,从第一历史访问记录及第二历史访问记录中确定与事件特征相似度高于预定阈值的历史事件特征对应的历史事件的持续时间段,可以查找到第一历史访问记录及第二历史访问记录与目标事件段的活动规律相似的多个事件对应的历史事件的持续时间段。其中,预定阈值根据需求设定。
进而,可以将该历史事件的持续时间段确定为所述预定时间段。
在步骤S150中,将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点。
本示例的实施方式中,时长采集点预测模型是预先训练好的机器学习模型,可以对大量的事件埋点配置特征及事件埋点触发特征,根据模型训练策略,自动计算分析当前的事件埋点配置特征及事件埋点触发特征,输出满足要求的访问者的时长采集埋点信息。根据访问者的时长采集埋点信息可以进行再对应的时长采集埋点设置时长采集代码,进行时长采集。
一种实施例中,时长采集点预测模型的训练方法,包括:获取特征样本集,其中每个样本包括事件埋点配置特征、事件埋点触发特征及标定的时长采集埋点信息;将每个所述样本的数据分别输入时长采集点预测模型,得到所述时长采集点预测模型输出的预测时长采集埋点信息;如果存在有所述样本的数据输入时长采集点预测模型后,得到的预测时长采集埋点信息与对所述样本事先标定的时长采集埋点信息不一致,则调整所述机器学习模型的系数,直到一致;当所有所述样本的数据输入时长采集点预测模型后,得到的预测时长采集埋点信息与对所述样本事先标定的时长采集埋点信息一致,训练结束。
特征样本集包括多个样本,其中每个样本包括事件埋点配置特征、事件埋点触发特征及由专家标定的时长采集埋点信息。这样可以保证模型训练的准确性。
在步骤S160中,为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于所述访问者的时长采集埋点信息中的事件埋点。
本示例的实施方式中,在页面代码中与时长采集埋点信息相应的事件埋点添加关联时长采集代码,可以由时长采集代码在访问者访问目标页面时,响应于目标事件埋点的触发使得述目标事件埋点关联的时长采集代码上报时长数据,这样可以进行分段上报时长,也分段上报与访问者的访问习惯相关,即时长采集代码的关联于访问者的访问习惯关联,可以保证时长采集的准确性的同时最大可能的避免代码过渡耦合。
一种实施例中,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,包括:当所述目标事件埋点为所述页面代码中关联了第一个时长采集代码的埋点时,所述第一个时长采集代码获取所述目标事件埋点的触发时间点与用户进入所述目标页面的时刻点的差值上报;当所述目标事件埋点不是所述页面代码中关联了第一个时长采集代码的埋点时,所述目标事件埋点关联的时长采集代码获取所述目标事件埋点的触发时间点与时长采集代码的上一个时长采集代码的上报时刻点的时差上报。
针对页面A访问流程,触发事件埋点1时,会上报进入页面开始到事件埋点1的间隔时长;触发事件埋点2时,会上报事件埋点1与事件埋点2之间的间隔时长;触发事件埋点3时,会上报事件埋点2与事件埋点3之间的间隔时长;最后在进入页面B后会上报页面A的最后一段时长,即事件埋点3到页面结束的间隔时长。页面A的总访问时长为4段时长的总和。利用分段上报,有效改善并解决了业务场景中的漏报情形以及数据重复统计的问题。
在相关技术中,时长数据采集的实现方案主要有两种;一种是进入第二个页面上报第一个访问页面的时长,另一种是采用心跳上报的方式。
第一种方式因未及时上报会造成数据丢失,也不能兼容异常退出等情况的出现;第二种方式服务器负荷过大,不符合业务量大的场景,而且做不到页面访问全程都用心跳,一般通过人为设置最大的访问时长来避免服务器超负荷,但这样会造成时长统计不准,上报数据会比实际情况少很多。
先前我们采用的是整条数据上报的方式,即下一个页面上报上一个页面的时长,对页面的访问结束点形成很大依赖。这种统计方式针对单页面应用非常不友好,会丢失很多一次性访问用户的时长数据,而且难以统计异常退出的访问时长。
而T+1做数据分析时,针对跨天的页面时长,会比较棘手。因为用户访问的结束点不明确,那么当天的访问时长可能会统计为第二天甚至第三天第四天的数据,而且对于没捕获到页面结束点的异常情况,时长数据就丢失了;而如果人为设置结束点(之前的处理方式),出现跨天的访问情况时,一个页面结束就被拆成了人为结束点和实际结束点两条数据,会造成重复统计。
本方案针对时长统计的痛点,结合事件埋点提出了时长分段上报的解决方案。
实时累计活跃时长:监听鼠标键盘触摸等事件实时累计用户活跃时间,在页面不可见但未退出的情形下(例如手机页面切入后台,PC端在不关闭浏览器的情况下浏览别的网页),这部分时长不计入时长累计。过滤无效时长,数据跟接近真实数据。
拆解时长分段上报:本方案以页面埋点为触发点,将访问时长拆分成多段。如果用户触发事件埋点或者其他的上报服务器事件,就将前一段累计的访问时长跟事件埋点一起上报到服务器,然后时长记数清空,从零开始累计下一段时长。在进入下一页面后,上报上一个页面最后的一段累计时长。
将同一个页面的访问,拆分成多段上报,每一段时长都相当于一个结束点,既能优化页面时长不及时上报造成的误差,也不会使服务器负荷过大。
保留异常情况下的访问时长:传统H5页面时长的统计非常依赖页面结束节点,如果未能捕获结束(浏览器异常退出等),则无法计算正确的访问时长,一般这部分数据会因此丢失。本方案解耦了传统时长统计对页面结束节点的依赖,实现时长实时累加,用户多停留一秒,就会有多一秒的时长,就算浏览器异常退出,也可以保存用户真实的活跃时长。
跟随埋点上报:H5每次触发事件会立即上报一条数据,本方案将时长的上报放入事件埋点的数据结构中上报,对服务器来说,是处理同一条数据,不会增加太多服务器负荷。
针对页面A访问流程,触发事件埋点1时,会上报进入页面开始到事件埋点1的间隔时长;触发事件埋点2时,会上报事件埋点1与事件埋点2之间的间隔时长;触发事件埋点3时,会上报事件埋点2与事件埋点3之间的间隔时长;最后在进入页面B后会上报页面A的最后一段时长,即事件埋点3到页面结束的间隔时长。页面A的总访问时长为4段时长的总和。利用分段上报,有效改善并解决了业务场景中的漏报情形以及数据重复统计的问题。
分段上报能最大程度地保证有效数据的成功采集,即使出现访问异常结束的情形,也会保留出现异常前的访问时长,而不会造成数据整条漏报或丢失,改善了数据采集的准确性。
兼容针对单页面的时长统计,既能保证页面有效时长及时上报,也不会给服务器造成过大负担。而且不用人为设置最大访问时长,统计的整体数据会更接近于真实的访问情况。
本方案采用分段统计页面访问时长的方案,每一小段时长都可以看成一个结束点,解耦了访问时长对页面结束点的依赖,解决了相关技术中T+1大数据计算不能当天及时统计同一个页面的跨天的访问时长,或者当同时出现人为结束点跟实际结束点时,造成重复计算的问题。在现有解决方案的基础上,每天时长累加,时长统计不会跨天也不会重复计算。
本申请还提供了一种页面访问时长采集装置。参考图4所示,页面访问时长采集装置,其中,包括:第一获取模块410用于获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;第二获取模块420用于获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;第一提取模块430用于从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;第二提取模块440用于从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;分析模块450用于将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;采集模块460用于为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于所述访问者的时长采集埋点信息中的事件埋点。
上述页面访问时长采集装置中各模块的具体细节已经在对应的页面访问时长采集方法中进行了详细的描述,因此此处不再赘述。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本申请的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
此外,尽管在附图中以特定顺序描述了本申请中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、移动终端、或者网络设备等)执行根据本申请实施方式的方法。
在本申请的示例性实施例中,还提供了一种能够实现上述方法的电子设备。
所属技术领域的技术人员能够理解,本申请的各个方面可以实现为系统、方法或程序产品。因此,本申请的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。
下面参照图5来描述根据本申请的这种实施方式的电子设备500。图5显示的电子设备500仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图5所示,电子设备500以通用计算设备的形式表现。电子设备500的组件可以包括但不限于:上述至少一个处理单元510、上述至少一个存储单元520、连接不同系统组件(包括存储单元520和处理单元510)的总线530。
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元510执行,使得所述处理单元510执行本说明书上述“示例性方法”部分中描述的根据本申请各种示例性实施方式的步骤。例如,所述处理单元510可以执行如图1中所示的步骤。
存储单元520可以包括易失性存储单元形式的可读存储介质,例如随机存取存储单元(RAM)5201和/或高速缓存存储单元5202,还可以进一步包括只读存储单元(ROM)5203。
存储单元520还可以包括具有一组(至少一个)程序模块5205的程序/实用工具5204,这样的程序模块5205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线530可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备500也可以与一个或多个外部设备700(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得客户能与该电子设备500交互的设备通信,和/或与使得该电子设备500能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口550进行,可以包括与输入/输出(I/O)接口550连接的显示单元540。并且,电子设备500还可以通过网络适配器560与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器560通过总线530与电子设备500的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备500使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本申请实施方式的方法。
在本申请的示例性实施例中,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。其中,该计算机可读存储介质可以是非易失性,也可以是易失性。在一些可能的实施方式中,本申请的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本申请各种示例性实施方式的步骤。
参考图6所示,描述了根据本申请的实施方式的用于实现上述方法的程序产品600,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
所述程序产品可以采用一个或多个可读存储介质的任意组合。可读存储介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读存储介质,该可读存储介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在客户计算设备上执行、部分地在客户设备上执行、作为一个独立的软件包执行、部分在客户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到客户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
此外,上述附图仅是根据本申请示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其他实施例。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由权利要求指出。

Claims (20)

  1. 一种页面访问时长采集方法,其中,包括:
    获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;
    获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;
    从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;
    从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;
    将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;
    为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于所述访问者的时长采集埋点信息中的事件埋点。
  2. 根据权利要求1所述的方法,其中,所述从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征,包括:
    从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发信息;
    从所述预定时间段的事件埋点触发信息中提取事件埋点触发特征。
  3. 根据权利要求2所述的方法,其中,所述从所述预定时间段的事件埋点触发信息中提取事件埋点触发特征,包括:
    将所述预定时间段的事件埋点触发信息输入预设触发特征提取模型,得到所述事件埋点触发特征。
  4. 根据权利要求1所述的方法,其中,确定所述预定时间段,包括:
    获取所述目标页面的页面访问时长采集的目标时间段,所述目标时间段指示特定页面访问事件持续时间段;
    获取所述特定页面访问事件持续时间段中所述特定页面访问事件的事件特征;
    根据所述事件特征,从所述第一历史访问记录及所述第二历史访问记录中确定与所述事件特征相似度高于预定阈值的历史事件特征对应的历史事件的持续时间段;
    将所述历史事件的持续时间段确定为所述预定时间段。
  5. 根据权利要求1所述的方法,其中,所述从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征,包括:
    将所述事件埋点配置信息输入预设配置特征提取模型,得到所述目标页面的事件埋点配置特征。
  6. 根据权利要求1所述的方法,其中,所述响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,包括:
    当所述目标事件埋点为所述页面代码中关联了第一个时长采集代码的埋点时,所述第一个时长采集代码获取所述目标事件埋点的触发时间点与用户进入所述目标页面的时刻点的差值上报;
    当所述目标事件埋点不是所述页面代码中关联了第一个时长采集代码的埋点时,所述目标事件埋点关联的时长采集代码获取所述目标事件埋点的触发时间点与时长采集代码的上一个时长采集代码的上报时刻点的时差上报。
  7. 根据权利要求1所述的方法,其中,所述时长采集点预测模型的训练方法,包括:
    获取特征样本集,其中每个样本包括事件埋点配置特征、事件埋点触发特征及标定的时长采集埋点信息;
    将每个所述样本的数据分别输入时长采集点预测模型,得到所述时长采集点预测模型输出的预测时长采集埋点信息;
    如果存在有所述样本的数据输入时长采集点预测模型后,得到的预测时长采集埋点信息与对所述样本事先标定的时长采集埋点信息不一致,则调整所述机器学习模型的系数,直到一致;
    当所有所述样本的数据输入时长采集点预测模型后,得到的预测时长采集埋点信息与对所述样本事先标定的时长采集埋点信息一致,训练结束。
  8. 根据权利要求3所述的方法,其中,所述预设触发特征提取模型的训练方法,包括:
    获取事件埋点触发信息样本集,其中每个样本包括事件埋点触发信息及标定的事件埋点触发特征;
    将每个所述样本的数据分别输入触发特征提取模型,得到所述触发特征提取模型输出的预测事件埋点触发特征;
    如果存在有所述样本的数据输入触发特征提取模型后,得到的预测事件埋点触发特征与对所述样本事先标定的事件埋点触发特征不一致,则调整所述触发特征提取模型的系数,直到一致;
    当所有所述样本的数据输入触发特征提取模型后,得到的预测触发特征提取模型与对所述样本事先标定的事件埋点触发特征一致,训练结束以得到所述预设触发特征提取模型。
  9. 根据权利要求5所述的方法,其中,所述预设配置特征提取模型的训练方法,包括:
    获取事件埋点配置信息样本集,其中每个样本包括事件埋点配置信息及标定的事件埋点配置特征;
    将每个所述样本的数据分别输入配置特征提取模型,得到所述配置特征提取模型输出的预测事件埋点配置特征;
    如果存在有所述样本的数据输入配置特征提取模型后,得到的预测事件埋点配置特征与对所述样本事先标定的事件埋点配置特征不一致,则调整所述配置特征提取模型的系数,直到一致;
    当所有所述样本的数据输入配置特征提取模型后,得到的预测事件埋点配置特征与对所述样本事先标定的事件埋点配置特征一致,训练结束以得到所述预设配置特征提取模型。
  10. 一种页面访问时长采集装置,其中,包括:
    第一获取模块,用于获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;
    第二获取模块,用于获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;
    第一提取模块,用于从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;
    第二提取模块,用于从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;
    分析模块,用于将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;
    采集模块,用于为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于所述访问者的时长采集埋点信息中的事件埋点。
  11. 一种电子设备,其中,包括:
    处理器;以及
    存储器,用于存储所述处理器的程序指令;其中,所述处理器配置为经由执行所述程序指令来执行:
    获取目标页面的页面代码对应的事件埋点配置信息及目标访问者对所述目标页面的第一历史访问记录;
    获取所述目标访问者对所述目标页面的关联页面的第二历史访问记录;
    从所述事件埋点配置信息提取所述目标页面的事件埋点配置特征;
    从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发特征;
    将所述事件埋点配置特征及所述事件埋点触发特征输入时长采集点预测模型,得到所述访问者的时长采集埋点信息,所述时长采集埋点信息中包括预测的所述访问者访问所述目标页面时会触发的事件埋点;
    为所述页面代码中的目标事件埋点关联时长采集代码,以在所述访问者访问所述目标页面时,响应于目标事件埋点的触发使得与所述目标事件埋点关联的所述时长采集代码上报时长数据,所述目标事件埋点对应于所述访问者的时长采集埋点信息中的事件埋点。
  12. 根据权利要求11所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    从所述第一历史访问记录及所述第二历史访问记录中提取预定时间段的事件埋点触发信息;
    从所述预定时间段的事件埋点触发信息中提取事件埋点触发特征。
  13. 根据权利要求12所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    将所述预定时间段的事件埋点触发信息输入预设触发特征提取模型,得到所述事件埋点触发特征。
  14. 根据权利要求11所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    获取所述目标页面的页面访问时长采集的目标时间段,所述目标时间段指示特定页面访问事件持续时间段;
    获取所述特定页面访问事件持续时间段中所述特定页面访问事件的事件特征;
    根据所述事件特征,从所述第一历史访问记录及所述第二历史访问记录中确定与所述事件特征相似度高于预定阈值的历史事件特征对应的历史事件的持续时间段;
    将所述历史事件的持续时间段确定为所述预定时间段。
  15. 根据权利要求11所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    将所述事件埋点配置信息输入预设配置特征提取模型,得到所述目标页面的事件埋点配置特征。
  16. 根据权利要求11所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    当所述目标事件埋点为所述页面代码中关联了第一个时长采集代码的埋点时,所述第一个时长采集代码获取所述目标事件埋点的触发时间点与用户进入所述目标页面的时刻点的差值上报;
    当所述目标事件埋点不是所述页面代码中关联了第一个时长采集代码的埋点时,所述目标事件埋点关联的时长采集代码获取所述目标事件埋点的触发时间点与时长采集代码的上一个时长采集代码的上报时刻点的时差上报。
  17. 根据权利要求11所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    获取特征样本集,其中每个样本包括事件埋点配置特征、事件埋点触发特征及标定的时长采集埋点信息;
    将每个所述样本的数据分别输入时长采集点预测模型,得到所述时长采集点预测模型输出的预测时长采集埋点信息;
    如果存在有所述样本的数据输入时长采集点预测模型后,得到的预测时长采集埋点信息与对所述样本事先标定的时长采集埋点信息不一致,则调整所述机器学习模型的系数,直到一致;
    当所有所述样本的数据输入时长采集点预测模型后,得到的预测时长采集埋点信息与对所述样本事先标定的时长采集埋点信息一致,训练结束。
  18. 根据权利要求13所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    获取事件埋点触发信息样本集,其中每个样本包括事件埋点触发信息及标定的事件埋点触发特征;
    将每个所述样本的数据分别输入触发特征提取模型,得到所述触发特征提取模型输出的预测事件埋点触发特征;
    如果存在有所述样本的数据输入触发特征提取模型后,得到的预测事件埋点触发特征与对所述样本事先标定的事件埋点触发特征不一致,则调整所述触发特征提取模型的系数,直到一致;
    当所有所述样本的数据输入触发特征提取模型后,得到的预测触发特征提取模型与对所述样本事先标定的事件埋点触发特征一致,训练结束以得到所述预设触发特征提取模型。
  19. 根据权利要求15所述的电子设备,其中,所述处理器配置为经由执行所述程序指令来执行:
    获取事件埋点配置信息样本集,其中每个样本包括事件埋点配置信息及标定的事件埋点配置特征;
    将每个所述样本的数据分别输入配置特征提取模型,得到所述配置特征提取模型输出的预测事件埋点配置特征;
    如果存在有所述样本的数据输入配置特征提取模型后,得到的预测事件埋点配置特征与对所述样本事先标定的事件埋点配置特征不一致,则调整所述配置特征提取模型的系数,直到一致;
    当所有所述样本的数据输入配置特征提取模型后,得到的预测事件埋点配置特征与对所述样本事先标定的事件埋点配置特征一致,训练结束以得到所述预设配置特征提取模型。
  20. 一种计算机可读存储介质,其上存储有程序指令,其中,所述程序指令被处理器执行时实现权利要求1-9任一项所述的方法。
PCT/CN2020/093584 2020-01-20 2020-05-30 页面访问时长采集方法、装置、介质及电子设备 WO2021147220A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010066940.6A CN111241453B (zh) 2020-01-20 2020-01-20 页面访问时长采集方法、装置、介质及电子设备
CN202010066940.6 2020-01-20

Publications (1)

Publication Number Publication Date
WO2021147220A1 true WO2021147220A1 (zh) 2021-07-29

Family

ID=70871895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093584 WO2021147220A1 (zh) 2020-01-20 2020-05-30 页面访问时长采集方法、装置、介质及电子设备

Country Status (2)

Country Link
CN (1) CN111241453B (zh)
WO (1) WO2021147220A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849390A (zh) * 2021-10-08 2021-12-28 珠海格力电器股份有限公司 一种行为数据获取方法和装置、电子设备和存储介质
CN116522000A (zh) * 2023-05-23 2023-08-01 上海任意门科技有限公司 一种用于向用户推荐内容的推荐模型的训练方法和装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448832B (zh) * 2020-06-18 2024-03-12 北京新氧科技有限公司 一种控件曝光检测方法及应用程序运行监测系统
CN112311629B (zh) * 2020-10-30 2022-04-26 广州华多网络科技有限公司 数据处理方法、装置、服务器及计算机可读存储介质
CN112435047A (zh) * 2020-10-30 2021-03-02 四川新网银行股份有限公司 一种基于埋点数据的营销外呼数据推荐方法
CN112732546B (zh) * 2021-01-28 2024-06-04 腾讯科技(深圳)有限公司 基于应用的使用时长处理方法、装置、设备及存储介质
CN113590985B (zh) * 2021-09-29 2022-01-04 北京每日优鲜电子商务有限公司 页面跳转配置方法、装置、电子设备和计算机可读介质
CN113849391A (zh) * 2021-10-08 2021-12-28 珠海格力电器股份有限公司 一种程序性能的确定方法和装置、电子设备和存储介质
CN116662638B (zh) * 2022-09-06 2024-04-12 荣耀终端有限公司 数据采集方法及相关装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7669136B1 (en) * 2008-11-17 2010-02-23 International Business Machines Corporation Intelligent analysis based self-scheduling browser reminder
CN104850409A (zh) * 2015-06-05 2015-08-19 北京京东尚科信息技术有限公司 统计网页停留时长方法
CN108337281A (zh) * 2017-01-19 2018-07-27 北京京东尚科信息技术有限公司 计算页面浏览时长的方法及系统
CN108491315A (zh) * 2018-03-16 2018-09-04 五八有限公司 页面驻留时长的统计方法、装置及计算机可读存储介质
CN110322343A (zh) * 2019-07-02 2019-10-11 上海上湖信息技术有限公司 一种用户全生命周期信用预测方法、装置和计算机设备
CN110633205A (zh) * 2019-06-20 2019-12-31 北京无限光场科技有限公司 一种埋点事件的检测方法、装置、终端设备及介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104216921B (zh) * 2013-06-05 2019-06-04 腾讯科技(深圳)有限公司 一种实现浏览器中快速链接的添加提示方法、装置及系统
CN106294406B (zh) * 2015-05-22 2020-04-17 阿里巴巴集团控股有限公司 一种用于处理应用访问数据的方法与设备
CN107995266A (zh) * 2017-11-22 2018-05-04 平安科技(深圳)有限公司 埋点数据处理方法、装置、计算机设备和存储介质
CN108921400A (zh) * 2018-06-14 2018-11-30 万翼科技有限公司 房产信息的统计方法、服务器及存储介质
CN113127771A (zh) * 2019-05-30 2021-07-16 北京腾云天下科技有限公司 应用埋点方法、装置、计算设备和系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7669136B1 (en) * 2008-11-17 2010-02-23 International Business Machines Corporation Intelligent analysis based self-scheduling browser reminder
CN104850409A (zh) * 2015-06-05 2015-08-19 北京京东尚科信息技术有限公司 统计网页停留时长方法
CN108337281A (zh) * 2017-01-19 2018-07-27 北京京东尚科信息技术有限公司 计算页面浏览时长的方法及系统
CN108491315A (zh) * 2018-03-16 2018-09-04 五八有限公司 页面驻留时长的统计方法、装置及计算机可读存储介质
CN110633205A (zh) * 2019-06-20 2019-12-31 北京无限光场科技有限公司 一种埋点事件的检测方法、装置、终端设备及介质
CN110322343A (zh) * 2019-07-02 2019-10-11 上海上湖信息技术有限公司 一种用户全生命周期信用预测方法、装置和计算机设备

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849390A (zh) * 2021-10-08 2021-12-28 珠海格力电器股份有限公司 一种行为数据获取方法和装置、电子设备和存储介质
CN116522000A (zh) * 2023-05-23 2023-08-01 上海任意门科技有限公司 一种用于向用户推荐内容的推荐模型的训练方法和装置
CN116522000B (zh) * 2023-05-23 2024-01-23 上海任意门科技有限公司 一种用于向用户推荐内容的推荐模型的训练方法和装置

Also Published As

Publication number Publication date
CN111241453B (zh) 2023-09-08
CN111241453A (zh) 2020-06-05

Similar Documents

Publication Publication Date Title
WO2021147220A1 (zh) 页面访问时长采集方法、装置、介质及电子设备
US20080033991A1 (en) Prediction of future performance of a dbms
EP3979080A1 (en) Methods and systems for predicting time of server failure using server logs and time-series data
US20160188182A1 (en) Predicting user navigation events
CN104731690B (zh) 适应性度量收集、存储、和警告阈值
US20190311114A1 (en) Man-machine identification method and device for captcha
CN109522190B (zh) 异常用户行为识别方法及装置、电子设备、存储介质
WO2020168756A1 (zh) 集群日志特征提取方法、装置、设备及存储介质
CN111405030B (zh) 一种消息推送方法、装置、电子设备和存储介质
WO2020164272A1 (zh) 上网设备的识别方法、装置及存储介质、计算机设备
CN113903389A (zh) 一种慢盘检测方法、装置及计算机可读写存储介质
WO2020232902A1 (zh) 异常对象识别方法、装置、计算设备和存储介质
CN114022711A (zh) 工业标识数据缓存处理方法及装置、介质及电子设备
WO2021174881A1 (zh) 多维度信息的组合预测方法、装置、计算机设备及介质
WO2020232903A1 (zh) 监控任务动态调整方法、装置、计算设备和存储介质
CN116954976A (zh) 灰度服务故障处理方法、装置、计算机设备及存储介质
CN108768742B (zh) 网络构建方法及装置、电子设备、存储介质
CN112070564B (zh) 广告拉取方法、装置、系统与电子设备
CN117597679A (zh) 作出在多租户高速缓存中放置数据的决策
CN114266352A (zh) 模型训练结果优化方法、装置、存储介质及设备
CN114650252B (zh) 基于企业服务总线的路由方法、装置及计算机设备
CN112700884B (zh) 疫情防控有效性确定方法、装置、电子设备及介质
US20230412452A1 (en) Detecting network anomalies by correlating multiple information sources
US10282178B2 (en) Dynamic determination of instrumentation code based on client experience
CN115714718A (zh) 基于内存的日志预警方法、系统、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20916086

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20916086

Country of ref document: EP

Kind code of ref document: A1