CN117149597A - User behavior analysis system, method, storage medium and computing device - Google Patents

User behavior analysis system, method, storage medium and computing device Download PDF

Info

Publication number
CN117149597A
CN117149597A CN202311040804.XA CN202311040804A CN117149597A CN 117149597 A CN117149597 A CN 117149597A CN 202311040804 A CN202311040804 A CN 202311040804A CN 117149597 A CN117149597 A CN 117149597A
Authority
CN
China
Prior art keywords
user behavior
analysis
user
log
different
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311040804.XA
Other languages
Chinese (zh)
Inventor
徐国兴
高海钊
冯晓梦
王伟
张浩男
蔡晓雨
孙福
杨文瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shuidi Technology Group Co ltd
Original Assignee
Beijing Shuidi Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shuidi Technology Group Co ltd filed Critical Beijing Shuidi Technology Group Co ltd
Priority to CN202311040804.XA priority Critical patent/CN117149597A/en
Publication of CN117149597A publication Critical patent/CN117149597A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

The invention discloses a user behavior analysis system, a method, a storage medium and a computing device, wherein the system comprises: the log server is used for receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters; the log server is further configured to classify the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and store the classified user behavior logs into a time sequence database; and the second analysis subsystem is used for inputting the user behavior log in the time sequence database into a second analysis model, and analyzing the user behavior by the second analysis model based on the user behavior log according to the correspondingly configured analysis rule. The embodiment of the invention can correlate the user behavior logs corresponding to different embedded point objects, and can help prompt technical research and development personnel to quickly verify whether embedded point acquisition is successful or not according to the success or failure of the user behavior analysis result, and report whether the embedded point parameters are abnormal or not.

Description

User behavior analysis system, method, storage medium and computing device
The invention relates to a division application of China patent application with the application number of 202010166628.4 and the name of user behavior analysis system, method, storage medium and computing device, which is the application of 11/03/2020.
Technical Field
The invention relates to the technical field of data analysis, in particular to a user behavior analysis system, a user behavior analysis method, a storage medium and computing equipment.
Background
Analysis of user behavior is an effective analysis method for both product iteration and anomaly diagnosis, and as such, a large number of repeated similar analysis items are carried out every day, and in order to improve efficiency, the analysis method is generally commercialized. At present, mature user behavior analysis systems are available in the market, are relatively easy to observe and grow IO, but the risk of data security needs to be borne when a third party system is accessed, and meanwhile, a data acquisition scheme is needed, so that the access cost is high.
The Event analysis model commonly used in the current User behavior analysis system is based on two core entity data of Event (Event) and User (User), and embedded points are collected through an SDK (software development kit ), and when each function of a product is used, the two types of data can also participate in specific analysis and inquiry respectively or through. For convenience of use, clients (e.g., H5, applet, APP, etc.) and backend burial points are typically provided to the user together, but not distinguishing the burial point types can result in increased cost to the user. In addition, the current user behavior analysis system usually uses unique distict id to identify the user, but for different clients, the acquired user identifications cannot be connected in series sometimes, so that user behavior analysis cannot be completed effectively and smoothly.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a system, a method, a storage medium, and a computing device for analyzing user behavior, which can achieve convenient and more comprehensive analysis of user behavior, and help to improve analysis efficiency of user behavior data.
According to an aspect of an embodiment of the present invention, there is provided a user behavior analysis system including:
the log server is used for receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters and storing the user behavior logs into the distributed file system;
the distributed file system is used for analyzing the user identification from the user behavior logs stored in the preset time period, correlating the user behavior logs corresponding to different user identifications belonging to the same user and storing the correlated user behavior logs into the time sequence database;
the first analysis subsystem is used for inputting the user behavior log in the time sequence database into a first analysis model, and analyzing the user behavior by the first analysis model based on the user behavior log according to the correspondingly configured analysis rule.
Optionally, the different buried point objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement;
the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
Optionally, a distributed publish-subscribe message system and a log collection system, wherein,
the log server is also used for publishing the user behavior logs which are reported in real time by the received different embedded point objects based on the preset embedded point parameters to the distributed publishing and subscribing message system;
the distributed publishing and subscribing message system is used for receiving user behavior logs from different embedded point objects published by the log server;
the log collection system is used for subscribing the user behavior log from the distributed publishing and subscribing message system and storing the subscribed user behavior log to the distributed file system.
Optionally, the user identifier includes a unique identifier generated by user registration and a necessary identifier generated by different embedded point objects, and the distributed file system is further configured to:
establishing association between the unique identification and the necessary identification of the user analyzed in the same user behavior log, and combining the unique identification and the necessary identification to generate a unified identification;
and after the unique identifier and the user behavior logs corresponding to the unique identifiers and the necessary identifiers which are established to be associated are associated, storing the user behavior logs into a time sequence database through a database management tool, wherein the associated user behavior logs correspond to the uniform identifiers.
Optionally, the first analysis model includes a funnel model, and the first analysis subsystem is further configured to:
receiving analysis rules which are configured by service personnel for the funnel model and comprise funnel conversion periods, funnel steps, funnel events corresponding to each step and funnel event screening conditions;
and analyzing the user behavior log according to a corresponding analysis rule by utilizing the funnel model through a funnel function of the time sequence database so as to analyze and obtain the user conversion rate of funnel events corresponding to different funnel steps executed by a user in a funnel conversion period.
Optionally, the first analysis model includes a retention model, and the first analysis subsystem is further configured to:
receiving analysis rules which are configured by service personnel for the retention model and comprise retention periods, initial behavior events, follow-up behavior events and screening conditions corresponding to the events;
and analyzing the user behavior log according to a corresponding analysis rule by using the retention model through a retention function of the time sequence database so as to obtain the user retention rate of the follow-up event continuously executed after the user executes the initial event in the retention period.
According to another aspect of the embodiment of the present invention, there is also provided a user behavior analysis system, including:
The log server is used for receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters;
the log server is further configured to classify the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and store the classified user behavior logs into a time sequence database;
and the second analysis subsystem is used for inputting the user behavior log in the time sequence database into a second analysis model, and analyzing the user behavior by the second analysis model based on the user behavior log according to the correspondingly configured analysis rule.
Optionally, the different buried point objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement;
the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
Optionally, a distributed publish-subscribe messaging system,
the log server is also used for publishing the user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters to the distributed publishing and subscribing message system;
the distributed publishing and subscribing message system is used for receiving user behavior logs from different embedded point objects published by the log server;
The time sequence database is further used for storing the user behavior log subscribed from the distributed publishing and subscribing message system.
Optionally, the second analysis model includes an event analysis model, and the second analysis subsystem is further configured to:
receiving analysis rules configured by service personnel and comprising at least two specific events capable of forming a combined event, indexes and screening conditions corresponding to each specific event, combined indexes of the combined event and operation modes of the combined indexes;
respectively carrying out aggregation calculation on at least two specific events meeting the corresponding screening conditions based on the user behavior log by utilizing the event analysis model to obtain a calculation result meeting the corresponding specific event index;
and calculating the calculation result conforming to the corresponding specific event index by using the event analysis model according to the calculation mode of the combination index to obtain the result conforming to the combination index.
Optionally, the second analysis model includes a path model, and the second analysis subsystem is further configured to:
receiving analysis rules configured by service personnel and containing specific participation events and corresponding screening conditions, target events and path conversion periods;
screening user behavior logs corresponding to users completing the target event based on the input user behavior logs by utilizing the path model;
Further screening user behavior logs of the participation events meeting the corresponding screening conditions in a path conversion period from the screened user behavior logs by using the path model, and analyzing different user behavior paths based on the screened user behavior logs;
and analyzing the user conversion rate between different user behavior paths by using the path model.
Optionally, the second analysis subsystem is further configured to:
sequencing behavior nodes in different user behavior paths according to a time sequence by using the path model, and removing repeated behavior nodes in the user behavior paths;
counting the number of user behavior paths corresponding to different user behavior paths after removing the repeated behavior nodes by using the path model;
and analyzing the user conversion rate of different user behavior paths based on the counted number of the user behavior paths by using the path model.
According to another aspect of the embodiment of the present invention, there is also provided a user behavior analysis method, which is characterized by including:
receiving and storing user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters;
analyzing user identifications from user behavior logs stored in a preset time period, correlating the user behavior logs corresponding to different user identifications belonging to the same user, and storing the correlated user behavior logs into a time sequence database;
And inputting the user behavior log in the time sequence database into a first analysis model, and analyzing the user behavior by the first analysis model based on the user behavior log according to the correspondingly configured analysis rule.
Optionally, the different buried point objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement;
the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
Optionally, receiving and storing user behavior logs reported in real time by different embedded point objects based on preset embedded point parameters, including:
receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters;
user behavior logs from different embedded point objects are stored in a publish/subscribe mode.
Optionally, the user identifier includes a unique identifier generated by user registration and a necessary identifier generated by different embedded point objects, and the associating the user behavior logs corresponding to different user identifiers belonging to the same user and storing the user behavior logs in a time sequence database includes:
establishing association between the unique identification and the necessary identification of the user analyzed in the same user behavior log, and combining the unique identification and the necessary identification to generate a unified identification;
And after the unique identifier and the user behavior logs corresponding to the unique identifiers and the necessary identifiers which are established to be associated are associated, storing the user behavior logs into a time sequence database through a database management tool, wherein the associated user behavior logs correspond to the uniform identifiers.
Optionally, the first analysis model includes a funnel model, and the first analysis model analyzes the user behavior according to the analysis rule configured correspondingly based on the user behavior log, including:
receiving analysis rules which are configured by service personnel for the funnel model and comprise funnel conversion periods, funnel steps, funnel events corresponding to each step and funnel event screening conditions;
and analyzing the user behavior log according to a corresponding analysis rule by utilizing the funnel model through a funnel function of the time sequence database so as to analyze and obtain the user conversion rate of funnel events corresponding to different funnel steps executed by a user in a funnel conversion period.
Optionally, the first analysis model includes a retention model, and the first analysis model analyzes the user behavior according to the analysis rule configured correspondingly based on the user behavior log, including:
receiving analysis rules which are configured by service personnel for the retention model and comprise retention periods, initial behavior events, follow-up behavior events and screening conditions corresponding to the events;
And analyzing the user behavior log according to a corresponding analysis rule by utilizing the retention model through a self-contained retention function of the time sequence database so as to obtain the user retention rate of the follow-up event continuously executed after the user executes the initial event in the retention period.
According to still another aspect of the embodiment of the present invention, there is further provided a user behavior analysis method, including:
receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters;
classifying the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and storing the classified user behavior logs into a time sequence database;
and inputting the user behavior log in the time sequence database into a second analysis model, and analyzing the user behavior by the second analysis model based on the user behavior log according to the correspondingly configured analysis rule.
Optionally, the different buried point objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement;
the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
Optionally, classifying the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and storing the classified user behavior logs in a time sequence database, including:
Classifying the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects;
and storing the classified user behavior logs into a time sequence database through a database management tool by adopting a publish/subscribe mode.
Optionally, the second analysis model includes an event analysis model, and the second analysis model analyzes the user behavior according to the analysis rule configured correspondingly based on the user behavior log, including:
receiving analysis rules configured by service personnel and comprising at least two specific events capable of forming a combined event, indexes and screening conditions corresponding to each specific event, combined indexes of the combined event and operation modes of the combined indexes;
respectively carrying out aggregation calculation on at least two specific events meeting the corresponding screening conditions based on the user behavior log by utilizing the event analysis model to obtain a calculation result meeting the corresponding specific event index;
and calculating the calculation result conforming to the corresponding specific event index by using the event analysis model according to the calculation mode of the combination index to obtain the result conforming to the combination index.
Optionally, the second analysis model includes a path model, and the second analysis model analyzes the user behavior according to the analysis rule configured correspondingly based on the user behavior log, including:
Receiving analysis rules configured by service personnel and containing specific participation events and corresponding screening conditions, target events and path conversion periods;
screening user behavior logs corresponding to users completing the target event based on the input user behavior logs by utilizing the path model;
further screening user behavior logs of the participation events meeting the corresponding screening conditions in a path conversion period from the screened user behavior logs by using the path model, and analyzing different user behavior paths based on the screened user behavior logs;
and analyzing the user conversion rate between different user behavior paths by using the path model.
Optionally, the analyzing the user conversion rate between different user behavior paths by using the path model includes:
sequencing behavior nodes in different user behavior paths according to a time sequence by using the path model, and removing repeated behavior nodes in the user behavior paths;
counting the number of user behavior paths corresponding to different user behavior paths after removing the repeated behavior nodes by using the path model;
and analyzing the user conversion rate of different user behavior paths based on the counted number of the user behavior paths by using the path model.
According to still another aspect of the embodiment of the present invention, there is further provided an integrated user behavior analysis system, including: the user behavior analysis system of any of the embodiments above.
According to yet another aspect of embodiments of the present invention, there is also provided a computer readable storage medium storing computer program code which, when run on a computing device, causes the computing device to perform the user behavior analysis method of any of the embodiments above.
According to yet another aspect of an embodiment of the present invention, there is also provided a computing device including: a processor; a memory storing computer program code; the computer program code, when executed by the processor, causes the computing device to perform the user behavior analysis method of any of the embodiments above.
The user behavior analysis system provided by the embodiment of the invention can analyze the offline user behavior logs stored in the distributed file system after receiving the user behavior logs reported by different embedded point objects in real time based on the preset embedded point parameters through the log server and storing the user behavior logs in the distributed file system, and can correlate the user behavior logs corresponding to different user identifications by linking different user identifications of the same user, so that the user behavior can be effectively restored, and the user behavior can be conveniently and comprehensively analyzed. Further, storing the user behavior log with the time series database can also help to improve the analysis efficiency of the first analysis model on the user behavior data.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
The above, as well as additional objectives, advantages, and features of the present invention will become apparent to those skilled in the art from the following detailed description of a specific embodiment of the present invention when read in conjunction with the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.
The invention may be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a schematic diagram showing a user behavior analysis system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram showing a user behavior analysis system according to another embodiment of the present invention;
FIG. 3 is a schematic diagram showing the structure of a user behavior analysis system according to still another embodiment of the present invention;
FIG. 4 is a schematic diagram showing the structure of a user behavior analysis system according to still another embodiment of the present invention;
FIG. 5 is a schematic diagram showing the structure of an integrated user behavior analysis system according to an embodiment of the present invention;
FIG. 6 is a flow chart of a user behavior analysis method according to an embodiment of the invention;
fig. 7 is a flow chart of a user behavior analysis method according to another embodiment of the present invention.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, the techniques, methods, and apparatus should be considered part of the specification.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
Embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the computer system/server include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments that include any of the foregoing, and the like.
A computer system/server may be described in the general context of computer-system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc., that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment in which tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computing system storage media including memory storage devices.
In order to solve the above technical problems, an embodiment of the present invention provides a user behavior analysis system, and fig. 1 shows a schematic structural diagram of a user behavior analysis system according to an embodiment of the present invention. Referring to fig. 1, the user behavior analysis system includes a log server log_server, a distributed file system HDFS (Distributed File System), a time-series database clickhouse, and a first analysis subsystem.
The log server is used for receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters and storing the user behavior logs into the distributed file system.
The distributed file system is used for analyzing the user identification from the user behavior logs stored in the preset time period, and storing the user behavior logs corresponding to different user identifications belonging to the same user in the time sequence database after being associated.
The user behavior log stored in the preset time period of the embodiment is an offline log. In the embodiment of the invention, the user behavior logs corresponding to different user identifications belonging to the same user are associated, namely, the user behavior logs from different embedded point objects are integrated.
The first analysis subsystem is used for inputting the user behavior log in the time sequence database into the first analysis model, and the first analysis model analyzes the user behavior according to the configured analysis rule based on the user behavior log.
According to the embodiment of the invention, the user behavior logs stored in the distributed file system are analyzed (namely, the offline logs are analyzed), and the user behavior logs corresponding to different user identifications can be associated by communicating different user identifications of the same user (namely, the different user identifications are associated), so that the user behavior can be effectively restored, and the user behavior can be conveniently and comprehensively analyzed. Further, since the analysis model is usually configured and checked by service personnel, and the time sequence database has the characteristics of a DBMS (database management system ), data efficient compression, multi-core parallel processing, multi-server distributed processing, vectorization engine and the like, the calculation of the complex analysis model involving a plurality of events can be completed within 10s, so that the storage of the user behavior log by the time sequence database is also beneficial to improving the analysis efficiency of the first analysis model on the user behavior data.
According to the embodiment of the invention, the embedded points are performed on different embedded point objects in advance, and the log server can receive user behavior logs reported by the different embedded point objects in real time. Different embedded objects may include a server and a front end, the front end including at least one of front end H5, applet, APP, front end advertisement, and the like.
When burying points on different buried point objects, corresponding buried point parameters are required to be preset, and the preset buried point parameters in the embodiment of the invention can comprise at least one of different attribute fields and attribute screening conditions. The different attribute fields may include a buried point name field, a user identification field, a device identification field, a timestamp field, a buried point page name, and the like. When user behaviors generated on different embedded point objects meet attribute screening conditions, the embedded point objects can be triggered to report corresponding user behavior logs, and the screening conditions can be used as dimensions for describing object features. The attribute screening conditions can be divided into common attributes and extended attributes by type, and the common attributes are mainly used for default embedded data acquisition and are usually integrated in a big data SDK (software development kit ). The extended attributes are mainly directed to custom buried data collection of specific behaviors.
In the embodiment of the present invention, the embedded point parameters of different embedded point types are maintained in different metadata management, for example, referring to table 1, and the front-end (e.g. H5, applet, APP, front-end advertisement) custom embedded point parameters are maintained in table custom_ext_list_2, which can be identified by biz, app_ terminal, event. The server-side embedded point parameters are maintained in a table server_parameter_list_2, and can be identified by biz and vent, and the front-end custom embedded point parameters and the server-side embedded point parameters are cleaned once in one hour. The preset embedded point parameters are maintained in a table dim_preset_ext and can be identified through biz and app_ terminal, operation, and the preset embedded point parameters need manual maintenance. The preset buried point parameters are default buried points of different buried point objects, such as the attribute screening conditions of the public attribute.
TABLE 1
Data table type Table name Storage database Use of the same
Attribute metadata custom_ext_list_2 Mysql Storing front end extension parameters
Attribute metadata server_parameter_list_2 Mysql Storage server extension parameters
Attribute metadata dim_preset_ext Mysql Storing preset embedded point expansion parameters
Referring to fig. 2, in an alternative embodiment of the present invention, the user behavior analysis system shown in fig. 1 further includes a distributed publish-subscribe message system kafka, a log collection system flume.
The log server is also used for publishing the user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters to the distributed publishing and subscribing message system.
The distributed publish-subscribe messaging system is used to receive user behavior logs from different embedded objects published by the log server.
The log collection system is used for subscribing the user behavior log from the distributed publishing and subscribing message system and storing the subscribed user behavior log to the distributed file system.
In an alternative embodiment of the present invention, the distributed publish-subscribe message system kafka of the embodiment of the present invention includes a front-end publish-subscribe message system kafka_log and a server-end publish-subscribe message system kafka_biz. The server-side publishing and subscribing message system is mainly used for receiving user behavior logs from the server-side published by the log server. The front-end publish-subscribe messaging system is for receiving user behavior logs from the front-end published by the log server. In addition, the log collection system of this embodiment may subscribe to the user behavior log from the distributed publish-subscribe message system through log service service_log.
In an alternative embodiment of the invention, the user identification includes a unique identification generated by user registration and a necessary identification generated by different embedded objects. The unique identifier may be generated when the user registers for an account, such as when the user registers for a water drop funding account. When a user accesses a water drop stage through different media (H5 pages, applets, APP and the like), the user can also generate a necessary identifier, such as an open_id generated by the applet and a device_id generated by the APP. The unique identification may be obtained from the collected user behavior log as the user logs into the registered account. When the user does not log in the account and accesses the related page through different media, the unique identification can not be obtained from the collected user behavior logs, only the necessary identification can be obtained, and at the moment, the association between the different user behavior logs can not be effectively established through the unique identification.
The distributed file system of the embodiment of the invention firstly opens the user identification of the same user and then effectively correlates the user behavior logs of the same user. If the user logs into the registered account and accesses the relevant page through different media, the unique identification and the necessary identification can be obtained from the generated user behavior log. Therefore, the embodiment of the invention firstly establishes association between the unique identification of the user and the necessary identification which are analyzed in the same user behavior log, and combines the unique identification and the necessary identification to generate the unified identification. And then, carrying out association on the user behavior logs corresponding to the unique identifier and the necessary identifier which are established with the association, and storing the user behavior logs to a time sequence database through a database management tool Waterdrop, wherein the associated user behavior logs correspond to the uniform identifier.
The user identification opening process and the user behavior log association process are described below by taking the contents shown in table 2 as an example. Referring to table 2, the log table sdm_user_action_d stores key fields in the user behavior log from the front end H5, the log table sdm_service_track_d stores key fields in the user behavior log from the service end, the log table sdm_crt_d stores key fields in the user behavior log from the front end advertisement, the log table sdm_web_action_d stores key fields in the user behavior log from the applet, and the log table sdm_app_track_d stores key fields in the user behavior log from the APP. The key fields here contain the unique identification of the user and the necessary identification.
Taking the user behavior log from the front end H5 as an example, if the unique identifier user_id and the necessary identifier self_tag of the user are obtained from the user behavior log from the front end H5, a unified identifier user_tag of the user can be generated according to the unique identifier user_id and the necessary identifier self_tag, so that the user_tag is used as the user identifier of the corresponding user. Further, the unique identifier user_id and the user behavior log corresponding to the identifier self_tag are associated, and the associated log and the user identifier user_tag are stored into a time sequence database through a database management tool after corresponding relation is established. It should be noted that the unified identifier generated for any user is a universal unique identifier (uuid).
TABLE 2
Log table User identification Cleaning logic ck unified identification
sdm_user_action_d user_tag user_id+self_tag uuid
sdm_service_track_d user_id user_id uuid
sdm_crt_d user_tag user_id+self_tag uuid
sdm_weapp_action_d open_tag open_id+self_tag uuid
sdm_app_traffic_d user_tag user_id+device_id uuid
In an embodiment of the present invention, the log tables corresponding to different embedded point objects may be mapped into one log table, that is, key fields in the user behavior log from different embedded point objects are mapped into one log table, so as to facilitate the management of the log table.
In an embodiment of the present invention, the first analysis model includes a funnel model, which can analyze the user behavior state and the user conversion rate from the start point to the end point according to the user behavior log.
When the user behavior log in the time sequence database is input into the funnel model, the first analysis subsystem can firstly receive analysis rules which are configured by service personnel for the funnel model and comprise funnel conversion periods, funnel steps, funnel events corresponding to each step and funnel event screening conditions when analyzing the user behavior. And then, analyzing the user behavior log by utilizing the funnel model through a time sequence database and a self-contained funnel function window fuel according to a corresponding analysis rule so as to analyze and obtain the user conversion rate of funnel events corresponding to different funnel steps executed by a user in a funnel conversion period.
In funnel function windowFunnel (window) (Timestamp, cond1, cond2, cond3,) window is the conversion period, timestamp is the Timestamp, cond1, cond2, cond3 are the funnel steps in the conversion funnel, respectively. The timestamp refers to the total number of seconds from the time of greenwich time 1970, 01, 00 minutes, 00 seconds (Beijing time 1970, 01, 08, 00 minutes, 00 seconds) to the present time, and can uniquely identify the time at a certain moment. Timestamp supports date and datetime data types, which are accurate to seconds, datetime data types support Uint32, so Timestamp does not exceed 2-31-1=2147483647 (10 bits). For example, the time stamp of the date 2019-08-21 12:53:00 converted to millisecond units is approximately equal to 1566319953000, well exceeding 2≡31-1. However, the time precision of other commonly used databases is accurate to millisecond, so that the two data types, namely date and datetime, cannot meet the requirement on precision accuracy.
In order to improve the accuracy of funnel function analysis data, in an embodiment of the present invention, the initial time of calculating the timestamp can be changed from 1970-01-08:00:00 to 2019-01-08:00:00, so that the timestamp accurate to millisecond in 2019 becomes 11 bits, and 10 bits are intercepted to meet Uint32, thereby, the embodiment of the present invention can adjust the time accuracy to 0.01s, and the calculated result in millisecond is infinitely close, and the mildness exceeds 99%.
The configuration process of the funnel model is described in detail below in a specific embodiment. The configuration process of the funnel model may comprise at least steps 1.1 to 1.5.
And 1.1, defining a funnel name according to the service requirement analyzed by service personnel. For example, the funnel name is defined as "conversion funnel.
Step 1.2, configuring a funnel conversion period. The funnel conversion period here is the conversion time between adjacent funnel steps.
Step 1.3, creating funnel steps, selecting funnel events for each funnel step. For example, the funnel creating step comprises four steps, and funnel events selected by each step are crowd funding page access events, crowd funding donation events, insurance page access events and insurance order events respectively. Since the individual events of a single step are not sufficient for funnel analysis, the funnel step typically comprises at least two steps.
In this embodiment, before the funnel event is selected, a service line and a buried object type may also be selected, for example, the selected service line is a water drop service line, the buried object type is a service end buried point, and the configured user behavior log corresponding to the funnel event is from the service end water drop service line.
And step 1.4, respectively configuring screening conditions for funnel events corresponding to the steps. For example, the screening condition configured for crowd funded page access events is that the user has access to crowd funded page for greater than 1 minute. For another example, the screening criteria configured for a crowd-sourced money event is a crowd-sourced money amount for the user greater than 10 yuan.
Step 1.5, defining events for the funnel events of each funnel step. Here, the event definition refers to naming each funnel event separately, for example, the four funnel events are named as "crowd fund page access", "crowd fund donation", "insurance page access", "insurance order" in sequence.
In an alternative embodiment of the present invention, if a funnel step is added according to an actual service requirement, the funnel step may be further added on the basis of the funnel step configured previously, and a funnel event is selected for the added funnel step, and a corresponding screening condition is configured and stored, where the added funnel step is not specifically limited.
In another alternative embodiment of the present invention, the service personnel may further configure the packet view dimension and the occurrence time range of the funnel event corresponding to the first funnel step. For example, business personnel configure the view dimension as a new user dimension or an old user dimension to view the analysis results of the event analysis model from different dimensions, respectively, by the business personnel. For another example, the occurrence time range of the crowd funding page access event in the first funnel step is from 11 pm to 30 pm, so that the funnel model screens user behavior logs corresponding to the crowd funding page access event with the occurrence time range of from 11 pm to 30 pm. Of course, other rules may be configured for the funnel model, which is not particularly limited in the embodiments of the present invention.
Of course, the business person may also add other rules when configuring the analysis rules, which is not particularly limited in the embodiment of the present invention. After the funnel model is configured with the rules, the user behavior analysis result of the funnel model according to the configuration rules can be obtained through direct query by triggering the query operation by service personnel. In an embodiment of the present invention, if the funnel model cannot effectively find the user behavior log of the funnel event in the actual user behavior analysis process, it may be that the previous buried point layout is insufficient, so that buried points may be added to different buried point objects.
In another embodiment of the present invention, the first analysis model may include a retention model, where the retention model may be used to analyze the participation and activity of the user, and examine how many people in the user after the initial behavior will perform the subsequent behavior, so as to serve as an important index for measuring the value of the product to the user.
When the user behavior log in the time sequence database is input into the retention model, the first analysis subsystem can firstly receive analysis rules which are configured by service personnel and contain retention periods, initial behavior events, follow-up behavior events and screening conditions corresponding to all the events when analyzing the user behavior. And then, analyzing the user behavior log according to the corresponding analysis rules by utilizing a retention model through a time sequence database self-contained retention function (cond 1, cond2, …) so as to obtain the user retention rate of the user for executing the initial event and continuing to execute the subsequent event in the retention period. In the retention function, cond1 represents an initial behavior, cond2 represents a subsequent behavior, and of course, a plurality of other subsequent behaviors may be set, which is not limited in this embodiment.
The following describes the configuration process of the retention model in detail in a specific embodiment. The configuration process of the retention model at least comprises the steps 2.1 to 2.4.
And 2.1, defining a reserved name according to the service requirement analyzed by the service personnel. For example, the defined persistence name is "water drop staged new user persistence".
And 2.2, configuring a retention period. The retention period may represent the time that the user has elapsed from an initial behavior to a subsequent behavior.
And 2.3, creating an initial behavior and a subsequent behavior, and respectively selecting an initial event and a subsequent event aiming at the created initial behavior and subsequent behavior, namely respectively selecting specific events for the initial behavior and the subsequent behavior.
For example, the initial behavior created is a registration behavior, the specific event selected for the registration behavior is a "registration drop funding account event", the subsequent behavior created is a login behavior, and the specific event selected for the login behavior is a "login drop funding client event".
In an embodiment of the present invention, before a specific event is selected, a service line and a buried point object type may also be selected, for example, the selected service line is a water drop service line, the buried point object type is a service end buried point, and the configured user behavior log corresponding to the specific event is from the service end water drop service line.
And 2.4, respectively configuring screening conditions for specific events corresponding to the initial behaviors and specific events corresponding to the subsequent behaviors. For example, the screening condition configured for "register water drop funding account event" is that a user over 40 years old registers. For another example, the screening condition configured for "logging in a water drop staged client event" is the user logging in the account for the second time, which is not particularly limited in the embodiment of the present invention.
In an alternative embodiment of the present invention, the service personnel may also configure the packet view dimension and the time range of occurrence of the initial event. For example, business personnel configure the view dimension as a new user dimension or an old user dimension to view the analysis results of the event analysis model from different dimensions, respectively, by the business personnel. For another example, the occurrence time range of the initial event is configured to be 7 to 8 am, so that the user behavior log of the initial event with the occurrence time range of 7 to 8 am is screened by the retention model. Of course, other rules may be configured for the persistence model, which is not specifically limited in the embodiments of the present invention.
The configuration rules are saved after the configuration rules are configured for the retention model. After the business personnel trigger the inquiry operation, the user behavior analysis result of the retention model according to the configuration rule analysis can be directly inquired. In an embodiment of the present invention, if the retention model cannot effectively find the user behavior logs of the initial event and the subsequent event in the actual user behavior analysis process, it may be that the previous buried point layout is insufficient, so that buried points may be added to different buried point objects.
In an optional embodiment, a visual display interface can be provided for the user, the business personnel can conduct rule configuration on different analysis models through the visual display interface, and the query results can be displayed on the visual display interface, so that the configuration efficiency of the business personnel can be improved, and the business personnel can conveniently check the analysis results in time.
The embodiment of the invention abstracts the data language into easier-to-understand business meaning, and generates SQL (Structured Query Language ) for query in the background after receiving the rules selected and configured by business personnel so as to query out the analysis result of the user behavior. The method can assist in improving the autonomous analysis capability of the business personnel, support the function evaluation of products and operators, and improve the conversion rate between the configuration of any rules of the business personnel. Further improving the analysis efficiency of the business analyst, reducing the analysis cost, releasing the data extraction pressure and concentrating on the analysis itself.
In addition, the embodiment of the invention can provide output interface enabling service besides analyzing the user behavior, and can identify high-value users by continuously training the user behavior data and matching with machine learning and AI technology.
Based on the same inventive concept, the embodiment of the invention also provides another user behavior analysis system, referring to fig. 3, the user behavior analysis system comprises a log server log_server, a time sequence database clickhouse and a second analysis subsystem.
The log server is used for receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters, classifying the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and storing the classified user behavior logs into the time sequence database.
And the second analysis subsystem is used for inputting the user behavior log in the time sequence database into a second analysis model, and analyzing the user behavior by the second analysis model based on the user behavior log according to the correspondingly configured analysis rule.
According to the embodiment of the invention, the user behavior logs reported by different buried point objects in real time are analyzed, so that technical research and development personnel can be prompted to rapidly verify whether buried point acquisition is successful or not according to the success or not of the user behavior analysis result, and whether the buried point parameters are reported abnormally or not is also helped. Furthermore, the time sequence database has the characteristics of a DBMS with a true positive array, high-efficiency data compression, multi-core parallel processing, multi-server distribution processing, a vectorization engine and the like, and can generally finish calculation within 10s for a complex analysis model involving a plurality of events, so that the time sequence database is adopted to store the user behavior logs, and the analysis efficiency of a second analysis model on the user behavior data is further improved.
In the embodiment of the invention, when the log server classifies the user behavior logs reported by different embedded point objects in real time according to the different embedded point objects and stores the user behavior logs in the time sequence database, the user behavior logs from the different embedded point objects can be classified and stored in different log tables pre-established in the time sequence database. For example, referring to table 3 below, 5 log tables are created in the time series database, log tables ods_user_act on_d_cluster stores h5 real-time logs, log tables ods_service_track_d_cluster stores server side real-time logs, log tables ods_well_action_d_cluster stores applet real-time logs, log tables ods_app_traffic_d_cluster stores APP real-time logs, log tables ods_crt_d_cluster stores advertisement real-time logs.
TABLE 3 Table 3
In the embodiment of the invention, different embedded point objects can comprise a service end and a front end, wherein the front end comprises a front end H5, an applet, an APP, a front end advertisement and the like. The preset buried point parameters comprise at least one of different attribute fields and attribute screening conditions. The attribute fields may include a buried point name field, a user identification field, a device identification field, a timestamp field, a buried point page name, and the like. For a specific description of the predetermined buried point parameters, reference is made to the above embodiments.
Referring to FIG. 4, in an alternative embodiment of the present invention, the user behavior analysis system shown in FIG. 3 further comprises a distributed publish-subscribe message system kafka.
The log server is also used for publishing the user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters to the distributed publishing and subscribing message system.
A distributed publish-subscribe messaging system for receiving user behavior logs from different embedded objects published by a log server. The user behavior log is subscribed to from the distributed publish-subscribe messaging system by the time series database and stored.
In an alternative embodiment of the present invention, the distributed publish-subscribe message system in the embodiment of the present invention includes a server publish-subscribe message system and a front-end publish-subscribe message system. Wherein the front-end publish-subscribe message system kafka_log is mainly used for receiving user behavior logs from the server, which are published by the log server. The server-side publish-subscribe message system kafka_biz is used for receiving user behavior logs from the front-end published by the log server.
In one embodiment of the invention, the second analysis model comprises an event analysis model. The event analysis model may analyze user behavior based on queries of the functions of index statistics, attribute grouping, condition screening, etc. of the event.
After the real-time user behavior log in the time sequence database is input into the event analysis model, the event analysis model can firstly receive analysis rules which are configured by service personnel and contain at least two specific events capable of forming the combined event, indexes and screening conditions corresponding to the specific events, combined indexes of the combined event and operation modes of the combined indexes. Of course, other rules may be configured for the event analysis model, which is not specifically limited in the embodiments of the present invention.
Then, the event analysis model analyzes the user behavior log according to the analysis rule. Specifically, first, the event analysis model respectively performs aggregation calculation on at least two specific events meeting the corresponding screening conditions to obtain a calculation result meeting the corresponding specific event index. The manner of the aggregation calculation here includes count, sum, avg, max, min and the like. And then, calculating the result conforming to the corresponding specific event index according to the operation mode of the combination index to obtain the result conforming to the combination index of the combination event. The operation method includes an operation method such as addition, subtraction, multiplication, and division.
The configuration process of the event analysis model is described in detail below in a specific embodiment. The configuration process for the event analysis model may include at least steps 3.1 to 3.4.
And 3.1, selecting an event according to the business requirement analyzed by business personnel, and selecting a specific event to be analyzed based on the selected event. For example, an event "donation amount event" is selected, and a specific event "order event".
In this embodiment, before a specific event is selected, a service line and a buried point object type may also be selected, for example, the selected service line is a water drop service line, the buried point object type is an APP end buried point, and the configured user behavior log corresponding to the specific event is from the water drop service line at the APP end.
And 3.2, configuring screening conditions and indexes for the selected specific event.
For example, the screening condition configured for the order event is "the order amount exceeds 10 yuan", and the configured index is "the number of people", that is, the number of people whose order amount exceeds 10 yuan needs to be analyzed.
And 3.3, selecting an operation mode for creating the combined index and the combined index. For example, a combination index is created as "order rate", and the operation mode is division.
And 3.4, selecting other specific events again, configuring corresponding screening conditions and indexes, and combining the selected other specific events with the selected specific events to obtain combined events meeting the combined indexes.
For example, the selection of other specific events is "user access event", the screening condition configured for it is "user access time is greater than 3 minutes", and the configured index is "number of people", that is, the number of people who need to analyze user access time is greater than 3 minutes. The corresponding order rate can be calculated by dividing the number of people with the order amount exceeding 10 yuan by the number of people with the user access time greater than 3 minutes.
In an optional embodiment of the present invention, the service personnel may further add other indexes, and in the embodiment of the present invention, the number of the added indexes is not specifically limited, but the number of the added indexes is not more than 5 in general, so as to ensure that the event analysis model effectively analyzes the user behavior.
In an alternative embodiment of the present invention, the business person may also configure the group view dimension and the specific event occurrence time range. For example, the business person configures the view dimension to be an order channel dimension, such as an a channel and a B channel, to view the analysis results of the event analysis model from different dimensions, respectively, by the business person. For another example, the occurrence time range of the specific event is configured to be 8 to 10 pm, so that the event analysis model screens the user behavior log corresponding to the specific event with the occurrence time range of 7 to 8 am.
After the event analysis model is configured with the rules, the configuration rules are saved. After the business personnel trigger the inquiry operation, the user behavior analysis result of the event analysis model according to the configuration rule can be directly inquired. In an embodiment of the present invention, if the event analysis model cannot effectively find the user behavior log of the specific participating event in the actual user behavior analysis process, it may be that the previous buried point layout is insufficient, so buried points may be added to different buried point objects.
In an embodiment of the present invention, the second analysis model may include a path model, which may analyze a path distribution of a user when using a product.
After the real-time user behavior log in the time sequence database is input into the path model, the path model firstly receives analysis rules which are configured by service personnel and comprise specific participation events, corresponding screening conditions, target events and path conversion periods. Of course, other rules may be configured for the path model, which is not specifically limited in the embodiments of the present invention.
And then, screening user behavior logs corresponding to the user completing the target event based on the input user behavior logs by using a path model, further screening the user behavior logs completing the participation event meeting the corresponding screening conditions in the path conversion period from the screened user behavior logs, and analyzing different user behavior paths based on the screened user behavior logs.
And finally, analyzing the user conversion rate among different user behavior paths by using a path model.
In an embodiment of the present invention, each node in the user behavior path has a corresponding defined name, and for a explicitly specified participation event, the name of the node may be determined according to the name of the participation event. For data requiring analysis of all branches of a participation event, the names of nodes may be determined according to the participation event name + sub-attribute. The path model of the embodiment of the invention can screen the participation event through the following codes:
in an alternative embodiment of the present invention, the specific process of analyzing the user conversion rate by the path model includes the following:
firstly, the path model firstly orders the behavior nodes in different user behavior paths according to time sequence, and removes repeated behavior nodes in the user behavior paths.
For example, ordering behavior nodes in the user behavior paths corresponding to different user identifications 32269239, 71880126, 62848335, respectively, results in behavior nodes in table 4. In an alternative embodiment of the invention, the arraySort function may be used for behavior node ordering.
TABLE 4 Table 4
User identification Behavior node
32269239 ['contribute','contribute','case_order','case_order','contribute','contribute']
71880126 ['contribute','case_order','case_order','paySuccess','paySuccess','case_order']
62848335 ['message_send','message_send','message_send']
Because the operations of refreshing, repeating the operation, turning pages and the like in the same page are basically the same user behavior, the analysis significance on the user behavior is not great, and repeated behaviors can be de-duplicated, so that a real user behavior path is obtained. When the repeated behavior nodes in the user behavior path are removed, each behavior node can be numbered first, and then the behavior nodes corresponding to the continuous and equal numbers are removed, so that the user behavior path after the duplication removal is obtained.
For example, after numbering the behavior nodes in the user behavior paths corresponding to the different user identifications 32269239, 71880126, 153718105, the results after removing the behavior nodes corresponding to the consecutive and equal numbers are shown in table 5.
TABLE 5
And then, counting the number of the user behavior paths corresponding to the different user behavior paths after the repeated behavior nodes are removed.
For example, referring to table 6, the number of people for the post-statistics user behavior paths 'message_send', 'constraint', 'message_send' is 1690. The number of people for the user behavior paths 'control', 'payNewStyle', 'click', 'case_order', 'control' is 1343. The number of people in the user behavior path 'control', 'case_order', 'payservice', 'message_send' is 1306
TABLE 6
User path The number of people
['message_send','contribute','message_send'] 1690
['contribute','payNewStyle','click','case_order','contribute'] 1343
['contribute','case_order','paySuccess','message_send'] 1306
And finally, analyzing the user conversion rate of different user behavior paths based on the counted number of the user behavior paths.
For example, 100 people passing through node A, 80 people from node A to node B, and 10 people from node A to node B to node C, the conversion of path AB is 80% and the conversion of path ABC is 10%.
In order to embody the configuration process of the path model, a detailed description will be given below with reference to a specific embodiment. The configuration process for the path model may include at least steps 4.1 to 4.4.
And 4.1, defining a path name according to the service requirement analyzed by the service personnel. For example, a path name is defined as "access to order path".
And 4.2, selecting the number of the participation events and the corresponding number of the specific participation events, and configuring corresponding screening conditions for the specific participation events. For example, 2 participation events are selected, the specific participation events are a "click element event" and a "browse page event", the screening condition of the click element event is "the number of times of clicking element", and the screening condition of the browse page event is "the browse page time is greater than 50s".
In this embodiment, before a specific participation event is selected, a service line and a buried point object type may also be selected, for example, the selected service line is a water drop service line, the buried point object type is an H5 end and an APP end buried point, etc., and then the configured user behavior log corresponding to the specific participation event is from the water drop service line of the H5 end and the APP end.
And 4.3, selecting a target event and configuring corresponding screening conditions for the target event. For example, the selected target event is "order event", and the corresponding screening condition is "order amount is greater than 10 yuan".
And 4.4, selecting a path conversion period. The path conversion period is the time from the user via the participation event to the completion of the target event.
In an alternative embodiment of the present invention, the service personnel may also select the user identifier and the target event occurrence time range, where the number of the selected user identifiers and the target event occurrence time range are not specifically limited. The configuration rules are saved after the configuration rules are configured for the path model. After the business personnel trigger the query operation, the user behavior analysis result of the configuration model according to the configuration rule analysis can be directly queried.
In an embodiment of the present invention, if the path model cannot effectively find the user behavior log of the specific participation event in the actual user behavior analysis process, it may be that the previous buried point layout is insufficient, so that it is also necessary to add buried points on different buried point objects.
In an optional embodiment, a visual display interface can be provided for the user, and the service personnel can conduct rule configuration on different analysis models through the visual display interface and display query results on the visual display interface, so that the configuration efficiency of the service personnel can be improved, and the service personnel can conveniently and timely check the analysis results.
The embodiment of the invention abstracts the data language into easier-to-understand business meaning, and generates SQL (Structured Query Language ) for query in the background after receiving the rules selected and configured by business personnel so as to query out the analysis result of the user behavior. The method can assist in improving the autonomous analysis capability of the business personnel, support the function evaluation of products and operators, and improve the conversion rate between the configuration of any rules of the business personnel. Further improving the analysis efficiency of the business analyst, reducing the analysis cost, releasing the data extraction pressure and concentrating on the analysis itself.
In addition, the embodiment of the invention can provide output interface enabling service besides analyzing the user behavior, and can identify high-value users by continuously training the user behavior data and matching with machine learning and AI technology.
Based on the same inventive concept, the embodiment of the present invention further provides an integrated user behavior analysis system, which may include the user behavior analysis system including the first analysis model and the user behavior analysis system including the second analysis model in the above embodiments, referring to fig. 5, the integrated user behavior analysis system of the embodiment of the present invention includes a log server log_server, a distributed file system HDFS (Distributed File System), a time sequence database clickhouse, a first analysis subsystem and a second analysis subsystem. To save server and database resources, the log server log_server and the time sequence database click in this embodiment can implement the functions implemented by the log server log_server and the time sequence database click in the two user behavior analysis systems in the above embodiments simultaneously.
Therefore, the embodiment of the invention comprises the event analysis model, the funnel model, the retention model and the path model, and by combining the embodiment, the funnel model and the retention model can analyze the user behaviors according to the offline logs, and the event analysis model and the path model can analyze the user behaviors according to the real-time logs, so that the comprehensive user behavior analysis system can analyze the offline logs and the real-time logs in parallel according to different analysis scenes by utilizing the corresponding analysis models, and further analyze the user behaviors from multiple angles.
Based on the same inventive concept, the embodiment of the invention further provides a user behavior analysis method, referring to fig. 6, where the user behavior analysis method at least includes steps S602 to S606.
Step S602, receiving and storing user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters.
Step S604, analyzing the user identification from the user behavior logs stored in the preset time period, and storing the user behavior logs corresponding to different user identifications belonging to the same user in a time sequence database after associating the user behavior logs.
Step S606, the user behavior log in the time sequence database is input into a first analysis model, and the first analysis model analyzes the user behavior based on the user behavior log according to the correspondingly configured analysis rules.
In one embodiment of the present invention, the different buried objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement; the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
Referring to step S602 above, in an alternative embodiment of the present invention, when receiving and storing user behavior logs reported in real time by different embedded point objects based on preset embedded point parameters, the user behavior logs reported in real time by different embedded point objects based on preset embedded point parameters may be received and stored by adopting a publish/subscribe mode.
Referring to step S604 above, in an alternative embodiment of the present invention, the user identification includes a unique identification generated by user registration and a necessary identification generated by a different embedded object. When the user behavior logs corresponding to different user identifications belonging to the same user are associated and stored in the time sequence database, the unique identifications and the necessary identifications of the users analyzed in the same user behavior log can be associated first, and the unique identifications and the necessary identifications are combined to generate a unified identification. And then, the user behavior logs corresponding to the unique identifier and the necessary identifier which are established to be associated are associated and then stored into a time sequence database through a database management tool, wherein the associated user behavior logs correspond to the uniform identifier.
Referring to step S606, in an embodiment of the present invention, the first analysis model includes a funnel model, and when the first analysis model analyzes the user behavior according to the analysis rules configured correspondingly based on the user behavior log, firstly, the analysis rules configured by the business personnel for the funnel model and including a funnel conversion period, a funnel step, and funnel events and funnel event screening conditions corresponding to each step are received. And then, analyzing the user behavior log according to a corresponding analysis rule by utilizing a funnel model through a self-contained funnel function of the time sequence database so as to analyze and obtain the user conversion rate of funnel events corresponding to different funnel steps executed by a user in a funnel conversion period.
Referring to step S606, in another embodiment of the present invention, the first analysis model includes a retention model, and when the first analysis model analyzes the user behavior according to the analysis rule configured correspondingly based on the user behavior log, the analysis rule configured by the retention model and including the retention period, the initial behavior event, the subsequent behavior event, and the screening condition corresponding to each event may be received by the service personnel. And then, analyzing the user behavior log according to the corresponding analysis rule by utilizing a retention model through a self-contained retention function of the time sequence database so as to obtain the user retention rate of the follow-up event continuously executed after the user executes the initial event in the retention period.
Based on the same inventive concept, the embodiment of the present invention further provides another method for analyzing user behavior, referring to fig. 7, where the method for analyzing user behavior at least includes steps S702 to S706.
Step S702, receiving user behavior logs reported in real time by different embedded point objects based on preset embedded point parameters.
Step S704, classifying the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and storing the classified user behavior logs into a time sequence database.
Step S706, the user behavior log in the time sequence database is input to a second analysis model, and the second analysis model analyzes the user behavior based on the user behavior log according to the correspondingly configured analysis rule.
In one embodiment of the present invention, the different buried objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement; the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
Referring to step S704, in an embodiment of the present invention, user behavior logs reported in real time by different embedded point objects are classified according to the different embedded point objects, and when the classified user behavior logs are stored in the time sequence database, the user behavior logs reported in real time by the different embedded point objects may be classified according to the different embedded point objects. And then storing the classified user behavior log into a time sequence database through a database management tool by adopting a publish/subscribe mode.
Referring to step S706 above, in an embodiment of the present invention, the second analysis model includes an event analysis model, and when the second analysis model analyzes the user behavior according to the analysis rules configured correspondingly based on the user behavior log, the second analysis model receives the analysis rules configured by the service personnel and including at least two specific events capable of forming the combined event, the index and the screening condition corresponding to each specific event, the combined index of the combined event, and the operation mode of the combined index. And then, respectively carrying out aggregation calculation on at least two specific events meeting the corresponding screening conditions based on the user behavior log by utilizing an event analysis model to obtain a calculation result meeting the corresponding specific event index. And finally, calculating the calculation result conforming to the corresponding specific event index by using the event analysis model according to the calculation mode of the combination index to obtain the result conforming to the combination index.
Referring to step S706 above, in another embodiment of the present invention, the second analysis model includes a path model, and when the second analysis model analyzes the user behavior according to the analysis rules configured correspondingly based on the user behavior log, firstly, the analysis rules configured by the service personnel and including specific participation events, corresponding screening conditions, target events, and path conversion periods are received. And then, screening the user behavior log corresponding to the user completing the target event based on the input user behavior log by using the path model. And further screening user behavior logs of the participation events meeting the corresponding screening conditions from the screened user behavior logs by using a path model, and analyzing different user behavior paths based on the screened user behavior logs. And finally, analyzing the user conversion rate among different user behavior paths by using a path model.
In an alternative embodiment of the present invention, the process of analyzing the user conversion rate between different user behavior paths using the path model includes: firstly, sequencing behavior nodes in different user behavior paths according to a time sequence by using a path model, and removing repeated behavior nodes in the user behavior paths; then, using a path model to count the number of user behavior paths corresponding to different user behavior paths after the repeated behavior nodes are removed; and finally, analyzing the user conversion rate of different user behavior paths based on the counted number of the user behavior paths by using the path model.
Based on the same inventive concept, embodiments of the present invention also provide a computer readable storage medium storing computer program code which, when run on a computing device, causes the computing device to perform the user behavior analysis method in any of the above embodiments.
Based on the same inventive concept, an embodiment of the present invention further provides a computing device, including: a processor; a memory storing computer program code; the computer program code, when executed by a processor, causes the computing device to perform the user behavior analysis method of any of the embodiments above.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, so that the same or similar parts between the embodiments are mutually referred to. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.
The method and system of the present invention may be implemented in a number of ways. For example, the methods and systems of the present invention may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present invention are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present invention may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present invention. Thus, the present invention also covers a recording medium storing a program for executing the method according to the present invention.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

Claims (10)

1. A user behavior analysis system, comprising:
the log server is used for receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters;
the log server is further configured to classify the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and store the classified user behavior logs into a time sequence database;
and the second analysis subsystem is used for inputting the user behavior log in the time sequence database into a second analysis model, and analyzing the user behavior by the second analysis model based on the user behavior log according to the correspondingly configured analysis rule.
2. The system of claim 1, wherein the system further comprises a controller configured to control the controller,
The different buried point objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement;
the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
3. The system of claim 1 or 2, further comprising a distributed publish-subscribe messaging system,
the log server is also used for publishing the user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters to the distributed publishing and subscribing message system;
the distributed publishing and subscribing message system is used for receiving user behavior logs from different embedded point objects published by the log server;
the time sequence database is further used for storing the user behavior log subscribed from the distributed publishing and subscribing message system.
4. The system of claim 1 or 2, wherein the second analysis model comprises an event analysis model, the second analysis subsystem further configured to:
receiving analysis rules configured by service personnel and comprising at least two specific events capable of forming a combined event, indexes and screening conditions corresponding to each specific event, combined indexes of the combined event and operation modes of the combined indexes;
Respectively carrying out aggregation calculation on at least two specific events meeting the corresponding screening conditions based on the user behavior log by utilizing the event analysis model to obtain a calculation result meeting the corresponding specific event index;
and calculating the calculation result conforming to the corresponding specific event index by using the event analysis model according to the calculation mode of the combination index to obtain the result conforming to the combination index.
5. The system of claim 1 or 2, wherein the second analysis model comprises a path model, the second analysis subsystem further configured to:
receiving analysis rules configured by service personnel and containing specific participation events and corresponding screening conditions, target events and path conversion periods;
screening user behavior logs corresponding to users completing the target event based on the input user behavior logs by utilizing the path model;
further screening user behavior logs of the participation events meeting the corresponding screening conditions in a path conversion period from the screened user behavior logs by using the path model, and analyzing different user behavior paths based on the screened user behavior logs;
and analyzing the user conversion rate between different user behavior paths by using the path model.
6. The system of claim 5, wherein the second analysis subsystem is further configured to:
sequencing behavior nodes in different user behavior paths according to a time sequence by using the path model, and removing repeated behavior nodes in the user behavior paths;
counting the number of user behavior paths corresponding to different user behavior paths after removing the repeated behavior nodes by using the path model;
and analyzing the user conversion rate of different user behavior paths based on the counted number of the user behavior paths by using the path model.
7. A method of user behavior analysis, comprising:
receiving user behavior logs reported by different embedded point objects in real time based on preset embedded point parameters;
classifying the user behavior logs reported by the different embedded point objects in real time according to the different embedded point objects, and storing the classified user behavior logs into a time sequence database;
and inputting the user behavior log in the time sequence database into a second analysis model, and analyzing the user behavior by the second analysis model based on the user behavior log according to the correspondingly configured analysis rule.
8. The method of claim 7, wherein the step of determining the position of the probe is performed,
The different buried point objects include: at least one of a server, a front end H5, an applet, an APP, and a front end advertisement;
the preset buried point parameters comprise: at least one of the different attribute fields and the attribute screening conditions.
9. A computer readable storage medium storing computer program code which, when run on a computing device, causes the computing device to perform the user behavior analysis method of claim 7 or 8.
10. A computing device, comprising: a processor; a memory storing computer program code; the computer program code, when executed by the processor, causes the computing device to perform the user behavior analysis method of claim 7 or 8.
CN202311040804.XA 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device Pending CN117149597A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311040804.XA CN117149597A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202311040804.XA CN117149597A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device
CN202010166628.4A CN111488261A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202010166628.4A Division CN111488261A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device

Publications (1)

Publication Number Publication Date
CN117149597A true CN117149597A (en) 2023-12-01

Family

ID=71794277

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202311040804.XA Pending CN117149597A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device
CN202010166628.4A Pending CN111488261A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202010166628.4A Pending CN111488261A (en) 2020-03-11 2020-03-11 User behavior analysis system, method, storage medium and computing device

Country Status (1)

Country Link
CN (2) CN117149597A (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231272B (en) * 2020-09-30 2021-06-29 深圳市神州通在线科技有限公司 Information processing method based on remote online office and computer equipment
CN112422445A (en) * 2020-10-10 2021-02-26 四川新网银行股份有限公司 Kafka-based real-time acquisition, calculation and storage method for buried point data
CN112561565A (en) * 2020-11-27 2021-03-26 四川新网银行股份有限公司 User demand identification method based on behavior log
CN112486789A (en) * 2020-11-30 2021-03-12 建信金融科技有限责任公司 Log analysis system, method and device
CN112527558A (en) * 2020-12-08 2021-03-19 广东小天才科技有限公司 Method, system and terminal equipment for analyzing crash of subsystem
CN114741412B (en) * 2021-01-07 2024-04-16 厦门美柚股份有限公司 User behavior self-help analysis system
CN113360817B (en) * 2021-01-26 2023-10-24 上海喜马拉雅科技有限公司 User operation analysis method, device, server and storage medium
CN113010484A (en) * 2021-03-12 2021-06-22 维沃移动通信有限公司 Log file management method and device
CN113176980B (en) * 2021-05-25 2023-09-12 医声医事(北京)科技有限公司 Dynamic construction method and system of flow hopper
CN113282608A (en) * 2021-06-10 2021-08-20 湖南力唯中天科技发展有限公司 Intelligent traffic data analysis and storage method based on column database
CN113609188A (en) * 2021-07-23 2021-11-05 恩亿科(北京)数据科技有限公司 Funnel analysis method, system, device and medium for user behavior steps
CN113642047A (en) * 2021-08-13 2021-11-12 上海哔哩哔哩科技有限公司 Buried point data verification method and system
CN114168430A (en) * 2022-01-06 2022-03-11 携程旅游网络技术(上海)有限公司 Front-end abnormal alarm configuration method, system, equipment and storage medium
CN114820079B (en) * 2022-05-20 2023-04-18 百度在线网络技术(北京)有限公司 Crowd determination method, device, equipment and medium
CN115054925B (en) * 2022-06-29 2023-06-09 上海益世界信息技术集团有限公司 Method, device, server and storage medium for determining lost user
CN115277409A (en) * 2022-07-20 2022-11-01 杭州米络星科技(集团)有限公司 Method and device for collecting and reporting buried point data in real time, acquisition system and terminal
CN116541262B (en) * 2023-07-07 2024-03-01 腾讯科技(深圳)有限公司 Data processing method, device, equipment and readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107515915B (en) * 2017-08-18 2020-02-18 晶赞广告(上海)有限公司 User identification association method based on user behavior data
CN109800225A (en) * 2018-12-24 2019-05-24 北京奇艺世纪科技有限公司 Acquisition methods, device, server and the computer readable storage medium of operational indicator
CN110347906A (en) * 2019-05-20 2019-10-18 拉扎斯网络科技(上海)有限公司 A kind of user behavior methods of exhibiting and device, electronic equipment and storage medium
CN110675194A (en) * 2019-09-29 2020-01-10 北京思维造物信息科技股份有限公司 Funnel analysis method, device, equipment and readable medium
CN110674231A (en) * 2019-10-09 2020-01-10 上海智子信息科技股份有限公司 Data lake-oriented user ID integration method and system

Also Published As

Publication number Publication date
CN111488261A (en) 2020-08-04

Similar Documents

Publication Publication Date Title
CN117149597A (en) User behavior analysis system, method, storage medium and computing device
CN108416620B (en) Portrait data intelligent social advertisement putting platform based on big data
US10452625B2 (en) Data lineage analysis
CN111581054B (en) Log embedded point service analysis alarm system and method based on ELK
CN109615129B (en) Real estate customer transaction probability prediction method, server and computer storage medium
EP3000029B1 (en) Apparatus and method for pipelined event processing in a distributed environment
US20120130940A1 (en) Real-time analytics of streaming data
CN109254901B (en) A kind of Monitoring Indexes method and system
CN107247811B (en) SQL statement performance optimization method and device based on Oracle database
CN106933906B (en) Data multi-dimensional query method and device
CN113064866A (en) Power business data integration system
US20140337274A1 (en) System and method for analyzing big data in a network environment
CN108694448A (en) PHM platforms
CN111310052A (en) User portrait construction method and device and computer readable storage medium
CN113468019A (en) Hbase-based index monitoring method, device, equipment and storage medium
CN111598700A (en) Financial wind control system and method
CN114116872A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN113553341A (en) Multidimensional data analysis method, multidimensional data analysis device, multidimensional data analysis equipment and computer readable storage medium
CN103268329B (en) Plasma panel manufacturing process data digging system
CN112631889A (en) Portrayal method, device and equipment for application system and readable storage medium
WO2020010531A1 (en) Fault detection method and device
CN116302867A (en) Behavior data analysis method, apparatus, computer device, medium, and program product
CN113986656A (en) Power grid data safety monitoring system based on data center
CN111930815A (en) Method and system for constructing enterprise portrait based on industry attribute and business attribute
CN117057425B (en) Rule-type knowledge analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination