CN114238360A - User behavior analysis system - Google Patents

User behavior analysis system Download PDF

Info

Publication number
CN114238360A
CN114238360A CN202111602485.8A CN202111602485A CN114238360A CN 114238360 A CN114238360 A CN 114238360A CN 202111602485 A CN202111602485 A CN 202111602485A CN 114238360 A CN114238360 A CN 114238360A
Authority
CN
China
Prior art keywords
user
behavior
field
session
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111602485.8A
Other languages
Chinese (zh)
Inventor
郝明
邹武
魏国富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Data Security Solutions Co Ltd
Original Assignee
Information and Data Security Solutions Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Data Security Solutions Co Ltd filed Critical Information and Data Security Solutions Co Ltd
Priority to CN202111602485.8A priority Critical patent/CN114238360A/en
Publication of CN114238360A publication Critical patent/CN114238360A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2315Optimistic concurrency control
    • G06F16/2322Optimistic concurrency control using timestamps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a user behavior analysis system, which comprises a user behavior session module, a user retention analysis module, a user behavior matching module and a user funnel analysis module; the user behavior session module is used for generating a user behavior session data table; the user retention analysis module is used for generating a user retention rate daily table; the user behavior matching module is used for generating a user behavior matching data table; the user funnel analysis module is used for matching the event chains in sequence based on a given sliding window, calculating the number of conversion steps in the window event and the conversion number of each stage, and generating a user funnel conversion analysis table; the invention has the advantages that: the problems that the prior art is limited by a single machine physical memory, is complex in development and is difficult to realize real-time response are solved.

Description

User behavior analysis system
Technical Field
The invention relates to the field of user behavior analysis, in particular to a user behavior analysis system.
Background
In recent years, with the rapid development of global internet technology, geometric growth has occurred in the amount of user-based behavior log data. The product form of analysis software related to user behaviors is replaced by a plurality of times on the basis of underlying technologies, the traditional database technology is too heavy, and the big data technology represented by Hadoop ecology has the problems of performance, stability and real-time response in use. At present, analytical software puts more and more demanding requirements on the real-time performance of the underlying OLAP technology.
The user behavior analysis is to analyze the behavior generated by the user on the product and the data behind the behavior, and changes the product decision by constructing a user behavior model and a user portrait, thereby realizing refined operation and guiding the service growth. One way to implement the user behavior analysis algorithm is to write a processing program to implement the relevant algorithm. And according to the algorithm intention, reading related data from the database by the program, and processing the data by implementing the algorithm in the program frame. Generally, such programs can only run on a single node, cannot process data in parallel, are limited by the size of a single-node memory where the programs run, and generally cannot process a large amount of data. Another implementation manner is implemented by a big data tool, such as Hive or Spark, for example, the document, "the teipeng, big data based user behavior analysis system, the ningbo university institute of information science and engineering, 2021," discloses a distributed cluster system platform architecture using Spark, to deeply design a user behavior data information analysis system. The processing mode of big data needs to write a series of array processing functions and aggregation functions to make up for the deficiency of the basic algorithm function of big data. The development complexity is also limited by the capability of the writer, and the difference of performance efficiency and stability exists. In addition, the prior art is limited by a single machine physical memory, is complex in development, and is difficult to respond in real time in processing speed.
Disclosure of Invention
The invention aims to solve the technical problem that the user behavior analysis system in the prior art is not limited by a single machine physical memory, is complex to develop, and is difficult to realize real-time response in the processing speed.
The invention solves the technical problems through the following technical means: a user behavior analysis system analyzes user behavior based on a native multi-parameter aggregation function and a high-order function in a ClickHouse distributed database, and comprises a user behavior session module, a user retention analysis module, a user behavior matching module and a user funnel analysis module;
the user behavior session module is used for extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, generating a session id, a session start time stamp field, a session end time stamp field and a user behavior chain field, sorting the session id fields according to the user identification field and the session id field, outputting the sorted session data to form an ordered user behavior session data set, and storing the ordered user behavior session data set as a user behavior session data table;
the user retention analysis module is used for calculating the retention rate of the first day and the retention rate within a preset time limit based on a given date parameter and generating a user retention rate daily table;
the user behavior matching module is used for screening out a user data set which accords with a corresponding mode based on a given behavior mode parameter with a sequence requirement or a behavior mode parameter without a sequence requirement, and generating a user behavior matching data table;
and the user funnel analysis module is used for matching the event chains in sequence based on a given sliding window, calculating the number of conversion steps in the window event and the conversion number of each stage, and generating a user funnel conversion analysis table.
The invention analyzes the user behavior based on the native multi-parameter aggregation function and the high-order function in the ClickHouse distributed database, the ClickHouse is taken as an OLAP distributed database with excellent performance, has obvious advantages when mass data are calculated in real time, is very suitable for being used as a bottom data warehouse of analysis software, has no limitation on data memory, does not need to write a series of array processing functions and aggregation functions to make up the deficiency of the basic algorithm function of big data, has relatively simple development process, and solves the problems that the prior art is limited by single machine physical memory, is complex in development and is difficult to realize real-time response.
Further, the user behavior session module extracts a user identification field, a behavior identification field, and a behavior time field from the original access log table to generate a session id, including:
extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, wherein the behavior time field is converted into a timestamp format, and the extracted data sets are sorted from small to large according to the numerical values of the behavior time field;
the data set carries out GROUP BY aggregation according to the user identification field, and corresponding behavior identification fields and behavior time fields are respectively aggregated into behavior identification array fields and behavior timestamp array fields BY using a groupArray function;
calculating the difference value between adjacent elements in the behavior time stamp array field by using an arrayDifference function to generate the user behavior interval array field;
comparing the behavior intervals in the behavior interval array field with a preset session time threshold by using an arrayMap function, if the behavior intervals are smaller than or equal to the preset session time threshold, representing that the behavior intervals are the behavior records of the same session of a user, if the behavior intervals are larger than the session time threshold, representing that the behavior records are two sessions, returning result arrays as a new session identification array field, wherein when the element value is 1, the new session is represented, when the element value is 0, the old session is represented, and the first element of the new session identification array field is set to be 1;
performing row-to-column expansion operation on the behavior identification ARRAY field, the behavior timestamp ARRAY field and the new session identification ARRAY field by using ARRAY JOIN, expanding the single-row records of each user into a plurality of rows of records according to the element number of the ARRAY, and generating a user identification field, a behavior timestamp field and a new session identification field after expansion;
the method comprises the steps of returning an array subscript to a new session identification field by using an arrayEnumerate function, converting a row of a result processed by the arrayEnumerate function by using an arrayJoin function, generating a session index session _ index, cutting each row of data by using an arraySlice (is _ new _ session _ array,1, session _ index) function, wherein three input parameters of the arraySlice are the new session identification field, the cut offset and the cut length respectively, summing elements in the cut array by using the arraySum function, and taking the value as a session id.
Further, the user behavior session module generates a start timestamp field of the session, an end timestamp field of the session, and a user behavior chain field, including:
performing left connection operation on the behavior identification field and the behavior name field of the behavior information dimension table, and converting the behavior identification field into the behavior name field;
carrying out sorting operation on the data sets according to the user identification field and the behavior timestamp field, and carrying out GROUP BY aggregation operation on the sorted data sets according to the user identification field and the session id field;
after aggregation, the minimum behavior timestamp in each session is used as a starting timestamp field of the session, the maximum behavior timestamp is used as an ending timestamp field of the session, behavior name fields in the starting timestamp field and the ending timestamp field are processed into behavior name arrays through a groupArray function, and then element connection in the arrays is carried out through an arraystringConcat function to generate a user behavior chain field.
Further, the construction process of the original access log table is as follows: the log data of the system where the analysis object is located is stored in a ClickHouse database, and then an original access log table is generated, wherein the original access log table at least comprises a user identification field, a behavior identification field and a behavior time field.
Further, the user retention analysis module is further configured to:
extracting a user identification field and a behavior time field from an original access log table, and generating a user access date field visit _ day by using a toDate function for the behavior time field;
generating date data of 1 st, 2 nd, the previous N th day by using an INTERVAL operator of ClickHouse for subsequent calculation based on a given date parameter, namely first _ day, wherein 0 th day is first _ day;
setting equal (i.e. visual _ day, first _ day) as a first judgment condition parameter of the retention function by using the equal function, and so on, wherein the Nth judgment condition parameter is equal (i.e. visual _ day, first _ day + INTERVAL N DAY), the result of the first judgment condition parameter is default to be retained, and the results of all calculated judgment condition parameters are stored into a result array of the retention function;
carrying out GROUP BY aggregation on a result array of the coverage function through a user identification field, and respectively counting the access conditions of the users from the 0 th day to the N th day;
the retention rate on day N is determined by calculating the visit volume on day N divided by the visit volume on day 0; the retention rate from the 0 th day to the N th day is calculated by the method and is stored as a user retention rate daily table.
Further, the user behavior matching module is further configured to process user behavior matching with a sequence requirement, where the user behavior matching with a sequence requirement includes:
configuring a user behavior event chain;
configuring a mode character string corresponding to the behavior of a user;
extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, and performing GROUP BY aggregation operation through the user identification field;
after each user conducts GROUP BY aggregation operation according to the user identification field, the behavior identification field conducts sorting according to the sequence of the behavior time field, the sorted result is converted into a mode character string, the mode character string is input into a sequence match function, whether a user behavior event chain meets the input mode or not is judged, a user behavior matching result field is returned, the number of the user behavior event chains meeting the input mode is counted, and a user behavior counting result field is generated;
and generating a user behavior matching data table by combining the results of the user behavior matching result field and the user behavior statistical result field.
Further, the configuring the user behavior event chain comprises: and converting one or more actions of the user into a judgment expression of a single event, judging whether the event meets the judgment expression when the user triggers the event, and triggering the event in the user behavior event chain if the event meets the judgment expression.
Further, the configuring the pattern character string corresponding to the behavior of the user includes:
if the user has occurred events a in sequence, and then has occurred events B, the above behavior is represented as (;
the case where the user has sequentially occurred with event a, followed by some careless action, followed by event B, is written as (;
the user has occurred twice in sequence with event a, followed by event B, representing the above behavior as (;
event and time-between-event requirements, a B event occurs at least m seconds after the occurrence of an a event, written as (; occurrence of an a event a B event write (; a B event occurs immediately Q seconds after an a event, written as (.
Further, the user behavior matching module is further configured to process unordered user behavior matching, where the unordered user behavior matching includes:
extracting a user identification field, a behavior identification field and a behavior time field from an original access log table to form an initial user behavior data set;
inquiring an initial user behavior data set, inquiring whether each user event needing to be matched occurs, combining user data with the same user event needing to be matched together to form a sub-inquiry data set, wherein each sub-inquiry data set has a unique event index field;
connecting each sub-query data set through UNION ALL to form an event set;
carrying out GROUP BY aggregation operation on the event set according to the user identification to generate a new data set;
the event indexes in the new data set are subjected to de-duplication after being aggregated through a groupUniqarray function, and an event index de-duplication aggregation array is formed;
and comparing the event index de-duplication aggregation array with the matching condition by using a hasAll function, screening the event index de-duplication aggregation array meeting the matching condition, and generating a user behavior matching data table.
Further, the user funnel analysis module is further configured to: extracting a user identification field, a behavior identification field and a behavior time field from an original access log table to form a data set;
the data set sorts the behavior identification fields of each user of the user identification fields according to the numerical values of the behavior time fields from small to large;
calculating the maximum number of continuous trigger condition chains of each user in a given sliding time window by using a windowFunnel function, and generating a user longest trigger chain counting field;
processing the longest trigger chain counting field of the user by using an arrayWithConstant function to generate an array with a specified length, and performing row-to-column conversion on the array with the specified length to generate a stage index value field;
performing GROUP BY aggregation operation on the data set according to the phase index value field, calculating the total number of users in each phase BY using a count function, sorting according to the phase index value field, outputting a new data set comprising the phase index value and the total number of the users in the phase, and storing the new data set as a user funnel conversion analysis table.
The invention has the advantages that:
(1) the invention analyzes the user behavior based on the native multi-parameter aggregation function and the high-order function in the ClickHouse distributed database, the ClickHouse is taken as an OLAP distributed database with excellent performance, has obvious advantages when mass data are calculated in real time, is very suitable for being used as a bottom data warehouse of analysis software, has no limitation on data memory, does not need to write a series of array processing functions and aggregation functions to make up the deficiency of the basic algorithm function of big data, has relatively simple development process, and solves the problems that the prior art is limited by single machine physical memory, is complex in development and is difficult to realize real-time response.
(2) The invention sets a user behavior session module to generate a user behavior session data table for subsequent service inquiry and calculation, guides product decision, realizes refined operation and guides service growth.
(3) The invention sets a user retention analysis module, calculates the retention rate of the first day and the preset time limit based on the given date parameter, and generates a user retention rate daily table which is used for analyzing the use and activity conditions of the user in the product, guiding the product decision, realizing the refined operation and guiding the service growth.
(4) The invention sets a user behavior matching module, screens out a user data set which accords with a corresponding mode based on a given behavior mode parameter with a sequence requirement or a behavior mode parameter without a sequence requirement, generates a user behavior matching data table, judges whether a user has performed a series of given operations, guides product decision, realizes refined operation and guides service growth.
(5) The invention sets a user funnel analysis module, based on a given sliding window, matches event chains in sequence, calculates the number of conversion steps in window events and the conversion number of each stage, generates a user funnel conversion analysis table, guides product decision, realizes refined operation and guides service growth.
Drawings
FIG. 1 is a block diagram of a user behavior analysis system according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a user behavior session module of a user behavior analysis system according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a user retention analysis module of a user behavior analysis system according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a user behavior matching module of the user behavior analysis system according to an embodiment of the present invention;
fig. 5 is a flowchart illustrating a user funnel analysis module of a user behavior analysis system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, a user behavior analysis system analyzes user behavior based on a native multi-parameter aggregation function and a high-order function in a clickwouse distributed database, and includes a user behavior session module, a user retention analysis module, a user behavior matching module, and a user funnel analysis module;
the user behavior session module is used for extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, generating a session id, a session start time stamp field, a session end time stamp field and a user behavior chain field, sorting the session id fields according to the user identification field and the session id field, outputting the sorted session data to form an ordered user behavior session data set, and storing the ordered user behavior session data set as a user behavior session data table;
the user retention analysis module is used for calculating the retention rate of the first day and the retention rate within a preset time limit based on a given date parameter and generating a user retention rate daily table;
the user behavior matching module is used for screening out a user data set which accords with a corresponding mode based on a given behavior mode parameter with a sequence requirement or a behavior mode parameter without a sequence requirement, and generating a user behavior matching data table;
and the user funnel analysis module is used for matching the event chains in sequence based on a given sliding window, calculating the number of conversion steps in the window event and the conversion number of each stage, and generating a user funnel conversion analysis table. The operation of each module is described in detail below.
1. User behavior session module
The user behavior conversation module is used for aggregating scattered user behavior information through a certain algorithm to identify a series of reasonable user behavior conversations, and can serially connect a series of behaviors of the user in the same conversation according to a time sequence. The module generates a user behavior session data table after data of an original access log table is processed by a series of ClickHouse SQL based on a given session time threshold parameter (session _ gap). As shown in fig. 2, the specific process is as follows:
(1) a user identification (user _ id) field, an action identification (action _ id) field and an action time (action _ time) field are extracted from the original access log table. Wherein the action time field needs to be converted into a timestamp format, forming an action timestamp field. The extracted data sets are sorted from small to large according to the numerical order of the action timestamp field.
(2) And the data set carries out GROUP BY aggregation according to the user identification field, and corresponding behavior identification field and behavior timestamp field are respectively aggregated into a behavior identification array field and a behavior timestamp array field BY using a groupArray function. The behavior identification array field and the behavior timestamp array field can store 0 to more pieces of user behavior identification data and behavior timestamp data corresponding to one respectively.
(3) The behavioral timestamp array field is processed using the arrayDifference function. The function calculates the difference between adjacent elements in the array, i.e. the number of seconds between two adjacent behaviors of a user, and the generated result is the user behavior interval array field.
(4) The behavior interval array field is computed using the arrayMap function. The arrayMap is a high-order function, a lambda function can be written as an entry parameter, and the other entry parameter is a behavior interval array field needing to be processed. The arrayMap makes a lambda function call to each element in the array, and the calculated result is also presented as an array return. The logic of calculation is to compare the behavior interval with a preset session time threshold, if the behavior interval is less than or equal to the preset session time threshold, it represents the behavior record of the same session of the user, and if the behavior interval is greater than the session time threshold, it represents two sessions. The returned result array is a new session identification array field, wherein the element value is 1 and is represented as a new session, and the element value is 0 and is represented as an old session.
(5) The new session identification array field then needs to be processed for its first element. Since the first element of the new session id array field generated in the above method is 0, and in reality the first element will always be a new session, it should be set to 1. The first element would then need to be deleted using the arraypoppron function and then populated with the value 1 using arrayPushFront, respectively.
(6) The data set at this time includes a user identification (user _ id) field, an action _ id _ array field, an action _ timestamp _ array field, and a new session identification array (is _ new _ session _ array) field. The elements of the fields in the behavior identification array field, the behavior timestamp array field and the new session identification array are in one-to-one correspondence according to the timestamp sequence.
(7) And carrying out column transfer expansion operation on the elements in each array. And performing ARRAY row-to-column expansion operation on the behavior identification ARRAY field, the behavior timestamp ARRAY field and the new session identification ARRAY field by using ARRAY JOIN. At this time, the single-row record of each user can be expanded into a multi-row record according to the number of elements of the array. The expanded data set comprises a user identification field, a behavior timestamp field and a new session identification field.
(8) A session index (session _ index) field is then generated. The generation of the session index field needs to use the arrayEnumerate function in cooperation with the ARRAY JOIN. This function returns an array index (similar to the row number function for a database) that starts with an array from 1. Specifically, the method comprises the following steps:
assume the raw data as in table 1:
TABLE 1 raw data
user_id is_new_session_array
user1 [1,0,0,1,0,1]
The session index (session _ index) is generated by using the arrayEnumerate function for the is _ new _ session _ array, followed by row-by-row permutation using arrayJoin.
Wherein is _ new _ session _ array is a new session identifier array, for example [1,0,0,1,0,1] represents that 1, 4, 6 are new sessions (1, 2,3 is a session, 4,5 is a session, 6 is a session).
The arrayEnumerate function returns an array index, i.e., a position value for each element of the array, as [1,2,3,4,5,6] for the array [1,0,0,1,0,1], arrayEnumerate ([1,0,0,1,0,1]), as shown in Table 2.
TABLE 2 arrayEnumerate processing results
user_id is_new_session_array ArrayEnumerate function processing
user1 [1,0,0,1,0,1] [1,2,3,4,5,6]
At this time, the row of the result processed by the arrayJoin on the arrayEnumerate function is shifted to the column, and one row of user data is converted into 6 rows, which can be regarded as a session index, as shown in Table 3:
TABLE 3ARRAYENUMERATE FUNCTION PROCESSING RESULT ROW-ROW
user_id is_new_session_array Session index
user1 [1,0,0,1,0,1] 1
user1 [1,0,0,1,0,1] 2
user1 [1,0,0,1,0,1] 3
user1 [1,0,0,1,0,1] 4
user1 [1,0,0,1,0,1] 5
user1 [1,0,0,1,0,1] 6
(9) The session id (session _ id) requires an operation on the new session identification array field. The new session identifier array is first cut by using the arraySlice function, and the parameters are the new session identifier array, the step value 1 and the session index value. The return value is then processed by the arraySum function to generate the session id. The generated session-id starts with 1 and each new session-id will be the value on the previous session-id plus the step value 1. Specifically, the method comprises the following steps:
for each line of records, cut using the arraySlice (is _ new _ session _ array,1, session _ index) function.
The arraySlice cuts the is _ new _ session _ array, and the latter two parameters are the cut offset and the cut length respectively. Here, we represent cutting an array of session _ index lengths from the 1 st position, as in table 4:
table 4 cuts the is _ new _ session _ array
user_id is_new_session_array Session index Cutting Using the arraySlice function
user1 [1,0,0,1,0,1] 1 [1]
user1 [1,0,0,1,0,1] 2 [1,0]
user1 [1,0,0,1,0,1] 3 [1,0,0]
user1 [1,0,0,1,0,1] 4 [1,0,0,1]
user1 [1,0,0,1,0,1] 5 [1,0,0,1,0]
user1 [1,0,0,1,0,1] 6 [1,0,0,1,0,1]
Then, the elements in the array cut out in the previous step are summed by arraySum, and the value can be regarded as the session id, as shown in table 5:
TABLE 5 elements of the cut array are summed
Figure BDA0003432262140000141
(10) At this time, LEFT connection (LEFT JOIN) operation is performed on the behavior identification field and the behavior name (operation _ name) field of the behavior information dimension table, and the corresponding behavior name field is searched and generated. Note that: this operation is optional and is intended to convert the behavior identification field into a behavior name field for subsequent use by querying the dimension table.
(11) And then, the data set carries out sorting operation according to the user identification field and the behavior timestamp field, and the sorted data set carries out GROUP BY aggregation operation according to the user identification field and the session id field.
(12) After aggregation, the minimum behavior timestamp in each session is used as a starting timestamp field of the session, the maximum behavior timestamp is used as an ending timestamp field of the session, the behavior names are processed into a behavior name array through a groupArray function, element connection in the array is carried out through an arraystringConcat function, and a user behavior chain field is generated.
(13) The session start timestamp field and the session end timestamp field may be converted to a normal year, month, day, hour, minute, second field using a clickwouse timestamp conversion function.
(14) Finally, selecting fields such as a user identification (user _ id) field, a session id (session _ id) field, a session start time (session start time) field, a session end time (session end time) field, a user behavior chain (user _ operation _ chain) and the like in the data set, and sorting and outputting the fields according to the user identification field and the session id field to form an ordered user behavior session data set.
(15) And storing the data set as a user behavior session data table for subsequent service inquiry and calculation.
2. User retention analysis module
User retention refers to the user staying and continuously active in the product page. The retention analysis may be used to analyze the user's usage and activity in the product. The module calculates the retention rate of the first day, the 1 st day, the 2 nd day, the 3 rd day and the 31 st day through ClickHouse SQL based on given date parameters, and generates a user retention rate daily table. As shown in fig. 3, the specific process is as follows:
(1) a user identification (user _ id) field and an action time (operation _ time) field are extracted from the original access log table. And generates a user access date (visit _ day) field using the toDate function for the action time field.
(2) The data set at this time includes a user identification (user _ id) field, an operation time (operation time) field, and a user access date (visit _ day) field.
(3) Date data up to day 1, day 2. The generation mode is as follows: the 0 th DAY is first _ DAY, the 1 st DAY takes the value of first _ DAY + INTERVAL 1DAY, the 2 nd DAY takes the value of first _ DAY + INTERVAL 2DAY, and so on. The ClickHouse will recognize the expression and convert to the correct date for calculation.
(4) The retention case is calculated using the retention function. The function takes a set of conditions as parameters to indicate whether an event satisfies a particular condition. Generally, it is observed whether the following conditions are satisfied based on the result of the first condition, and if so, the value is set to 1 (meaning True), and if not, the value is set to 0 (meaning False). Namely: the second result will be True if both the first and second are True, otherwise the result will be False. If the first and third are True, the third result will be True, and so on. The final function returns an array containing 0 and 1 elements.
(5) The equals function can be used to determine whether the two parameters of the input are equal. At this time, equal (visit _ DAY, first _ DAY) should be used as the parameter of the first judgment condition of the retention, the second parameter is equal (visit _ DAY, first _ DAY + INTERVAL 1DAY), and so on, if the retention condition up to the 31 st DAY needs to be calculated, the last parameter is equal (visit _ DAY, first _ DAY + INTERVAL 31 DAY). Its final array of calculation results may be saved to the result variable.
(6) The result of the computation due to the retention function is an array. Therefore, it is necessary to obtain the judgment value, and array content can be extracted by using the arrayElement function, such as arrayElement (result,1) representing the first element in the result array, and arrayElement (result,2) representing the second element in the result array.
(7) And then carrying out GROUP BY aggregation on the calculation results through the user identification, and respectively counting the access conditions of the user from the 0 th day to the 31 th day. The statistical method is to accumulate each result array element through sum function. The results may be saved in variables day0_ coverage through day31_ coverage, respectively.
(8) The retention rate (day _ retention _ ratio) on day N is derived by calculating the value of day _ retention divided by day0_ retention. The value ranges from 0 to 1.
(9) The retention rate value can then be multiplied by 100 and rounded off by a round function to truncate the fractional number of the specified number of bits to a percentage value ranging from 0 to 100.
(10) According to the method, the data sets of the retention rates of the 0 th day to the 31 th day are respectively calculated according to the user identification. The data set includes fields for day0 retention (day0_ retention _ ratio) to day31 retention (day31_ retention _ ratio) starting at this date.
(11) And storing the data set as a user retention rate daily table for subsequent service inquiry and calculation.
3. User behavior matching module
The user behavior matching module is used for judging whether the user has performed a series of given operations. A series of actions may be considered as a combination of multiple events, that is, a combination of events may have a requirement on a sequence of actions, or may not have a requirement on a sequence of actions, and corresponding algorithms need to be implemented respectively. The module screens out a user data set which meets a given sequence requirement based on a given behavior mode parameter with the sequence requirement or a behavior mode parameter without the sequence requirement through ClickHouse SQL, and generates a user behavior matching data table. As shown in fig. 4, the specific process is as follows:
31. user behavior matching specific implementation with sequence requirement
(1) Configuration of the event chain.
a) And converting one or more actions of the user into a judgment expression of a single event according to the business requirements, namely judging whether the given operation is met.
b) Each event is composed of a conditional judgment expression, and the calculation result returns 1 (true) or 0 (false).
c) A chain of events consists of a single event or up to 32 events.
(2) Configuration of the pattern string pattern.
a) For a user's sequentially requested behavior, such as the user sequentially having occurred event a, followed by event B, the behavior can be expressed as (? 1) (? 2). Where (. The pattern string (.
b) The user behavior may be written as (? 1) (? 2).
c) The user behavior is repeatedly occurred in the matching, for example, the user has occurred twice in sequence with event a, and then has occurred with event B, which can be expressed as (? 1) (? 2) (? 3) Where (? 1) And (? 2) Corresponds to event a, (? 3) Corresponding to event B.
d) Event-to-event, if there is a time requirement between events, e.g. a B event occurs at least 1800 seconds after an a event occurs, can be written as (? 1) (? t >1800) (? 2) (ii) a Similarly, if an a event occurs within 500 seconds, a B event can be written as (; as another example, a B event occurs immediately 60 seconds after an a event, can be written as (.
(3) And extracting a user identification (user _ id) field, an action identification (action _ id) field and an action time (action _ time) field from the original access log table, and performing GROUP BY aggregation operation through the user identification (user _ id).
(4) And processing the user behavior matching with the sequence requirement by using a sequence match function, and checking whether an event chain meets the input mode. The input of the sequence match function is the above pattern string pattern, action time (action _ time) and one or more condition judgment expressions in the user action event chain. The function returns the user behavior matching result aggregated based on the user identification (user _ id) as a user behavior matching result (user _ operation _ match) field.
(5) And processing the user behavior matching with the sequence requirement by using the sequence count function, and counting the number of event chains meeting the input mode. The input of the sequence count function is the above-mentioned pattern string pattern, action time (action _ time) and one or more specific conditional judgment expressions in the user action event chain. The function returns the statistics aggregated based on the user identification (user _ id) as a user behavior statistics (user _ operation _ count) field.
(6) And screening out the user data meeting the requirements by combining the results of the user behavior matching result (user _ operation _ match) field and the user behavior statistical result (user _ operation _ count) field.
(7) And storing the data set as a user behavior matching table for subsequent service inquiry and calculation.
32. User behavior matching specific implementation without order requirement
(1) Extracting fields such as a user identification (user _ id) field, an action identification (action _ id) field, action time (action _ time) and the like from an original access log table to form an initial user action data set.
(2) And inquiring the initial user behavior data set according to the service requirement, and inquiring each user event needing to be matched. An event may be one or more behaviors of a user. A unique event index (event _ index) field is generated in each sub-query dataset. For example, the index value of the first sub-query is 1, the index value of the second sub-query is 2, and so on. Wherein the user data with the same user event that needs to be matched are merged together to form a sub-query data set.
(3) Multiple events are connected to the data set of each sub-query by UNION ALL, forming an event set.
(4) And carrying out GROUP BY aggregation operation on the data collection in the previous step according to a user identifier (user _ id).
(5) Filtering the aggregated data by using HAVING, comparing the event index de-duplication aggregation array with the matching condition, screening out the event index de-duplication aggregation array meeting the matching condition, and generating a user behavior matching data table, wherein the specific steps are as follows:
assume the resulting data set is as in table 6:
table 6 data set containing event indices
User id Event index (rule _ index)
user1 1
user1 2
user1 1
user1 3
user2 1
user2 1
user2 2
user3 3
And aggregating the user ids, and de-duplicating the event indexes after aggregating the event indexes through a groupuiqarray function to form an event index de-duplication aggregation array so as to match the subsequent hasAll function, as shown in table 7.
Table 7 event index de-reaggregate array
User id Event index deduplication aggregation array If the weight is not removed after polymerization, the subsequent matching is inconvenient
user1 [1,2,3] [1,2,1,3]
user2 [1,2] [1,1,2]
user3 [3] [3]
Using the hasAll function, it is checked whether one array is a subset of another. Returning 1 indicates a match and returning 0 indicates a mismatch.
Here, the statement:
HAVING hasAll(groupUniqArray(rule_index),[1,2,3])=1
that is, the event index de-reunion array and [1,2,3] (match condition, indicating that event indexes 1,2,3 are satisfied simultaneously) are compared, as shown in table 8.
TABLE 8 event index Dereaggregate array and match condition comparison results
User id Event index deduplication aggregation array Hasall calculation results
user1 [1,2,3] 1
user2 [1,2] 0
user3 [3] 0
Indicating that it is desired to filter out the user id, i.e., user1, that satisfies the event indexes 1,2,3 simultaneously.
(6) And storing the data set as a user behavior matching data table for subsequent service inquiry and calculation.
4. Funnel analysis module
The user funnel analysis, namely the conversion analysis, is an algorithm reflecting the user conversion rate situation of each stage of the user behavior state from the starting point to the end point. The module matches event chains in sequence based on a given sliding window (sliding _ time _ window) parameter, calculates the number of conversion steps in window events and the number of conversion of each stage through ClickHouse SQL, and generates a user funnel conversion analysis table. As shown in fig. 5, the specific process is as follows:
(1) and extracting data sets of fields such as a user identification (user _ id) field, an action identification (action _ id) field, action time (action time) and the like from the original access log table, and sequencing the data sets in an ascending order according to the user identification (user _ id) and the action time (action time).
(2) And performing GROUP BY aggregation operation on the data set according to a user identification (user _ id) field.
(3) The window function is used to calculate the maximum number of consecutive trigger condition chains per user within a given sliding time window (sliding _ time _ window) and generate the user's longest trigger chain count (max _ stage _ active) field, specifically:
a) the windowFunnel function searches the chain of events in the sliding time window and calculates the maximum number of events that occur from the chain.
b) The function searches for the first condition in the trigger chain and sets the event counter to 1, which is the time at which the sliding window starts. If the events from the chain occur sequentially within the window, the counter will increment. If the sequence of events is interrupted, the counter is not incremented. If the data has multiple chains of events at different completion points, the function will output the size of the longest chain.
c) The first parameter of the function is the size of the sliding window, which represents the maximum interval between the first and last event in the chain of events.
d) The parameters after this function include the time field of the data set, and the predicate expressions of the event chain. The expression is to use a set of conditions as parameters to indicate whether an event satisfies a specific condition. Each event is composed of a conditional judgment expression, and the calculation result returns 1 (true) or 0 (false).
e) The windowFunnel function returns the max _ stage _ active field that generates the longest trigger chain count for each user.
(4) The number of users at each stage of the generation funnel is as follows:
a) the longest trigger chain count (max _ stage _ active) field of the user is processed using the arrayWithConstant function. The function generates an array of specified lengths for subsequent operations. (if the length is set to be 3, the value is set to be 1, and the generated array is [1,1,1])
b) The function takes the user's longest trigger chain count (max _ stage _ active) field as its first parameter, the second parameter being the value 1. The function will return an array with the length of the user's longest trigger chain count, and the element value of 1.
c) Processing the arrayWithConstant function using the arrayEnumerate function generates an array. The arrayEnumerate function returns an array index. For example, after the array [1,1,1] is processed, a new array of [1,2,3] is returned.
d) The arrayEnumerate function return array is converted to multiple columns (row to column) using the arrayJoin function. The expanded field is a stage index value (stage _ index) field of the funnel.
e) And performing GROUP BY aggregation operation on the data set according to the stage index value (stage _ index) field.
f) At this time, the total number of users in each stage is calculated according to the stage index value aggregation and by using a count function.
(5) Sorting is carried out according to the stage index value (stage _ index), and a data set containing the stage index value and the total number of the stage users can be output. At this time, the calculation process of the funnel stage index and the total number of users in each stage of the funnel is completed.
(6) And storing the data set as a user funnel conversion analysis table for subsequent service inquiry and calculation.
Through the technical scheme, the method is based on the analysis of the user behavior by the original multi-parameter aggregation function and the high-order function in the ClickHouse distributed database, the ClickHouse is used as an OLAP distributed database with excellent performance, the method has obvious advantages in real-time calculation of mass data, is very suitable for being used as a bottom data warehouse of analysis software, is not limited in data memory, does not need to write a series of array processing functions and aggregation functions to make up for the deficiency of the basic algorithm function of big data, is relatively simple in development process, and solves the problems that the prior art is limited by a single-machine physical memory, is complex in development and is difficult to realize real-time response.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A user behavior analysis system is characterized in that a user behavior is analyzed based on a native multi-parameter aggregation function and a high-order function in a ClickHouse distributed database, and the user behavior analysis system comprises a user behavior session module, a user retention analysis module, a user behavior matching module and a user funnel analysis module;
the user behavior session module is used for extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, generating a session id, a session start time stamp field, a session end time stamp field and a user behavior chain field, sorting the session id fields according to the user identification field and the session id field, outputting the sorted session data to form an ordered user behavior session data set, and storing the ordered user behavior session data set as a user behavior session data table;
the user retention analysis module is used for calculating the retention rate of the first day and the retention rate within a preset time limit based on a given date parameter and generating a user retention rate daily table;
the user behavior matching module is used for screening out a user data set which accords with a corresponding mode based on a given behavior mode parameter with a sequence requirement or a behavior mode parameter without a sequence requirement, and generating a user behavior matching data table;
and the user funnel analysis module is used for matching the event chains in sequence based on a given sliding window, calculating the number of conversion steps in the window event and the conversion number of each stage, and generating a user funnel conversion analysis table.
2. The system according to claim 1, wherein the user behavior session module extracts a user identification field, a behavior identification field, and a behavior time field from an original access log table to generate a session id, and includes:
extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, wherein the behavior time field is converted into a timestamp format, and the extracted data sets are sorted from small to large according to the numerical values of the behavior time field;
the data set carries out GROUP BY aggregation according to the user identification field, and corresponding behavior identification fields and behavior time fields are respectively aggregated into behavior identification array fields and behavior timestamp array fields BY using a groupArray function;
calculating the difference value between adjacent elements in the behavior time stamp array field by using an arrayDifference function to generate the user behavior interval array field;
comparing the behavior intervals in the behavior interval array field with a preset session time threshold by using an arrayMap function, if the behavior intervals are smaller than or equal to the preset session time threshold, representing that the behavior intervals are the behavior records of the same session of a user, if the behavior intervals are larger than the session time threshold, representing that the behavior records are two sessions, returning result arrays as a new session identification array field, wherein when the element value is 1, the new session is represented, when the element value is 0, the old session is represented, and the first element of the new session identification array field is set to be 1;
performing row-to-column expansion operation on the behavior identification ARRAY field, the behavior timestamp ARRAY field and the new session identification ARRAY field by using ARRAY JOIN, expanding the single-row records of each user into a plurality of rows of records according to the element number of the ARRAY, and generating a user identification field, a behavior timestamp field and a new session identification field after expansion;
the method comprises the steps of returning an array subscript to a new session identification field by using an arrayEnumerate function, converting a row of a result processed by the arrayEnumerate function by using an arrayJoin function, generating a session index session _ index, cutting each row of data by using an arraySlice (is _ new _ session _ array,1, session _ index) function, wherein three input parameters of the arraySlice are the new session identification field, the cut offset and the cut length respectively, summing elements in the cut array by using the arraySum function, and taking the value as a session id.
3. The system of claim 2, wherein the user behavior session module generates a start timestamp field of a session, an end timestamp field of the session, and a user behavior chain field, and comprises:
performing left connection operation on the behavior identification field and the behavior name field of the behavior information dimension table, and converting the behavior identification field into the behavior name field;
carrying out sorting operation on the data sets according to the user identification field and the behavior timestamp field, and carrying out GROUP BY aggregation operation on the sorted data sets according to the user identification field and the session id field;
after aggregation, the minimum behavior timestamp in each session is used as a starting timestamp field of the session, the maximum behavior timestamp is used as an ending timestamp field of the session, behavior name fields in the starting timestamp field and the ending timestamp field are processed into behavior name arrays through a groupArray function, and then element connection in the arrays is carried out through an arraystringConcat function to generate a user behavior chain field.
4. The system according to claim 1, wherein the original access log table is constructed by: the log data of the system where the analysis object is located is stored in a ClickHouse database, and then an original access log table is generated, wherein the original access log table at least comprises a user identification field, a behavior identification field and a behavior time field.
5. The system according to claim 1, wherein the user retention analysis module is further configured to:
extracting a user identification field and a behavior time field from an original access log table, and generating a user access date field visit _ day by using a toDate function for the behavior time field;
generating date data of 1 st, 2 nd, the previous N th day by using an INTERVAL operator of ClickHouse for subsequent calculation based on a given date parameter, namely first _ day, wherein 0 th day is first _ day;
setting equal (i.e. visual _ day, first _ day) as a first judgment condition parameter of the retention function by using the equal function, and so on, wherein the Nth judgment condition parameter is equal (i.e. visual _ day, first _ day + INTERVAL N DAY), the result of the first judgment condition parameter is default to be retained, and the results of all calculated judgment condition parameters are stored into a result array of the retention function;
carrying out GROUP BY aggregation on a result array of the coverage function through a user identification field, and respectively counting the access conditions of the users from the 0 th day to the N th day;
the retention rate on day N is determined by calculating the visit volume on day N divided by the visit volume on day 0; the retention rate from the 0 th day to the N th day is calculated by the method and is stored as a user retention rate daily table.
6. The system according to claim 1, wherein the user behavior matching module is further configured to process a sequence-required user behavior matching, and the sequence-required user behavior matching comprises:
configuring a user behavior event chain;
configuring a mode character string corresponding to the behavior of a user;
extracting a user identification field, a behavior identification field and a behavior time field from an original access log table, and performing GROUP BY aggregation operation through the user identification field;
after each user conducts GROUP BY aggregation operation according to the user identification field, the behavior identification field conducts sorting according to the sequence of the behavior time field, the sorted result is converted into a mode character string, the mode character string is input into a sequence match function, whether a user behavior event chain meets the input mode or not is judged, a user behavior matching result field is returned, the number of the user behavior event chains meeting the input mode is counted, and a user behavior counting result field is generated;
and generating a user behavior matching data table by combining the results of the user behavior matching result field and the user behavior statistical result field.
7. The system according to claim 6, wherein the configuring the user behavior event chain comprises: and converting one or more actions of the user into a judgment expression of a single event, judging whether the event meets the judgment expression when the user triggers the event, and triggering the event in the user behavior event chain if the event meets the judgment expression.
8. The system according to claim 7, wherein the configuring of the pattern string corresponding to the behavior of the user comprises:
if the user has occurred events a in sequence, and then has occurred events B, the above behavior is represented as (;
the case where the user has sequentially occurred with event a, followed by some careless action, followed by event B, is written as (;
the user has occurred twice in sequence with event a, followed by event B, representing the above behavior as (;
event and time-between-event requirements, a B event occurs at least m seconds after the occurrence of an a event, written as (; occurrence of an a event a B event write (; a B event occurs immediately Q seconds after an a event, written as (.
9. The system according to claim 1, wherein the user behavior matching module is further configured to process unordered user behavior matching, and the unordered user behavior matching comprises:
extracting a user identification field, a behavior identification field and a behavior time field from an original access log table to form an initial user behavior data set;
inquiring an initial user behavior data set, inquiring whether each user event needing to be matched occurs, combining user data with the same user event needing to be matched together to form a sub-inquiry data set, wherein each sub-inquiry data set has a unique event index field;
connecting each sub-query data set through UNION ALL to form an event set;
carrying out GROUP BY aggregation operation on the event set according to the user identification to generate a new data set;
the event indexes in the new data set are subjected to de-duplication after being aggregated through a groupUniqarray function, and an event index de-duplication aggregation array is formed;
and comparing the event index de-duplication aggregation array with the matching condition by using a hasAll function, screening the event index de-duplication aggregation array meeting the matching condition, and generating a user behavior matching data table.
10. The user behavior analysis system of claim 1, wherein the user funnel analysis module is further configured to: extracting a user identification field, a behavior identification field and a behavior time field from an original access log table to form a data set;
the data set sorts the behavior identification fields of each user of the user identification fields according to the numerical values of the behavior time fields from small to large;
calculating the maximum number of continuous trigger condition chains of each user in a given sliding time window by using a windowFunnel function, and generating a user longest trigger chain counting field;
processing the longest trigger chain counting field of the user by using an arrayWithConstant function to generate an array with a specified length, and performing row-to-column conversion on the array with the specified length to generate a stage index value field;
performing GROUP BY aggregation operation on the data set according to the phase index value field, calculating the total number of users in each phase BY using a count function, sorting according to the phase index value field, outputting a new data set comprising the phase index value and the total number of the users in the phase, and storing the new data set as a user funnel conversion analysis table.
CN202111602485.8A 2021-12-24 2021-12-24 User behavior analysis system Pending CN114238360A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111602485.8A CN114238360A (en) 2021-12-24 2021-12-24 User behavior analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111602485.8A CN114238360A (en) 2021-12-24 2021-12-24 User behavior analysis system

Publications (1)

Publication Number Publication Date
CN114238360A true CN114238360A (en) 2022-03-25

Family

ID=80762799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111602485.8A Pending CN114238360A (en) 2021-12-24 2021-12-24 User behavior analysis system

Country Status (1)

Country Link
CN (1) CN114238360A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116070206A (en) * 2023-03-28 2023-05-05 上海观安信息技术股份有限公司 Abnormal behavior detection method, system, electronic equipment and storage medium
CN116501778A (en) * 2023-05-16 2023-07-28 湖北省珍岛数字智能科技有限公司 Real-time user behavior data analysis method based on ClickHouse

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116070206A (en) * 2023-03-28 2023-05-05 上海观安信息技术股份有限公司 Abnormal behavior detection method, system, electronic equipment and storage medium
CN116501778A (en) * 2023-05-16 2023-07-28 湖北省珍岛数字智能科技有限公司 Real-time user behavior data analysis method based on ClickHouse

Similar Documents

Publication Publication Date Title
CN102662952B (en) Chinese text parallel data mining method based on hierarchy
CN114238360A (en) User behavior analysis system
CN101606149B (en) Apparatus and method for categorical filtering of data
CN102306176B (en) On-line analytical processing (OLAP) keyword query method based on intrinsic characteristic of data warehouse
CN103927398A (en) Microblog hype group discovering method based on maximum frequent item set mining
CN103605651A (en) Data processing showing method based on on-line analytical processing (OLAP) multi-dimensional analysis
WO2008157456A1 (en) Multidimensional analysis tool for high dimensional data
US9275015B2 (en) System and method for performing analysis on information, such as social media
CN107832333B (en) Method and system for constructing user network data fingerprint based on distributed processing and DPI data
CN110389950B (en) Rapid running big data cleaning method
JP5588811B2 (en) Data analysis support system and method
CN111740884A (en) Log processing method, electronic equipment, server and storage medium
CN101944116B (en) Complex multi-dimensional hierarchical connection and aggregation method for data warehouse
CN101916281B (en) Concurrent computational system and non-repetition counting method
CN114579409A (en) Alarm method, device, equipment and storage medium
CN111639060A (en) Thermal power plant time sequence data processing method, device, equipment and medium
CN102799616A (en) Outlier point detection method in large-scale social network
CN105426392A (en) Collaborative filtering recommendation method and system
CN114022051A (en) Index fluctuation analysis method, storage medium and electronic equipment
CN111125045B (en) Lightweight ETL processing platform
CN106919566A (en) A kind of query statistic method and system based on mass data
CN106941419B (en) visual analysis method and system for network architecture and network communication mode
US20160078071A1 (en) Large scale offline retrieval of machine operational information
CN110191005B (en) Alarm log processing method and system
Singh et al. Knowledge based retrieval scheme from big data for aviation industry

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination