CA3184895A1 - User behavior data writing method and device, computer equipment and storage medium - Google Patents

User behavior data writing method and device, computer equipment and storage medium

Info

Publication number
CA3184895A1
CA3184895A1 CA3184895A CA3184895A CA3184895A1 CA 3184895 A1 CA3184895 A1 CA 3184895A1 CA 3184895 A CA3184895 A CA 3184895A CA 3184895 A CA3184895 A CA 3184895A CA 3184895 A1 CA3184895 A1 CA 3184895A1
Authority
CA
Canada
Prior art keywords
target
user behavior
behavior data
hudi
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3184895A
Other languages
French (fr)
Inventor
Dong FAN
Cheng Li
Qian Sun
Wuyuan Fang
Jinzhong Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3184895A1 publication Critical patent/CA3184895A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a user behavior data writing method, apparatus, computer device, and storage medium, comprising: obtaining multi-dimensional user behavior data, the user behavior data carries data source; creating a target Hudi table matching the data source;
aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size. The present method can write user behavior data into Hudi table, and based on Hudi storage, solve the problem of insufficient memory caused by Flink intermediate status storage of large window or large data amount, so as to avoid the instniction of other components as storage medium.

Description

USER BEHAVIOR DATA WRITING METHOD AND DEVICE, COMPUTER
EQUIPMENT AND STORAGE MEDIUM
Technical Field [0001] The present disclosure relates to the computer technology field, particularly to user behavior data writing method, apparatus, computer device, and storage medium.
Background
[0002] At present, when industry performs real-time window analysis of various scenarios, the rolling window or sliding window that comes with Flink is usually used, after setting the window size and the sliding step length, the desired result can be calculated in real time according to the preset dimension, but such statistic needs to preset the statistic dimension, the window size and the step size, the flexibility is poor. Wherein, since Flink is based on memory computing, the window size and the status result set should not be too large, when data amount is too large, a third-party component storage is required, introducing additional components not only complicates the architecture, but also significantly reduces performance.
Invention Content
[0003] Based on this, it is necessary to provide a method, apparatus, computer device, and storage medium to tackle the above-mentioned technical problem, writing user behavior data into Hudi table, based on Hudi storage, solving the problem of insufficient memory caused by Flink intermediate status storage of large window or large data amount, so as to avoid the instruction of other components as storage medium.
[0004] A user behavior data writing method comprises:
[0005] Obtaining multi-dimensional user behavior data, the user behavior data carries data source;

Date Regue/Date Received 2023-02-21
[0006] Creating a target Hudi table matching the data source;
[0007] Aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
[0008] Synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0009] In an embodiment, obtaining multi-dimensional user behavior data, the user behavior data carries data source, comprising: obtaining the multi-dimensional user behavior data from log file corresponding to service system, obtaining the user behavior data by service system through dimensional filtering from collected original data.
[0010] In an embodiment, creating a target Hudi table matching the data source, comprising:
creating a My Sql table corresponding to the data source; creating a corresponding target Hudi table based on the My Sql table.
[0011] In an embodiment, after creating a target Hudi table matching the data source, comprising: obtaining an initial submission frequency corresponding to the target Hudi table;
the initial submission frequency is set as a preset window size to obtain a target submission frequency.
[0012] In an embodiment, aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension, comprising: obtaining a preset window size; when current window size reaches the preset window size, obtaining intermediate user behavior data corresponding to target dimension by filtering from the multi-dimensional user behavior data; aggregating the intermediate user behavior data corresponding to target dimension to obtain target service data corresponding to aggregated target dimension.

Date Regue/Date Received 2023-02-21
[0013] In an embodiment, synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size, comprising: obtaining last submission time; determining target submission time according to the last submission time and the target submission frequency; when current submission time reaches the target submission time, synchronizing the target service data and the corresponding target submission time to the target Hudi table.
[0014] In an embodiment, the user behavior data writing method also comprises:
sending a query request to the target Hudi table, the query request carries dimension information to be queried and submission time information to be queried; searching for service data matching the dimension information to be queried and the submission time information to be queried in the target Hudi table according to the query request.
[0015] A user behavior data writing apparatus, wherein, the apparatus comprises:
[0016] An obtaining module configured to obtain multi-dimensional user behavior data, the user behavior data carries data source;
[0017] A creating module configured to create a target Hudi table matching the data source;
[0018] An aggregating module configured to aggregate the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
[0019] A synchronizing module configured to synchronize the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0020] A computer device, including a memory, a processor and a computer program stored in the memory and run on the processor configured to achieve the following steps when the processor executes the computer program:

Date Regue/Date Received 2023-02-21
[0021] Obtaining multi-dimensional user behavior data, the user behavior data carries data source;
[0022] Creating a target Hudi table matching the data source;
[0023] Aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
[0024] Synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0025] A computer readable storage medium stored with a computer program configured to achieve the following steps when the processor executes the computer program:
[0026] Obtaining multi-dimensional user behavior data, the user behavior data carries data source;
[0027] Creating a target Hudi table matching the data source;
[0028] Aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
[0029] Synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0030] The above-mentioned user behavior data writing method, computer device, and storage medium, obtaining multi-dimensional user behavior data, the user behavior data carries data source; creating a target Hudi table matching the data source;
aggregating the user behavior data according to a preset window size to obtain target service data corresponding to Date Regue/Date Received 2023-02-21 target dimension; synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size. Therefore, writing user behavior data into Hudi table, and based on Hudi storage, solving the problem of insufficient memory caused by Flink intermediate status storage of large window or large data amount, so as to avoid the instruction of other components as storage medium.
Drawing Description
[0031] Figure 1 is an application environment diagram of user behavior data writing method in an embodiment;
[0032] Figure 2 is a process diagram of user behavior data writing method in an embodiment;
[0033] Figure 3 is a structural diagram of user behavior data writing apparatus in an embodiment;
[0034] Figure 4 is an internal structural diagram of a computer device in an embodiment;
[0035] Figure 5 is an internal structural diagram of a computer device in an embodiment.
Specific embodiment methods
[0036] In order to make clearer application purposes, technical solutions, and advantages, the present disclosure is further explained in detail with a particular embodiment thereof, and with reference to the drawings. It shall be appreciated that these descriptions are only intended to be illustrative, but not to limit the scope of the disclosure thereto.
[0037] The user behavior data writing method provided in this application can be applied to the application environment as shown in Figure 1. Wherein, terminal 102 communicates with Date Regue/Date Received 2023-02-21 server 104 through a network, wherein, the terminal 102 can be but not limited to various personal computers, laptops, smaaphones, tablets and portable wearable devices, the server 104 can be implemented by an independent server or a server cluster composed of a plurality of servers.
[0038] Specifically, the terminal 102 collects multi-dimensional user behavior data, the user behavior data carries data source and sends to the server 104 through network communication, the server 104 obtains multi-dimensional user behavior data, the user behavior data carries data source, creates a target Hudi table matching the data source, aggregates the user behavior data according to a preset window size to obtain target service data corresponding to target dimension, and synchronizes the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0039] In another embodiment, the server 104 obtains multi-dimensional user behavior data, the user behavior data carries data source, creates a target Hudi table matching the data source, aggregates the user behavior data according to a preset window size to obtain target service data corresponding to target dimension, and synchronizes the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0040] In an embodiment, as shown in Figure 2, a user behavior data writing method is provided, taking the method applied to the terminal or server in Figure 1 as an example for illustration, including following steps:
[0041] Step 202, obtaining multi-dimensional user behavior data, the user behavior data carries data source.
[0042] Wherein, multi-dimensional user behavior data is user behavior data in multi-dimension, user behavior data is data related to user behavior in each dimension, each dimension can be determined according to actual service application scenarios, actual service Date Regue/Date Received 2023-02-21 requirements or actual product requirements, such as user behavior data under a certain category of commodity, user behavior data under a certain region, and user behavior data purchased through a certain channel.
[0043] Wherein, the user behavior data carries data source, and the data source here is the source of user behavior data, for example, user behavior data is originally stored in the MySql table, so the data source is My Sql table.
[0044] In an embodiment, the step 202 includes: obtaining the multi-dimensional user behavior data from log file corresponding to service system, obtaining the user behavior data by service system through dimensional filtering from collected original data.
[0045] Wherein, the service system here is a system related to actual service for storing user behavior data in various dimensions, user behavior data in various dimensions can be collected from the service system. Specifically, the service system can include a MySql service library, and the My Sql service library can include a MySql table for storing user behavior data of each dimension, or the service system uses service log to record user behavior data of each dimension, obtains all the data recorded in the log file from the log file of service system, and determines the data as the original data, then filters the original data to obtain user behavior data in a plurality of dimensions. Wherein, filtering the original data collected by the service system can be to obtain the original data corresponding to each dimension, analyze the data type of the original data, and filter out the data related to user behavior to obtain the user behavior data corresponding to each dimension.
[0046] Step 204, creating a target Hudi table matching the data source.
[0047] Wherein, the target Hudi table here is a Hudi table matching the data source, the table is Apache Hudi table. After obtaining multi-dimensional user behavior data, creating a target Hudi table matching the data source, at this time, the created target Hudi is temporarily an empty table without writing any data. In an embodiment, step 204 includes:
creating a MySql Date Recue/Date Received 2023-02-21 table corresponding to the data source; creating a corresponding target Hudi table based on the My Sql table.
[0048] Wherein, the target Hudi table here is Hudi matching the data source, creating the target Hudi table can specifically create a My Sql table matching the data source, at this time, no data is written in the My Sql table, then the corresponding target Hudi table is created based on the created MySql table. Wherein, creating a My Sql table can be created through a Sql statement, and creating a target Hudi table can be created based on a My Sql table.
[0049] In an embodiment, after step 204, obtaining an initial submission frequency corresponding to the target Hudi table; the initial submission frequency is set as a preset window size to obtain a target submission frequency.
[0050] Specifically, after creating the target Hudi table, the initial submission frequency is set as a preset window size, the preset window size here is the statistical window size set in advance for counting user behavior data and can be the size of time rolling window, keeping the submission frequency of the target Hudi table consistent with the preset window size can ensure the smoothness of user behavior data writing.
[0051] Step 206, aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension.
[0052] Specifically, before writing to the target Hudi table, lightweight aggregation operation can be performed on user behavior data according to the preset window size, aggregation operation can be statistical calculation based on user behavior data to obtain target service data in target dimension, for example, according to the data recorded in user behavior data, calculating the commodity order quantity under the target channel.
[0053] In an embodiment, step 206 includes: obtaining a preset window size, when current window size reaches the preset window size, obtaining intermediate user behavior data Date Regue/Date Received 2023-02-21 corresponding to target dimension by filtering from the multi-dimensional user behavior data, aggregating the intermediate user behavior data corresponding to target dimension to obtain target service data corresponding to aggregated target dimension.
[0054] Specifically, performing lightweight aggregation operation on user behavior data according to the preset window size, obtaining the window size set in advance, the preset window size can be set according to actual service requirements, actual product requirements or actual application scenarios, and can be the size of time rolling window, then obtaining current window size, comparing whether the current window size reaches the preset window size, when the current window size reaches the preset window size, filtering out the intermediate user behavior data corresponding to the target dimension from the multi-dimensional user behavior data, the target dimension is selected from a plurality of dimensions, and the determining of the target dimension can be obtained by filtering according to actual service requirements, actual product requirements, or actual application scenarios.
[0055] Furthermore, performing statistical calculation on the filtered intermediate user behavior data in the target dimension to obtain aggregated target service data in the target dimension. Specifically, obtaining the data related to the target service in the user behavior data under the target dimension, and performing statistical calculation on the data, so as to obtain the target service data under the target dimension in the end.
[0056] Step 208, synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0057] Specifically, after setting the target submission frequency of the target Hudi table to be the same as the preset window size, the target service data can be synchronously written into the target Hudi table according to the target submission frequency, so that the consistency of statistic and writing can be guaranteed, after counting the target service data, the data can be written into the target Hudi table in time. For example, the preset window size is the size of Date Regue/Date Received 2023-02-21 time rolling window which is 5 seconds, and the target submission frequency is also 5 seconds, when the target submission frequency is reached, the calculated target service data can be written into the target Hudi table.
[0058] In an embodiment, step 208 includes: obtaining last submission time, determining target submission time according to the last submission time and the target submission frequency, when current submission time reaches the target submission time, synchronizing the target service data and the corresponding target submission time to the target Hudi table.
[0059] Specifically, obtaining the last submission time of the target Hudi table from the configuration file corresponding to the target Hudi table, then calculating the next submission time based on the last submission time and the target submission frequency, the next submission time is the target submission time. Comparing whether the current submission time reaches the target submission time, if the current submission time reaches the target submission time, the target service data and the corresponding target submission time can be written into the target Hudi table.
[0060] In an embodiment, after step 208, sending a query request to the target Hudi table, the query request carries dimension information to be queried and submission time information to be queried, searching for service data matching the dimension information to be queried and the submission time information to be queried in the target Hudi table according to the query request.
[0061] Specifically, after the target service data in the user behavior data is written into the target Hudi table, the target service data can be queried from the target Hudi table. Specifically, sending a query request to the target Hudi table, the query request can carry dimension information to be queried and submission information to be queried, in other words, when querying, as long as different dimension information and submission time information are locked, matching service data can be found in the target Hudi table. In this way, as long as different dimension information and submission time info are locked, matching service data Date Regue/Date Received 2023-02-21 can be found from the target Hudi table for subsequent statistical analysis.
[0062] The above-mentioned user behavior data writing method, obtaining multi-dimensional user behavior data, the user behavior data carries data source;
creating a target Hudi table matching the data source; aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size. Therefore, writing user behavior data into Hudi table, and based on Hudi storage, solving the problem of insufficient memory caused by Flink intermediate status storage of large window or large data amount, so as to avoid the instruction of other components as storage medium.
[0063] In the prior art, if user analyzes based on different dimensions, creating a plurality of tasks is required, and counting a plurality of status result sets, which obviously consumes a lot in terms of calculation and storage. In the present application, as long as different dimension information and submission time information are locked, the matching service data can be found in the target Hudi table without creating a plurality of tasks which leads to the problem of occupying a large amount of calculation resources and storage resources.
[0064] What should be noted is although the steps of the above-mentioned process diagram are shown in sequence as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly provided instruction in this article, there is no strict order in which these steps can be performed, and they can be performed in any other orders. In addition, at least parts of the appended drawings in the steps can include more sub steps or multiple stages, these sub steps or stages are not necessarily completed at the same time but can be executed in different time, the execution order of these sub steps or stages is also not necessarily in sequence order but can be performed alternately with the other steps or sub steps of other steps or at least one part of the other stages.
[0065] In an embodiment, as shown in Figure 3, a user behavior data writing apparatus Date Recue/Date Received 2023-02-21 comprises: obtaining module 302, creating module 304, aggregating module 306 and synchronizing module 308, wherein:
[0066] An obtaining module 302 configured to obtain multi-dimensional user behavior data, the user behavior data carries data source.
[0067] A creating module 304 configured to create a target Hudi table matching the data source;
[0068] An aggregating module 306 configured to aggregate the user behavior data according to a preset window size to obtain target service data corresponding to target dimension.
[0069] A synchronizing module 308 configured to synchronize the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0070] In an embodiment, the obtaining module 302 obtains the multi-dimensional user behavior data from log file corresponding to service system, the user behavior data is obtained by service system through dimensional filtering from collected original data.
[0071] In an embodiment, the creating module 304 creates a MySql table corresponding to the data source and creates a corresponding target Hudi table based on the My Sql table.
[0072] In an embodiment, the user behavior data writing apparatus 300 obtains an initial submission frequency corresponding to the target Hudi table, the initial submission frequency is set as a preset window size to obtain a target submission frequency.
[0073] In an embodiment, the aggregating module 306 obtains a preset window size, when current window size reaches the preset window size, obtaining intermediate user behavior data corresponding to target dimension by filtering from the multi-dimensional user behavior data, Date Regue/Date Received 2023-02-21 aggregating the intermediate user behavior data corresponding to target dimension to obtain target service data corresponding to aggregated target dimension.
[0074] In an embodiment, the synchronizing module 308 obtains last submission time, determines target submission time according to the last submission time and the target submission frequency, when current submission time reaches the target submission time, synchronizing the target service data and the corresponding target submission time to the target Hudi table.
[0075] In an embodiment, the user behavior data writing apparatus 300 sends a query request to the target Hudi table, the query request carries dimension information to be queried and submission time infoimation to be queried, searches for service data matching the dimension information to be queried and the submission time information to be queried in the target Hudi table according to the query request.
[0076] For the specific limitation of user behavior data writing apparatus can refer to the above-mentioned the user behavior data writing method, which will not be repeated here. Each module of the above user behavior data writing apparatus can be achieved fully or partly by software, hardware, and their combinations. The above modules can be embedded in the processor or independent of the processor in computer device and can store in the memory of computer device in form of software, so that the processor can call and execute the operations corresponding to the above modules.
[0077] In an embodiment, a computer device is provided, the computer device can be a server and whose internal structure diagram is shown in Figure 4. The computer device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the computer device is configured to provide calculation and control capabilities.
The memory of the computer device includes non-volatile storage medium and internal memory. The memory of non-volatile storage medium has an operation system, computer programs and database. The internal memory provides an environment for the operation system Date Recue/Date Received 2023-02-21 and computer program running in a non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to implement a user behavior data writing method.
[0078] In an embodiment, a computer device is provided, the computer device can be a server whose internal structure diagram is shown in Figure 5. The computer device includes a processor, a memory, a network interface, display screen and input apparatus connected through a system bus. The processor of the computer device is configured to provide calculation and control capabilities. The memory of the computer device includes non-volatile storage medium and internal memory. The memory of non-volatile storage medium has an operation system and computer programs. The internal memory provides an environment for the operation system and computer program running in a non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer program is executed by the processor to implement a laundry care order processing method. The display screen of the computer device can be a liquid display screen or an electronic ink display screen, the input apparatus of the computer device can be a touch layer covered on the display screen, or keys, trackball or touchpad set on the surface of computer device, and also can be external keyboard, touchpad or mouse, etc.
[0079] The skilled in the art can understand that the structure shown in Figure 4 or Figure 5 is only partial structural diagram related this application solution and not constitute limitation to the computer device applied on the current application solution, the specific computer device can include more or less components than what is shown in the figure, or combinations of some components or different components to what is shown in the figure.
[0080] In an embodiment, a computer device is provided, including a memory, a processor and a computer program stored in the memory and ran on the processor configured to achieve the following steps when the processor executes the computer program:
obtaining multi-Date Regue/Date Received 2023-02-21 dimensional user behavior data, the user behavior data carries data source;
creating a target Hudi table matching the data source; aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension;
synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0081] In an embodiment, the processor performs the following steps when executing the computer program: obtaining the multi-dimensional user behavior data from log file corresponding to service system, obtaining the user behavior data by service system through dimensional filtering from collected original data.
[0082] In an embodiment, the processor performs the following steps when executing the computer program: creating a My Sql table corresponding to the data source, creating a corresponding target Hudi table based on the MySql table.
[0083] In an embodiment, the processor performs the following steps when executing the computer program: obtaining an initial submission frequency corresponding to the target Hudi table, the initial submission frequency is set as a preset window size to obtain a target submission frequency.
[0084] In an embodiment, the processor performs the following steps when executing the computer program: obtaining a preset window size, when current window size reaches the preset window size, obtaining intemiediate user behavior data corresponding to target dimension by filtering from the multi-dimensional user behavior data, aggregating the intermediate user behavior data corresponding to target dimension to obtain target service data corresponding to aggregated target dimension.
[0085] In an embodiment, the processor performs the following steps when executing the computer program: obtaining last submission time, determining target submission time according to the last submission time and the target submission frequency, when current Date Regue/Date Received 2023-02-21 submission time reaches the target submission time, synchronizing the target service data and the corresponding target submission time to the target Hudi table.
[0086] In an embodiment, the processor performs the following steps when executing the computer program: sending a query request to the target Hudi table, the query request carries dimension information to be queried and submission time information to be queried, searching for service data matching the dimension information to be queried and the submission time information to be queried in the target Hudi table according to the query request.
[0087] In an embodiment, a computer readable storage medium is provided, the medium stored with computer program and the processor performs the following steps when executing the computer program: obtaining multi-dimensional user behavior data, the user behavior data carries data source; creating a target Hudi table matching the data source;
aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension; synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
[0088] In an embodiment, the processor performs the following steps when executing the computer program: obtaining the multi-dimensional user behavior data from log file corresponding to service system, obtaining the user behavior data by service system through dimensional filtering from collected original data.
[0089] In an embodiment, the processor performs the following steps when executing the computer program: creating a MySql table corresponding to the data source, creating a corresponding target Hudi table based on the MySql table.
[0090] In an embodiment, the processor performs the following steps when executing the computer program: obtaining an initial submission frequency corresponding to the target Hudi table, the initial submission frequency is set as a preset window size to obtain a target Date Regue/Date Received 2023-02-21 submission frequency.
[0091] In an embodiment, the processor performs the following steps when executing the computer program: obtaining a preset window size, when current window size reaches the preset window size, obtaining intermediate user behavior data corresponding to target dimension by filtering from the multi-dimensional user behavior data, aggregating the intermediate user behavior data corresponding to target dimension to obtain target service data corresponding to aggregated target dimension.
[0092] In an embodiment, the processor performs the following steps when executing the computer program: obtaining last submission time, determining target submission time according to the last submission time and the target submission frequency, when current submission time reaches the target submission time, synchronizing the target service data and the corresponding target submission time to the target Hudi table.
[0093] In an embodiment, the processor performs the following steps when executing the computer program: sending a query request to the target Hudi table, the query request carries dimension information to be queried and submission time information to be queried, searching for service data matching the dimension information to be queried and the submission time information to be queried in the target Hudi table according to the query request.
[0094] The skilled in the art can understand that all or partial of procedures from the above-mentioned methods can be performed by computer program instructions through related hardware, the mentioned computer program can be stored in a non-volatile material computer readable storage medium, this computer can include various embodiment procedures from the abovementioned methods when execution. Any reference to the memory, the storage, the database, or the other media used in each embodiment provided in current application can include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), programable ROM (PROM), electrically programmable ROM (EPRPMD), electrically erasable programmable ROM (EEPROM) or flash memory. Volatile memory can Date Regue/Date Received 2023-02-21 include random access memory (RAM) or external cache memory. As an instruction but not limited to, RAM is available in many forms such as static RAM (SRAM), dynamic RAM
(DRAMD), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SRAM (ESDRAM), synchronal link (Synchlink) DRAM (SLDRAM), memory bus (Rambus), direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
[0095] The technical features of the above-mentioned embodiments can be randomly combined, for concisely statement, not all possible combinations of technical features in the abovementioned embodiments are described. However, if there are no conflicts in the combinations of these technical features, it shall be within the scope of this description.
[0096] The above-mentioned embodiments are only several embodiments in this disclosure and the description is more specific and detailed but cannot be understood as the limitation of the scope of the invention patent. Evidently those ordinary skilled in the art can make various modifications and variations to the disclosure without departing from the spirit and scope of the disclosure. Therefore, the appended claims are intended to be construed as encompassing the described embodiment and all the modifications and variations coming into the scope of the disclosure.

Date Recue/Date Received 2023-02-21

Claims (10)

Claims:
1. A user behavior data writing method comprises:
obtaining multi-dimensional user behavior data, the user behavior data canies data source;
creating a target Hudi table matching the data source;
aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension; and synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
2. The method according to claim 1, wherein, obtaining multi-dimensional user behavior data, the user behavior data carries data source, comprising:
obtaining the multi-dimensional user behavior data from log file corresponding to service system, obtaining the user behavior data by service system through dimensional filtering from collected original data.
3. The method according to claim 1, wherein, creating a target Hudi table matching the data source, comprising:
creating a MySql table corresponding to the data source; and creating a corresponding target Hudi table based on the MySql table.
4. The method according to claim 1, wherein, after creating a target Hudi table matching the Date Recue/Date Received 2023-02-21 data source, comprising:
obtaining an initial submission frequency corresponding to the target Hudi table; and the initial submission frequency is set as a preset window size to obtain a target submission frequency.
5. The method according to claim 1, wherein, aggregating the user behavior data according to a preset window size to obtain target service data corresponding to target dimension, comprising:
obtaining a preset window size;
when current window size reaches the preset window size, obtaining intermediate user behavior data corresponding to target dimension by filtering from the multi-dimensional user behavior data; and aggregating the intermediate user behavior data corresponding to target dimension to obtain target service data corresponding to aggregated target dimension.
6. The method according to claim 1, wherein, synchronizing the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size, comprising:
obtaining last submission time;
determining target submission time according to the last submission time and the target submission frequency; and when current submission time reaches the target submission time, synchronizing the Date Regue/Date Received 2023-02-21 target service data and the corresponding target submission time to the target Hudi table.
7. The method according to claim 6, wherein, the method also comprises:
sending a query request to the target Hudi table, the query request carries dimension information to be queried and submission time information to be queried; and searching for service data matching the dimension infoimation to be queried and the submission time information to be queried in the target Hudi table according to the query request.
8. A user behavior data writing apparatus, wherein, the apparatus comprises:
an obtaining module configured to obtain multi-dimensional user behavior data, the user behavior data carries data source;
a creating module configured to create a target Hudi table matching the data source;
an aggregating module configured to aggregate the user behavior data according to a preset window size to obtain target service data corresponding to target dimension; and a synchronizing module configured to synchronize the target service data to the target Hudi table according to target submission frequency, the target submission frequency is the same as the preset window size.
9. A computer device, including a memory, a processor and a computer program stored in the memory and run on the processor configured to achieve the steps of any methods in claim 1 to 7 when the processor executes the computer program.

Date Recue/Date Received 2023-02-21
10. A computer readable storage medium stored with a computer program configured to achieve the steps of any methods in claim 1 to 7 when the processor executes the computer program.

Date Regue/Date Received 2023-02-21
CA3184895A 2021-12-29 2022-12-23 User behavior data writing method and device, computer equipment and storage medium Pending CA3184895A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111641860.X 2021-12-29
CN202111641860.XA CN114461726A (en) 2021-12-29 2021-12-29 User behavior data writing method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CA3184895A1 true CA3184895A1 (en) 2023-06-29

Family

ID=81408226

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3184895A Pending CA3184895A1 (en) 2021-12-29 2022-12-23 User behavior data writing method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN114461726A (en)
CA (1) CA3184895A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116126976B (en) * 2023-04-06 2023-07-04 之江实验室 Data synchronization method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN114461726A (en) 2022-05-10

Similar Documents

Publication Publication Date Title
US11106486B2 (en) Techniques to manage virtual classes for statistical tests
CN110765157A (en) Data query method and device, computer equipment and storage medium
CN111723079A (en) Data migration method and device, computer equipment and storage medium
CA3128540C (en) Cache system hotspot data access method, apparatus, computer device and storage medium
US20230144100A1 (en) Method and apparatus for managing and controlling resource, device and storage medium
CN107402863B (en) Method and equipment for processing logs of service system through log system
CA3184895A1 (en) User behavior data writing method and device, computer equipment and storage medium
CA3148489C (en) Method of and device for assessing data query time consumption, computer equipment and storage medium
CN111061758A (en) Data storage method, device and storage medium
CA3157818A1 (en) Method, apparatus, computer device, and storage medium for fusing multi-system multi-store orders
WO2023160398A1 (en) Data processing method and system
CN106649210B (en) Data conversion method and device
WO2020010492A1 (en) Data processing method for numerical control system, computer device, and storage medium
CN109542962B (en) Data processing method, data processing device, computer equipment and storage medium
US9473572B2 (en) Selecting a target server for a workload with a lowest adjusted cost based on component values
JP2018525728A (en) A distributed machine learning analysis framework for analyzing streaming datasets from computer environments
CN110266555A (en) Method for analyzing web site service request
US9405786B2 (en) System and method for database flow management
CN113835953A (en) Statistical method and device of job information, computer equipment and storage medium
WO2022001626A1 (en) Time series data injection method, time series data query method and database system
CN112667682A (en) Data processing method, data processing device, computer equipment and storage medium
CN113268483A (en) Request processing method and device, electronic equipment and storage medium
US20190138931A1 (en) Apparatus and method of introducing probability and uncertainty via order statistics to unsupervised data classification via clustering
WO2019019698A1 (en) Case processing system and method, server and storage medium
US10423578B2 (en) Management of contextual information for data

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20230919

EEER Examination request

Effective date: 20230919

EEER Examination request

Effective date: 20230919

EEER Examination request

Effective date: 20230919