CN114296944A - Data processing method, data processing device, electronic device, program product, and storage medium - Google Patents

Data processing method, data processing device, electronic device, program product, and storage medium Download PDF

Info

Publication number
CN114296944A
CN114296944A CN202210003204.5A CN202210003204A CN114296944A CN 114296944 A CN114296944 A CN 114296944A CN 202210003204 A CN202210003204 A CN 202210003204A CN 114296944 A CN114296944 A CN 114296944A
Authority
CN
China
Prior art keywords
behavior
time period
initial
target
descending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210003204.5A
Other languages
Chinese (zh)
Inventor
钟子宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210003204.5A priority Critical patent/CN114296944A/en
Publication of CN114296944A publication Critical patent/CN114296944A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method, a data processing device, an electronic device, a computer program product and a computer readable storage medium; the method is applied to the map field, and comprises the following steps: acquiring a first behavior object set of a first time period, and acquiring behavior objects in a target time period, wherein the target time period is adjacent to and behind the first time period; performing descending sorting processing based on object identification on the behavior objects in the target time period to obtain a first descending result; acquiring a first target behavior object of which the object identification is larger than all behavior objects in the first behavior object set from the first descending result; and merging the first target behavior objects which are sequenced according to the first descending result into the first behavior object set to obtain a second behavior object set of a second time period. Through the application, the data processing efficiency can be improved.

Description

Data processing method, data processing device, electronic device, program product, and storage medium
Technical Field
The present application relates to the technology of internet of vehicles and big data calculation, and in particular, to a data processing method, apparatus, electronic device, computer program product, and computer-readable storage medium.
Background
With the development of computer technology, the variety and number of applications are increasing. The object using the application program is called a behavior object, the behavior object generates corresponding object behavior data when accessing browsing behavior, the computer can also acquire all object behavior data until different time nodes, and perform full processing on the acquired object behavior data to obtain behavior object sets accumulated to the different time nodes, and the accessed condition of the application program is represented in real time by the behavior object sets accumulated to the different time nodes.
However, as time goes by, the data volume of the object behavior data becomes larger and larger, and when the behavior object sets accumulated to different time nodes are determined in the above manner, the full data processing time is long due to the fact that the data volume of the object behavior data acquired by the computer is very large, and the full data processing needs to be executed for each different time node, which limits the data processing efficiency.
Disclosure of Invention
Embodiments of the present application provide a data processing method, an apparatus, an electronic device, a computer program product, and a computer-readable storage medium, which can improve data processing efficiency and improve data processing resource utilization.
The technical scheme of the embodiment of the application is realized as follows:
an embodiment of the present application provides a data processing method, including:
acquiring a first behavior object set of a first time period, and acquiring behavior objects in a target time period, wherein the target time period is adjacent to and behind the first time period;
performing descending sorting processing based on object identification on the behavior objects in the target time period to obtain a first descending result;
acquiring a first target behavior object of which the object identification is larger than all behavior objects in the first behavior object set from the first descending result;
merging the first target behavior objects which are sequenced according to the first descending result into the first behavior object set to obtain a second behavior object set of a second time period, wherein the second time period comprises the first time period and the target time period.
An embodiment of the present application provides a data processing apparatus, including:
an obtaining module, configured to obtain a first behavior object set of a first time period, and obtain a behavior object in a target time period, where the target time period is adjacent to and after the first time period;
the sorting module is used for performing descending sorting processing based on object identification on the behavior objects in the target time period to obtain a first descending result;
a comparison module, configured to obtain, from the first descending result, a first target behavioral object whose object identifier is greater than all behavioral objects in the first behavioral object set;
and a merging module, configured to merge the first target behavior objects sorted according to the first descending result into the first behavior object set to obtain a second behavior object set of a second time period, where the second time period includes the first time period and the target time period.
In the foregoing scheme, when the first time period is an initial time period with an initial time as a starting point, the obtaining module is further configured to: acquiring object behavior data of the initial moment of the first time period; performing object duplicate removal processing on the object behavior data at the initial time to obtain an initial behavior object set at the initial time; determining a first behavior object set of the first time period based on an initial behavior object set of the initial time instant.
In the foregoing solution, the obtaining module is further configured to: performing descending order sorting processing on the behavior objects in the initial behavior object set based on the object identifiers to obtain a first maximum object identifier of the initial behavior object set; performing descending sorting processing on the behavior objects of the initial behavior object set based on the object identifiers to obtain a second descending result; acquiring a second target behavior object of which the object identifier is larger than the first maximum object identifier from the second descending result; and merging the second target behavior object into the initial behavior object set to obtain a first behavior object set of the first time period.
In the foregoing solution, the obtaining module is further configured to: and sequentially inserting the second target behavior objects into the last bit of the descending ordering result of the behavior objects in the initial behavior object set according to the sequence represented by the second descending ordering result to obtain the first behavior object set.
In the foregoing solution, the obtaining module is further configured to: and executing the following processing for each behavior object in the second descending result in sequence according to the sequence represented by the second descending result: comparing the object identifier of the behavior object in the second descending result with the first maximum object identifier; when the comparison result represents that the object identifier of the behavior object in the second descending result is larger than the first maximum object identifier, determining the behavior object in the second descending result as the second target behavior object; and when the comparison result represents that the object identifier of the behavior object in the second descending result is not larger than the first maximum object identifier, determining to stop executing the comparison processing.
In the foregoing solution, the obtaining module is further configured to: acquiring object behavior data in the target time period; and carrying out object duplicate removal processing on the object behavior data in the target time period to obtain the behavior object in the target time period.
In the foregoing solution, the comparing module is further configured to: acquiring a second maximum object identifier of the first behavior object set; and acquiring a first target behavior object with the object identifier larger than the second maximum object identifier from the first descending result.
In the foregoing solution, the comparing module is further configured to: and executing the following processing for each behavior object in the first descending result in sequence according to the sequence represented by the first descending result: comparing the object identifier of the behavior object in the first descending result with the second maximum object identifier; when the comparison result represents that the object identifier of the behavior object in the first descending result is larger than the second maximum object identifier, determining the behavior object in the first descending result as the first target behavior object; and when the comparison result represents that the object identifier of the behavior object in the first descending result is not larger than the second maximum object identifier, determining to stop executing the comparison processing.
In the above solution, the apparatus further comprises: a cumulative statistics module to: acquiring the accumulated access volume and the accumulated object quantity of the first time period, and acquiring the accumulated access volume of the target time period; superposing the number of the first target behavior objects to the accumulated object number of the first time period to obtain the accumulated object number of the second time period; and overlapping the accumulated access amount of the target time period to the accumulated access amount of the first time period to obtain the accumulated access amount of the second time period.
In the foregoing scheme, when the first time period is an initial time period with an initial time as a starting point, the cumulative statistics module is further configured to: acquiring object behavior data of the initial moment of the first time period; performing object duplicate removal processing on the object behavior data at the initial time to obtain an initial behavior object set at the initial time; mapping and reducing the behavior objects in the initial behavior object set to obtain initial accumulated visit amount and initial accumulated object number at the initial moment; and determining the accumulated visit amount and the accumulated object number of the first time period based on the initial accumulated visit amount and the initial accumulated object number.
In the foregoing solution, the cumulative statistics module is further configured to: performing descending order sorting processing on the behavior objects in the initial behavior object set based on the object identifiers to obtain a first maximum object identifier of the initial behavior object set; performing descending sorting processing on the behavior objects of the initial behavior object set based on the object identifiers to obtain a second descending result; acquiring a second target behavior object of which the object identifier is larger than the first maximum object identifier from the second descending result; superposing the number of the second target behavior objects to the initial accumulated object number to obtain the accumulated object number of the first time period; and overlapping the accumulated access volume of the first time period to the initial accumulated access volume to obtain the accumulated access volume of the first time period.
An embodiment of the present application provides an electronic device, including:
a memory for storing executable instructions;
and the processor is used for realizing the data processing method provided by the embodiment of the application when the processor executes the executable instructions stored in the memory.
The embodiment of the application provides a computer-readable storage medium, which stores executable instructions and is used for realizing the data processing method provided by the embodiment of the application when being executed by a processor.
The embodiment of the application has the following beneficial effects:
the method comprises the steps of obtaining a first behavior object set of a first time period and obtaining behavior objects in a target time period through the embodiment of the application; the behavior objects in the target time period are subjected to descending sorting processing based on the object identification to obtain a first descending result, first target behavior objects with the object identification larger than all behavior objects in the first behavior object set are obtained from the first descending result, the first target behavior objects sorted according to the first descending result are merged into the first behavior object set to obtain a second behavior object set of a second time period, wherein the second time period comprises a first time period and a target time period, when the second behavior object set of the second time period is obtained, large data statistics processing aiming at the total data of the second time period is not needed any more, the calculation efficiency is effectively improved, the calculation resource occupation is effectively saved, and the calculation resource occupation caused by the fact that the whole number of behavior objects need to be sorted in each time period is effectively reduced because only the behavior objects in the target time period need to be sorted in descending sorting, the calculation efficiency is effectively improved.
Drawings
Fig. 1 is a flowchart illustrating a mapping process and a reduction process of object behavior data provided in the related art;
fig. 2 is a flow chart illustrating data statistics of object behavior data provided in the related art;
FIG. 3 is a block diagram of a data processing system according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;
5A-5C are schematic flow charts of data processing methods provided by embodiments of the present application;
fig. 6 is a schematic flow chart of data statistics of object behavior data according to an embodiment of the present application.
Detailed Description
In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) And (3) removing weight statistics: statistics are only performed once for the repeat access records of the same behavioral object in the sequence.
2) And (3) aggregation operation: the calculation is performed on a set of values by the aggregation function and returns a single value, and the aggregation function ignores null values.
3) The total amount is as follows: all the latest state data is stored every day.
4) Increment table: the incremental data is new data after being exported last time, the incremental table records data increased each time, but not total data, the incremental table only reports the variation, and if the variation does not exist, the incremental table does not need to be reported.
5) Data skew: in a data set processed in parallel, data of a certain part is significantly more than that of other parts, so that the processing speed of the part becomes a bottleneck of the whole data set processing.
6) Number of visitors (UV, Unique Visitor): the number of visitors is the sum of the number of people accessing a product in a statistical period, and the number of visitors needs to be subjected to duplicate removal processing.
7) Total number of visitors (TUV): the sum of the number of people visiting a product in all the statistical periods is used, and the accumulated number of visitors needs to be subjected to duplicate removal processing.
8) Viewing volume (PV, Page View): the sum of the number of the browsed pages in a statistical period is used, and the browsing amount is acquired without executing deduplication operation.
9) Cumulative View (TPV, Total Page View): the method refers to the sum of the number of browsed pages in all statistical periods, and the accumulated browsing amount is obtained without executing deduplication operation.
10) Access object behavior dataset (U): the object behavior data set is formed by removing the weight of people who visit a product in a statistical period.
11) Total access object behavior dataset (TU, Total vision User): the object behavior data set is formed by removing the weight of people who visit a product in all the statistical periods.
12) Sequencing and interpolation: the data processing method comprises the steps of firstly sorting a user set or a data sequence from big to small (or from small to big), and then selecting a certain sorting position for interpolation.
13) An Intelligent Vehicle Infrastructure Cooperative System (IVICS), which is called a Vehicle Infrastructure Cooperative system for short, is a development direction of an Intelligent Transportation System (ITS). The vehicle-road cooperative system adopts the advanced wireless communication, new generation internet and other technologies, implements vehicle-vehicle and vehicle-road dynamic real-time information interaction in all directions, develops vehicle active safety control and road cooperative management on the basis of full-time dynamic traffic information acquisition and fusion, fully realizes effective cooperation of human and vehicle roads, ensures traffic safety, improves traffic efficiency, and thus forms a safe, efficient and environment-friendly road traffic system.
Referring to fig. 1, fig. 1 is a schematic flow chart of mapping and reduction processing of object behavior data provided in the related art, in which distributed computation in the related art adopts a distributed iteration mode, an initial time is selected as a 0 point, a distributed computation mode is adopted, all historical data before the initial time are aggregated and computed (e.g., summed) as an initial value TUV of the aggregated computation (e.g., summed)0For example, a month of functional online is used as an initial time, and the Mapredu platform in the related art is used to count all the users to be reused in the month, which is used as an initial value TUV0Counting the sum of the browsing amount of all the users in the month as an initial value TPV0Taking all historical object behavior data before the initial time as full-scale object behavior data of the initial time, and reordering from small to large according to the object identification to obtain an initial full-scale user set TU0
Referring to fig. 2, fig. 2 is a flow chart illustrating data statistics of object behavior data provided in the related art for T1Calculating the data of time T1The moment object behavior data is subjected to duplicate removal according to the object identification, and T is obtained1Temporal object behavior data U1With initial user set TU0Matching to obtain a new user set
Figure BDA0003455620980000071
And will add the new user set
Figure BDA0003455620980000072
Add to initial full user set TU0To obtain T1Time full user set TU1. At the same time, calculate T1Number of users UV newly increased at any moment1 newTo obtain T1Time accumulated user number TUV1=TUV0+UV1 newCalculating T1Number of times PV of newly added users at any moment1 newTo obtain T1Accumulated user times TPV at any moment1=TPV0+PV1 newBy analogy, T is obtainednFull user set TU of timen(n is not less than 1) and the cumulative number of users TUVn=TUVn-1+UVt newAccumulating the number of users
Figure BDA0003455620980000073
Wherein, TUVn-1Represents Tn-1Number of users at a time, TPVn-1Represents Tn-1The cumulative amount of access at the time.
In the related art, when the distributed calculation scheme calculates TUV and TPV in each statistical period, the user table U in the current statistical period and the full table TU in the previous statistical period are required to be used for performing Left Join operation. However, when the user table U of the current statistical period is large, the table operation by the Left Join still has a data skew, which limits the data processing efficiency. In addition, in the related art, when newly added UV and newly added PV are calculated in each statistical period, a MapReduce mode still needs to be adopted for calculation, and when newly added UV and newly added PV of a user in each statistical period are counted, a Left Join operation needs to be adopted based on a MapReduce framework to match a newly added user set NU. Therefore, MapReduce still needs to be called every statistical period, thereby consuming a large amount of data processing resources.
Embodiments of the present application provide a data processing method, an apparatus, an electronic device, a computer program product, and a computer-readable storage medium, which can improve data processing efficiency and improve data processing resource utilization, and an exemplary application of the electronic device provided in the embodiments of the present application is described below. In the following, an exemplary application will be explained when the device is implemented as a server.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a data processing system according to an embodiment of the present application, in order to support an electronic map application, a terminal 400 is connected to a server 200 through a network 300, where the network 300 may be a wide area network or a local area network, or a combination of the two, and data transmission is implemented using a wireless link.
In an application scenario, the function of the electronic map is implemented based on the server 200 and the terminal 400, and in a process that a user accesses the electronic map by using the terminal 400, in response to a behavior operation that the terminal 400 receives an object, the terminal 400 acquires behavior object data of a first time period and behavior object data within a target time period, and sends the behavior object data to the server 200, so that the server 200 performs the data processing method provided by the embodiment of the application on the received data to obtain a behavior object set of a second time period, where the second time period includes the first time period and the target time period, and stores the behavior object set of the second time period in the database 500.
In other embodiments, when the data processing method provided in this embodiment is implemented by a terminal alone, in the application scenario described above, in a process that a user accesses an electronic map using the terminal, in response to the terminal 400 receiving a behavior operation of an object, the terminal acquires behavior object data in a first time period and behavior object data in a target time period, implements the data processing method provided in this embodiment on the acquired data, obtains a behavior object set in a second time period, where the second time period includes the first time period and the target time period, and stores the behavior object set in the second time period in a database.
In some embodiments, the server 200 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal 400 may include, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent appliance, a vehicle-mounted terminal, and the like, and the terminal 400 may be provided with a client, for example, but not limited to, an instant messaging client, a learning client, a game client, a map client, a client of a road toll system, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited.
Next, a structure of an electronic device for implementing the data processing method provided in the embodiment of the present application is described, and as described above, the electronic device provided in the embodiment of the present application may be the server 200 in fig. 3. Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and the server 200 shown in fig. 4 includes: at least one processor 210, memory 250, at least one network interface 220. The various components in server 200 are coupled together by a bus system 240. It is understood that the bus system 240 is used to enable communications among the components. The bus system 240 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 240 in fig. 4.
The Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 250 optionally includes one or more storage devices physically located remotely from processor 210.
The memory 250 includes volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 250 described in embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, memory 250 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.
An operating system 251 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks; a network communication module 252 for communicating to other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), among others.
In some embodiments, the data processing apparatus provided in the embodiments of the present application may be implemented in software, and fig. 4 shows a data processing apparatus 255 stored in the memory 250, which may be software in the form of programs and plug-ins, and includes the following software modules: an obtaining module 2551, a sorting module 2552, a comparing module 2553, a merging module 2554 and a cumulative statistics module 2555, which are logical, so that they can be arbitrarily combined or further split according to the implemented functions, which will be described below.
In some embodiments, the terminal or the server may implement the data processing method provided by the embodiment of the present application by running a computer program. For example, the computer program may be a native program or a software module in an operating system; can be a local (Native) Application program (APP), i.e. a program that needs to be installed in an operating system to run, such as a map APP; or may be an applet, i.e. a program that can be run only by downloading it to the browser environment; but also an applet that can be embedded into any APP. In general, the computer programs described above may be any form of application, module or plug-in.
Next, a description will be given of a data processing method provided in the embodiment of the present application, and in actual implementation, the data processing method provided in the embodiment of the present application may be implemented by the terminal 400 shown in fig. 3, the in-vehicle terminal mounted with the electronic map, or the server 200, or may be implemented by the terminal 400, the in-vehicle terminal mounted with the electronic map, or the server 200 in cooperation.
The following description will be given taking as an example a case where the terminal 400 separately executes the data processing method provided in the embodiment of the present application. Referring to fig. 5A, fig. 5A is a schematic flowchart of a data processing method provided in an embodiment of the present application, and will be described with reference to the steps shown in fig. 5A.
In step 101, a first behavior object set of a first time period is obtained, and behavior objects within a target time period are obtained.
As an example, the target time period is adjacent to and after the first time period.
In some embodiments, when the first time period is an initial time period starting from an initial time, the step 101 of acquiring the first behavior object set of the first time period may be implemented by the following technical solutions: acquiring object behavior data of an initial moment of a first time period; carrying out object duplicate removal processing on the object behavior data at the initial moment to obtain an initial behavior object set at the initial moment; a first set of behavioral objects for a first time period is determined based on an initial set of behavioral objects at an initial time instance.
As an example, T is input0Object behavior data at time as initial input, T0The time is the initial time of the whole data processing flow, for example, zero time of zero time at day 30 after the function is on-line is taken as the initial time, a distributed technology calculation mode is adopted, all historical data before the initial time are calculated in an aggregation mode, a Map reduce platform is adopted to count all the reuse user numbers in one month through mapping processing and reduction processing, and the reuse user numbers are taken as the initial accumulated object number TUV0Counting the sum of the browsing volumes of all the users in the month as the initial accumulated access volume TPV0Taking all object behavior data before the initial time as the full-scale object behavior data of the initial time, and reordering from small to large according to the object identification to obtain a full-scale initial behavior object set TU0
In some embodiments, the determining the first behavior object set of the first time period based on the initial behavior object set at the initial time may be implemented by the following technical solutions: performing descending order sorting processing based on object identification on the behavior objects in the initial behavior object set to obtain a first maximum object identification of the initial behavior object set; performing descending sorting processing based on object identification on the behavior objects of the initial behavior object set to obtain a second descending result; acquiring a second target behavior object with the object identification larger than the first maximum object identification from the second descending result; and merging the second target behavior object into the initial behavior object set to obtain a first behavior object set of the first time period.
In some embodiments, the merging the second target behavior object into the initial behavior object set to obtain the first behavior object set in the first time period may be implemented by the following technical solutions: sequencing the behavior objects in the initial behavior object set in a descending order; and sequentially inserting the second target behavior objects into the last bit of the descending ordering result of the behavior objects in the initial behavior object set according to the sequence represented by the second descending ordering result to obtain the first behavior object set.
As an example, when the second target behavior objects are merged into the initial behavior object set, the second target behavior objects are obtained from the second descending result, so if the number of the second target behavior objects is multiple, the order of the multiple second target behavior objects can be represented by the second descending result, the second target behavior objects are directly and sequentially inserted into the last bit of the descending ordering result of the behavior objects in the initial behavior object set according to the order represented by the first descending result, which is equivalent to sequentially splicing the two ordering results, and thus the behavior objects in the first behavior object set carry the order attribute based on the object identifier.
In some embodiments, the obtaining of the second target behavior object whose object identifier is greater than the first maximum object identifier from the second descending result may be implemented by the following technical solutions: and executing the following processing for each behavior object in the second descending result in turn according to the sequence represented by the second descending result: comparing the object identifier of the behavior object in the second descending result with the first maximum object identifier; when the comparison result represents that the object identification of the behavior object in the second descending result is larger than the first maximum object identification, determining the behavior object in the second descending result as a second target behavior object; and when the comparison result represents that the object identifier of the behavior object in the second descending result is not larger than the first maximum object identifier, determining to stop executing the comparison processing.
As an example, the behavior objects in the initial behavior object set TU0 are sorted in a descending order based on the object identifiers, resulting in the initial behavior object set having the first largest object identifier
Figure BDA0003455620980000121
And identifying the first largest object
Figure BDA0003455620980000122
As a comparison parameter, inputting a behavior object U in a first time period1Since the first time period is an initial time period starting from an initial time, the first time period can be regarded as a first single cycle, and for U1Sorting in descending order from big to small based on object identification, and sorting U1The object identification u of each behavior object in1According to the descending order and the first maximum object identification
Figure BDA0003455620980000131
Making a comparison when
Figure BDA0003455620980000132
When u is over1The corresponding behavior object is used as a second target behavior object which forms a first one-cycle newly added behavior object set
Figure BDA0003455620980000133
When the comparison result indicates that the object identifier of the behavior object in the second descending result is not greater than the object identifier of the initial behavior object, determining that the comparison processing is to be stopped, that is, the second target behavior object is not obtained any more, and the second target behavior object is sequentially inserted into the last bit of the descending ordering result of the behavior objects in the initial behavior object set TU0 according to the sequence represented by the second descending result to obtain the first behavior object set, so the behavior objects in the first behavior object set are also arranged in a descending order according to the size of the object identifier, and the Union operation is adopted to arrange the u in the descending order by using the Union operation1Corresponding behavior object and TU0The combination is carried out, and the combination,
Figure BDA0003455620980000134
resulting in a first set of behavioural objects TU1The first behavioral object set TU1Is the full-size user set by the first time instant, which is the end time instant of the first time period.
In some embodiments, the obtaining of the behavior object in the target time period may be implemented by the following technical solutions: acquiring object behavior data in a target time period; and carrying out object duplicate removal processing on the object behavior data in the target time period to obtain the behavior object in the target time period.
As an example, when the first time period is from 1/2021/1/2/2021/2/1/2/2021/1/3/2021/1/2/2021/1/3/2021/1/2/3, the object behavior data from 1/2/2021/3/2021/1/is acquired, for example, the object behavior data (5000 access records) in the target time period is subject to the object deduplication processing, and since the 5000 access records are obtained based on the behaviors of 1000 behavior objects, some behavior objects are subject to multiple accesses, the object deduplication processing is required to obtain the behavior objects in the target time period.
In step 102, the behavior objects in the target time period are subjected to descending order sorting processing based on the object identifiers, so as to obtain a first descending result.
As an example, the target time period is a time period between two time nodes, for example, when data statistics is performed, data update storage is performed according to the same time interval, and the target time period is Tn-Tn-1Illustratively, the first time period is Tn-1Characterizing the time period from the initial time to the first time (last data update store), the second time period being TnThe method comprises the steps of representing a time period from an initial time to a second time (data updating and storing this time), wherein the target time period is a time period between the time of last data updating and storing and the time of data updating and storing at present.
In step 103, a first target behavior object with an object identification larger than all behavior objects in the first behavior object set is obtained from the first descending result.
In step 104, the first target behavior objects sorted according to the first descending result are merged into the first behavior object set to obtain a second behavior object set of the second time period.
As an example, the second time period includes the first time period and the target time period.
In some embodiments, the merging the first target behavior objects sorted according to the first descending result into the first behavior object set in step 104 may be implemented by the following technical solutions: and sequentially inserting the first target behavior object into the last bit of the descending ordering result of the behavior objects in the first behavior object set according to the sequence represented by the first descending ordering result.
As an example, when merging the first target behavior object into the first behavior object set, the first target behavior object is obtained from the first descending result, and therefore, if the number of the first target behavior objects is multiple, the order of the multiple first target behavior objects may be represented by the first descending result, and the behavior objects in the first behavior object set carry order attributes based on the object identifier, and therefore, the behavior objects in the second behavior object set in the second time period also carry order attributes based on the object identifier, and the first target behavior objects are directly and sequentially inserted into the last position of the descending ordering result of the behavior objects in the first behavior object set according to the order represented by the first descending result, which is equivalent to sequentially splicing two ordering results.
As an example, based on the first time period Tn-1Generates a second time period T for the first set of behavioral objectsnThe process of the second set of behavioral objects can be traced back to generating a first time period (T) based on the data at the initial time1) And based on a first time period (T)1) Generates a second time period (T) of the behavior object set2) When generating a behavior object set of a first time period (an initial time period starting from an initial time) based on data of the initial time, the procedure of (1) a behavior object set of (2) needs to perform a sorting process, and therefore the first time period (T) is a period of (T)1) The behavior objects in the behavior object set are also self-provided with sequential attributes based on object identification, and are based on a first time period (T)1) Generates a second time period (T) of the behavior object set2) When the behavioral objects are collected, it is necessary toAnd introducing a first descending result of the behavior objects of the target time period, so that the behavior objects in the behavior object set of the generated second time period are also provided with sequence attributes based on the object identifiers, and so on, and when a new target time period is introduced each time, only the behavior objects in the target time period need to be sequenced, but the behavior objects in the first time period do not need to be sequenced.
Referring to fig. 5B, fig. 5B is a schematic flowchart of a data processing method provided in the embodiment of the present application, and in step 103, a first target behavior object whose object identifier is greater than all behavior objects in the first behavior object set is obtained from the first descending result, which may be implemented by steps 1031 to 1032 in fig. 5B.
In step 1031, the second largest object id of the first behavior object set is obtained.
As an example, when a first time period corresponding to a first behavior object set is an initial time period, behavior objects in the first behavior object set need to be sorted in a descending order to obtain a second maximum object identifier of the first behavior object set, if the first time period corresponding to the first behavior object set further includes a time period after the initial time period, and the first behavior object set carries an order attribute based on the object identifiers, the second maximum object identifier of the first behavior object set may be directly obtained, and if the first behavior object set does not carry the order attribute based on the object identifiers, the behavior objects of the first behavior object set need to be sorted in a descending order based on the object identifiers to obtain the second maximum object identifier of the first behavior object set.
In step 1032, a first target behavior object with an object identifier larger than the second largest object identifier is obtained from the first descending result.
In some embodiments, the obtaining of the first target behavior object whose object identifier is greater than the second maximum object identifier from the first descending result in step 1032 may be implemented by the following technical solution: and executing the following processing for each behavior object in the first descending result in turn according to the sequence represented by the first descending result: comparing the object identifier of the behavior object in the first descending result with the second maximum object identifier; when the comparison result represents that the object identification of the behavior object in the first descending result is larger than the second maximum object identification, determining the behavior object in the first descending result as a first target behavior object; and when the comparison result represents that the object identifier of the behavior object in the first descending result is not larger than the second maximum object identifier, determining to stop executing the comparison processing.
As an example, based on the first time period Tn-1The first behavioral object set TU in descending ordern-1Obtaining the second maximum object identification in the first behavior object set
Figure BDA0003455620980000161
And identifying the second largest object
Figure BDA0003455620980000162
As comparison parameters, inputting a behavior object set U of a target time periodnTo U, to UnThe behavior objects in the program are sorted from big to small in descending order based on the object identification, and the U is processednThe object identification u of each behavior object innAccording to the descending order and the second maximum object identification
Figure BDA0003455620980000163
Making a comparison when
Figure BDA0003455620980000164
When u is overnThe corresponding behavior object is used as a first target behavior object, and the first target behavior object forms a newly added behavior object set of the target time period
Figure BDA0003455620980000165
When the comparison result represents that the object identification of the behavior object in the first descending result is not larger than the second maximum object identification, the comparison processing is determined to be stopped, namely the first target behavior object is not obtained any more, and the first target behavior object is represented according to the first descending resultSequentially inserting the symbolic sequence into the first behavioral object set TUnThe last bit of the descending ordering result of the behavior objects in the first behavior object set obtains a first behavior object set, so the behavior objects in the first behavior object set are also in descending order according to the size of the object identifier, and the Union operation is adopted to arrange u in descending ordernCorresponding behavior object and TUn-1The combination is carried out, and the combination,
Figure BDA0003455620980000166
thereby obtaining a second behavior object set TUnAnd, a second behavior object set TUnIs the full set of users from the initial time to the nth time.
In some embodiments, referring to fig. 5C, fig. 5C is a schematic flowchart of a data processing method provided in this embodiment, and after the first target behavior object whose object identifier is larger than all behavior objects in the first behavior object set is obtained from the first descending result in step 103, steps 105 to 107 in fig. 5C may be executed.
In step 105, the accumulated visit volume and the accumulated target number for the first time period are acquired, and the accumulated visit volume for the target time period is acquired.
In some embodiments, when the first time period is an initial time period starting from an initial time, the step 105 of obtaining the accumulated visit volume and the accumulated number of objects of the first time period may be implemented by the following technical solutions: acquiring object behavior data of an initial moment of a first time period; carrying out object duplicate removal processing on the object behavior data at the initial moment to obtain an initial behavior object set at the initial moment; mapping and reducing the behavior objects in the initial behavior object set to obtain initial accumulated access amount and initial accumulated object number at the initial moment; based on the initial accumulated amount of access and the initial accumulated number of objects, an accumulated amount of access and an accumulated number of objects for the first time period are determined.
In some embodiments, the determining the cumulative access amount and the cumulative object number in the first time period based on the initial cumulative access amount and the initial cumulative object number may be implemented by the following technical solutions: performing descending order sorting processing based on object identification on the behavior objects in the initial behavior object set to obtain a first maximum object identification of the initial behavior object set; performing descending sorting processing based on object identification on the behavior objects of the initial behavior object set to obtain a second descending result; acquiring a second target behavior object with the object identification larger than the first maximum object identification from the second descending result; superposing the number of the second target behavior objects to the initial accumulated object number to obtain the accumulated object number of the first time period; and overlapping the accumulated access volume of the first time period to the initial accumulated access volume to obtain the accumulated access volume of the first time period.
As an example, the number of behavior objects UV for a first time period is input1 newInputting an initial time T0Cumulative number of objects TUV0Initial time T0Cumulative number of objects TUV0Is the cumulative number of behavior objects in the period from the initial time to the 1 st time, and the number UV of the second target behavior objects from the first initial time to the 1 st time (first period) is calculated1 newThe number of the second target behavior objects is the number of the new users from the first initial time to the 1 st time (the first time period), and the accumulated number of users TUV is obtained by the 1 st time1=TUV0+UV1 newThe 1 st by time point represents from the initial time point to the 1 st time point. Inputting cumulative visit volume PV of first time period1 newInputting an initial time T0Accumulated number of accesses TPV0Initial time T0Cumulative number of objects TUV0Is the accumulated access times of the behavior object in the time period from the first access behavior to the initial time, and the accumulated access amount PV from the initial time to the 1 st time (the first time period) is calculated1 newThe accumulated access volume of the first time period is the number of new user accesses from the initial time to the 1 st time (first time period), and the accumulated access volume TPV of the 1 st time is obtained according to the iterative formula n ═ n +11=TPV0+PV1 newBy 1 st time, characterizing from the initial time to 1 st timeThe time of day.
In step 106, the number of the first target behavior objects is added to the accumulated object number in the first time period, so as to obtain the accumulated object number in the second time period.
As an example, the number of behavior objects for a target time period is input
Figure BDA0003455620980000171
Inputting a first time period Tn-1Cumulative number of objects TUVn-1First period of time Tn-1Cumulative number of objects TUVn-1Is the accumulated number of the behavior objects in the time period from the initial time to the n-1 th time, and calculates the number of the first target behavior objects from the n-1 th time to the n-1 th time (target time period)
Figure BDA0003455620980000181
The number of the first target behavior objects is the number of newly added users from the nth-1 to the nth (target time period), and the number of the users accumulated by the nth time is obtained according to an iterative formula n being n +1
Figure BDA0003455620980000182
The nth-by time represents a second time period from the initial time to the nth time, and the nth-1 st time period represents a first time period from the initial time to the nth-1 st time.
In step 107, the accumulated access volume of the target time period is added to the accumulated access volume of the first time period, and the accumulated access volume of the second time period is obtained.
As an example, the cumulative access amount for a target time period is input
Figure BDA0003455620980000183
Inputting a first time period Tn-1Accumulated number of accesses TPVn-1First period of time Tn-1Cumulative number of objects TUVn-1Is the accumulated access times of the behavior object in the time period from the initial time to the n-1 th time, and calculates the accumulated access amount from the n-1 th time to the n-1 th time (target time period)
Figure BDA0003455620980000184
The accumulated access amount of the target time period is the newly increased user access times from the nth-1 time to the nth time (target time period), and the accumulated access amount of the nth time is obtained according to the iterative formula n being n +1
Figure BDA0003455620980000185
The nth-by time represents a second time period from the initial time to the nth time, and the nth-1 st time period represents a first time period from the initial time to the nth-1 st time.
Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
In some embodiments, the function of the electronic map is implemented based on a server and a terminal, and in a process that a user accesses the electronic map by using the terminal, in response to a behavior operation that the terminal receives an object, the terminal acquires behavior object data of a first time period and behavior object data within a target time period, and sends the behavior object data to the server, so that the server performs the data processing method provided by the embodiment of the application on the received data to obtain a behavior object set of a second time period, where the second time period includes the first time period and the target time period, and stores the behavior object set of the second time period in a database.
The data processing method provided by the embodiment of the application is applied to the field of big data calculation optimization of the car networking products, log data of the car networking products and all modules of the corresponding car networking products are obtained, and relevant object behavior data (details of behaviors of each user in the car networking products, such as clicking behavior data, browsing pictures, browsing pages, recharging operation and the like) are extracted and stored in a database (databases of mysql, oracle, hdfs, hbase, hive and the like).
The overall process of the data processing method provided by the embodiment of the application can be mainly divided into the following six stages: (1) initial data input stage, (2) stage for obtaining initial user set, initial browsing times and initial access user number, (3) T1New use of single cycleFamily set acquisition stage, (4) T1Calculation stages of time accumulated visitor number TUV and accumulated visitor volume TPV, (5) TnPeriod newly-added user set acquisition stage, (6) TnAnd calculating the accumulated visitor number TUV and the accumulated visit amount TPV at all times.
Referring to fig. 6, fig. 6 is a schematic flowchart of data statistics of object behavior data provided in an embodiment of the present application, and the following describes in detail a data processing procedure of 6 stages with reference to fig. 6.
In the first phase, T is input0Object behavior data at time as initial input, T0The time is the initial time of the entire data processing flow.
In the second stage, inputting initial object behavior data, processing the object behavior data in a deduplication mode, and sequencing the object behavior data in a descending order according to the size of an object identifier Uuse to obtain an initial total user set TU0Calculating the initial cumulative object quantity TUV by adopting a MapReduce distributed mode0Initial cumulative number of objects TPV0
In the third stage, the initial full-amount user set TU0 sorted in descending order in the second stage is input, and the object identifier sorted in the first stage is selected
Figure BDA0003455620980000191
As a parameter, T is input1User set U of monocycle1To U, to U1Performing descending order from big to small based on the object identification, and arranging U1The object identification u of each behavior object in1And
Figure BDA0003455620980000192
make a comparison if
Figure BDA0003455620980000193
Then handle u1Added in descending order
Figure BDA0003455620980000194
By adopting Union operation will
Figure BDA0003455620980000195
And TU0The combination is carried out, and the combination,
Figure BDA0003455620980000196
thus obtaining a full user set TU by time T11While adopting n as n +1 formula, iterating
Figure BDA0003455620980000197
Number of users UV1 newSum number PV1 new
In the fourth stage, T1And (5) accumulating the UV of the user and accumulating the PV. T input at Step31Periodic user number UV1 newSum number PV1 newTUV input at Step20、TPV0Thereby calculating T according to an iterative formula1Number of users UV newly increased at any moment1 newTo obtain T1Time accumulated user number TUV1=TUV0+UV1 newCalculating T1Number of times PV of newly added users at any moment1 newTo obtain T1Accumulated user times TPV at any moment1=TPV0+PV1 new
In the fifth stage, TnAnd periodically adding a user set acquisition stage. Inputting the total user set TU in the n-1 period which is arranged in a descending ordern-1And selecting the object identifier ordered first
Figure BDA0003455620980000198
As a parameter, T is inputnPeriodic single-phase user set UnTo U, to UnPerforming descending order from big to small, and arranging UnIs identified by the object identification unAnd
Figure BDA0003455620980000201
make a comparison if
Figure BDA0003455620980000202
Then handle unAdded in descending order
Figure BDA0003455620980000203
Simultaneously, iteration formula n is equal to n +1, and iteration is carried out
Figure BDA0003455620980000204
Number of users
Figure BDA0003455620980000205
Number of times of sum
Figure BDA0003455620980000206
By adopting Union operation will
Figure BDA0003455620980000207
And TUn-1Are combined to obtain TnTime full user set TUnI.e. by
Figure BDA0003455620980000208
In the sixth stage, the number of users of a single cycle (target time period) of the second time period in the fifth stage is input
Figure BDA0003455620980000209
Number of times of sum
Figure BDA00034556209800002010
Inputting a first time period Tn-1Cumulative number of objects TUVn-1And accumulated access number TPVn-1First period of time Tn-1Cumulative number of objects TUVn-1The method is characterized in that the accumulated number of behavior objects in the time period from the initial time to the n-1 th time is calculated, and the number of newly added users from the n-1 th time to the n-1 th time (target time period) is calculated
Figure BDA00034556209800002011
Obtaining the accumulated user number at the nth moment according to an iteration formula n which is n +1
Figure BDA00034556209800002012
The calculation is from the n-1 st time to the nnumber of newly added users at n-th time (target time period)
Figure BDA00034556209800002013
Obtaining the accumulated user times at the nth moment
Figure BDA00034556209800002014
In the data processing method provided by the embodiment of the application, firstly, the full-scale user table ending the last period and the user table of the current statistical period are sorted in descending order from large to small according to the object identifiers, the object identifier at the head of the full-scale table is selected as an input parameter and is compared with the object identifier sorted in descending order in the current statistical period, and if the object identifier of the current statistical period is larger than the input object identifier, the part of the object identifiers are selected, so that a new object behavior data set of the current period is formed. According to the data processing method provided by the embodiment of the application, only the MapReduce mode is needed to be adopted for statistical calculation during the calculation of the initial statistical period, and the MapReduce processing is not needed to be executed in other periods, so that the calculation efficiency is effectively improved, and the occupation of calculation resources is saved.
In the data processing method provided by the embodiment of the application, when a new user set is obtained, the user table of the current statistical period and the full user table do not need to be adopted for Left Join operation matching, so that the calculation efficiency is effectively improved.
In the data processing method provided by the embodiment of the application, the T is calculated during iterative computationnData of time, using cutoff Tn-1Cumulative target number up to cycle TUVn-1Cumulative number of accesses TPVn-1And the newly increased user number and the newly increased browsing number counted By the user with the current counting period being greater than the input object identifier are added, and operations such as Left Join, distint, Group By and the like are not required, so that the computing resources can be effectively reduced, and the computing efficiency is improved.
In the data processing method provided by the embodiment of the application, when the full-user table is updated, the newly added users in the current statistical period are reduced according to the object identificationAfter the sorting is carried out, Union operation is adopted to combine the full user tables which are sorted in a descending order based on the object identification until the last statistical period is ended, so that the full user tables which are sorted in the descending order until the current statistical period is ended are obtained. The full user table only needs to be in the initial period (T cut-off)0The time is up) to perform one-time whole-amount user sorting, and only newly-added users need to be sorted in a descending order in each later statistical period, so that the occupation of computing resources caused by the fact that the whole-amount users need to be sorted in each statistical period is effectively reduced, and the computing efficiency is effectively improved.
The data processing method provided by the embodiment of the application is a scheme of optimizing data calculation based on a big data statistical technology, and can be used for big data statistical analysis of all the Internet besides the Internet of vehicles.
It is understood that, in the embodiments of the present application, the data related to the user information and the like need to be approved or approved by the user when the embodiments of the present application are applied to specific products or technologies, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related countries and regions.
Continuing with the exemplary structure of the data processing device 255 provided by the embodiments of the present application as software modules, in some embodiments, as shown in fig. 4, the software modules stored in the data processing device 255 of the memory 250 may include: an obtaining module 2551, configured to obtain a first behavior object set of a first time period, and obtain a behavior object in a target time period, where the target time period is adjacent to and after the first time period; a sorting module 2552, configured to perform sorting processing on the behavior objects within the target time period in a descending order based on the object identifiers, so as to obtain a first descending result; a comparing module 2553, configured to obtain, from the first descending result, a first target behavioral object whose object identifier is greater than all behavioral objects in the first behavioral object set; a merging module 2554, configured to merge the first target behavior objects sorted according to the first descending result into the first behavior object set to obtain a second behavior object set of a second time period, where the second time period includes the first time period and the target time period.
In some embodiments, when the first time period is an initial time period starting from an initial time, the obtaining module 2551 is further configured to: acquiring object behavior data of an initial moment of a first time period; carrying out object duplicate removal processing on the object behavior data at the initial moment to obtain an initial behavior object set at the initial moment; a first set of behavioral objects for a first time period is determined based on an initial set of behavioral objects at an initial time instance.
In some embodiments, the obtaining module 2551 is further configured to: performing descending order sorting processing based on object identification on the behavior objects in the initial behavior object set to obtain a first maximum object identification of the initial behavior object set; performing descending sorting processing based on object identification on the behavior objects of the initial behavior object set to obtain a second descending result; acquiring a second target behavior object with the object identification larger than the first maximum object identification from the second descending result; and merging the second target behavior object into the initial behavior object set to obtain a first behavior object set of the first time period.
In some embodiments, the obtaining module 2551 is further configured to: and sequentially inserting the second target behavior objects into the last bit of the descending ordering result of the behavior objects in the initial behavior object set according to the sequence represented by the second descending ordering result to obtain the first behavior object set.
In some embodiments, the obtaining module 2551 is further configured to: and executing the following processing for each behavior object in the second descending result in turn according to the sequence represented by the second descending result: comparing the object identifier of the behavior object in the second descending result with the first maximum object identifier; when the comparison result represents that the object identification of the behavior object in the second descending result is larger than the first maximum object identification, determining the behavior object in the second descending result as a second target behavior object; and when the comparison result represents that the object identifier of the behavior object in the second descending result is not larger than the first maximum object identifier, determining to stop executing the comparison processing.
In some embodiments, the obtaining module 2551 is further configured to: acquiring object behavior data in a target time period; and carrying out object duplicate removal processing on the object behavior data in the target time period to obtain the behavior object in the target time period.
In some embodiments, the comparing module 2553 is further configured to: acquiring a second maximum object identifier of the first behavior object set; and acquiring a first target behavior object with the object identification larger than the second maximum object identification from the first descending result.
In some embodiments, the comparing module 2553 is further configured to: and executing the following processing for each behavior object in the first descending result in turn according to the sequence represented by the first descending result: comparing the object identifier of the behavior object in the first descending result with the second maximum object identifier; when the comparison result represents that the object identification of the behavior object in the first descending result is larger than the second maximum object identification, determining the behavior object in the first descending result as a first target behavior object; and when the comparison result represents that the object identifier of the behavior object in the first descending result is not larger than the second maximum object identifier, determining to stop executing the comparison processing.
In some embodiments, the apparatus further comprises: a cumulative statistics module 2555 to: acquiring the accumulated access amount and the accumulated object number in a first time period, and acquiring the accumulated access amount in a target time period; superposing the number of the first target behavior objects to the accumulated object number of the first time period to obtain the accumulated object number of the second time period; and overlapping the accumulated access amount of the target time period to the accumulated access amount of the first time period to obtain the accumulated access amount of the second time period.
In some embodiments, when the first time period is an initial time period starting from an initial time, the cumulative statistics module 2555 is further configured to: acquiring object behavior data of an initial moment of a first time period; carrying out object duplicate removal processing on the object behavior data at the initial moment to obtain an initial behavior object set at the initial moment; mapping and reducing the behavior objects in the initial behavior object set to obtain initial accumulated access amount and initial accumulated object number at the initial moment; based on the initial accumulated amount of access and the initial accumulated number of objects, an accumulated amount of access and an accumulated number of objects for the first time period are determined.
In some embodiments, cumulative statistics module 2555 is further configured to: performing descending order sorting processing based on object identification on the behavior objects in the initial behavior object set to obtain a first maximum object identification of the initial behavior object set; performing descending sorting processing based on object identification on the behavior objects of the initial behavior object set to obtain a second descending result; acquiring a second target behavior object with the object identification larger than the first maximum object identification from the second descending result; superposing the number of the second target behavior objects to the initial accumulated object number to obtain the accumulated object number of the first time period; and overlapping the accumulated access volume of the first time period to the initial accumulated access volume to obtain the accumulated access volume of the first time period.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the data processing method described in the embodiment of the present application.
Embodiments of the present application provide a computer-readable storage medium storing executable instructions, wherein the executable instructions are stored, and when being executed by a processor, the executable instructions are to be executed by the processor to perform a data processing method provided by embodiments of the present application, for example, a data processing method as shown in fig. 5A to 5C.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
To sum up, according to the embodiment of the present application, a first behavior object set in a first time period is obtained, behavior objects in a target time period are obtained, a first descending result is obtained by performing descending sorting processing based on object identifiers on the behavior objects in the target time period, a first target behavior object whose object identifier is larger than all the behavior objects in the first behavior object set is obtained from the first descending result, the first target behavior object is merged into the first behavior object set, a second behavior object set in a second time period is obtained, the second time period includes the first time period and the target time period, when a second behavior object set in the second time period is obtained, large data statistics processing for the total data in the second time period is no longer needed, so that the calculation efficiency is effectively improved, the calculation resource occupation is saved, and only the behavior objects in the target time period need to be sorted in a descending order, therefore, the occupation of computing resources caused by the fact that the whole amount of behavior objects need to be sequenced in each time period is effectively reduced, and the computing efficiency is effectively improved.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (15)

1. A method of data processing, the method comprising:
acquiring a first behavior object set of a first time period, and acquiring behavior objects in a target time period, wherein the target time period is adjacent to and behind the first time period;
performing descending sorting processing based on object identification on the behavior objects in the target time period to obtain a first descending result;
acquiring a first target behavior object of which the object identification is larger than all behavior objects in the first behavior object set from the first descending result;
merging the first target behavior objects which are sequenced according to the first descending result into the first behavior object set to obtain a second behavior object set of a second time period, wherein the second time period comprises the first time period and the target time period.
2. The method of claim 1, wherein when the first time period is an initial time period starting from an initial time, the obtaining a first set of behavioural objects for the first time period comprises:
acquiring object behavior data of the initial moment of the first time period;
performing object duplicate removal processing on the object behavior data at the initial time to obtain an initial behavior object set at the initial time;
determining a first behavior object set of the first time period based on an initial behavior object set of the initial time instant.
3. The method of claim 2, wherein determining the first set of behavioral objects for the first time period based on the initial set of behavioral objects for the initial time instance comprises:
performing descending order sorting processing on the behavior objects in the initial behavior object set based on the object identifiers to obtain a first maximum object identifier of the initial behavior object set;
performing descending sorting processing on the behavior objects of the initial behavior object set based on the object identifiers to obtain a second descending result;
acquiring a second target behavior object of which the object identifier is larger than the first maximum object identifier from the second descending result;
and merging the second target behavior object into the initial behavior object set to obtain a first behavior object set of the first time period.
4. The method of claim 3, wherein merging the second target behavior object into the initial set of behavior objects to obtain the first set of behavior objects for the first time period comprises:
and sequentially inserting the second target behavior objects into the last bit of the descending ordering result of the behavior objects in the initial behavior object set according to the sequence represented by the second descending ordering result to obtain the first behavior object set.
5. The method of claim 3, wherein the obtaining a second target behavior object with the object identifier greater than the first maximum object identifier from the second descending result comprises:
and executing the following processing for each behavior object in the second descending result in sequence according to the sequence represented by the second descending result:
comparing the object identifier of the behavior object in the second descending result with the first maximum object identifier;
when the comparison result represents that the object identifier of the behavior object in the second descending result is larger than the first maximum object identifier, determining the behavior object in the second descending result as the second target behavior object;
the method further comprises the following steps:
and when the comparison result represents that the object identifier of the behavior object in the second descending result is not larger than the first maximum object identifier, determining to stop executing the comparison processing.
6. The method of claim 1, wherein the obtaining behavior objects within the target time period comprises:
acquiring object behavior data in the target time period;
and carrying out object duplicate removal processing on the object behavior data in the target time period to obtain the behavior object in the target time period.
7. The method of claim 1,
the obtaining of the first target behavioral object whose object identifier is larger than all behavioral objects in the first behavioral object set from the first descending result includes:
acquiring a second maximum object identifier of the first behavior object set;
and acquiring a first target behavior object with the object identifier larger than the second maximum object identifier from the first descending result.
8. The method of claim 7, wherein obtaining the first target behavior object with the object identifier greater than the second largest object identifier from the first descending result comprises:
and executing the following processing for each behavior object in the first descending result in sequence according to the sequence represented by the first descending result:
comparing the object identifier of the behavior object in the first descending result with the second maximum object identifier;
when the comparison result represents that the object identifier of the behavior object in the first descending result is larger than the second maximum object identifier, determining the behavior object in the first descending result as the first target behavior object;
the method further comprises the following steps:
and when the comparison result represents that the object identifier of the behavior object in the first descending result is not larger than the second maximum object identifier, determining to stop executing the comparison processing.
9. The method of claim 1, further comprising:
acquiring the accumulated access volume and the accumulated object quantity of the first time period, and acquiring the accumulated access volume of the target time period;
superposing the number of the first target behavior objects to the accumulated object number of the first time period to obtain the accumulated object number of the second time period;
and overlapping the accumulated access amount of the target time period to the accumulated access amount of the first time period to obtain the accumulated access amount of the second time period.
10. The method of claim 9, wherein when the first time period is an initial time period starting from an initial time, the obtaining of the accumulated visit volume and the accumulated object volume of the first time period comprises:
acquiring object behavior data of the initial moment of the first time period;
performing object duplicate removal processing on the object behavior data at the initial time to obtain an initial behavior object set at the initial time;
mapping and reducing the behavior objects in the initial behavior object set to obtain initial accumulated visit amount and initial accumulated object number at the initial moment;
and determining the accumulated visit amount and the accumulated object number of the first time period based on the initial accumulated visit amount and the initial accumulated object number.
11. The method of claim 10, wherein determining a cumulative visit amount and a cumulative object number for the first time period based on the initial cumulative visit amount and the initial cumulative object number comprises:
performing descending order sorting processing on the behavior objects in the initial behavior object set based on the object identifiers to obtain a first maximum object identifier of the initial behavior object set;
performing descending sorting processing on the behavior objects of the initial behavior object set based on the object identifiers to obtain a second descending result;
acquiring a second target behavior object of which the object identifier is larger than the first maximum object identifier from the second descending result;
superposing the number of the second target behavior objects to the initial accumulated object number to obtain the accumulated object number of the first time period;
and overlapping the accumulated access volume of the first time period to the initial accumulated access volume to obtain the accumulated access volume of the first time period.
12. A data processing apparatus, characterized in that the apparatus comprises:
an obtaining module, configured to obtain a first behavior object set of a first time period, and obtain a behavior object in a target time period, where the target time period is adjacent to and after the first time period;
the sorting module is used for performing descending sorting processing based on object identification on the behavior objects in the target time period to obtain a first descending result;
a comparison module, configured to obtain, from the first descending result, a first target behavioral object whose object identifier is greater than all behavioral objects in the first behavioral object set;
and a merging module, configured to merge the first target behavior objects sorted according to the first descending result into the first behavior object set to obtain a second behavior object set of a second time period, where the second time period includes the first time period and the target time period.
13. An electronic device, characterized in that the electronic device comprises:
a memory for storing executable instructions;
a processor for implementing the data processing method of any one of claims 1 to 11 when executing executable instructions stored in the memory.
14. A computer-readable storage medium storing executable instructions, wherein the executable instructions, when executed by a processor, implement the data processing method of any one of claims 1 to 11.
15. A computer program product comprising a computer program or instructions, characterized in that the computer program or instructions, when executed by a processor, implement the data processing method of any of claims 1 to 11.
CN202210003204.5A 2022-01-04 2022-01-04 Data processing method, data processing device, electronic device, program product, and storage medium Pending CN114296944A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210003204.5A CN114296944A (en) 2022-01-04 2022-01-04 Data processing method, data processing device, electronic device, program product, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210003204.5A CN114296944A (en) 2022-01-04 2022-01-04 Data processing method, data processing device, electronic device, program product, and storage medium

Publications (1)

Publication Number Publication Date
CN114296944A true CN114296944A (en) 2022-04-08

Family

ID=80976368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210003204.5A Pending CN114296944A (en) 2022-01-04 2022-01-04 Data processing method, data processing device, electronic device, program product, and storage medium

Country Status (1)

Country Link
CN (1) CN114296944A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277843A (en) * 2022-06-30 2022-11-01 南斗六星系统集成有限公司 Method and system for merging vehicle network frequency division data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277843A (en) * 2022-06-30 2022-11-01 南斗六星系统集成有限公司 Method and system for merging vehicle network frequency division data
CN115277843B (en) * 2022-06-30 2024-01-26 南斗六星系统集成有限公司 Method and system for merging frequency division data of vehicle network

Similar Documents

Publication Publication Date Title
CN103546326B (en) Website traffic statistic method
CN107145556B (en) Universal distributed acquisition system
CN103782295A (en) Query explain plan in a distributed data management system
CN106411998A (en) Prediction method for UBI (Usage-Based Insurance) system based on internet of vehicles big data
CN111708641B (en) Memory management method, device, equipment and computer readable storage medium
CN104869009A (en) Website data statistics system and method
CN105302920A (en) Optimal management method and system for cloud storage data
CN105589917A (en) Method and device for analyzing log information of browser
CN106873952A (en) The data handling system and method and application apparatus of mobile terminal webpage development
CN106228263A (en) Materials stream informationization methods based on big data
CN111797243A (en) Knowledge graph data system construction method, system, terminal and readable storage medium
CN110390739A (en) A kind of vehicle data processing method and vehicle data processing system
CN108846564A (en) Server, the method for waiting and storage medium
CN103533043A (en) Charging method of cloud storage service based on REST (representational state transfer)
CN114296944A (en) Data processing method, data processing device, electronic device, program product, and storage medium
CN106097060A (en) A kind of university students is left unused Cycle Hire software screening method system and its implementation
CN111046041A (en) Data processing method and device, storage medium and processor
CN109586970B (en) Resource allocation method, device and system
CN103699653A (en) Method and device for clustering data
CN107679107B (en) Graph database-based power grid equipment reachability query method and system
CN113806466A (en) Path time query method and device, electronic equipment and readable storage medium
CN110399534B (en) Terminal performance report generation method, device, equipment and storage medium
CN113407807A (en) Query optimization method and device for search engine and electronic equipment
CN106126670A (en) Operation data sequence processing method and processing device
Li et al. Efficient path query processing over massive trajectories on the cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination