CN110955802A - Data barreling method and device, electronic equipment and storage medium - Google Patents

Data barreling method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110955802A
CN110955802A CN201911100853.1A CN201911100853A CN110955802A CN 110955802 A CN110955802 A CN 110955802A CN 201911100853 A CN201911100853 A CN 201911100853A CN 110955802 A CN110955802 A CN 110955802A
Authority
CN
China
Prior art keywords
data
time period
scores
current time
structure corresponding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911100853.1A
Other languages
Chinese (zh)
Inventor
张付伟
洪庚伟
李羽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimin Insurance Agency Co Ltd
Original Assignee
Weimin Insurance Agency Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimin Insurance Agency Co Ltd filed Critical Weimin Insurance Agency Co Ltd
Priority to CN201911100853.1A priority Critical patent/CN110955802A/en
Publication of CN110955802A publication Critical patent/CN110955802A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention relates to a data bucket dividing method, a data bucket dividing device and electronic equipment, wherein the method comprises the following steps: acquiring data and scores thereof; writing the data into a data structure corresponding to the current time period to obtain data ordered arrangement of the data; writing data into a data structure corresponding to the next time period in advance, wherein the first time point is taken as a time demarcation point in the current time period and the next time period; if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period; and carrying out data barreling on the data according to the data ordered arrangement. The embodiment of the invention respectively writes the data and the scores thereof into the data structure corresponding to the current time period and the data structure corresponding to the next time period. Therefore, the data structure corresponding to the current time period simultaneously comprises the data of the previous time period and the data of the current time period, and an accurate bucket dividing result can be obtained based on enough sample data.

Description

Data barreling method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a data bucket dividing method and device, electronic equipment and a storage medium.
Background
The data sub-bucket is to perform sub-bucket processing on the data according to the corresponding fraction of the data, and uniformly disperse the original data with uneven fraction distribution in a set bucket. Before data is subjected to barrel sorting, the original data is sorted according to scores to obtain data ordered arrangement, and then the data is subjected to barrel sorting based on the data ordered arrangement.
In the process of implementing the invention, the inventor finds that: the existing data bucket dividing method cannot ensure that enough sample data is obtained, so that effective data ordered arrangement cannot be obtained, and further, the bucket dividing result is inaccurate.
Disclosure of Invention
The embodiment of the invention aims to provide a data bucket dividing method, a data bucket dividing device, electronic equipment and a storage medium, which can perform data bucket dividing based on enough sample data so as to obtain an accurate bucket dividing result.
The embodiment of the invention provides a data bucket dividing method, which comprises the following steps:
acquiring data and scores thereof;
writing the data and the scores thereof into a data structure corresponding to the current time period to obtain ordered arrangement of the data in the data structure based on the scores, wherein the data structure corresponding to the current time period also comprises the data of the previous time period;
writing the data and the scores thereof into a data structure corresponding to a next time period in advance, wherein the current time period and the next time period use a first time point as a time demarcation point;
if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period;
and carrying out data bucket separation on the data according to the data ordered arrangement.
In some embodiments, the obtaining data and scores thereof comprises:
acquiring data and scores thereof at an application layer;
wherein, the data structure is the ordered collection of data of the database, writes data and its score into the data structure, include:
and writing the data and the scores thereof into the data ordered set of the database through the application layer and the database interface so as to obtain the data ordered arrangement.
In some embodiments, the writing the data and its scores to an ordered set of data of a database through an application layer interface comprises:
and sequentially writing the data into the ordered data set based on a binary method according to the scores of the data.
In some embodiments, after the acquiring data and its scores, the method further comprises:
grouping the data according to data categories to obtain at least two grouped data of the data;
writing the data and the scores thereof into a data structure corresponding to the current time period to obtain the ordered arrangement of the data in the data structure based on the scores, including:
writing at least two grouped data of the data and corresponding scores thereof into at least two data structures corresponding to the current time period respectively so as to obtain at least two ordered data arrangements based on the scores respectively;
the pre-writing the data and the scores thereof into a data structure corresponding to the next time period comprises:
at least two grouped data of the data and corresponding scores thereof are respectively written into at least two data structures corresponding to the next time period in advance;
the data barreling of the data according to the data ordered arrangement comprises:
and performing data bucket separation on the grouped data according to the data ordered arrangement corresponding to the grouped data.
In some embodiments, prior to performing data bucketing, the method further comprises:
and determining whether the number of the data in the data ordered arrangement is greater than or equal to a preset number threshold, and if so, performing data barreling based on the data ordered arrangement.
In some embodiments, the database is a redis database.
In some embodiments, the current time period is a preset time range including a current time, and the next time period is a next preset time range of the current time period.
The embodiment of the invention also provides a data bucket separating device, which comprises:
the data acquisition module is used for acquiring data and scores thereof;
the data sorting module is used for writing the data and the scores thereof into a data structure corresponding to the current time period so as to obtain the ordered arrangement of the data in the data structure based on the scores, wherein the data structure corresponding to the current time period also comprises the data of the previous time period;
writing the data and the scores thereof into a data structure corresponding to a next time period in advance, wherein the current time period and the next time period use a first time point as a time demarcation point;
if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period;
and the data bucket dividing module is used for carrying out data bucket dividing on the data according to the data ordered arrangement.
An embodiment of the present invention further provides an electronic device, where the electronic device includes:
a memory communicatively coupled to the at least one processor, the memory storing instructions executable by the at least one processor to enable the at least one processor to perform the method described above.
Embodiments of the present invention also provide a non-transitory computer-readable storage medium, which stores computer-executable instructions, and when the computer-executable instructions are executed by an electronic device, the electronic device is caused to execute the above method.
According to the data bucket dividing method and device, the electronic equipment and the storage medium, data and the scores of the data are written into the data structure corresponding to the current time period and the data structure corresponding to the next time period respectively, and the data structure corresponding to the current time period further comprises data of the previous time period. Therefore, the data structure corresponding to the current time period simultaneously comprises the data of the previous time period and the data of the current time period, effective data ordered arrangement can be obtained based on enough sample data in the data structure of the current time period, data bucket sorting is carried out based on the effective data ordered arrangement, and accurate bucket sorting results can be obtained.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.
FIG. 1a is a schematic diagram of an application scenario of a data bucketing method and apparatus according to an embodiment of the present invention;
FIG. 1b is a schematic diagram of an application scenario of the data bucketing method and apparatus according to the present invention;
FIG. 2a is a schematic flow chart diagram illustrating one embodiment of a data bucketing method of the present invention;
FIG. 2b is a schematic flow chart diagram illustrating one embodiment of a data bucketing method of the present invention;
FIG. 3 is a schematic structural diagram of one embodiment of a data bucketing device of the present invention;
FIG. 4 is a schematic structural diagram of one embodiment of a data bucketing device of the present invention;
fig. 5 is a schematic diagram of a hardware structure of an embodiment of the electronic device of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1a shows an application scenario of the data bucketing method and apparatus according to the embodiment of the present invention. The application scenario includes an electronic device 10, where the electronic device 10 may be any suitable device composed of electronic components such as an integrated circuit, a transistor, a tube, and the like, for example, an electronic computer, an intelligent terminal (e.g., a smart phone, a tablet computer, and the like), and the like.
The data bucket dividing method of the embodiment of the invention firstly obtains data and scores thereof, and respectively writes the data and the scores thereof into a data structure corresponding to the current time period and a data structure corresponding to the next time period. And when the current time exceeds the demarcation point of the current time period and the next time period, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period. And obtaining data ordered arrangement based on a data structure corresponding to the current time period, and performing barrel distribution on the data based on the data ordered arrangement.
The data structure corresponding to the current time period also comprises the data of the previous time period. Therefore, the data structure corresponding to the current time period simultaneously comprises the data of the previous time period and the data of the current time period, and the data structure corresponding to the current time period contains enough sample data, so that the scale requirement of the sample data can be met. Effective data ordered arrangement can be obtained based on the sample data, and then an accurate bucket dividing result is obtained.
And when the current time exceeds the demarcation point of the current time period and the next time period, the data structure corresponding to the next time period is switched to the data structure corresponding to the current time period, so that the data structure corresponding to the current time period can always contain sample data with relatively short time, and the timeliness requirement of the sample data can be met. The bucket dividing result obtained based on the sample data can provide more effective decision reference for the user.
In the embodiment shown in fig. 1a, the data bucketing method is performed by the electronic device 10, i.e., the steps of data acquisition, data sorting, and data bucketing are all performed on the electronic device 10. In other embodiments, the above steps may be performed on different electronic devices respectively. For example, in the embodiment shown in FIG. 1b, the steps of data acquisition and data binning are performed on the electronic device 10 and the step of data sorting is performed on the second electronic device 20.
The data bucket dividing method of the embodiment of the invention can be realized by running application software, and in other embodiments, can also be realized by the mode of the application software plus a database. For example, the application layer obtains the data and its score, and then writes the data and its score into a data structure (the data structure may be an ordered data set in the database) in the database through the application layer and the database interface (i.e. the database driver), and implements the ordered arrangement of the data through the application layer and the database interface and the database. The database transmits the data ordered arrangement to the application layer, and the application layer obtains data barrel dividing results according to the data ordered arrangement.
In the embodiment shown in fig. 1a, the data bucketing method according to the embodiment of the present invention may be implemented by using application software, or may be implemented in the form of application software plus a database, that is, both the application software and the database are run on the electronic device 10. In the embodiment shown in fig. 1b, the application layer software may be run on the electronic device 10, while the database is run on the second electronic device 20.
When the data are sorted by using the redis database, the data can be arranged by using the ordered data set of the redis database, and the sorting method has high sorting efficiency and quick response time. Compared with a sorting method based on a database index function, the time consumption is shorter, and the method is more economical and efficient compared with a method for writing a sorting algorithm in application software to sort.
The data bucket dividing method of the embodiment of the invention can be applied to any suitable occasions where decision making needs to be carried out by utilizing classified data, and one of the application occasions of the embodiment of the invention is illustrated below.
For example, a user needs to obtain the age hierarchy of a target registered in a certain application software, and at this time, each registrant constitutes a piece of data, and the age of each registrant can be used as the score of the data. The data bucket dividing method provided by the embodiment of the invention can be used for dividing loggers according to ages. Firstly, obtaining the data of the registrant and the corresponding age of the registrant, then writing the data and the corresponding age of the data into a data structure in a database respectively to obtain the data ordered arrangement of the registrant data based on the age, and then carrying out barreling on the registrant data according to the data ordered arrangement. In some of these embodiments, the logger data may be sorted by age into a number of buckets (e.g., each bucket contains 10000 individuals) on average, after which the age span of each bucket may be obtained. For example, the age span of the first bucket is 13-22, the age span of the second bucket is 23-25, the age span of the third bucket is 26-27, the age span of the fourth bucket is 28-30, the age span of the fifth bucket is 31-37, and the age span of the sixth bucket is 38-50, …. From the above information, it is known that most of the registrants are young, particularly young people of 23 to 30 years old, and the user can make a decision based on this.
Fig. 2a is a flowchart illustrating a data bucketing method according to an embodiment of the present invention, where the method may be applied to the electronic device 10 in fig. 1a or the electronic device 10 and the second electronic device 20 in fig. 1 b. As shown in fig. 2a, the method comprises:
101, data acquisition: data and scores thereof are acquired.
Where the score may be any data tag that may provide a reference for the decision, such as age, gender, time, hedonic score, and the like.
102 data sorting: writing the data and the scores thereof into a data structure corresponding to the current time period to obtain ordered arrangement of the data in the data structure based on the scores, wherein the data structure corresponding to the current time period also comprises the data of the previous time period; writing the data and the scores thereof into a data structure corresponding to a next time period in advance, wherein the current time period and the next time period use a first time point as a time demarcation point; and if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period.
The current time period is a preset time range including the current time, and the next time period is a next preset time range of the current time period. The predetermined time range may be any suitable period of time, such as 12 hours, 24 hours, 48 hours, and the like. The first time point is not used to indicate a specific time point, but is used to indicate a time dividing point between any two time periods. If the preset time range is 24 hours, the current time period may be the current day, and the next time period is tomorrow, the dividing point of each time period may be 0, or may be any other time point, such as 12 pm, 1 pm, and the like.
That is, the data acquired each time is written into two data structures simultaneously, the data structure corresponding to the current time period and the data structure corresponding to the next time period, so that when the time passes through the demarcation point of the two time periods, the data structure corresponding to the next time period is changed into the data structure corresponding to the current time period, and a new data structure can be newly built in the data structure corresponding to the new next time period (or the existing other data structures are used and taken as the data structure corresponding to the next time period). The new data can be continuously written into the data structure corresponding to the current time period and the data structure corresponding to the next time period, so that the data structure corresponding to the current time period always comprises the data of the previous time period and the data of the current time period, and the timeliness and the scale of the sample data can be ensured by using the data in the data structure corresponding to the current time period for data bucket classification.
Specifically, the current time is compared with a first time point (i.e., a time boundary point between the current time period and a next time period), and if the current time exceeds the first time point, the data structure corresponding to the next time period is switched to the data structure corresponding to the current time period, and a new data structure of the next time period is obtained. The data structure corresponding to the current time period can be discarded, or can be stored in the database without any operation.
In the following description, the current time zone is the current day, the next time zone is the next day, and the boundary point of each time zone is 0 point. For example, each time period is respectively 9 months 20 days, 9 months 21 days, 9 months 22 days, and the like, when the method starts to run, the first data structure and the second data structure are respectively set to correspond to the keytoday and the keytomarrow, and after the data is obtained, the data is respectively written into the keytoday and the keytomarrow. At time 9 months and 20 days, data is written into the first data structure and the second data structure respectively. And when the time reaches 21 days of 9 months after 0 point, the second data structure is switched to be corresponding to the keytoday, the third data structure is switched to be corresponding to the keytomarow, and then the data are respectively written into the second data structure and the third data structure. And when the time reaches 22 days of 9 months after 0 point, the third data structure is switched to correspond to the keytoday, the fourth data structure is switched to correspond to the keytomarow, and the data are respectively written into the fourth data structure and the fifth data structure. And by analogy, every day of data is written into keytoday and keytomarrow at the same time, and the data in keytoday always comprises today's data and yesterday's data. And data bucket distribution is carried out according to the data in the keytoday, so that the timeliness and the scale of the sample data can be ensured.
In other embodiments, the system time may be named sequentially for each data structure, and the above example is also taken as an example, for example, 9 month 20 days, 9 month 21 days, 9 month 22 days, and the demarcation point of each time period is 0 point. Data of 9-month-20 days are written in 9-month-20 days and 9-month-21 days, data of 9-month-21 days are written in 9-month-21 days and 9-month-22 days, and data of 9-month-22 days are written in 9-month-22 days and 9-month-23 days, respectively. Then, the bucketing is performed based on the data in the data structure for the current day, for example, if the current day is 9 months and 21 days, the bucketing is performed based on the data in 9 months and 21 days. Compared with the method that the corresponding relation between the current time period and the data structure is switched as soon as the time demarcation point passes by setting the data structure to correspond to the current time period and the next time period, the method is higher in efficiency.
Data needs to be sorted before being subjected to bucket sorting, and in one embodiment, the data is sorted by using an ordered data set function of a database, for example, data sorting is realized by using a sortset structure of a redis database. Then, the data structure is an ordered data set in a redis database. After the application layer software obtains the data and the scores thereof, the application layer software writes the data into the ordered data set in the database through the application layer and database interface so as to sort the data by utilizing the ordered data set in the database and obtain the ordered arrangement of the data. Specifically, in some embodiments, the application layer and the database interface may sequentially write the data into the database based on a binary method according to the score of the data, so as to obtain the ordered arrangement of the data.
Then, the database transmits the data ordered arrangement to the application layer through the application layer and the database interface, and the application layer carries out data barreling according to the data ordered arrangement. The data arrangement is realized by using the ordered data set of the redis database, the ordering efficiency is high, and the response time is fast.
In other embodiments, to further ensure the scalability of the sample data, before performing data bucket sorting, data in the data structure corresponding to the current time period is also confirmed to determine whether the number of the data is large enough to exceed a preset number threshold. And if the number of the data is larger than or equal to the preset number threshold, considering that the number of the data is large enough to meet the scale requirement of the sample, and then performing data barreling according to the data. And if the number of the data is smaller than the preset number threshold, the number of the data is considered not large enough to meet the scale requirement of the sample, and the sample accumulation is needed. Then, at this time, data bucket sorting is not performed, and data bucket sorting is performed after the number of data reaches a preset number threshold.
The preset number threshold may be set according to the requirements of different applications on the scale of the sample data, and may be any suitable number, which is not limited in the present invention.
103, data bucket dividing: and carrying out data bucket separation on the data according to the data ordered arrangement.
In the embodiment of the present invention, an average distribution method is adopted, that is, data is evenly distributed into a certain number of buckets.
The data and the scores thereof are respectively written into the data structure corresponding to the current time period and the data structure corresponding to the next time period, and the data structure corresponding to the current time period also comprises the data of the previous time period. Therefore, the data structure corresponding to the current time period simultaneously comprises the data of the previous time period and the data of the current time period, effective data ordered arrangement can be obtained based on enough sample data in the data structure of the current time period, data bucket sorting is carried out based on the effective data ordered arrangement, and accurate bucket sorting results can be obtained.
In some application scenarios, the data includes more than two types of data, for example, the application layer obtains the user's preference for the application software a and the application software B respectively (the preference may be set to any number within 1-100, or any number within 0-1, that is, the score of the data described above). And then the application layer transmits the sample data mixed with the data A and the data B into the database through the application layer and the database interface, and in order to accurately obtain the barreling result of each type of data, data classification is required before writing the data into the database (namely before data sorting).
Grouping data according to data types to obtain at least two grouped data of the data, and then writing the at least two grouped data and corresponding scores thereof into at least two data structures corresponding to the current time period respectively to obtain at least two data ordered arrangements based on the scores respectively; and respectively writing the at least two grouped data and the corresponding scores thereof into at least two data structures corresponding to the next time period. And then, carrying out data bucket separation on the grouped data according to the data ordered arrangement corresponding to each grouped data.
Also in the above example, after sample data mixed with a and B is obtained, data classification is performed first, and the sample data is divided into sample data for a and sample data for B. Then, the sample data about a and the sample data about B are written into the database respectively for sorting (for example, written into a data structure a and a data structure B of the database respectively), the data ordered arrangement about a and the data ordered arrangement about B are obtained, the data of the application software a is subjected to data bucketing based on the data ordered arrangement about a, and the data of the application software B is subjected to data bucketing based on the data ordered arrangement about B respectively. The step of data classification may be performed at an application layer, or may be performed through the application layer and a database interface.
In the following, an embodiment of the present invention is explained in detail for the above application, and referring to fig. 2b, the data bucketing method includes:
201: the application layer obtains the data and its scores.
In the present application example, the data includes data of the application software a and its score, and data of the application software B and its score. The scores are the user's preference for the application software a and the user's preference for the application software B, and the scores can be obtained by questionnaire survey of the registrants who have registered the application software. Then for application a, each logger scoring application a constitutes a sample data for application a, and the logger's preference for application a constitutes a score for the sample data (and vice versa for application B).
202: and grouping the data according to the data category through an application layer and a database interface to obtain at least two grouped data of the data.
The application layer transmits the data and the scores thereof to the application layer and database interface, and the data are classified through the application layer and the database interface. In other embodiments, after the application layer obtains the data and the scores thereof, the application layer may also classify the data first and then transmit the classified packet data to the application layer and the database interface. In the above application example, after data classification is performed, packet data for the application software a and packet data for the application software B are obtained.
Respectively writing at least two grouped data of the data and corresponding scores thereof into at least two data structures corresponding to the current time period in a database through an application layer and a database interface so as to respectively obtain at least two data ordered arrangements based on the scores; and writing at least two grouped data of the data and corresponding scores thereof into at least two data structures corresponding to the next time period in advance respectively. And the current time period and the next time period take the first time point as a time demarcation point.
In the above application example, the packet data for the application software a and the packet data for the application software B are written into two data structures (for example, a data structure a1 and a data structure B1) corresponding to the current time period, respectively, and the packet data for the application software a and the packet data for the application software B are written into two data structures (for example, a data structure a2 and a data structure B2) corresponding to the next time period, respectively, in advance through the application layer and database interface.
And 204, determining whether the number of the data in the data ordered arrangement is greater than or equal to a preset number threshold, and if so, performing data barreling based on the data ordered arrangement.
And 205, carrying out data bucket classification on the grouped data according to the data ordered arrangement corresponding to the grouped data.
In the application example, whether the data quantity in the data structure corresponding to the current time period of the application software a is greater than or equal to a preset quantity threshold is judged, if so, data classification is performed based on the data, and otherwise, data classification is performed after the data reaching the preset quantity threshold in the data structure corresponding to the current time period is waited. And for the application software B, judging whether the data quantity in the data structure corresponding to the current time period is greater than or equal to a preset quantity threshold, if so, performing data barreling based on the data, and otherwise, waiting for the data reaching the preset quantity threshold in the data structure corresponding to the current time period to perform data barreling.
206: and if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period.
The application layer obtains data and the scores thereof in a continuous or discontinuous continuous process, and the application layer can write the data and the scores thereof into the database through the application layer and the database interface every time the application layer obtains the data and the scores thereof, or write the data into the database through the application layer and the database interface after the application layer accumulates a certain number (for example, 100 pieces of data). Then, the application layer and the database interface continuously write data into the data structure corresponding to the current time period and the data structure corresponding to the next time period of the database until the current time reaches a first time point. And then, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period. And after new data are acquired, writing the data into a data structure corresponding to a new current time period and a data structure corresponding to a next time period.
In the above application example, when the current time exceeds the first time point, the data structure a2 corresponding to the next time period is switched to the data structure corresponding to the current time period, and the data structure B2 corresponding to the next time period is switched to the data structure corresponding to the current time period, and the new data structures corresponding to the next time periods for the application software a and the application software B are the data structure a3 and the data structure B3, respectively. The newly acquired packet data for application software a and application software B continues to be written into a2, B2, a3 and B3, respectively, data structure a1 and data structure B1 may be subject to a discard process, and so on.
Accordingly, as shown in fig. 3, an embodiment of the present invention further provides a data bucket dividing apparatus, which can be applied to the electronic device 10 in fig. 1a or the electronic device 10 and the second electronic device 20 in fig. 1 b. The electronic device data barreling apparatus 300 includes:
a data acquisition module 301, configured to acquire data and scores thereof;
a data sorting module 302, configured to write the data and the scores thereof into a data structure corresponding to a current time period, so as to obtain a score-based data ordered arrangement of the data in the data structure, where the data structure corresponding to the current time period further includes data of a previous time period;
writing the data and the scores thereof into a data structure corresponding to a next time period in advance, wherein the current time period and the next time period use a first time point as a time demarcation point;
if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period;
and a data bucket dividing module 303, configured to perform data bucket dividing on the data according to the data ordered arrangement.
In some embodiments, the data acquisition module 301 is specifically configured to:
acquiring data and scores thereof at an application layer;
the data sorting module 302 is specifically configured to:
and writing the data and the scores thereof into the data ordered set of the database through the application layer and the database interface so as to obtain the data ordered arrangement.
In some embodiments, the data sorting module 302 is specifically configured to:
and sequentially writing the data into the ordered data set based on a binary method according to the scores of the data.
In some embodiments, as shown in fig. 4, the apparatus further comprises:
a grouping module 304, configured to group the data according to a data category to obtain at least two grouped data of the data;
the data sorting module 302 is specifically configured to: writing at least two grouped data of the data and corresponding scores thereof into at least two data structures corresponding to the current time period respectively so as to obtain at least two ordered data arrangements based on the scores respectively;
at least two grouped data of the data and corresponding scores thereof are respectively written into at least two data structures corresponding to the next time period in advance;
the data bucket dividing module 303 is specifically configured to: and performing data bucket separation on the grouped data according to the data ordered arrangement corresponding to the grouped data.
In some embodiments, data bucketing module 303 is further to:
and determining whether the number of the data in the data ordered arrangement is greater than or equal to a preset number threshold, and if so, performing data barreling based on the data ordered arrangement.
In some embodiments, the database is a redis database.
In some embodiments, the current time period is a preset time range including a current time, and the next time period is a next preset time range of the current time period.
The data and the scores thereof are respectively written into the data structure corresponding to the current time period and the data structure corresponding to the next time period, and the data structure corresponding to the current time period also comprises the data of the previous time period. Therefore, the data structure corresponding to the current time period simultaneously comprises the data of the previous time period and the data of the current time period, effective data ordered arrangement can be obtained based on enough sample data in the data structure of the current time period, data bucket sorting is carried out based on the effective data ordered arrangement, and accurate bucket sorting results can be obtained.
It should be noted that the above-mentioned apparatus can execute the method provided by the embodiments of the present application, and has corresponding functional modules and beneficial effects for executing the method. For technical details which are not described in detail in the device embodiments, reference is made to the methods provided in the embodiments of the present application.
Fig. 5 is a schematic diagram of a hardware structure of an electronic device (the electronic device 10 or the electronic device 20, which is illustrated by the electronic device 10), and as shown in fig. 5, the electronic device 10 includes:
one or more processors 131 and a memory 132, with one processor 131 being an example in fig. 5.
The processor 131 and the memory 132 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The memory 132, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules (e.g., the data acquisition module 301 shown in fig. 3) corresponding to the data bucketing method in the embodiments of the present application. The processor 131 executes various functional applications and data processing of the electronic device, i.e., implementing the data bucketing method of the above method embodiment, by running the nonvolatile software programs, instructions and modules stored in the memory 132.
The memory 132 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 132 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 132 may optionally include memory located remotely from processor 131, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory 132, and when executed by the one or more processors 131, perform the data bucketing method in any of the above-described method embodiments, for example, the method steps 101-103 in FIG. 2a, and the method steps 201-206 in FIG. 2b described above; the functions of the modules 301 and 303 in fig. 3 and the modules 301 and 304 in fig. 4 are realized.
The product can execute the method provided by the embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the methods provided in the embodiments of the present application.
Embodiments of the present application provide a non-transitory computer-readable storage medium, which stores computer-executable instructions, which are executed by one or more processors, such as the processor 131 in fig. 5, and enable the one or more processors to perform the data bucketing method in any of the above method embodiments, such as the method steps 101 to 103 in fig. 2a, and the method step 201 and 206 in fig. 2b described above; the functions of the modules 301 and 303 in fig. 3 and the modules 301 and 304 in fig. 4 are realized.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
Through the above description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a general hardware platform, and may also be implemented by hardware. It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-only memory (ROM), a Random Access Memory (RAM), or the like.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; within the idea of the invention, also technical features in the above embodiments or in different embodiments may be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for data barreling, the method comprising:
acquiring data and scores thereof;
writing the data and the scores thereof into a data structure corresponding to the current time period to obtain ordered arrangement of the data in the data structure based on the scores, wherein the data structure corresponding to the current time period also comprises the data of the previous time period;
writing the data and the scores thereof into a data structure corresponding to a next time period in advance, wherein the current time period and the next time period use a first time point as a time demarcation point;
if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period;
and carrying out data bucket separation on the data according to the data ordered arrangement.
2. The method of claim 1, wherein the obtaining data and its scores comprises:
acquiring data and scores thereof at an application layer;
wherein, the data structure is the ordered collection of data of the database, writes data and its score into the data structure, include:
and writing the data and the scores thereof into the data ordered set of the database through the application layer and the database interface so as to obtain the data ordered arrangement.
3. The data bucketing method of claim 2, wherein said writing said data and its scores to a data ordered collection of a database through an application layer and database interface comprises:
and sequentially writing the data into the ordered data set based on a binary method according to the scores of the data.
4. The method of any of claims 1-3, wherein after said obtaining data and its scores, the method further comprises:
grouping the data according to data categories to obtain at least two grouped data of the data;
writing the data and the scores thereof into a data structure corresponding to the current time period to obtain the ordered arrangement of the data in the data structure based on the scores, including:
writing at least two grouped data of the data and corresponding scores thereof into at least two data structures corresponding to the current time period respectively so as to obtain at least two ordered data arrangements based on the scores respectively;
the pre-writing the data and the scores thereof into a data structure corresponding to the next time period comprises:
at least two grouped data of the data and corresponding scores thereof are respectively written into at least two data structures corresponding to the next time period in advance;
the data barreling of the data according to the data ordered arrangement comprises:
and performing data bucket separation on the grouped data according to the data ordered arrangement corresponding to the grouped data.
5. The method of any of claims 1-3, wherein prior to performing data bucketing, the method further comprises:
and determining whether the number of the data in the data ordered arrangement is greater than or equal to a preset number threshold, and if so, performing data barreling based on the data ordered arrangement.
6. A data bucketing method according to claim 2 or 3, characterised in that said database is a redis database.
7. The data bucketizing method according to any one of claims 1 to 3, wherein said current time period is a preset time range including a current time, and said next time period is a next preset time range of said current time period.
8. A data barreling apparatus, the apparatus comprising:
the data acquisition module is used for acquiring data and scores thereof;
the data sorting module is used for writing the data and the scores thereof into a data structure corresponding to the current time period so as to obtain the ordered arrangement of the data in the data structure based on the scores, wherein the data structure corresponding to the current time period also comprises the data of the previous time period;
writing the data and the scores thereof into a data structure corresponding to a next time period in advance, wherein the current time period and the next time period use a first time point as a time demarcation point;
if the current time exceeds the first time point, switching the data structure corresponding to the next time period into the data structure corresponding to the current time period, and acquiring a new data structure corresponding to the next time period;
and the data bucket dividing module is used for carrying out data bucket dividing on the data according to the data ordered arrangement.
9. An electronic device, characterized in that the electronic device comprises:
a memory communicatively coupled to the at least one processor, the memory storing instructions executable by the at least one processor to enable the at least one processor to perform the method of any of claims 1-7.
10. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions that, when executed by an electronic device, cause the electronic device to perform the method of any of claims 1-7.
CN201911100853.1A 2019-11-12 2019-11-12 Data barreling method and device, electronic equipment and storage medium Pending CN110955802A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911100853.1A CN110955802A (en) 2019-11-12 2019-11-12 Data barreling method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911100853.1A CN110955802A (en) 2019-11-12 2019-11-12 Data barreling method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110955802A true CN110955802A (en) 2020-04-03

Family

ID=69977282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911100853.1A Pending CN110955802A (en) 2019-11-12 2019-11-12 Data barreling method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110955802A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117493422A (en) * 2023-12-29 2024-02-02 智者四海(北京)技术有限公司 Sampling method, sampling device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117493422A (en) * 2023-12-29 2024-02-02 智者四海(北京)技术有限公司 Sampling method, sampling device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109934619A (en) User's portrait tag modeling method, apparatus, electronic equipment and readable storage medium storing program for executing
CN106951925A (en) Data processing method, device, server and system
CN110765615B (en) Logistics simulation method, device and equipment
CN109816043B (en) Method and device for determining user identification model, electronic equipment and storage medium
CN108563681B (en) Content recommendation method and device, electronic equipment and system
CN107864405B (en) Viewing behavior type prediction method, device and computer readable medium
WO2019062079A1 (en) Tag library-based segmentation method for service objects, electronic device and storage medium
CN106326242A (en) Application pushing method and apparatus
CN112651635A (en) Risk identification method and device, electronic equipment and storage medium
CN110766438A (en) Method for analyzing user behaviors of power grid users through artificial intelligence
CN114240101A (en) Risk identification model verification method, device and equipment
CN110378739B (en) Data traffic matching method and device
CN114490786A (en) Data sorting method and device
CN110955802A (en) Data barreling method and device, electronic equipment and storage medium
CN112784159B (en) Content recommendation method and device, terminal equipment and computer readable storage medium
CN112560463B (en) Text multi-labeling method, device, equipment and storage medium
CN111882113B (en) Enterprise mobile banking user prediction method and device
CN111325255B (en) Specific crowd delineating method and device, electronic equipment and storage medium
CN111062736A (en) Model training and clue sequencing method, device and equipment
CN110399026B (en) Multi-source single-output reset method and device based on FPGA and related equipment
US20140164270A1 (en) Method, system and computer readable medium for recommending medium users
CN111831892A (en) Information recommendation method, information recommendation device, server and storage medium
CN107613086B (en) Contact person information processing method and device and mobile terminal
CN109491970A (en) Imperfect picture detection method, device and storage medium towards cloud storage
CN115470279A (en) Data source conversion method, device, equipment and medium based on enterprise data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination