CN111143415B - Data processing method, device and computer readable storage medium - Google Patents
Data processing method, device and computer readable storage medium Download PDFInfo
- Publication number
- CN111143415B CN111143415B CN201911367948.XA CN201911367948A CN111143415B CN 111143415 B CN111143415 B CN 111143415B CN 201911367948 A CN201911367948 A CN 201911367948A CN 111143415 B CN111143415 B CN 111143415B
- Authority
- CN
- China
- Prior art keywords
- data
- message queue
- information
- line data
- analyzed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 37
- 238000004458 analytical method Methods 0.000 claims abstract description 36
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000001914 filtration Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 10
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000004880 explosion Methods 0.000 description 4
- 230000001052 transient effect Effects 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 2
- 238000010923 batch production Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention discloses a data processing method, a device and a medium, which are used for recording acquired various business line data to a first message queue; and extracting the effective data flow of various service line data in the first message queue. According to the preset window time, counting each effective data stream by utilizing a sliding window to obtain a data block to be analyzed; and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue. By adopting the message queue buffer and the sliding window data reading mode, the direct processing of the real-time business line data can be realized, and unnecessary time consumption caused in the data reading process is avoided, so that the value information of a large amount of data can be more effectively mined. And the analysis result is stored in the second message queue, and the service party can intuitively acquire valuable data information by reading the second message queue.
Description
Technical Field
The present invention relates to the field of data technologies, and in particular, to a data processing method, apparatus, and computer readable storage medium.
Background
The electronic commerce platform can generate data stream information at every moment, and the data stream information comprises real-time information stream data such as user login and real-time information stream data such as the region to which the user belongs, the number, the amount and the category of the commodity sold by the platform provider, commodity browsing information of a buyer, commodity purchasing information and the like. In the field of electronic commerce platforms for emphasizing data value, especially real-time data value, the maximum sorting of the relationship between real-time data of the platform, the recombination of data information structures and the exploitation of the potential value of data information are extremely important within the shortest time range.
Many data processing frameworks in the market today use a batch process to process historical data read from a database or data storage medium. This approach has some inherent drawbacks or shortcomings. The method has the following defects: reading the data itself from the database or storage medium requires a certain amount of time, and in some scenarios where the timeliness requirements for the data are high, the value of the data may be correspondingly compromised. And the second disadvantage is that: the characteristics of the batch process itself may cause that a batch of data is processed completely and then the next batch of data is processed, which not only makes the computing engine underutilized, but also causes unavoidable data processing delay due to switching between different batches of data in the whole data pool. And the third disadvantage is: the massive data generated at a certain moment can cause untimely data processing and even data loss due to the self computing capacity of the computing framework, and seriously cause system downtime.
It can be seen that how to effectively mine the value information of the mass data is a problem to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the invention aims to provide a data processing method, a data processing device and a computer readable storage medium, which can effectively mine the value information of mass data.
In order to solve the above technical problems, an embodiment of the present invention provides a data processing method, including:
recording the acquired various service line data to a first message queue;
extracting effective data streams of various service line data in the first message queue;
according to the preset window time, counting each effective data stream by utilizing a sliding window to obtain a data block to be analyzed;
and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue.
Optionally, the recording the acquired various service line data to the first message queue includes:
adding tag information to the acquired various service line data according to a preset classification rule;
and recording various business line data added with the tag information to a first message queue.
Optionally, the extracting the valid data flow of the various service line data in the first message queue includes:
sequencing the target business line data according to the time stamp corresponding to the target business line data to obtain a data stream; wherein, the target service line data is any one service line data in all service line data;
extracting effective data flow in the data flow according to the data filtering rule corresponding to the target service line data; wherein, different tag information has data filtering rules corresponding to the tag information.
Optionally, the analyzing each data block to be analyzed according to the corresponding service processing rule, and storing the obtained analysis result in the second message queue includes:
when the data block to be analyzed is commodity transaction information, counting the sales quantity and sales amount of different commodity categories under different regions in the commodity transaction information according to preset region information and commodity category information;
and storing the top N-bit commodity transaction information with the highest sales quantity and the top N-bit commodity transaction information with the highest sales amount into a second message queue according to the corresponding relation among the region, the commodity category, the sales quantity and the sales amount.
Optionally, the analyzing each data block to be analyzed according to the corresponding service processing rule, and storing the obtained analysis result in the second message queue includes:
when the data block to be analyzed is browsing information of a user, counting click amounts of advertisement information provided with the same tag information in different time periods divided in advance;
and storing the advertisement information to a second message queue according to the corresponding relation among the label information, the time period and the click quantity.
Optionally, the analyzing each data block to be analyzed according to the corresponding service processing rule, and storing the obtained analysis result in the second message queue includes:
when the data block to be analyzed is user login information, counting the user login quantity in different time periods under different regions in the user login information according to preset region information;
and storing the user login information into a second message queue according to the corresponding relation of the region, the time period and the user login quantity.
The embodiment of the invention also provides a data processing device which comprises a recording unit, an extracting unit, a statistics unit and an analysis unit;
the recording unit is used for recording the acquired various service line data to a first message queue;
the extracting unit is used for extracting the effective data streams of various service line data in the first message queue;
the statistics unit is used for counting each effective data stream by utilizing a sliding window according to preset window time to obtain a data block to be analyzed;
the analysis unit is used for analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into the second message queue.
Optionally, the recording unit is specifically configured to add tag information to each type of acquired service line data according to a preset classification rule; and recording various business line data added with the tag information to a first message queue.
Optionally, the extraction unit includes a sorting subunit and a filtering subunit;
the sequencing subunit is configured to sequence the target service line data according to a timestamp corresponding to the target service line data, so as to obtain a data stream; wherein, the target service line data is any one service line data in all service line data;
the filtering subunit is configured to extract an effective data stream from the data streams according to a data filtering rule corresponding to the target service line data; wherein, different tag information has data filtering rules corresponding to the tag information.
Optionally, the analysis unit includes a statistics subunit and a storage subunit;
the statistics subunit is used for counting sales quantity and sales amount of different commodity categories under different regions in the commodity transaction information according to preset region information and commodity category information when the data block to be analyzed is commodity transaction information;
the storage subunit is configured to store the top N bits of the sales amount and the top N bits of the sales amount to the second message queue according to the corresponding relationship among the division, the commodity category, the sales amount and the sales amount.
Optionally, the analysis unit includes a statistics subunit and a storage subunit;
the statistics subunit is used for counting the click rate of advertisement information provided with the same label information in different time periods divided in advance when the data block to be analyzed is user browsing information;
and the storage subunit is used for storing the advertisement information to a second message queue according to the corresponding relation among the label information, the time period and the click quantity.
Optionally, the analysis unit includes a statistics subunit and a storage subunit;
the statistics subunit is used for counting the number of user logins in different regions and different time periods in the user login information according to preset region information when the data block to be analyzed is the user login information;
the storage subunit is configured to store the user login information to a second message queue according to a corresponding relationship of a region, a time period and a user login number.
The embodiment of the invention also provides a data processing device, which comprises:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the data processing method as claimed in any one of the preceding claims.
Embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data processing method according to any of the preceding claims.
According to the technical scheme, the acquired various service line data are recorded to a first message queue; by adopting the message queue technology, the data calculation engine is helped to buffer and send real-time information stream data, so that massive information streams produced in a transient or short time range can be well processed, and information loss caused by information explosion is avoided. And extracting the effective data flow of various service line data in the first message queue. According to the preset window time, counting each effective data stream by utilizing a sliding window to obtain a data block to be analyzed; and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue. In the technical scheme, the service line data generated by the online platform in real time does not need to be read in the mode of first-in database, and the direct processing of the real-time service line data can be realized by adopting the mode of message queue caching and sliding window data reading, so that unnecessary time consumption in the data reading process is avoided, and the value information of a large amount of data can be more effectively mined. And the analysis result is stored in the second message queue, and the service party can intuitively acquire valuable data information by reading the second message queue.
Drawings
For a clearer description of embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 3 is a schematic hardware structure of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present invention.
In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description.
Next, a data processing method provided by the embodiment of the present invention is described in detail. Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, where the method includes:
s101: and recording the acquired various service line data to a first message queue.
The business line data may be from user data, merchandise data, etc. generated by the online transaction platform. In the embodiment of the invention, the message queue is used for storing the service line data, so that the data calculation engine can be effectively assisted to buffer and send the real-time information flow data, and massive information flows produced in a transient or short time range can be well processed, so that information loss caused by information explosion is avoided.
The service line data comprises different types of data, in order to facilitate analysis and management of the data, the data types possibly contained in the service line data can be classified in advance, and when the service line data is acquired, tag information can be added to the acquired various service line data according to a preset classification rule; and recording various business line data added with the tag information to a first message queue.
The tag information is used to distinguish between different data types. In the embodiment of the present invention, the specific form of the tag information is not limited, and for example, a form of a combination of numerals or letters may be used as the tag information.
S102: and extracting the effective data flow of various service line data in the first message queue.
Considering that the service line data often have some conventional data which does not have analysis value, in order to improve the data analysis efficiency, the conventional data in the service line data can be filtered, so that effective data with analysis value is extracted.
In the embodiment of the invention, any one business line data, namely, target business line data, is taken as an example to develop and introduce, and in the specific implementation, the target business line data can be sequenced according to the timestamp corresponding to the target business line data to obtain a data stream; extracting effective data flow in the data flow according to the data filtering rule corresponding to the target service line data; wherein, different tag information has data filtering rules corresponding to the tag information.
The different types of data differ in their corresponding regular data. In the embodiment of the invention, different types of data are distinguished by adopting the label information, so that the corresponding filtering rule can be preset aiming at different label information. The filtering rule may include data content or data characteristics of the regular data, so as to filter the regular data, and obtain an effective data stream.
S103: and counting each effective data stream by utilizing a sliding window according to the preset window time to obtain a data block to be analyzed.
Sliding windows are a technique used to improve throughput by allowing the sender to transmit additional packets before receiving any acknowledgement, and the receiver tells the sender how many packets can be transmitted at a time, thus effectively avoiding congestion in the network.
In order to ensure orderly execution of data processing, in the embodiment of the present invention, a sliding window is used to process each valid data stream in consideration of the larger data volume of the valid data stream. The processing manner of each effective data stream is similar, and in the following description, the processing of one effective data stream is taken as an example for development.
The preset window time can be regarded as the time span of the sliding window for collecting data for many times, and the value of the preset window time is larger than the time corresponding to the sliding window.
S104: and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue.
In the preset window time, each effective data stream corresponds to one data block to be analyzed, and the corresponding business processing rules are different because the data types contained in the data blocks to be analyzed are different.
The data types corresponding to the data blocks to be analyzed can comprise commodity transaction information, user browsing information, user login information and the like.
When the data block to be analyzed is commodity transaction information, the sales quantity and sales amount of different commodity categories under different regions in the commodity transaction information can be counted according to preset region information and commodity category information.
The division refers to an area to which a commodity seller belongs. In practical application, the regions to which the commodities belong can be divided according to different regions such as provincial level, municipal level or county level.
After the sales quantity and sales amount of different regions and different commodity categories in the commodity transaction information are counted, the top N positions with the highest sales quantity and the top N positions with the highest sales amount of the commodity transaction information can be stored in the second message queue according to the corresponding relation among the regions, the commodity categories, the sales quantity and the sales amount.
The value of N may be set according to actual requirements, and is not limited herein, for example, the value of N may be set to 10.
When the data block to be analyzed is browsing information of a user, the click rate of advertisement information provided with the same label information in different time periods divided in advance can be counted; and storing the advertisement information to a second message queue according to the corresponding relation among the label information, the time period and the click quantity.
The user browsing information may include advertisement browsing information, commodity browsing information, and the like.
When the data block to be analyzed is user login information, counting the number of user logins in different time periods under different regions in the user login information according to preset region information; and storing the user login information into a second message queue according to the corresponding relation of the region, the time period and the user login quantity.
In the embodiment of the invention, in order to facilitate distinguishing from the message queue storing the service line data, the message queue storing the service line data may be referred to as a first message queue, and the message queue storing the analysis result may be referred to as a second message queue.
According to the technical scheme, the acquired various service line data are recorded to a first message queue; by adopting the message queue technology, the data calculation engine is helped to buffer and send real-time information stream data, so that massive information streams produced in a transient or short time range can be well processed, and information loss caused by information explosion is avoided. And extracting the effective data flow of various service line data in the first message queue. According to the preset window time, counting each effective data stream by utilizing a sliding window to obtain a data block to be analyzed; and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue. In the technical scheme, the service line data generated by the online platform in real time does not need to be read in the mode of first-in database, and the direct processing of the real-time service line data can be realized by adopting the mode of message queue caching and sliding window data reading, so that unnecessary time consumption in the data reading process is avoided, and the value information of a large amount of data can be more effectively mined. And the analysis result is stored in the second message queue, and the service party can intuitively acquire valuable data information by reading the second message queue.
Fig. 2 is a schematic structural diagram of a data processing device according to an embodiment of the present invention, which includes a recording unit 21, an extracting unit 22, a statistics unit 23, and an analysis unit 24;
a recording unit 21, configured to record the acquired various service line data to a first message queue;
an extracting unit 22, configured to extract valid data flows of various service line data in the first message queue;
a statistics unit 23, configured to perform statistics on each valid data stream by using a sliding window according to a preset window time, so as to obtain a data block to be analyzed;
and the analysis unit 24 is configured to analyze each data block to be analyzed according to the corresponding service processing rule, and store the obtained analysis result in the second message queue.
Optionally, the recording unit is specifically configured to add tag information to each type of acquired service line data according to a preset classification rule; and recording various business line data added with the tag information to a first message queue.
Optionally, the extraction unit comprises a sorting subunit and a filtering subunit;
the sequencing subunit is used for sequencing the target service line data according to the time stamp corresponding to the target service line data to obtain a data stream; the target service line data is any one of all service line data;
the filtering subunit is used for extracting effective data streams in the data streams according to the data filtering rules corresponding to the target service line data; wherein, different tag information has data filtering rules corresponding to the tag information.
Optionally, the analysis unit includes a statistics subunit and a storage subunit;
the statistics subunit is used for counting the sales quantity and sales amount of different commodity categories under different regions in the commodity transaction information according to preset region information and commodity category information when the data block to be analyzed is commodity transaction information;
and the storage subunit is used for storing the top N-bit commodity transaction information with the highest sales quantity and the top N-bit commodity transaction information with the highest sales quantity into the second message queue according to the corresponding relation among the division, the commodity category, the sales quantity and the sales quantity.
Optionally, the analysis unit includes a statistics subunit and a storage subunit;
the statistics subunit is used for counting the click rate of the advertisement information provided with the same label information in different time periods divided in advance when the data block to be analyzed is user browsing information;
and the storage subunit is used for storing the advertisement information to the second message queue according to the corresponding relation among the label information, the time period and the click quantity.
Optionally, the analysis unit includes a statistics subunit and a storage subunit;
the statistics subunit is used for counting the user login quantity in different time periods under different regions in the user login information according to preset region information when the data block to be analyzed is the user login information;
and the storage subunit is used for storing the user login information into the second message queue according to the corresponding relation of the region, the time period and the user login quantity.
The description of the features in the embodiment corresponding to fig. 2 may be referred to the related description of the embodiment corresponding to fig. 1, and will not be repeated here.
According to the technical scheme, the acquired various service line data are recorded to a first message queue; by adopting the message queue technology, the data calculation engine is helped to buffer and send real-time information stream data, so that massive information streams produced in a transient or short time range can be well processed, and information loss caused by information explosion is avoided. And extracting the effective data flow of various service line data in the first message queue. According to the preset window time, counting each effective data stream by utilizing a sliding window to obtain a data block to be analyzed; and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue. In the technical scheme, the service line data generated by the online platform in real time does not need to be read in the mode of first-in database, and the direct processing of the real-time service line data can be realized by adopting the mode of message queue caching and sliding window data reading, so that unnecessary time consumption in the data reading process is avoided, and the value information of a large amount of data can be more effectively mined. And the analysis result is stored in the second message queue, and the service party can intuitively acquire valuable data information by reading the second message queue.
Fig. 3 is a schematic hardware structure of a data processing apparatus 30 according to an embodiment of the present invention, including:
a memory 31 for storing a computer program;
a processor 32 for executing a computer program to perform the steps of any of the data processing methods described above.
The embodiment of the invention also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of any one of the data processing methods are realized.
The foregoing describes in detail a data processing method, apparatus and computer readable storage medium provided by embodiments of the present invention. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Claims (10)
1. A method of data processing, comprising:
recording the acquired various service line data to a first message queue;
extracting effective data streams of various service line data in the first message queue;
according to the preset window time, counting each effective data stream by utilizing a sliding window to obtain a data block to be analyzed; the sliding window is a number of packets that the receiver tells the sender to be able to transmit at each instant;
and analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into a second message queue.
2. The method of claim 1, wherein the recording the acquired various types of service line data to the first message queue comprises:
adding tag information to the acquired various service line data according to a preset classification rule;
and recording various business line data added with the tag information to a first message queue.
3. The method of claim 2, wherein extracting valid data streams for various traffic line data in the first message queue comprises:
sequencing the target business line data according to the time stamp corresponding to the target business line data to obtain a data stream; wherein, the target service line data is any one service line data in all service line data;
extracting effective data flow in the data flow according to the data filtering rule corresponding to the target service line data; wherein, different tag information has data filtering rules corresponding to the tag information.
4. A method according to claim 3, wherein analyzing each data block to be analyzed according to the corresponding service processing rule, and storing the analysis result in the second message queue comprises:
when the data block to be analyzed is commodity transaction information, counting the sales quantity and sales amount of different commodity categories under different regions in the commodity transaction information according to preset region information and commodity category information;
and storing the top N-bit commodity transaction information with the highest sales quantity and the top N-bit commodity transaction information with the highest sales amount into a second message queue according to the corresponding relation among the region, the commodity category, the sales quantity and the sales amount.
5. A method according to claim 3, wherein analyzing each data block to be analyzed according to the corresponding service processing rule, and storing the analysis result in the second message queue comprises:
when the data block to be analyzed is browsing information of a user, counting click amounts of advertisement information provided with the same tag information in different time periods divided in advance;
and storing the advertisement information to a second message queue according to the corresponding relation among the label information, the time period and the click quantity.
6. A method according to claim 3, wherein analyzing each data block to be analyzed according to the corresponding service processing rule, and storing the analysis result in the second message queue comprises:
when the data block to be analyzed is user login information, counting the user login quantity in different time periods under different regions in the user login information according to preset region information;
and storing the user login information into a second message queue according to the corresponding relation of the region, the time period and the user login quantity.
7. The data processing device is characterized by comprising a recording unit, an extracting unit, a statistics unit and an analysis unit;
the recording unit is used for recording the acquired various service line data to a first message queue;
the extracting unit is used for extracting the effective data streams of various service line data in the first message queue;
the statistics unit is used for counting each effective data stream by utilizing a sliding window according to preset window time to obtain a data block to be analyzed; the sliding window is a number of packets that the receiver tells the sender to be able to transmit at each instant;
the analysis unit is used for analyzing each data block to be analyzed according to the corresponding business processing rule, and storing the obtained analysis result into the second message queue.
8. The apparatus of claim 7, wherein the recording unit is specifically configured to add tag information to each type of service line data according to a preset classification rule; and recording various business line data added with the tag information to a first message queue.
9. A data processing apparatus, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the data processing method according to any one of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the data processing method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911367948.XA CN111143415B (en) | 2019-12-26 | 2019-12-26 | Data processing method, device and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911367948.XA CN111143415B (en) | 2019-12-26 | 2019-12-26 | Data processing method, device and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111143415A CN111143415A (en) | 2020-05-12 |
CN111143415B true CN111143415B (en) | 2023-12-29 |
Family
ID=70520482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911367948.XA Active CN111143415B (en) | 2019-12-26 | 2019-12-26 | Data processing method, device and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111143415B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111901352B (en) * | 2020-07-30 | 2023-08-25 | 彩讯科技股份有限公司 | Method, device, server and storage medium for message distribution processing |
CN112035534A (en) * | 2020-09-18 | 2020-12-04 | 上海依图网络科技有限公司 | Real-time big data processing method and device and electronic equipment |
CN112506978A (en) * | 2020-12-15 | 2021-03-16 | 中国联合网络通信集团有限公司 | Big data real-time processing method, device and equipment |
CN112751726B (en) * | 2020-12-17 | 2022-09-09 | 北京达佳互联信息技术有限公司 | Data processing method and device, electronic equipment and storage medium |
CN112633904B (en) * | 2020-12-30 | 2024-04-30 | 中国平安财产保险股份有限公司 | Complaint behavior analysis method, apparatus, device and computer readable storage medium |
CN113360564A (en) * | 2021-07-12 | 2021-09-07 | 杭州安恒信息技术股份有限公司 | ETL-based data stream processing method, system, device and readable storage medium |
CN113626218A (en) * | 2021-07-30 | 2021-11-09 | 江苏苏宁物流有限公司 | Data processing method, data processing device, storage medium and computer equipment |
CN113609202B (en) * | 2021-08-11 | 2024-09-06 | 湖南快乐阳光互动娱乐传媒有限公司 | Data processing method and device |
CN113993001B (en) * | 2021-09-08 | 2024-04-12 | 四创电子股份有限公司 | Real-time stream analysis alarm method based on sliding data window |
CN116266183A (en) * | 2021-12-16 | 2023-06-20 | 中移(苏州)软件技术有限公司 | Data analysis method, device, equipment and computer storage medium |
CN118350058B (en) * | 2024-06-20 | 2024-08-27 | 江西省送变电工程有限公司 | Electric power material data management method and system |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104615777A (en) * | 2015-02-27 | 2015-05-13 | 浪潮集团有限公司 | Method and device for real-time data processing based on stream-oriented calculation engine |
CN105512297A (en) * | 2015-12-10 | 2016-04-20 | 中国测绘科学研究院 | Distributed stream-oriented computation based spatial data processing method and system |
CN105786941A (en) * | 2014-12-26 | 2016-07-20 | 中国移动通信集团上海有限公司 | Information mining method and device |
CN106156026A (en) * | 2015-03-24 | 2016-11-23 | 中国人民解放军国防科学技术大学 | A kind of method based on the data online anomaly of stream fictitious assets |
CN106528865A (en) * | 2016-12-02 | 2017-03-22 | 航天科工智慧产业发展有限公司 | Quick and accurate cleaning method of traffic big data |
WO2017092582A1 (en) * | 2015-12-01 | 2017-06-08 | 阿里巴巴集团控股有限公司 | Data processing method and apparatus |
WO2017185576A1 (en) * | 2016-04-25 | 2017-11-02 | 百度在线网络技术(北京)有限公司 | Multi-streaming data processing method, system, storage medium, and device |
CN108287905A (en) * | 2018-01-26 | 2018-07-17 | 华南理工大学 | A kind of extraction of network flow feature and storage method |
CN108874812A (en) * | 2017-05-10 | 2018-11-23 | 腾讯科技(北京)有限公司 | A kind of data processing method and server, computer storage medium |
CN108874834A (en) * | 2017-05-16 | 2018-11-23 | 北京嘀嘀无限科技发展有限公司 | A kind of data processing method, processing system and computer installation |
CN109471898A (en) * | 2018-12-19 | 2019-03-15 | 华迪计算机集团有限公司 | It is a kind of for data to be carried out with the method and system of shared distribution |
CN109905412A (en) * | 2019-04-28 | 2019-06-18 | 山东渔翁信息技术股份有限公司 | A kind of parallel encrypting and deciphering processing method of network data, device and medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7509650B2 (en) * | 2004-05-20 | 2009-03-24 | International Business Machines Corporation | Enhance browsing of messages in a message queue |
US9639895B2 (en) * | 2007-08-30 | 2017-05-02 | Chicago Mercantile Exchange, Inc. | Dynamic market data filtering |
US20130339473A1 (en) * | 2012-06-15 | 2013-12-19 | Zynga Inc. | Real time analytics via stream processing |
-
2019
- 2019-12-26 CN CN201911367948.XA patent/CN111143415B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105786941A (en) * | 2014-12-26 | 2016-07-20 | 中国移动通信集团上海有限公司 | Information mining method and device |
CN104615777A (en) * | 2015-02-27 | 2015-05-13 | 浪潮集团有限公司 | Method and device for real-time data processing based on stream-oriented calculation engine |
CN106156026A (en) * | 2015-03-24 | 2016-11-23 | 中国人民解放军国防科学技术大学 | A kind of method based on the data online anomaly of stream fictitious assets |
WO2017092582A1 (en) * | 2015-12-01 | 2017-06-08 | 阿里巴巴集团控股有限公司 | Data processing method and apparatus |
CN105512297A (en) * | 2015-12-10 | 2016-04-20 | 中国测绘科学研究院 | Distributed stream-oriented computation based spatial data processing method and system |
WO2017185576A1 (en) * | 2016-04-25 | 2017-11-02 | 百度在线网络技术(北京)有限公司 | Multi-streaming data processing method, system, storage medium, and device |
CN106528865A (en) * | 2016-12-02 | 2017-03-22 | 航天科工智慧产业发展有限公司 | Quick and accurate cleaning method of traffic big data |
CN108874812A (en) * | 2017-05-10 | 2018-11-23 | 腾讯科技(北京)有限公司 | A kind of data processing method and server, computer storage medium |
CN108874834A (en) * | 2017-05-16 | 2018-11-23 | 北京嘀嘀无限科技发展有限公司 | A kind of data processing method, processing system and computer installation |
CN108287905A (en) * | 2018-01-26 | 2018-07-17 | 华南理工大学 | A kind of extraction of network flow feature and storage method |
CN109471898A (en) * | 2018-12-19 | 2019-03-15 | 华迪计算机集团有限公司 | It is a kind of for data to be carried out with the method and system of shared distribution |
CN109905412A (en) * | 2019-04-28 | 2019-06-18 | 山东渔翁信息技术股份有限公司 | A kind of parallel encrypting and deciphering processing method of network data, device and medium |
Non-Patent Citations (1)
Title |
---|
基于海量数据的消息队列的性能对比与优化方案;刘峰;鄂海红;;软件(第10期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111143415A (en) | 2020-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111143415B (en) | Data processing method, device and computer readable storage medium | |
CN107168854B (en) | Internet advertisement abnormal click detection method, device, equipment and readable storage medium | |
CN111639138B (en) | Data processing method, device, equipment and storage medium | |
US8849798B2 (en) | Sampling analysis of search queries | |
US20210035126A1 (en) | Data processing method, system and computer device based on electronic payment behaviors | |
US20060098647A1 (en) | Monitoring and reporting enterprise data using a message-based data exchange | |
CN106815254B (en) | Data processing method and device | |
US8949315B2 (en) | System and method for generating web analytic reports | |
CN111311136A (en) | Wind control decision method, computer equipment and storage medium | |
CN110060087B (en) | Abnormal data detection method, device and server | |
CN106131083A (en) | A kind of attack message detection and take precautions against method and switch | |
CN111062799A (en) | Method and device for managing family client, electronic equipment and storage medium | |
CN106294676B (en) | A kind of data retrieval method of ecommerce government system | |
CN110675078A (en) | Marketing company risk diagnosis method, system, computer terminal and storage medium | |
CN112633842A (en) | Task pushing method, device and system | |
CN102982048A (en) | Method and device for assessing junk information mining rule | |
CN115357629A (en) | Processing method, system, electronic device and storage medium for financial data stream | |
CN116887340B (en) | Real-time pushing system for short message status report | |
CN114022051A (en) | Index fluctuation analysis method, storage medium and electronic equipment | |
CN113225325B (en) | IP (Internet protocol) blacklist determining method, device, equipment and storage medium | |
CN109919197B (en) | Random forest model training method and device | |
CN110866241A (en) | Evaluation model generation and equipment association method, device and storage medium | |
CN110032596A (en) | Traffic Anomaly user identification method and system | |
CN112560992B (en) | Method, device, electronic equipment and storage medium for optimizing picture classification model | |
CN116151670B (en) | Intelligent evaluation method, system and medium for marketing project quality of marketing business |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |