CN112134909A - Time sequence data processing method, device, system, server and readable storage medium - Google Patents
Time sequence data processing method, device, system, server and readable storage medium Download PDFInfo
- Publication number
- CN112134909A CN112134909A CN201910549734.8A CN201910549734A CN112134909A CN 112134909 A CN112134909 A CN 112134909A CN 201910549734 A CN201910549734 A CN 201910549734A CN 112134909 A CN112134909 A CN 112134909A
- Authority
- CN
- China
- Prior art keywords
- data
- result
- pieces
- queue
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 114
- 238000000034 method Methods 0.000 claims abstract description 56
- 230000008569 process Effects 0.000 claims abstract description 19
- 230000006870 function Effects 0.000 claims description 25
- 230000008901 benefit Effects 0.000 abstract description 12
- 238000010586 diagram Methods 0.000 description 15
- 230000005540 biological transmission Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 230000008676 import Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000012885 constant function Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/12—Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/60—Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
- H04L67/62—Establishing a time schedule for servicing the requests
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Fuzzy Systems (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a time sequence data processing method, a time sequence data processing device, a time sequence data processing system, a server and a readable storage medium. The method comprises the following steps: receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices; according to the grouping results of the plurality of devices, the plurality of pieces of data to be processed are respectively led into corresponding task queues to be processed in sequence, and a plurality of pieces of result data are correspondingly obtained; and sequentially storing the result data into a first buffer queue. According to the time sequence data processing method provided by the invention, high-performance time sequence processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, the single machine can process the data processing of tens of thousands of devices, so that the complex computer group is avoided being built, and the deployment cost is obviously reduced.
Description
Technical Field
The invention relates to the field of data processing, in particular to a time sequence data processing method, a time sequence data processing device, a time sequence data processing system, a time sequence data processing server and a readable storage medium.
Background
With the development of the technology of the internet of things, the processing capacity and the processing difficulty of the data of the internet of things are increasingly greater. Currently, data processing methods commonly used in the field of internet of things are mainly classified into the following two types:
the first type: the big data cluster computing method based on typical stream/batch processing frameworks such as Hadoop, Spark, Storm and the like is mainly suitable for high-speed processing of mass data, has the advantages of high parallelism, high throughput and the like, and can support data processing of thousands of Internet of things devices. But at the same time, the big data cluster processing framework has higher requirements on the resources required by the operation of the big data cluster processing framework. According to the scale effect, the performance advantage can be exerted only when the cluster reaches a certain scale, for example, a cluster framework generally needs at least three servers to be built. The architecture has high deployment difficulty, complex realization and higher cost;
the second type: the simple data processing method realized by programming is mainly suitable for scenes such as household use or small equipment scale, and the like, and the supportable equipment scale is within hundreds of devices. Because the design basis is simple, the general program design is difficult to achieve the purpose of ensuring the time sequence and ensuring the high-performance data processing, and therefore, the method is usually only suitable for processing the scenes of non-time sequence sensitive data.
The time sequence sensitivity means that in a data processing project, processing of data monitored at a current moment must depend on a data processing result at a previous moment, that is, the data monitored at the previous moment must be processed before the data monitored at the current moment. In the conventional multi-thread model, two pieces of data may be allocated to different processing threads, so that a situation that the data monitored at the current time is processed earlier than the data monitored at the previous time often occurs, and a completely wrong processing result is caused. For example, in cold-chain logistics service, a temperature sensor is used for monitoring the temperature in a boxcar in real time, and when the temperature is not lower than 20 ℃, an early warning prompt needs to be sent, but when the temperature is not lower than 20 ℃, only one early warning prompt needs to be sent: and assuming that the temperatures monitored at the 5 continuous moments are 17 ℃, 19 ℃, 20 ℃, 19 ℃ and 21 ℃, and triggering early warning prompts at the 3 rd moment and the 5 th moment respectively under normal conditions according to the business rules. However, when it is not ensured that the data are processed in chronological order, for example, the data at the 4 th time is processed before the data at the 3 rd time, an early warning prompt is triggered only when the data at the 3 rd time (20 ℃) is processed, thereby causing the transportation staff to delay the detection of the failure of the car temperature control system.
Serialization can ensure the realization of the ordered processing of data, but the advantage of multi-core operation cannot be exerted, which causes great resource waste. In addition, the existing dedicated time-series database generally only focuses on storing time-series data and providing data query service, and cannot directly complete the aforementioned logic processing.
The above information disclosed in this background section is only for enhancement of understanding of the background of the invention and therefore it may contain information that does not constitute prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, a system, a server and a readable storage medium for processing time series data.
Additional features and advantages of the invention will be set forth in the detailed description which follows, or may be learned by practice of the invention.
According to an aspect of the present invention, there is provided a time series data processing method, including: receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices; according to the grouping results of the plurality of devices, the plurality of pieces of data to be processed are respectively led into corresponding task queues to be processed in sequence, and a plurality of pieces of result data are correspondingly obtained; and sequentially storing the result data into a first buffer queue.
According to an embodiment of the present invention, the grouping result is obtained according to the following steps: and inputting the identification code into a pre-constructed hash function, and taking an output result of the hash function as the grouping result of the equipment corresponding to the identification code.
According to an embodiment of the present invention, the step of respectively importing the multiple pieces of data to be processed into corresponding task queues for sequential processing according to the grouping result of the multiple pieces of equipment, wherein the step of correspondingly obtaining the multiple pieces of result data includes: according to the grouping result, the time sequence data in the multiple pieces of data to be processed are respectively led into the task queues of corresponding thread pools, and each thread pool comprises a processing thread; and in each thread pool, sequentially processing the time sequence data in the task queue imported into the thread pool through the processing thread to correspondingly obtain the plurality of pieces of result data.
According to an embodiment of the invention, the method further comprises: generating a second cache queue according to the first cache queue, wherein the result data stored in the second cache queue is completely the same as the result data currently stored in the first cache queue; sending the result data in the second cache queue to a database; and deleting the second buffer queue when the second buffer queue is empty.
According to an embodiment of the present invention, the new result data obtained by the sequential processing is received and stored in the first buffer queue.
According to another aspect of the present invention, there is provided a time-series data processing apparatus including: the data receiving module is used for receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices; the data processing module is used for respectively importing the multiple pieces of data to be processed into corresponding task queues to be sequentially processed according to grouping results of the multiple pieces of equipment, and correspondingly obtaining multiple pieces of result data; and the data caching module is used for sequentially storing the result data into a first caching queue.
According to an embodiment of the invention, the apparatus further comprises: a queue generating module, configured to generate a second buffer queue according to the first buffer queue, where the result data stored in the second buffer queue is completely the same as the result data currently stored in the first buffer queue; the data sending module is used for sending the result data in the second cache queue to a database; and the queue deleting module is used for deleting the second buffer queue when the second buffer queue is empty.
According to still another aspect of the present invention, there is provided a server including: the device comprises a memory, a processor and executable instructions stored in the memory and capable of running in the processor, wherein the processor executes the executable instructions to realize any one of the time series data processing methods.
According to still another aspect of the present invention, there is provided a time-series data processing system including: a plurality of devices, a server and a database according to the above; the server receives a plurality of pieces of data to be processed from the plurality of devices and sends a plurality of pieces of result data to the database.
According to yet another aspect of the present invention, there is provided a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement any one of the above-described methods of time-series data processing.
According to the time series data processing method provided by the embodiment of the invention, high-performance time series processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing method provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
In addition, according to some embodiments, the time sequence data processing method provided by the invention adopts a data caching technology and a data transmission technology which are independent of each other, and can continuously and stably realize batch transmission and accurate storage of mass time sequence result data.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
FIG. 1 is a flow chart illustrating a method of time series data processing according to an exemplary embodiment.
FIG. 2 is a flow chart illustrating another method of time series data processing according to an exemplary embodiment.
Fig. 3 is a flowchart illustrating yet another method of processing time series data according to an example embodiment.
FIG. 4 is a block diagram illustrating a time series data processing apparatus according to an exemplary embodiment.
Fig. 5 is a schematic diagram illustrating a configuration of a server according to an example embodiment.
FIG. 6 is a block diagram illustrating a sequential data processing system in accordance with an exemplary embodiment.
FIG. 7 is a schematic diagram illustrating a computer-readable storage medium in accordance with an example embodiment.
Fig. 8 is a schematic flow diagram of one type of timing data shown according to an example.
FIG. 9 is a schematic diagram illustrating one type of scheduling and processing time series data according to an example.
FIG. 10 is a schematic diagram illustrating one type of stored result data, according to an example.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, apparatus, steps, and so forth. In other instances, well-known structures, methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
As described above, the second type of data processing method can easily implement serial processing of time series data, but cannot exert the advantage of multi-core operation, and severely wastes computer resources; the first type of data processing method can process time sequence data in parallel, but the framework deployment difficulty is high, the implementation is complex and the cost is high.
Therefore, the invention provides a time sequence data processing method which integrates the advantages of the two methods. According to the time series data processing method provided by the embodiment of the invention, high-performance time series processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing method provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
Fig. 8 is a schematic flow diagram of one type of timing data shown according to an example. As indicated by arrows in fig. 8, the time-series data collected by a terminal device in real time may be first received by the data receiving/scheduling module in the server, and then transferred to the corresponding processing unit in the data processing module in the server according to the configuration policy. After the data processor module performs logic processing on the time sequence data, the obtained result data can be temporarily stored in a data buffer module in the server, and a data submitter module in the server acquires the result data in batches from the data buffer module and submits the result data to a database system for storage. And subsequently, operations such as calling, exporting, inquiring, testing, counting and the like can be performed on the result data stored in the database system through a data interface.
The overall scheme of the present invention is proposed based on the flow of time series data as shown in fig. 8. Referring to fig. 9 and 10, the complete flow of the time series data can be divided into two parts, front and back, to embody the specific innovation made by the overall scheme of the present invention:
FIG. 9 is a schematic diagram illustrating one type of scheduling and processing time series data according to an example. The data receiving/dispatcher module provides a configuration strategy aiming at the terminal equipment, can group n equipment and continuously and stably guides time sequence data acquired by the same equipment at different moments into the same thread pool in the data processor module;
FIG. 10 is a schematic diagram illustrating one type of stored result data, according to an example. Considering that the data processor module processes the time sequence data, the obtained result data is finally submitted to the database system through the data submitter module, the data buffer module serving as a 'transfer' can provide two data queues, and the buffer queue and the temporary queue generated according to the buffer queue can respectively and continuously and independently execute dynamic buffer storage and dynamic submission of the result data.
FIG. 1 is a flow chart illustrating a method of time series data processing according to an exemplary embodiment. The time-series data processing method shown in fig. 1 can be applied to a server side in a data processing scenario of the internet of things, for example, a server composed of a data receiving/scheduling device, a data processor, a data buffer, and a data submitter in fig. 8.
Referring to fig. 1, a time series data processing method 10 includes:
in step S102, a plurality of pieces of data to be processed transmitted by a plurality of devices are received.
The data to be processed respectively comprise time sequence data collected by a plurality of devices and identification codes of the devices.
In the data processing scene of the internet of things, the equipment of the internet of things such as a physical sensor can acquire time series data of which the corresponding physical quantity indexes change continuously along with time to wait for processing. The server end receives data to be processed sent by the Internet of things equipment, and the structural body of the data to be processed comprises a unique identification code which is fixed and unchangeable corresponding to the Internet of things equipment besides the time sequence data collected by the Internet of things equipment. Information interaction between the internet of things device and the server side can be realized through Protocol Transmission such as MQTT (Message Queuing Telemetry Transport), HTTP (HyperText Transfer Protocol), TCP (Transmission Control Protocol), and the like.
In step S104, according to the grouping result for the multiple devices, the multiple pieces of data to be processed are respectively imported into the corresponding task queues for sequential processing, and multiple pieces of result data are correspondingly obtained.
In some embodiments, the grouping result is obtained according to the following steps: and inputting the identification code into a pre-constructed hash function, and taking an output result of the hash function as a grouping result of the equipment corresponding to the identification code.
In the above description, it is assumed that the server can process time sequence data collected by m internet of things devices, and the unique identification codes of the devices are S1, S2, S3, … … and Sm, respectively. An equalized and idempotent hash function addr ═ h (key) is constructed in advance:
the balanced hash function can meet the condition that the probability that the KEY obtained by random sampling or the KEY randomly generated according to the distribution characteristic of the KEY space enters any one of the n hash addresses is 1/n; the idempotent hash function can meet the condition that the same result is obtained when the function is repeatedly executed by using the same parameter key, and the repeated execution of the function does not change the state of the system. The identity function and the constant function are the two simplest idempotent functions.
The pre-constructed hash function addr (key) may establish a certain correspondence between the keyword key and its storage location addr, so that each keyword corresponds to a unique storage location. Therefore, the balanced and idempotent hash function addr (h (key)) can realize that a certain distributed flow identifier is mapped to different hash addresses, for example, the identification codes of the m pieces of internet of things equipment are divided into n groups and mapped to n hash addresses, so that the time sequence data from the same piece of internet of things equipment can enter the same thread pool of the server end to be processed each time in the subsequent process, and the number of the matched pieces of internet of things equipment corresponding to each thread pool is approximately equal to m/n.
In the embodiment of the present invention, the grouping results of the plurality of devices may be determined based on the hash function through one-time pre-assignment without repeatedly performing the hash function every time the time-series data is received and processed; or may be determined by repeatedly executing the hash function each time the time-series data is received and processed: since the pre-constructed hash function has the property of an idempotent function, repeated execution of the pre-constructed hash function can ensure that the grouping result of each time is kept unchanged; the structure of the hash function may also be adjusted at any time according to actual requirements to change the device grouping result for any one or more times, which is not limited in the present invention.
The value of the packet number n may be, for example, an experience value with reference to the number of CPU cores at the server side: assuming that m is 6 and n is 3, one of the simplest grouping cases is: h (S1) ═ 0, h (S2) ═ 1, h (S3) ═ 2, h (S4) ═ 0, h (S5) ═ 1, h (S6) ═ 2, that is, 6 internet of things devices are evenly divided into 3 groups, where internet of things devices S1 and S4 correspond to the first hash address, and the time series data from S1 and S4 enter the first thread pool at the server end for processing each time; the internet of things devices S2 and S5 correspond to second hash addresses, and time sequence data from S2 and S5 enter a second thread pool at the server side for processing each time; the internet of things devices S3 and S6 correspond to the third hash address, and the time sequence data from S3 and S6 enter the third thread pool at the server side for processing each time. The above grouping is only an exemplary illustration, and the present invention is not limited by the number of devices, the number of groups, and the grouping result.
In step S106, the result data are sequentially stored in the first buffer queue.
According to the time series data processing method provided by the embodiment of the invention, high-performance time series processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing method provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
It should be clearly understood that the present disclosure describes how to make and use particular examples, but the principles of the present disclosure are not limited to any details of these examples. Rather, these principles can be applied to many other embodiments based on the teachings of the present disclosure.
FIG. 2 is a flow chart illustrating another method of time series data processing according to an exemplary embodiment. The difference from the method 10 shown in fig. 1 is that the method 20 shown in fig. 2 further provides a method for sequentially processing the data to be processed according to the device grouping result to correspondingly obtain the result data, i.e. an embodiment of step S104 in the method 10. Similarly, the time-series data processing method shown in fig. 2 may also be applied to a server side in a data processing scenario of the internet of things, for example.
Referring to fig. 2, step S104 in the method 10 further includes:
in step S202, according to the grouping result, time series data in the multiple pieces of data to be processed are respectively imported into task queues of corresponding thread pools.
Wherein each thread pool comprises one processing thread.
In step S204, in each thread pool, the processing thread sequentially processes the time series data in the task queue imported into the thread pool, and a plurality of pieces of result data are correspondingly obtained.
In some embodiments, each thread pool in the server only reserves one unique processing thread, and when one thread pool needs to receive multiple pieces of data to be processed sent by multiple pieces of internet of things equipment, the unique processing thread can perform queuing processing according to the sequence of acquiring/sending time sequence data by different pieces of internet of things equipment, so that the orderliness of the thread pool in processing the time sequence data from the multiple pieces of internet of things equipment is ensured.
Fig. 3 is a flowchart illustrating yet another method of processing time series data according to an example embodiment. The difference from the method 10 of fig. 1 is that the method 30 of fig. 3 further provides a method of storing the resulting data to a database. Similarly, the time-series data processing method shown in fig. 3 may also be applied to a server side in a data processing scenario of the internet of things, for example.
Referring to fig. 3, the time-series data processing method 30 includes:
in step S302, a second buffer queue is generated according to the first buffer queue.
And the result data stored in the second cache queue is completely the same as the result data currently stored in the first cache queue.
The data processing scenario of the internet of things is still described as an example: after a plurality of pieces of RESULT data are correspondingly obtained, the server sequentially stores the RESULT data into a first cache QUEUE, for example, a cache QUEUE named as QUEUE-RESULT. The transmission, reception and processing of time-series data can be performed dynamically and continuously while the RESULT data is stored in sequence in the QUEUE-RESULT.
In the process of storing the RESULT data into the database, after continuously processing the time sequence data acquired/transmitted by the Internet of things equipment at different moments by the server, storing the obtained RESULT data into a QUEUE-RESULT; meanwhile, the RESULT data in the QUEUE-RESULT is transmitted in batches and stored in the database, so that the data is extremely easy to disorder and even lose. In the embodiment of the invention, the server side can generate the second cache queue according to the first cache queue, so that two different cache queues are simultaneously processed in the server side, and the processes of caching result data and transmitting result data are independent and do not interfere with each other, thereby avoiding the disorder and loss of data.
In some embodiments, generating the second buffer queue from the first buffer queue may be performed by copying the first buffer queue in real time, or by renaming the first buffer queue. By renaming the first cache QUEUE, for example, renaming the QUEUE-RESULT to QUEUE-TEMP, i.e. only modifying the pointer to the head address of the array formed in real time by the RESULT data, it is ensured that the RESULT data stored in the second cache QUEUE-TEMP is identical to the RESULT data currently stored in the first cache QUEUE-RESULT in structure and content.
In step S304, the result data in the second buffer queue is sent to the database.
The database storing the mass result data can provide services such as data query, statistics and report generation of the internet of things.
In step S306, when the second buffer queue is empty, the second buffer queue is deleted.
In some embodiments, the method 30 further comprises: and receiving new result data obtained by sequential processing and storing the new result data into the first cache queue.
After the server side transmits and stores all the result data in the second cache QUEUE QUEUE-TEMP to the database, namely the QUEUE-TEMP becomes empty, the server side deletes the QUEUE-TEMP immediately. Meanwhile, the current first buffer QUEUE QUEUE-RESULT stores new RESULT data obtained by sequential processing. And repeating the steps S302-S306, and generating a second buffer queue in each period, so that the mass transmission and accurate storage of mass time sequence result data can be continuously and stably realized.
Those skilled in the art will appreciate that all or part of the steps implementing the above embodiments are implemented as computer programs executed by a CPU. The computer program, when executed by the CPU, performs the functions defined by the method provided by the present invention. The program may be stored in a computer readable storage medium, which may be a read-only memory, a magnetic or optical disk, or the like.
Furthermore, it should be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the method according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
FIG. 4 is a block diagram illustrating a time series data processing apparatus according to an exemplary embodiment.
Referring to fig. 4, the time-series data processing apparatus 40 includes: a data receiving module 402, a data processing module 404, and a data caching module 406.
The data receiving module 402 is configured to receive multiple pieces of to-be-processed data sent by multiple devices.
The data to be processed respectively comprise time sequence data collected by a plurality of devices and identification codes of the devices.
The data processing module 404 is configured to, according to the grouping result of the multiple devices, respectively import multiple pieces of data to be processed into corresponding task queues for sequential processing, and correspondingly obtain multiple pieces of result data.
The data buffer module 406 is configured to store the result data into the first buffer queue in sequence.
In some embodiments, the data processing module 404 may further include a data import unit and a data processing unit. The data import unit is used for importing the time sequence data in the plurality of pieces of data to be processed into the task queues of the corresponding thread pools respectively according to the grouping result. Wherein each thread pool comprises one processing thread; the data processing unit is used for processing the time sequence data in the task queue led into the thread pool in each thread pool through the processing thread in sequence and correspondingly obtaining a plurality of pieces of result data.
In an embodiment, a virtual machine having 4-core CPU, 16GB memory, the data receiving module 402, the data processing module 404, and the data caching module 406 can process time series data sent by up to 12000 internet of things devices at a frequency of 30 seconds/time, and the processing logic includes complex logic such as correlation timing determination.
In some embodiments, the time series data processing apparatus 40 may further include: a queue generation module 408, a data transmission module 410, and a queue deletion module 412.
The queue generating module 408 is configured to generate a second buffer queue according to the first buffer queue. And the result data stored in the second buffer queue is completely the same as the result data currently stored in the first buffer queue.
The data sending module 410 is configured to send the result data in the second buffer queue to the database.
The queue deleting module 412 is configured to delete the second buffer queue when the second buffer queue is empty.
In some embodiments, the data buffer module 406 may further be configured to receive new result data obtained by sequential processing performed by the data processing module 404 and store the new result data in the first buffer queue.
According to the time-series data processing device provided by the embodiment of the invention, high-performance time-series processing of large-batch data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing device provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
In addition, according to some embodiments, the time sequence data processing device provided by the invention adopts a data caching technology and a data transmission technology which are independent of each other, and can continuously and stably realize batch transmission and accurate storage of mass time sequence result data.
It is noted that the block diagrams shown in the above figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Fig. 5 is a schematic diagram illustrating a configuration of a server according to an example embodiment. It should be noted that the server shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention. The server shown in fig. 5 may be applied to, for example, an internet of things data processing scenario.
As shown in fig. 5, the server 100 is in the form of a general-purpose computer device. The components of the server 100 include: at least one Central Processing Unit (CPU)1001, which can perform various appropriate actions and processes according to program codes stored in a Read Only Memory (ROM)1002 or program codes loaded from at least one storage unit 1008 into a Random Access Memory (RAM) 1003.
In particular, according to an embodiment of the present invention, the program code may be executed by the central processing unit 1001, such that the central processing unit 1001 performs the steps according to various exemplary embodiments of the present invention described in the above-mentioned method embodiment section of the present specification. For example, the central processing unit 1001 may perform the steps as shown in fig. 1, 2, 3.
In the RAM 1003, various programs and data necessary for the operation of the server 100 are also stored. The CPU 1001, ROM 1002, and RAM 1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
The following components are connected to the I/O interface 1005: an input unit 1006 including a keyboard, a mouse, and the like; an output unit 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage unit 1008 including a hard disk and the like; and a communication unit 1009 including a network interface card such as a LAN card, a modem, or the like. The communication unit 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted on the storage unit 1008 as necessary.
FIG. 6 is a block diagram illustrating a sequential data processing system in accordance with an exemplary embodiment. The time-series data processing system shown in fig. 6 can be applied to the data processing scene of the internet of things, for example.
Referring to fig. 6, a time series data processing system 60 includes: a plurality of devices 602, the server 100 described above, and a database 604.
The server 100 is configured to receive a plurality of pieces of data to be processed from a plurality of devices 602 and send a plurality of pieces of result data to the database 604.
FIG. 7 is a schematic diagram illustrating a computer-readable storage medium in accordance with an example embodiment.
Referring to fig. 7, a program product 700 configured to implement the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable medium carries one or more programs which, when executed by a device, cause the computer readable medium to implement the functions as shown in fig. 1, 2, 3.
Exemplary embodiments of the present invention are specifically illustrated and described above. It is to be understood that the invention is not limited to the precise construction, arrangements, or instrumentalities described herein; on the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Claims (10)
1. A method for processing time series data, comprising:
receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices;
according to the grouping results of the plurality of devices, the plurality of pieces of data to be processed are respectively led into corresponding task queues to be processed in sequence, and a plurality of pieces of result data are correspondingly obtained; and
and sequentially storing the result data into a first buffer queue.
2. The method of claim 1, wherein the grouping result is obtained according to the following steps: and inputting the identification code into a pre-constructed hash function, and taking an output result of the hash function as the grouping result of the equipment corresponding to the identification code.
3. The method of claim 1, wherein the step of respectively importing the multiple pieces of data to be processed into corresponding task queues for sequential processing according to the grouping result of the multiple pieces of equipment, and correspondingly obtaining multiple pieces of result data comprises:
according to the grouping result, the time sequence data in the multiple pieces of data to be processed are respectively led into the task queues of corresponding thread pools, and each thread pool comprises a processing thread; and
in each thread pool, the processing thread sequentially processes the time sequence data in the task queue imported into the thread pool, and the plurality of pieces of result data are correspondingly obtained.
4. The method according to any one of claims 1-3, further comprising:
generating a second cache queue according to the first cache queue, wherein the result data stored in the second cache queue is completely the same as the result data currently stored in the first cache queue;
sending the result data in the second cache queue to a database; and
and deleting the second buffer queue when the second buffer queue is empty.
5. The method of claim 4, further comprising: and receiving new result data obtained by the sequential processing and storing the new result data into the first cache queue.
6. A time-series data processing apparatus, comprising:
the data receiving module is used for receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices;
the data processing module is used for respectively importing the multiple pieces of data to be processed into corresponding task queues to be sequentially processed according to grouping results of the multiple pieces of equipment, and correspondingly obtaining multiple pieces of result data; and
and the data caching module is used for sequentially storing the result data into a first caching queue.
7. The method of claim 6, wherein the apparatus further comprises:
a queue generating module, configured to generate a second buffer queue according to the first buffer queue, where the result data stored in the second buffer queue is completely the same as the result data currently stored in the first buffer queue;
the data sending module is used for sending the result data in the second cache queue to a database; and
and the queue deleting module is used for deleting the second buffer queue when the second buffer queue is empty.
8. A server, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-5 when executing the executable instructions.
9. A time series data processing system, comprising: a plurality of devices, a server according to claim 8 and a database; the server receives a plurality of pieces of data to be processed from the plurality of devices and sends a plurality of pieces of result data to the database.
10. A computer-readable storage medium having computer-executable instructions stored thereon, wherein the executable instructions, when executed by a processor, implement the method of any of claims 1-5.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910549734.8A CN112134909B (en) | 2019-06-24 | 2019-06-24 | Time sequence data processing method, device, system, server and readable storage medium |
PCT/CN2020/084962 WO2020259017A1 (en) | 2019-06-24 | 2020-04-15 | Time sequence data processing method, apparatus and system, and server and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910549734.8A CN112134909B (en) | 2019-06-24 | 2019-06-24 | Time sequence data processing method, device, system, server and readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112134909A true CN112134909A (en) | 2020-12-25 |
CN112134909B CN112134909B (en) | 2022-04-19 |
Family
ID=73849297
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910549734.8A Active CN112134909B (en) | 2019-06-24 | 2019-06-24 | Time sequence data processing method, device, system, server and readable storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112134909B (en) |
WO (1) | WO2020259017A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282604A (en) * | 2021-07-14 | 2021-08-20 | 北京远舢智能科技有限公司 | High-availability time sequence database cluster system realized based on message queue |
CN114553700A (en) * | 2022-02-24 | 2022-05-27 | 树根互联股份有限公司 | Equipment grouping method and device, computer equipment and storage medium |
CN116089414A (en) * | 2023-04-10 | 2023-05-09 | 之江实验室 | Time sequence database writing performance optimization method and device based on mass data scene |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113220780B (en) * | 2021-04-29 | 2023-12-05 | 北京字跳网络技术有限公司 | Data processing method, device, equipment and medium |
CN113377501A (en) * | 2021-06-08 | 2021-09-10 | 中国农业银行股份有限公司 | Data processing method, apparatus, device, medium, and program product |
CN114500673A (en) * | 2022-02-15 | 2022-05-13 | 江苏提米智能科技有限公司 | Encoding method and decoding method based on time series data transmission, device, electronic equipment and storage medium |
CN116192985B (en) * | 2023-02-08 | 2023-11-21 | 广东保伦电子股份有限公司 | Data transmission method, device and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101682565A (en) * | 2007-03-12 | 2010-03-24 | 思杰系统有限公司 | Systems and methods for dynamic bandwidth control by proxy |
US20100148940A1 (en) * | 1999-10-06 | 2010-06-17 | Gelvin David C | Apparatus for internetworked wireless integrated network sensors (wins) |
CN103109285A (en) * | 2010-08-31 | 2013-05-15 | 佳能株式会社 | Mechanism for autotuning mass data transfer from a sender to a receiver over parallel connections |
CN107391719A (en) * | 2017-07-31 | 2017-11-24 | 南京邮电大学 | Distributed stream data processing method and system in a kind of cloud environment |
CN108228730A (en) * | 2017-12-11 | 2018-06-29 | 深圳市买买提信息科技有限公司 | Data lead-in method, device, computer equipment and readable storage medium storing program for executing |
CN108459917A (en) * | 2018-03-15 | 2018-08-28 | 欧普照明股份有限公司 | A kind of message distribution member, message handling system and message distribution method |
CN109145051A (en) * | 2018-07-03 | 2019-01-04 | 阿里巴巴集团控股有限公司 | The data summarization method and device and electronic equipment of distributed data base |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9870385B2 (en) * | 2013-04-01 | 2018-01-16 | Hitachi, Ltd. | Computer system, data management method, and computer |
CN107193539B (en) * | 2016-03-14 | 2020-11-24 | 北京京东尚科信息技术有限公司 | Multithreading concurrent processing method and multithreading concurrent processing system |
-
2019
- 2019-06-24 CN CN201910549734.8A patent/CN112134909B/en active Active
-
2020
- 2020-04-15 WO PCT/CN2020/084962 patent/WO2020259017A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100148940A1 (en) * | 1999-10-06 | 2010-06-17 | Gelvin David C | Apparatus for internetworked wireless integrated network sensors (wins) |
CN101682565A (en) * | 2007-03-12 | 2010-03-24 | 思杰系统有限公司 | Systems and methods for dynamic bandwidth control by proxy |
CN103109285A (en) * | 2010-08-31 | 2013-05-15 | 佳能株式会社 | Mechanism for autotuning mass data transfer from a sender to a receiver over parallel connections |
CN107391719A (en) * | 2017-07-31 | 2017-11-24 | 南京邮电大学 | Distributed stream data processing method and system in a kind of cloud environment |
CN108228730A (en) * | 2017-12-11 | 2018-06-29 | 深圳市买买提信息科技有限公司 | Data lead-in method, device, computer equipment and readable storage medium storing program for executing |
CN108459917A (en) * | 2018-03-15 | 2018-08-28 | 欧普照明股份有限公司 | A kind of message distribution member, message handling system and message distribution method |
CN109145051A (en) * | 2018-07-03 | 2019-01-04 | 阿里巴巴集团控股有限公司 | The data summarization method and device and electronic equipment of distributed data base |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113282604A (en) * | 2021-07-14 | 2021-08-20 | 北京远舢智能科技有限公司 | High-availability time sequence database cluster system realized based on message queue |
CN113282604B (en) * | 2021-07-14 | 2021-10-22 | 北京远舢智能科技有限公司 | High-availability time sequence database cluster system realized based on message queue |
CN114553700A (en) * | 2022-02-24 | 2022-05-27 | 树根互联股份有限公司 | Equipment grouping method and device, computer equipment and storage medium |
CN116089414A (en) * | 2023-04-10 | 2023-05-09 | 之江实验室 | Time sequence database writing performance optimization method and device based on mass data scene |
CN116089414B (en) * | 2023-04-10 | 2023-09-08 | 之江实验室 | Time sequence database writing performance optimization method and device based on mass data scene |
Also Published As
Publication number | Publication date |
---|---|
WO2020259017A1 (en) | 2020-12-30 |
CN112134909B (en) | 2022-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112134909B (en) | Time sequence data processing method, device, system, server and readable storage medium | |
EP3726411A1 (en) | Data desensitising method, server, terminal, and computer-readable storage medium | |
US9736034B2 (en) | System and method for small batching processing of usage requests | |
CN110096344A (en) | Task management method, system, server cluster and computer-readable medium | |
US20200285508A1 (en) | Method and Apparatus for Assigning Computing Task | |
CN110413384B (en) | Delay task processing method and device, storage medium and electronic equipment | |
US11188443B2 (en) | Method, apparatus and system for processing log data | |
CN112306719B (en) | Task scheduling method and device | |
CN108874524A (en) | Big data distributed task dispatching system | |
CN111045911B (en) | Performance test method, performance test device, storage medium and electronic equipment | |
CN109783562B (en) | Service processing method and device | |
CN110569252A (en) | Data processing system and method | |
CN114090366A (en) | Method, device and system for monitoring data | |
US20230037783A1 (en) | Resource scheduling method and related apparatus | |
CN111371585A (en) | Configuration method and device for CDN node | |
CN113127564B (en) | Parameter synchronization method and device | |
CN115221033A (en) | Interface protocol testing method and device, computer readable medium and electronic equipment | |
CN116842090A (en) | Accounting system, method, equipment and storage medium | |
CN111738721A (en) | Block chain transaction monitoring method and related device | |
CN116701020A (en) | Message delay processing method, device, equipment, medium and program product | |
CN113407551A (en) | Data consistency determining method, device, equipment and storage medium | |
CN111209263A (en) | Data storage method, device, equipment and storage medium | |
US10623523B2 (en) | Distributed communication and task handling to facilitate operations of application system | |
CN108805741B (en) | Fusion method, device and system of power quality data | |
CN113778700A (en) | Message processing method, system, medium and computer system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |