CN112134909A - Time sequence data processing method, device, system, server and readable storage medium - Google Patents

Time sequence data processing method, device, system, server and readable storage medium Download PDF

Info

Publication number
CN112134909A
CN112134909A CN201910549734.8A CN201910549734A CN112134909A CN 112134909 A CN112134909 A CN 112134909A CN 201910549734 A CN201910549734 A CN 201910549734A CN 112134909 A CN112134909 A CN 112134909A
Authority
CN
China
Prior art keywords
data
result
pieces
queue
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910549734.8A
Other languages
Chinese (zh)
Other versions
CN112134909B (en
Inventor
李元景
李荐民
戴俊娣
朱文涛
牛雄飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongfang Vision Technology Jiangsu Co ltd
Nuctech Co Ltd
Original Assignee
Tongfang Vision Technology Jiangsu Co ltd
Nuctech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongfang Vision Technology Jiangsu Co ltd, Nuctech Co Ltd filed Critical Tongfang Vision Technology Jiangsu Co ltd
Priority to CN201910549734.8A priority Critical patent/CN112134909B/en
Priority to PCT/CN2020/084962 priority patent/WO2020259017A1/en
Publication of CN112134909A publication Critical patent/CN112134909A/en
Application granted granted Critical
Publication of CN112134909B publication Critical patent/CN112134909B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/62Establishing a time schedule for servicing the requests

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a time sequence data processing method, a time sequence data processing device, a time sequence data processing system, a server and a readable storage medium. The method comprises the following steps: receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices; according to the grouping results of the plurality of devices, the plurality of pieces of data to be processed are respectively led into corresponding task queues to be processed in sequence, and a plurality of pieces of result data are correspondingly obtained; and sequentially storing the result data into a first buffer queue. According to the time sequence data processing method provided by the invention, high-performance time sequence processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, the single machine can process the data processing of tens of thousands of devices, so that the complex computer group is avoided being built, and the deployment cost is obviously reduced.

Description

Time sequence data processing method, device, system, server and readable storage medium
Technical Field
The invention relates to the field of data processing, in particular to a time sequence data processing method, a time sequence data processing device, a time sequence data processing system, a time sequence data processing server and a readable storage medium.
Background
With the development of the technology of the internet of things, the processing capacity and the processing difficulty of the data of the internet of things are increasingly greater. Currently, data processing methods commonly used in the field of internet of things are mainly classified into the following two types:
the first type: the big data cluster computing method based on typical stream/batch processing frameworks such as Hadoop, Spark, Storm and the like is mainly suitable for high-speed processing of mass data, has the advantages of high parallelism, high throughput and the like, and can support data processing of thousands of Internet of things devices. But at the same time, the big data cluster processing framework has higher requirements on the resources required by the operation of the big data cluster processing framework. According to the scale effect, the performance advantage can be exerted only when the cluster reaches a certain scale, for example, a cluster framework generally needs at least three servers to be built. The architecture has high deployment difficulty, complex realization and higher cost;
the second type: the simple data processing method realized by programming is mainly suitable for scenes such as household use or small equipment scale, and the like, and the supportable equipment scale is within hundreds of devices. Because the design basis is simple, the general program design is difficult to achieve the purpose of ensuring the time sequence and ensuring the high-performance data processing, and therefore, the method is usually only suitable for processing the scenes of non-time sequence sensitive data.
The time sequence sensitivity means that in a data processing project, processing of data monitored at a current moment must depend on a data processing result at a previous moment, that is, the data monitored at the previous moment must be processed before the data monitored at the current moment. In the conventional multi-thread model, two pieces of data may be allocated to different processing threads, so that a situation that the data monitored at the current time is processed earlier than the data monitored at the previous time often occurs, and a completely wrong processing result is caused. For example, in cold-chain logistics service, a temperature sensor is used for monitoring the temperature in a boxcar in real time, and when the temperature is not lower than 20 ℃, an early warning prompt needs to be sent, but when the temperature is not lower than 20 ℃, only one early warning prompt needs to be sent: and assuming that the temperatures monitored at the 5 continuous moments are 17 ℃, 19 ℃, 20 ℃, 19 ℃ and 21 ℃, and triggering early warning prompts at the 3 rd moment and the 5 th moment respectively under normal conditions according to the business rules. However, when it is not ensured that the data are processed in chronological order, for example, the data at the 4 th time is processed before the data at the 3 rd time, an early warning prompt is triggered only when the data at the 3 rd time (20 ℃) is processed, thereby causing the transportation staff to delay the detection of the failure of the car temperature control system.
Serialization can ensure the realization of the ordered processing of data, but the advantage of multi-core operation cannot be exerted, which causes great resource waste. In addition, the existing dedicated time-series database generally only focuses on storing time-series data and providing data query service, and cannot directly complete the aforementioned logic processing.
The above information disclosed in this background section is only for enhancement of understanding of the background of the invention and therefore it may contain information that does not constitute prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, a system, a server and a readable storage medium for processing time series data.
Additional features and advantages of the invention will be set forth in the detailed description which follows, or may be learned by practice of the invention.
According to an aspect of the present invention, there is provided a time series data processing method, including: receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices; according to the grouping results of the plurality of devices, the plurality of pieces of data to be processed are respectively led into corresponding task queues to be processed in sequence, and a plurality of pieces of result data are correspondingly obtained; and sequentially storing the result data into a first buffer queue.
According to an embodiment of the present invention, the grouping result is obtained according to the following steps: and inputting the identification code into a pre-constructed hash function, and taking an output result of the hash function as the grouping result of the equipment corresponding to the identification code.
According to an embodiment of the present invention, the step of respectively importing the multiple pieces of data to be processed into corresponding task queues for sequential processing according to the grouping result of the multiple pieces of equipment, wherein the step of correspondingly obtaining the multiple pieces of result data includes: according to the grouping result, the time sequence data in the multiple pieces of data to be processed are respectively led into the task queues of corresponding thread pools, and each thread pool comprises a processing thread; and in each thread pool, sequentially processing the time sequence data in the task queue imported into the thread pool through the processing thread to correspondingly obtain the plurality of pieces of result data.
According to an embodiment of the invention, the method further comprises: generating a second cache queue according to the first cache queue, wherein the result data stored in the second cache queue is completely the same as the result data currently stored in the first cache queue; sending the result data in the second cache queue to a database; and deleting the second buffer queue when the second buffer queue is empty.
According to an embodiment of the present invention, the new result data obtained by the sequential processing is received and stored in the first buffer queue.
According to another aspect of the present invention, there is provided a time-series data processing apparatus including: the data receiving module is used for receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices; the data processing module is used for respectively importing the multiple pieces of data to be processed into corresponding task queues to be sequentially processed according to grouping results of the multiple pieces of equipment, and correspondingly obtaining multiple pieces of result data; and the data caching module is used for sequentially storing the result data into a first caching queue.
According to an embodiment of the invention, the apparatus further comprises: a queue generating module, configured to generate a second buffer queue according to the first buffer queue, where the result data stored in the second buffer queue is completely the same as the result data currently stored in the first buffer queue; the data sending module is used for sending the result data in the second cache queue to a database; and the queue deleting module is used for deleting the second buffer queue when the second buffer queue is empty.
According to still another aspect of the present invention, there is provided a server including: the device comprises a memory, a processor and executable instructions stored in the memory and capable of running in the processor, wherein the processor executes the executable instructions to realize any one of the time series data processing methods.
According to still another aspect of the present invention, there is provided a time-series data processing system including: a plurality of devices, a server and a database according to the above; the server receives a plurality of pieces of data to be processed from the plurality of devices and sends a plurality of pieces of result data to the database.
According to yet another aspect of the present invention, there is provided a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement any one of the above-described methods of time-series data processing.
According to the time series data processing method provided by the embodiment of the invention, high-performance time series processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing method provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
In addition, according to some embodiments, the time sequence data processing method provided by the invention adopts a data caching technology and a data transmission technology which are independent of each other, and can continuously and stably realize batch transmission and accurate storage of mass time sequence result data.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
FIG. 1 is a flow chart illustrating a method of time series data processing according to an exemplary embodiment.
FIG. 2 is a flow chart illustrating another method of time series data processing according to an exemplary embodiment.
Fig. 3 is a flowchart illustrating yet another method of processing time series data according to an example embodiment.
FIG. 4 is a block diagram illustrating a time series data processing apparatus according to an exemplary embodiment.
Fig. 5 is a schematic diagram illustrating a configuration of a server according to an example embodiment.
FIG. 6 is a block diagram illustrating a sequential data processing system in accordance with an exemplary embodiment.
FIG. 7 is a schematic diagram illustrating a computer-readable storage medium in accordance with an example embodiment.
Fig. 8 is a schematic flow diagram of one type of timing data shown according to an example.
FIG. 9 is a schematic diagram illustrating one type of scheduling and processing time series data according to an example.
FIG. 10 is a schematic diagram illustrating one type of stored result data, according to an example.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, apparatus, steps, and so forth. In other instances, well-known structures, methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
As described above, the second type of data processing method can easily implement serial processing of time series data, but cannot exert the advantage of multi-core operation, and severely wastes computer resources; the first type of data processing method can process time sequence data in parallel, but the framework deployment difficulty is high, the implementation is complex and the cost is high.
Therefore, the invention provides a time sequence data processing method which integrates the advantages of the two methods. According to the time series data processing method provided by the embodiment of the invention, high-performance time series processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing method provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
Fig. 8 is a schematic flow diagram of one type of timing data shown according to an example. As indicated by arrows in fig. 8, the time-series data collected by a terminal device in real time may be first received by the data receiving/scheduling module in the server, and then transferred to the corresponding processing unit in the data processing module in the server according to the configuration policy. After the data processor module performs logic processing on the time sequence data, the obtained result data can be temporarily stored in a data buffer module in the server, and a data submitter module in the server acquires the result data in batches from the data buffer module and submits the result data to a database system for storage. And subsequently, operations such as calling, exporting, inquiring, testing, counting and the like can be performed on the result data stored in the database system through a data interface.
The overall scheme of the present invention is proposed based on the flow of time series data as shown in fig. 8. Referring to fig. 9 and 10, the complete flow of the time series data can be divided into two parts, front and back, to embody the specific innovation made by the overall scheme of the present invention:
FIG. 9 is a schematic diagram illustrating one type of scheduling and processing time series data according to an example. The data receiving/dispatcher module provides a configuration strategy aiming at the terminal equipment, can group n equipment and continuously and stably guides time sequence data acquired by the same equipment at different moments into the same thread pool in the data processor module;
FIG. 10 is a schematic diagram illustrating one type of stored result data, according to an example. Considering that the data processor module processes the time sequence data, the obtained result data is finally submitted to the database system through the data submitter module, the data buffer module serving as a 'transfer' can provide two data queues, and the buffer queue and the temporary queue generated according to the buffer queue can respectively and continuously and independently execute dynamic buffer storage and dynamic submission of the result data.
FIG. 1 is a flow chart illustrating a method of time series data processing according to an exemplary embodiment. The time-series data processing method shown in fig. 1 can be applied to a server side in a data processing scenario of the internet of things, for example, a server composed of a data receiving/scheduling device, a data processor, a data buffer, and a data submitter in fig. 8.
Referring to fig. 1, a time series data processing method 10 includes:
in step S102, a plurality of pieces of data to be processed transmitted by a plurality of devices are received.
The data to be processed respectively comprise time sequence data collected by a plurality of devices and identification codes of the devices.
In the data processing scene of the internet of things, the equipment of the internet of things such as a physical sensor can acquire time series data of which the corresponding physical quantity indexes change continuously along with time to wait for processing. The server end receives data to be processed sent by the Internet of things equipment, and the structural body of the data to be processed comprises a unique identification code which is fixed and unchangeable corresponding to the Internet of things equipment besides the time sequence data collected by the Internet of things equipment. Information interaction between the internet of things device and the server side can be realized through Protocol Transmission such as MQTT (Message Queuing Telemetry Transport), HTTP (HyperText Transfer Protocol), TCP (Transmission Control Protocol), and the like.
In step S104, according to the grouping result for the multiple devices, the multiple pieces of data to be processed are respectively imported into the corresponding task queues for sequential processing, and multiple pieces of result data are correspondingly obtained.
In some embodiments, the grouping result is obtained according to the following steps: and inputting the identification code into a pre-constructed hash function, and taking an output result of the hash function as a grouping result of the equipment corresponding to the identification code.
In the above description, it is assumed that the server can process time sequence data collected by m internet of things devices, and the unique identification codes of the devices are S1, S2, S3, … … and Sm, respectively. An equalized and idempotent hash function addr ═ h (key) is constructed in advance:
the balanced hash function can meet the condition that the probability that the KEY obtained by random sampling or the KEY randomly generated according to the distribution characteristic of the KEY space enters any one of the n hash addresses is 1/n; the idempotent hash function can meet the condition that the same result is obtained when the function is repeatedly executed by using the same parameter key, and the repeated execution of the function does not change the state of the system. The identity function and the constant function are the two simplest idempotent functions.
The pre-constructed hash function addr (key) may establish a certain correspondence between the keyword key and its storage location addr, so that each keyword corresponds to a unique storage location. Therefore, the balanced and idempotent hash function addr (h (key)) can realize that a certain distributed flow identifier is mapped to different hash addresses, for example, the identification codes of the m pieces of internet of things equipment are divided into n groups and mapped to n hash addresses, so that the time sequence data from the same piece of internet of things equipment can enter the same thread pool of the server end to be processed each time in the subsequent process, and the number of the matched pieces of internet of things equipment corresponding to each thread pool is approximately equal to m/n.
In the embodiment of the present invention, the grouping results of the plurality of devices may be determined based on the hash function through one-time pre-assignment without repeatedly performing the hash function every time the time-series data is received and processed; or may be determined by repeatedly executing the hash function each time the time-series data is received and processed: since the pre-constructed hash function has the property of an idempotent function, repeated execution of the pre-constructed hash function can ensure that the grouping result of each time is kept unchanged; the structure of the hash function may also be adjusted at any time according to actual requirements to change the device grouping result for any one or more times, which is not limited in the present invention.
The value of the packet number n may be, for example, an experience value with reference to the number of CPU cores at the server side: assuming that m is 6 and n is 3, one of the simplest grouping cases is: h (S1) ═ 0, h (S2) ═ 1, h (S3) ═ 2, h (S4) ═ 0, h (S5) ═ 1, h (S6) ═ 2, that is, 6 internet of things devices are evenly divided into 3 groups, where internet of things devices S1 and S4 correspond to the first hash address, and the time series data from S1 and S4 enter the first thread pool at the server end for processing each time; the internet of things devices S2 and S5 correspond to second hash addresses, and time sequence data from S2 and S5 enter a second thread pool at the server side for processing each time; the internet of things devices S3 and S6 correspond to the third hash address, and the time sequence data from S3 and S6 enter the third thread pool at the server side for processing each time. The above grouping is only an exemplary illustration, and the present invention is not limited by the number of devices, the number of groups, and the grouping result.
In step S106, the result data are sequentially stored in the first buffer queue.
According to the time series data processing method provided by the embodiment of the invention, high-performance time series processing of mass data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing method provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
It should be clearly understood that the present disclosure describes how to make and use particular examples, but the principles of the present disclosure are not limited to any details of these examples. Rather, these principles can be applied to many other embodiments based on the teachings of the present disclosure.
FIG. 2 is a flow chart illustrating another method of time series data processing according to an exemplary embodiment. The difference from the method 10 shown in fig. 1 is that the method 20 shown in fig. 2 further provides a method for sequentially processing the data to be processed according to the device grouping result to correspondingly obtain the result data, i.e. an embodiment of step S104 in the method 10. Similarly, the time-series data processing method shown in fig. 2 may also be applied to a server side in a data processing scenario of the internet of things, for example.
Referring to fig. 2, step S104 in the method 10 further includes:
in step S202, according to the grouping result, time series data in the multiple pieces of data to be processed are respectively imported into task queues of corresponding thread pools.
Wherein each thread pool comprises one processing thread.
In step S204, in each thread pool, the processing thread sequentially processes the time series data in the task queue imported into the thread pool, and a plurality of pieces of result data are correspondingly obtained.
In some embodiments, each thread pool in the server only reserves one unique processing thread, and when one thread pool needs to receive multiple pieces of data to be processed sent by multiple pieces of internet of things equipment, the unique processing thread can perform queuing processing according to the sequence of acquiring/sending time sequence data by different pieces of internet of things equipment, so that the orderliness of the thread pool in processing the time sequence data from the multiple pieces of internet of things equipment is ensured.
Fig. 3 is a flowchart illustrating yet another method of processing time series data according to an example embodiment. The difference from the method 10 of fig. 1 is that the method 30 of fig. 3 further provides a method of storing the resulting data to a database. Similarly, the time-series data processing method shown in fig. 3 may also be applied to a server side in a data processing scenario of the internet of things, for example.
Referring to fig. 3, the time-series data processing method 30 includes:
in step S302, a second buffer queue is generated according to the first buffer queue.
And the result data stored in the second cache queue is completely the same as the result data currently stored in the first cache queue.
The data processing scenario of the internet of things is still described as an example: after a plurality of pieces of RESULT data are correspondingly obtained, the server sequentially stores the RESULT data into a first cache QUEUE, for example, a cache QUEUE named as QUEUE-RESULT. The transmission, reception and processing of time-series data can be performed dynamically and continuously while the RESULT data is stored in sequence in the QUEUE-RESULT.
In the process of storing the RESULT data into the database, after continuously processing the time sequence data acquired/transmitted by the Internet of things equipment at different moments by the server, storing the obtained RESULT data into a QUEUE-RESULT; meanwhile, the RESULT data in the QUEUE-RESULT is transmitted in batches and stored in the database, so that the data is extremely easy to disorder and even lose. In the embodiment of the invention, the server side can generate the second cache queue according to the first cache queue, so that two different cache queues are simultaneously processed in the server side, and the processes of caching result data and transmitting result data are independent and do not interfere with each other, thereby avoiding the disorder and loss of data.
In some embodiments, generating the second buffer queue from the first buffer queue may be performed by copying the first buffer queue in real time, or by renaming the first buffer queue. By renaming the first cache QUEUE, for example, renaming the QUEUE-RESULT to QUEUE-TEMP, i.e. only modifying the pointer to the head address of the array formed in real time by the RESULT data, it is ensured that the RESULT data stored in the second cache QUEUE-TEMP is identical to the RESULT data currently stored in the first cache QUEUE-RESULT in structure and content.
In step S304, the result data in the second buffer queue is sent to the database.
The database storing the mass result data can provide services such as data query, statistics and report generation of the internet of things.
In step S306, when the second buffer queue is empty, the second buffer queue is deleted.
In some embodiments, the method 30 further comprises: and receiving new result data obtained by sequential processing and storing the new result data into the first cache queue.
After the server side transmits and stores all the result data in the second cache QUEUE QUEUE-TEMP to the database, namely the QUEUE-TEMP becomes empty, the server side deletes the QUEUE-TEMP immediately. Meanwhile, the current first buffer QUEUE QUEUE-RESULT stores new RESULT data obtained by sequential processing. And repeating the steps S302-S306, and generating a second buffer queue in each period, so that the mass transmission and accurate storage of mass time sequence result data can be continuously and stably realized.
Those skilled in the art will appreciate that all or part of the steps implementing the above embodiments are implemented as computer programs executed by a CPU. The computer program, when executed by the CPU, performs the functions defined by the method provided by the present invention. The program may be stored in a computer readable storage medium, which may be a read-only memory, a magnetic or optical disk, or the like.
Furthermore, it should be noted that the above-mentioned figures are only schematic illustrations of the processes involved in the method according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
FIG. 4 is a block diagram illustrating a time series data processing apparatus according to an exemplary embodiment.
Referring to fig. 4, the time-series data processing apparatus 40 includes: a data receiving module 402, a data processing module 404, and a data caching module 406.
The data receiving module 402 is configured to receive multiple pieces of to-be-processed data sent by multiple devices.
The data to be processed respectively comprise time sequence data collected by a plurality of devices and identification codes of the devices.
The data processing module 404 is configured to, according to the grouping result of the multiple devices, respectively import multiple pieces of data to be processed into corresponding task queues for sequential processing, and correspondingly obtain multiple pieces of result data.
The data buffer module 406 is configured to store the result data into the first buffer queue in sequence.
In some embodiments, the data processing module 404 may further include a data import unit and a data processing unit. The data import unit is used for importing the time sequence data in the plurality of pieces of data to be processed into the task queues of the corresponding thread pools respectively according to the grouping result. Wherein each thread pool comprises one processing thread; the data processing unit is used for processing the time sequence data in the task queue led into the thread pool in each thread pool through the processing thread in sequence and correspondingly obtaining a plurality of pieces of result data.
In an embodiment, a virtual machine having 4-core CPU, 16GB memory, the data receiving module 402, the data processing module 404, and the data caching module 406 can process time series data sent by up to 12000 internet of things devices at a frequency of 30 seconds/time, and the processing logic includes complex logic such as correlation timing determination.
In some embodiments, the time series data processing apparatus 40 may further include: a queue generation module 408, a data transmission module 410, and a queue deletion module 412.
The queue generating module 408 is configured to generate a second buffer queue according to the first buffer queue. And the result data stored in the second buffer queue is completely the same as the result data currently stored in the first buffer queue.
The data sending module 410 is configured to send the result data in the second buffer queue to the database.
The queue deleting module 412 is configured to delete the second buffer queue when the second buffer queue is empty.
In some embodiments, the data buffer module 406 may further be configured to receive new result data obtained by sequential processing performed by the data processing module 404 and store the new result data in the first buffer queue.
According to the time-series data processing device provided by the embodiment of the invention, high-performance time-series processing of large-batch data can be realized in parallel based on the grouping result of the equipment, and the advantage of multi-core operation is exerted. Meanwhile, according to the time sequence data processing device provided by the embodiment of the invention, a single machine can process data processing of tens of thousands of devices, so that a complex computer group is avoided, and the deployment cost is obviously reduced.
In addition, according to some embodiments, the time sequence data processing device provided by the invention adopts a data caching technology and a data transmission technology which are independent of each other, and can continuously and stably realize batch transmission and accurate storage of mass time sequence result data.
It is noted that the block diagrams shown in the above figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Fig. 5 is a schematic diagram illustrating a configuration of a server according to an example embodiment. It should be noted that the server shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention. The server shown in fig. 5 may be applied to, for example, an internet of things data processing scenario.
As shown in fig. 5, the server 100 is in the form of a general-purpose computer device. The components of the server 100 include: at least one Central Processing Unit (CPU)1001, which can perform various appropriate actions and processes according to program codes stored in a Read Only Memory (ROM)1002 or program codes loaded from at least one storage unit 1008 into a Random Access Memory (RAM) 1003.
In particular, according to an embodiment of the present invention, the program code may be executed by the central processing unit 1001, such that the central processing unit 1001 performs the steps according to various exemplary embodiments of the present invention described in the above-mentioned method embodiment section of the present specification. For example, the central processing unit 1001 may perform the steps as shown in fig. 1, 2, 3.
In the RAM 1003, various programs and data necessary for the operation of the server 100 are also stored. The CPU 1001, ROM 1002, and RAM 1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.
The following components are connected to the I/O interface 1005: an input unit 1006 including a keyboard, a mouse, and the like; an output unit 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage unit 1008 including a hard disk and the like; and a communication unit 1009 including a network interface card such as a LAN card, a modem, or the like. The communication unit 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted on the storage unit 1008 as necessary.
FIG. 6 is a block diagram illustrating a sequential data processing system in accordance with an exemplary embodiment. The time-series data processing system shown in fig. 6 can be applied to the data processing scene of the internet of things, for example.
Referring to fig. 6, a time series data processing system 60 includes: a plurality of devices 602, the server 100 described above, and a database 604.
The server 100 is configured to receive a plurality of pieces of data to be processed from a plurality of devices 602 and send a plurality of pieces of result data to the database 604.
FIG. 7 is a schematic diagram illustrating a computer-readable storage medium in accordance with an example embodiment.
Referring to fig. 7, a program product 700 configured to implement the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable medium carries one or more programs which, when executed by a device, cause the computer readable medium to implement the functions as shown in fig. 1, 2, 3.
Exemplary embodiments of the present invention are specifically illustrated and described above. It is to be understood that the invention is not limited to the precise construction, arrangements, or instrumentalities described herein; on the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A method for processing time series data, comprising:
receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices;
according to the grouping results of the plurality of devices, the plurality of pieces of data to be processed are respectively led into corresponding task queues to be processed in sequence, and a plurality of pieces of result data are correspondingly obtained; and
and sequentially storing the result data into a first buffer queue.
2. The method of claim 1, wherein the grouping result is obtained according to the following steps: and inputting the identification code into a pre-constructed hash function, and taking an output result of the hash function as the grouping result of the equipment corresponding to the identification code.
3. The method of claim 1, wherein the step of respectively importing the multiple pieces of data to be processed into corresponding task queues for sequential processing according to the grouping result of the multiple pieces of equipment, and correspondingly obtaining multiple pieces of result data comprises:
according to the grouping result, the time sequence data in the multiple pieces of data to be processed are respectively led into the task queues of corresponding thread pools, and each thread pool comprises a processing thread; and
in each thread pool, the processing thread sequentially processes the time sequence data in the task queue imported into the thread pool, and the plurality of pieces of result data are correspondingly obtained.
4. The method according to any one of claims 1-3, further comprising:
generating a second cache queue according to the first cache queue, wherein the result data stored in the second cache queue is completely the same as the result data currently stored in the first cache queue;
sending the result data in the second cache queue to a database; and
and deleting the second buffer queue when the second buffer queue is empty.
5. The method of claim 4, further comprising: and receiving new result data obtained by the sequential processing and storing the new result data into the first cache queue.
6. A time-series data processing apparatus, comprising:
the data receiving module is used for receiving a plurality of pieces of data to be processed sent by a plurality of devices, wherein the plurality of pieces of data to be processed respectively comprise time sequence data acquired by the plurality of devices and identification codes of the plurality of devices;
the data processing module is used for respectively importing the multiple pieces of data to be processed into corresponding task queues to be sequentially processed according to grouping results of the multiple pieces of equipment, and correspondingly obtaining multiple pieces of result data; and
and the data caching module is used for sequentially storing the result data into a first caching queue.
7. The method of claim 6, wherein the apparatus further comprises:
a queue generating module, configured to generate a second buffer queue according to the first buffer queue, where the result data stored in the second buffer queue is completely the same as the result data currently stored in the first buffer queue;
the data sending module is used for sending the result data in the second cache queue to a database; and
and the queue deleting module is used for deleting the second buffer queue when the second buffer queue is empty.
8. A server, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-5 when executing the executable instructions.
9. A time series data processing system, comprising: a plurality of devices, a server according to claim 8 and a database; the server receives a plurality of pieces of data to be processed from the plurality of devices and sends a plurality of pieces of result data to the database.
10. A computer-readable storage medium having computer-executable instructions stored thereon, wherein the executable instructions, when executed by a processor, implement the method of any of claims 1-5.
CN201910549734.8A 2019-06-24 2019-06-24 Time sequence data processing method, device, system, server and readable storage medium Active CN112134909B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910549734.8A CN112134909B (en) 2019-06-24 2019-06-24 Time sequence data processing method, device, system, server and readable storage medium
PCT/CN2020/084962 WO2020259017A1 (en) 2019-06-24 2020-04-15 Time sequence data processing method, apparatus and system, and server and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910549734.8A CN112134909B (en) 2019-06-24 2019-06-24 Time sequence data processing method, device, system, server and readable storage medium

Publications (2)

Publication Number Publication Date
CN112134909A true CN112134909A (en) 2020-12-25
CN112134909B CN112134909B (en) 2022-04-19

Family

ID=73849297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910549734.8A Active CN112134909B (en) 2019-06-24 2019-06-24 Time sequence data processing method, device, system, server and readable storage medium

Country Status (2)

Country Link
CN (1) CN112134909B (en)
WO (1) WO2020259017A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282604A (en) * 2021-07-14 2021-08-20 北京远舢智能科技有限公司 High-availability time sequence database cluster system realized based on message queue
CN114553700A (en) * 2022-02-24 2022-05-27 树根互联股份有限公司 Equipment grouping method and device, computer equipment and storage medium
CN116089414A (en) * 2023-04-10 2023-05-09 之江实验室 Time sequence database writing performance optimization method and device based on mass data scene

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220780B (en) * 2021-04-29 2023-12-05 北京字跳网络技术有限公司 Data processing method, device, equipment and medium
CN113377501A (en) * 2021-06-08 2021-09-10 中国农业银行股份有限公司 Data processing method, apparatus, device, medium, and program product
CN114500673A (en) * 2022-02-15 2022-05-13 江苏提米智能科技有限公司 Encoding method and decoding method based on time series data transmission, device, electronic equipment and storage medium
CN116192985B (en) * 2023-02-08 2023-11-21 广东保伦电子股份有限公司 Data transmission method, device and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101682565A (en) * 2007-03-12 2010-03-24 思杰系统有限公司 Systems and methods for dynamic bandwidth control by proxy
US20100148940A1 (en) * 1999-10-06 2010-06-17 Gelvin David C Apparatus for internetworked wireless integrated network sensors (wins)
CN103109285A (en) * 2010-08-31 2013-05-15 佳能株式会社 Mechanism for autotuning mass data transfer from a sender to a receiver over parallel connections
CN107391719A (en) * 2017-07-31 2017-11-24 南京邮电大学 Distributed stream data processing method and system in a kind of cloud environment
CN108228730A (en) * 2017-12-11 2018-06-29 深圳市买买提信息科技有限公司 Data lead-in method, device, computer equipment and readable storage medium storing program for executing
CN108459917A (en) * 2018-03-15 2018-08-28 欧普照明股份有限公司 A kind of message distribution member, message handling system and message distribution method
CN109145051A (en) * 2018-07-03 2019-01-04 阿里巴巴集团控股有限公司 The data summarization method and device and electronic equipment of distributed data base

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9870385B2 (en) * 2013-04-01 2018-01-16 Hitachi, Ltd. Computer system, data management method, and computer
CN107193539B (en) * 2016-03-14 2020-11-24 北京京东尚科信息技术有限公司 Multithreading concurrent processing method and multithreading concurrent processing system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100148940A1 (en) * 1999-10-06 2010-06-17 Gelvin David C Apparatus for internetworked wireless integrated network sensors (wins)
CN101682565A (en) * 2007-03-12 2010-03-24 思杰系统有限公司 Systems and methods for dynamic bandwidth control by proxy
CN103109285A (en) * 2010-08-31 2013-05-15 佳能株式会社 Mechanism for autotuning mass data transfer from a sender to a receiver over parallel connections
CN107391719A (en) * 2017-07-31 2017-11-24 南京邮电大学 Distributed stream data processing method and system in a kind of cloud environment
CN108228730A (en) * 2017-12-11 2018-06-29 深圳市买买提信息科技有限公司 Data lead-in method, device, computer equipment and readable storage medium storing program for executing
CN108459917A (en) * 2018-03-15 2018-08-28 欧普照明股份有限公司 A kind of message distribution member, message handling system and message distribution method
CN109145051A (en) * 2018-07-03 2019-01-04 阿里巴巴集团控股有限公司 The data summarization method and device and electronic equipment of distributed data base

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282604A (en) * 2021-07-14 2021-08-20 北京远舢智能科技有限公司 High-availability time sequence database cluster system realized based on message queue
CN113282604B (en) * 2021-07-14 2021-10-22 北京远舢智能科技有限公司 High-availability time sequence database cluster system realized based on message queue
CN114553700A (en) * 2022-02-24 2022-05-27 树根互联股份有限公司 Equipment grouping method and device, computer equipment and storage medium
CN116089414A (en) * 2023-04-10 2023-05-09 之江实验室 Time sequence database writing performance optimization method and device based on mass data scene
CN116089414B (en) * 2023-04-10 2023-09-08 之江实验室 Time sequence database writing performance optimization method and device based on mass data scene

Also Published As

Publication number Publication date
WO2020259017A1 (en) 2020-12-30
CN112134909B (en) 2022-04-19

Similar Documents

Publication Publication Date Title
CN112134909B (en) Time sequence data processing method, device, system, server and readable storage medium
EP3726411A1 (en) Data desensitising method, server, terminal, and computer-readable storage medium
US9736034B2 (en) System and method for small batching processing of usage requests
CN110096344A (en) Task management method, system, server cluster and computer-readable medium
US20200285508A1 (en) Method and Apparatus for Assigning Computing Task
CN110413384B (en) Delay task processing method and device, storage medium and electronic equipment
US11188443B2 (en) Method, apparatus and system for processing log data
CN112306719B (en) Task scheduling method and device
CN108874524A (en) Big data distributed task dispatching system
CN111045911B (en) Performance test method, performance test device, storage medium and electronic equipment
CN109783562B (en) Service processing method and device
CN110569252A (en) Data processing system and method
CN114090366A (en) Method, device and system for monitoring data
US20230037783A1 (en) Resource scheduling method and related apparatus
CN111371585A (en) Configuration method and device for CDN node
CN113127564B (en) Parameter synchronization method and device
CN115221033A (en) Interface protocol testing method and device, computer readable medium and electronic equipment
CN116842090A (en) Accounting system, method, equipment and storage medium
CN111738721A (en) Block chain transaction monitoring method and related device
CN116701020A (en) Message delay processing method, device, equipment, medium and program product
CN113407551A (en) Data consistency determining method, device, equipment and storage medium
CN111209263A (en) Data storage method, device, equipment and storage medium
US10623523B2 (en) Distributed communication and task handling to facilitate operations of application system
CN108805741B (en) Fusion method, device and system of power quality data
CN113778700A (en) Message processing method, system, medium and computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant