WO2023109806A1 - Method and apparatus for processing active data for internet of things device, and storage medium - Google Patents

Method and apparatus for processing active data for internet of things device, and storage medium Download PDF

Info

Publication number
WO2023109806A1
WO2023109806A1 PCT/CN2022/138668 CN2022138668W WO2023109806A1 WO 2023109806 A1 WO2023109806 A1 WO 2023109806A1 CN 2022138668 W CN2022138668 W CN 2022138668W WO 2023109806 A1 WO2023109806 A1 WO 2023109806A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
active
online
cache
time
Prior art date
Application number
PCT/CN2022/138668
Other languages
French (fr)
Chinese (zh)
Inventor
贾水钦
朱明�
任勇强
丁霞
王世杰
Original Assignee
天翼物联科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天翼物联科技有限公司 filed Critical 天翼物联科技有限公司
Publication of WO2023109806A1 publication Critical patent/WO2023109806A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Definitions

  • the present application relates to the field of data processing, and in particular to an active data processing method, device and storage medium of an Internet of Things device.
  • This application aims to solve one of the technical problems in the related art at least to a certain extent. To this end, the present application proposes a method, device and storage medium for processing active data of an Internet of Things device.
  • the embodiment of the present application provides a method for processing active data of an Internet of Things device, including: obtaining a device report message, and pushing the device report message to a distributed message queue; wherein, the device report message Including the message type; based on the flow computing service, constructing the data flow of the device reporting message in the message queue; according to the message type, cleaning the data flow to obtain the data flow containing only the active data of the device ; According to the data stream after cleaning, the active data of the device in the cache is added, modified or not processed; according to the active data of the device in the cache, the active data of the device in Hbase is Adding, modifying or not processing; based on the batch processing service, batch processing is performed on the device active data in the Hbase to obtain active basic statistical data, and store the active basic statistical data in a database.
  • the active device data includes the device ID, the date of going online and offline of the device, and the time of going online and offline of the device.
  • the active data of the device in the cache is added, Modifying or not processing, including: querying the cache according to the device ID and the online and offline dates in the data stream; According to the device active data in the date, according to the device ID in the data stream, the online and offline date and the online and offline time, add the active data of the device in the cache; The device active data of the same device in the same date exists in the data stream and the cache, and according to the online and offline time, the modification processing or no processing is performed on the device active data in the cache.
  • the active data of the device in the cache Performing the modification processing or not processing includes: when the device active data of the same device in the same date exists in the data stream and the cache, the first online and offline time in the data stream and the The second online and offline time in the cache is compared; when the first online and offline time is later than the second online and offline time, replace the second online and offline time in the cache with the first online and offline time Online time; when the first online and offline time is earlier than or equal to the second online and offline time, no processing is performed.
  • adding, modifying or not processing the device active data in Hbase according to the device active data in the cache includes: setting a timed task according to business requirements; the timed task Including the operation interval; every time the operation interval passes, according to the device active data in the cache, the device active data in the Hbase is added, modified or not processed.
  • adding, modifying or not processing the device active data in Hbase according to the device active data in the cache includes: according to the device ID in the cache, The Hbase is inquired; when the device active data of the same device exists in the cache and the Hbase, the first log-off date in the cache is compared with the second log-off date in the Hbase; When the first online and offline dates are different from the second online and offline dates, according to the device ID in the cache, the online and offline dates and the online and offline time, add the new online and offline time in the Hbase
  • the active data of the device when the first online and offline date is the same as the second online and offline date, and the first online and offline time in the cache is earlier than the second online and offline time in the Hbase, the The device active data in the Hbase is replaced with the device active data in the cache; when the first online and offline date is the same as the second online and offline date, and the first online and offline time in the cache is later If it is equal to or equal to the second online and offline time in the Hbase, it will
  • the batch-based service performs batch processing on the device active data in the Hbase, obtains active basic statistical data, and stores the active basic statistical data in a database, including: based on the batch Processing services, obtaining the device active data in the Hbase in batches; analyzing the device active data in batches to obtain user fields, product fields and device fields in the device active data; analyzing the device active data
  • the active data is assembled into an on-line and off-line data sequence, and the on-line and off-line data sequence is composed of a plurality of unit data; starting from the last unit data in the on-line and off-line data sequence, recursively forward, judging from the unit data that the device is in The active status within the specified time period; according to the user field, the product field, the device field and the specified time period, perform aggregation calculations on the active status to obtain the active basic statistical data, and storing said activity base statistics data in said database.
  • the unit data includes device operation and operation time; the recursion starts from the last unit data in the online and offline data sequence, and it is judged according to the unit data that the device is within the specified time period
  • the active status of the device includes: when the operation time is within the specified period, judging that the device is active within the specified period; when the operation time is later than the end time of the specified period, take the previous The unit data is re-judged; when the operation time is earlier than or equal to the start time of the specified period, and the operation of the device is online, it is determined that the device is active within the specified period; when the operation time is earlier If it is at or equal to the start time of the specified time period, and the operation of the device is offline, it is determined that the device is inactive within the specified time period.
  • the embodiment of the present application provides an active data processing system of an Internet of Things device, including: a first module, configured to obtain a device report message, and push the device report message to a distributed message queue; wherein , the message reported by the device includes a message type; the second module is configured to construct a data flow of the message reported by the device in the message queue based on the flow computing service; the third module is configured to generate the message according to the message type The data stream is cleaned to obtain the data stream containing only the active data of the device; the fourth module is used to add, modify or modify the active data of the device in the cache according to the data stream after cleaning.
  • the fifth module is used to add, modify or not process the device active data in Hbase according to the device active data in the cache;
  • the sixth module is used for batch-based services, Perform batch processing on the device active data in the Hbase to obtain active basic statistical data, and store the active basic statistical data in a database.
  • the embodiment of the present application provides an apparatus, including: at least one processor; at least one memory for storing at least one program; when the at least one program is executed by the at least one processor, the At least one processor implements the active data processing method of the IoT device as described in the first aspect.
  • an embodiment of the present application provides a computer storage medium, which stores a processor-executable program, and the processor-executable program is used to implement the program described in the first aspect when executed by the processor.
  • the beneficial effects of the embodiments of the present application are as follows: firstly obtain the device report message, and push the device report message to the distributed message queue; wherein, the device report message includes the message type; based on the flow computing service, construct the device report message in the message queue According to the message type, the data flow is cleaned to obtain a data flow containing only the active data of the device; according to the cleaned data flow, the active data of the device in the cache is added, modified or not processed; according to the cache Add, modify or not process the device active data in Hbase; based on the batch processing service, perform batch processing on the device active data in Hbase to obtain active basic statistical data and convert the active basic statistical data into the database.
  • This application processes device active data through stream processing services, which can further meet the high concurrency requirements generated by massive device data. Moreover, this application introduces cache as a buffer for device active data, which can reduce data interaction with Hbase and reduce a large amount of disk I/O processing, thereby further improving data processing performance.
  • FIG. 1 is a flow chart of the steps of the active data processing method of the Internet of Things device provided by the embodiment of the present application;
  • Fig. 2 is the implementation flowchart of the data collection optimization algorithm that the embodiment of the present application proposes
  • Fig. 3 is the flow chart of the steps of the synchronization process of the device active data provided by the embodiment of the present application from cache to Hbase;
  • FIG. 4 is a schematic diagram of an active data processing system of an Internet of Things device provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of the device provided by the embodiment of the present application.
  • FIG. 1 is a flow chart of the steps of the active data processing method of the Internet of Things device provided by the embodiment of the present application.
  • the method includes but is not limited to steps S100-S150:
  • the Internet of Things platform accesses a large number of different types of Internet of Things devices through various protocols such as MQTT (Message Queuing Telemetry Transport) and LWM2M (Lightweight Machine-To-Machine, Lightweight M2M).
  • MQTT Message Queuing Telemetry Transport
  • LWM2M Lightweight Machine-To-Machine, Lightweight M2M
  • the device needs to report its own information to the Internet of Things platform, and this information is called device reporting information. Because the information reported by devices is often huge, after the IoT platform obtains the information reported by these devices, it will simply analyze the information reported by the devices, and push it to the distributed message queue, which is temporarily stored in the message queue.
  • the information reported by the device can include various conditions of the IoT device during operation, so the information reported by the device may include device failure data, device energy consumption data, or device activity data.
  • the message type of the device reported information can be used for Multiple types of device data are distinguished.
  • a data source flow object based on the flow computing service, build a data source flow object, and specify the data source that needs to be consumed in the distributed message queue, so as to build the data flow of the message reported by the device in the message queue, and use the flow computing engine to consume in the message queue
  • the data (that is, the message reported by the device) is monitored in real time.
  • the message reported by the device reflecting different situations of the device can be distinguished through the message type of the information reported by the device.
  • the embodiment of the present application proposes a method for processing active data of IoT devices, in this step, the data flow is cleaned according to the message type of the message reported by the device, specifically, the data in the data flow such as device failure data, device consumption data, etc.
  • Non-device active data such as performance data are cleaned and filtered to obtain a data stream containing only device active data.
  • the device since the device is in the active state, it generally means that the device is in the online state; similarly, the device is in the inactive state, which generally means that the device is in the offline state, so the active data of the device is the data that can reflect the state of the device in the offline state. .
  • device active data includes but not limited to device ID, device log-off date, and device log-off time. It can be understood that the log-off date is divided into the log-on date and the log-off date, and the log-off time is divided into the log-on time And offline time, the time corresponding to the operation of device online or device offline is the online time or offline time of the device.
  • the embodiment of the present application introduces a cache in the process of processing the active data of the Internet of Things device.
  • the cache is a window aggregation database.
  • the active data of the device enters the cache from the message queue first, and the embodiment of the present application is run in the cache.
  • the proposed data acquisition optimization algorithm further screens and integrates the active data of the equipment, and then integrates the data in the cache into Hbase, thereby greatly reducing the amount of data entering Hbase, and reducing the disk I/O by reducing the amount of data. O processing, to achieve the purpose of improving data processing capabilities.
  • device active data includes but is not limited to device ID, device log-off date and device log-off time.
  • the data acquisition optimization algorithm proposed in the embodiment of the present application can integrate the device activity data of the data stream into the device activity data in the cache according to the device ID and other data. The specific flow of the data acquisition optimization algorithm is described below.
  • Fig. 2 is the implementation flowchart of the data acquisition optimization algorithm proposed by the embodiment of the present application, the method includes but not limited to steps S200-S250:
  • the query is performed in the cache using the device ID and the online and offline dates in the data stream as keys.
  • the Data streams, caches, and data stored in Hbase that reflect device activity are collectively referred to as device activity data, and device activity data located in different places in data streams, caches, and Hbase can be compared to determine subsequent data processing steps.
  • step S200 queries the active data of the device in the cache through the two conditions of the device ID and the online and offline date. If the device ID in the data stream is found in the cache, it means that there is data of the same device in the data stream and the cache. ; If the online and offline date of the device queried in the cache is the same as the online and offline information in the data stream, it means that there is device active data of the same device on the same date in the data stream and cache.
  • this step is to judge the result of the query realized in step S200, that is, to judge whether there is device active data of the same device in the same date in the data stream and cache, if not, jump to step S220; if yes, jump to Go to step S230.
  • step S210 it is determined that there is no device active data of the same device in the same date in the data stream and cache, then add a device active data in the cache for the device ID, log-off date and log-off time.
  • step S210 it is determined that there is device active data of the same device in the same date in the data stream and the cache, then a further judgment is made in this step, and the first online and offline time in the data stream and the first online and offline time in the cache are combined. 2. Comparing and judging the online and offline time.
  • the online and offline time is divided into the online time and the offline time. It can be understood that when the first online and offline time is compared with the second online and offline time, it should be the online time in the first online and offline time Compared with the online time in the second online and offline time, the offline time in the first online and offline time is the offline time in the second online and offline time.
  • step S240 it is judged whether the first log-off time in the data stream is later than the second log-off time in the cache, if not, go to step S240; if yes, go to step S250.
  • step S230 it is determined that the first log-off time in the data stream is earlier than or equal to the second log-off time in the cache, and then the device active data in the cache is not processed.
  • first online and offline time is earlier than or equal to the second online and offline time specifically means: the online and offline time in the first online and offline time is earlier than or equal to the online and offline time in the second online and offline time, and the first online and offline time
  • the offline time in the online time is earlier than or equal to the offline time in the second online and offline time.
  • step S230 it is determined that the first log-off time in the data stream is later than the second log-off time in the cache, and replace the second log-off time corresponding to the device ID in the cache with the first log-off time.
  • first online and offline time when the first online and offline time is later than the second online and offline time, it specifically means that the online time in the first online and offline time is later than the online time in the second online and offline time and/or in the first online and offline time If the offline time of is earlier than or equal to the offline time in the second online and offline time, then replace the second online and offline time with the first online and offline time.
  • the embodiment of the present application provides a data collection optimization algorithm, and running the data collection optimization algorithm on the data stream can greatly simplify the active data of the device.
  • traditional equipment can store as many as 86,400 records per day.
  • equipment activity optimization collection algorithm to calculate daily activity only needs to store 1 original record for daily and monthly activity calculations.
  • the traditional daily activity of the equipment stores as many as 96 records per day, and the calculation of the daily activity using the equipment activity optimization collection algorithm only needs to store one original record for daily and monthly activity calculations. Therefore, the data acquisition optimization algorithm in the embodiment of the present application helps to reduce the amount of data stored in the cache, and effectively avoids the storage and calculation pressure brought by high-frequency offline devices to the system.
  • Step S130 has been described through the above content, and step S140 will be described below.
  • timing task can be set according to business requirements; the timing task includes the duration of the operation interval. After each operation interval, according to the device active data in the cache, add, modify or not process the device active data in Hbase, so as to achieve the purpose of synchronizing the device active data in the cache to Hbase at regular intervals.
  • the data in the cache can be further simplified, and the number of data interactions with Hbase can be further reduced, thereby improving the performance of data processing.
  • the following describes the synchronization process of device active data from cache to Hbase.
  • Fig. 3 is a flow chart of the steps of the synchronization process of the device active data from the cache to the Hbase provided by the embodiment of the present application.
  • the method includes but is not limited to steps S300-S370:
  • step S320 if there is a device ID identical to the device ID in the cache in the Hbase, it means that the device active data of the same device exists in the cache and Hbase. Therefore, in this step, it is judged whether there is device active data of the same device in the cache and Hbase, if not, jump to step S320; if yes, jump to step S330.
  • step S310 it is judged that the device active data of the same device does not exist in the cache and Hbase, then according to the device ID, the date of going online and the time of going online and going online, the device active data corresponding to the device ID is added in Hbase. It can be understood that, in the future, the device activity data corresponding to the device ID will be recorded under the newly created device ID.
  • step S310 it is judged that there is device active data of the same device in the cache and Hbase, and then the date corresponding to the device ID is further judged.
  • the online date in the first online and offline date is compared with the online date in the second online and offline date
  • the offline date in the first online and offline date is compared with the offline date in the second online and offline date .
  • the first online and offline date is different from the second online and offline date, that is, the online date in the first online and offline date is different from the online date in the second online and offline date, and the first online and offline date If the offline date in the offline date is different from the offline date in the second online and offline date, then under the device ID in Hbase, insert the active number of the device in the cache.
  • the first online and offline date is the same as the second online and offline date, that is, the online date in the first online and offline date is different from the online date in the second online and offline date, and/or the first online and offline date is different.
  • the offline date in the online date is different from the offline date in the second online and offline date, whether the first online and offline time in the further cache is earlier than the second online and offline time in Hbase, if not, skip to the step S360; if so, go to step S370.
  • the online time in the first online and offline time is compared with the online time in the second online and offline time
  • the offline time in the first online and offline time is compared with the offline time in the second online and offline time
  • the first online and offline time is later than or equal to the second online and offline time in Hbase, that is, the online time in the first online and offline time is later than or equal to the online time in the second online and offline time
  • the offline time in the first online and offline time is later than or equal to the offline time in the second online and offline time
  • the first online and offline time is earlier than the second online and offline time in Hbase, that is, the online time in the first online and offline time is earlier than the online time in the second online and offline time and/or the first online and offline time. If the offline time in the first online and offline time is earlier than the offline time in the second online and offline time, the device active data in Hbase is replaced with the device active data in the cache, specifically, the device active data in Hbase is deleted, and Add a piece of device active data based on the device active data in the cache.
  • Step S140 has been described, and step S150 will be described below.
  • the embodiment of the present application proposes to establish a batch processing task based on a batch processing engine, thereby performing batch processing on device active data in Hbase, and finally obtaining active basic statistical data, and storing the active basic statistical data in the database.
  • the IoT platform needs to perform user-level, product-level, or platform-level equipment daily and monthly activity statistics, it can directly call the active basic statistical data in the database for statistics.
  • the method for processing the device active data in Hbase according to the batch processing engine includes: firstly, based on the batch processing service, obtaining the device active data in Hbase in batches; Batch parsing to obtain user fields, product fields, and device fields in device active data, and assemble the parsed device active data into an on-line and on-line data sequence composed of multiple unit data. Then, according to the active determination optimization algorithm proposed in the embodiment of the present application, logically process the online and offline data sequence, that is, start recursively from the last unit data in the online and offline data sequence, and judge the status of the device within a specified period according to the unit data. Active situation.
  • the user field, product field, and device field are obtained according to the analysis, and the activity status is aggregated and calculated within a specified period of time to obtain basic statistics of activity at different levels. For example, it is possible to calculate the active basic statistics of multiple IoT devices corresponding to the same user within a specified period of time, and it is also possible to calculate all the same products (such as all cameras in the smart community, face access control, etc. IoT devices) The active basic statistical data within a specified period can also be calculated to obtain the active basic statistical data of all devices in the IoT platform. Finally, the calculated active basic statistics are stored in the database.
  • the implementation process of the liveness determination optimization algorithm proposed in the embodiment of the present application is described below. Firstly, determine the last unit data in the online and offline data sequence.
  • the unit data includes equipment operation and operation time.
  • Equipment operation refers to the time when the device goes online or goes offline
  • the operation time refers to the time when the device goes online or when the device goes offline.
  • the embodiment of the present application provides a method for processing active data of an Internet of Things device.
  • the method includes: first obtaining a device report message, and pushing the device report message to a distributed message queue; wherein, the device report The message includes the message type; based on the flow computing service, the data flow of the device reporting the message is constructed in the message queue; according to the message type, the data flow is cleaned to obtain a data flow containing only the active data of the device; according to the cleaned data flow, Add, modify or not process the device active data in the cache according to the data collection optimization algorithm; add, modify or not process the device active data in Hbase according to the device active data in the cache; based on batch processing services, According to the active determination optimization algorithm, the device active data in Hbase is batch-processed to obtain active basic statistical data, and the active basic statistical data is stored in the database.
  • This application processes device active data through stream processing services, which can further meet the high concurrency requirements generated by massive device data. Moreover, this application introduces cache as a buffer for device active data, which can reduce data interaction with Hbase and reduce a large amount of disk I/O processing, thereby further improving data processing performance.
  • Pulsar Create a push queue for device reporting messages on Pulsar, which is used for the IoT platform to receive messages reported by a large number of devices and push them to the designated queue.
  • the stream computing service After the stream computing service obtains the message reported by the device, it first cleans the message, cleans and filters the non-device offline message, and then analyzes and processes the cleaned data to identify the user, product, device, operation time, operation identifier, etc. information to form structured data.
  • the scheduling service of the IoT platform starts to trigger the batch processing service at 00:05 every night to calculate the daily and monthly active status of massive devices on the platform on that day.
  • Batch processing service loads massive equipment online and offline data in Hbase at one time, and parses the data to form a structured list.
  • step S150 Perform active determination optimization algorithm processing on the device log-off data List, and refer to the above-mentioned step S150 for the specific processing method. If it is necessary to calculate the daily activity of the device, the specified period is a specified day; if it is necessary to calculate the monthly activity of the device, the specified period is a specified month. And the active device is marked as 1, and the inactive device is marked as 0.
  • the device data is aggregated and calculated from the dimensions of user, product and time range to generate active basic statistical data.
  • the massive active basic statistical data will be put into the database, inserted into the mysql database, and then the subsequent business layer data statistics will be processed.
  • the IoT platform can use the basic data to make daily and monthly activity statistics of user equipment, product-level equipment daily and monthly activity statistics, or platform-level equipment daily and monthly activity statistics.
  • FIG. 4 is a schematic diagram of an active data processing system of an Internet of Things device provided by an embodiment of the present application.
  • the system 400 includes a first module 410, a second module 420, a third module 430, a fourth module 440, The fifth module 450 and the sixth module 460 .
  • the first module is used to obtain the device report message, and push the device report message to the distributed message queue; wherein, the device report message includes the message type; the second module is used to build the device report message in the message queue based on the stream computing service The data flow; the third module is used to clean the data flow according to the message type, and obtain the data flow containing only the device active data; the fourth module is used to clean the device active data in the cache according to the cleaned data flow Add, modify or not process; the fifth module is used to add, modify or not process the device active data in Hbase according to the device active data in the cache; Batch processing of device activity data to obtain active basic statistical data and store active basic statistical data in the database.
  • FIG. 5 is a schematic diagram of a device provided by an embodiment of the present application.
  • the device 500 includes at least one processor 510 and at least one memory 520 for storing at least one program; in FIG. 5, a processor and a memory as an example.
  • the processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 5 .
  • memory can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, flash memory device or other non-transitory solid-state storage device.
  • the memory optionally includes memory located remotely from the processor, which remote memory may be connected to the device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Another embodiment of the present application further provides an apparatus, which can be used to execute the control method in any of the above embodiments, for example, execute the method steps in FIG. 1 described above.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • the embodiment of the present application also discloses a computer storage medium, which stores a processor-executable program, which is characterized in that the processor-executable program is used to implement the Internet of Things device proposed in this application when executed by the processor. Active data processing method.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Abstract

A method and apparatus for processing active data for an Internet of things device, and a storage medium. The method comprises: obtaining a device reporting message, and pushing the device reporting message to a distributed message queue, the device reporting message comprising a message type (S100); constructing, on the basis of a stream computing service, a data stream of the device reporting message in the message queue (S110); cleaning the data stream according to the message type to obtain a data stream only comprising device active data (S120); processing device active data in a cache according to the data stream (S130); processing device active data in Hbase according to the device active data in the cache (S140); and performing batch processing on the device active data in the HBase on the basis of a batch processing service to obtain active basic statistical data and storing same into a database (S150). In the method, high concurrency requirements generated by mass device data are satisfied by means of a stream processing service; the cache is introduced as a buffer, thereby reducing data interaction with HBase, and improving data processing performance.

Description

物联网设备的活跃数据处理方法、装置及存储介质Active data processing method, device and storage medium of Internet of things equipment 技术领域technical field
本申请涉及数据处理领域,尤其涉及一种物联网设备的活跃数据处理方法、装置及存储介质。The present application relates to the field of data processing, and in particular to an active data processing method, device and storage medium of an Internet of Things device.
背景技术Background technique
随着物联网技术的发展,物联网内的设备数量日益增多,而设备的活跃数据是物联网平台运营的重要指标和运营数据。由于物联网内的设备数量庞大,且设备形态差异不同,设备的行为表现千差万别,相关技术中采用传统的数据库采集和记录方式来采集设备的日活跃数据、月活跃数据等活跃数据的方案,已经难以满足海量设备所产生的数据高并发需求。With the development of the Internet of Things technology, the number of devices in the Internet of Things is increasing, and the active data of the devices is an important indicator and operational data for the operation of the Internet of Things platform. Due to the large number of devices in the Internet of Things, and the differences in device forms, the behavior of devices varies greatly. In related technologies, traditional database collection and recording methods are used to collect active data such as daily active data and monthly active data of devices. It is difficult to meet the high data concurrency requirements generated by massive devices.
发明内容Contents of the invention
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。为此,本申请提出一种物联网设备的活跃数据处理方法、装置及存储介质。This application aims to solve one of the technical problems in the related art at least to a certain extent. To this end, the present application proposes a method, device and storage medium for processing active data of an Internet of Things device.
第一方面,本申请实施例提供了一种物联网设备的活跃数据处理方法,包括:获取设备上报消息,并将所述设备上报消息推送至分布式的消息队列;其中,所述设备上报消息包括消息类型;基于流计算服务,在所述消息队列中构建所述设备上报消息的数据流;根据所述消息类型,对所述数据流进行清洗,获得仅包含设备活跃数据的所述数据流;根据完成清洗后的所述数据流,对缓存中的所述设备活跃数据进行新增、修改或不处理;根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理;基于批处理服务,对所述Hbase中的所述设备活跃数据进行批量处理,获得活跃基础统计数据,并将所述活跃基础统计数据存入数据库。In the first aspect, the embodiment of the present application provides a method for processing active data of an Internet of Things device, including: obtaining a device report message, and pushing the device report message to a distributed message queue; wherein, the device report message Including the message type; based on the flow computing service, constructing the data flow of the device reporting message in the message queue; according to the message type, cleaning the data flow to obtain the data flow containing only the active data of the device ; According to the data stream after cleaning, the active data of the device in the cache is added, modified or not processed; according to the active data of the device in the cache, the active data of the device in Hbase is Adding, modifying or not processing; based on the batch processing service, batch processing is performed on the device active data in the Hbase to obtain active basic statistical data, and store the active basic statistical data in a database.
可选地,所述设备活跃数据包括设备ID、设备的上下线日期和设备的上下线时间,所述根据完成清洗后的所述数据流,对缓存中的所述设备活跃数据进行新增、修改或不处理,包括:根据所述数据流中的所述设备ID和所述上下线日期,对所述缓存进行查询;当查询到所述数据流和所述缓存中不存在同一设备在同一日期内的所述设备活跃数据,根据所述数据流中的所述设备ID、所述上下线日期和所述上下线时间,在所述缓存中新增所述设备活跃数据;当查询到所述数据流和所述缓存中存在同一设备在同一日期内的所述设备活跃数据,根据所 述上下线时间,对所述缓存中的所述设备活跃数据进行所述修改处理或不作处理。Optionally, the active device data includes the device ID, the date of going online and offline of the device, and the time of going online and offline of the device. According to the data stream after cleaning, the active data of the device in the cache is added, Modifying or not processing, including: querying the cache according to the device ID and the online and offline dates in the data stream; According to the device active data in the date, according to the device ID in the data stream, the online and offline date and the online and offline time, add the active data of the device in the cache; The device active data of the same device in the same date exists in the data stream and the cache, and according to the online and offline time, the modification processing or no processing is performed on the device active data in the cache.
可选地,所述当查询到所述数据流和所述缓存中存在同一设备在同一日期内的所述设备活跃数据,根据所述上下线时间,对所述缓存中的所述设备活跃数据进行所述修改处理或不作处理,包括:当所述数据流和所述缓存中存在同一设备在同一日期内的所述设备活跃数据,将所述数据流中的第一上下线时间和所述缓存中的第二上下线时间进行比较;当所述第一上下线时间晚于所述第二上下线时间,则将所述缓存中的所述第二上下线时间替换为所述第一上下线时间;当所述第一上下线时间早于或等于所述第二上下线时间,不作处理。Optionally, when querying the data flow and the device active data of the same device in the same date in the cache, according to the online and offline time, the active data of the device in the cache Performing the modification processing or not processing includes: when the device active data of the same device in the same date exists in the data stream and the cache, the first online and offline time in the data stream and the The second online and offline time in the cache is compared; when the first online and offline time is later than the second online and offline time, replace the second online and offline time in the cache with the first online and offline time Online time; when the first online and offline time is earlier than or equal to the second online and offline time, no processing is performed.
可选地,所述根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理,包括:根据业务需求,设置定时任务;所述定时任务包括操作间隔时长;每经过所述操作间隔时长,根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理。Optionally, adding, modifying or not processing the device active data in Hbase according to the device active data in the cache includes: setting a timed task according to business requirements; the timed task Including the operation interval; every time the operation interval passes, according to the device active data in the cache, the device active data in the Hbase is added, modified or not processed.
可选地,所述根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理,包括:根据所述缓存中的所述设备ID,对所述Hbase进行查询;当所述缓存和所述Hbase中存在同一设备的所述设备活跃数据,将所述缓存中的第一上下线日期和所述Hbase中的第二上下线日期作比较;当所述第一上下线日期与所述第二上下线日期不相同,根据所述缓存中的所述设备ID、所述上下线日期和所述上下线时间,在所述Hbase中新增所述设备活跃数据;当所述第一上下线日期与所述第二上下线日期相同,且所述缓存中的第一上下线时间早于所述Hbase中的第二上下线时间,将所述Hbase中所述设备活跃数据替换为所述缓存中的所述设备活跃数据;当所述第一上下线日期与所述第二上下线日期相同,且所述缓存中的第一上下线时间晚于或者等于所述Hbase中的第二上下线时间,不作处理。Optionally, adding, modifying or not processing the device active data in Hbase according to the device active data in the cache includes: according to the device ID in the cache, The Hbase is inquired; when the device active data of the same device exists in the cache and the Hbase, the first log-off date in the cache is compared with the second log-off date in the Hbase; When the first online and offline dates are different from the second online and offline dates, according to the device ID in the cache, the online and offline dates and the online and offline time, add the new online and offline time in the Hbase The active data of the device; when the first online and offline date is the same as the second online and offline date, and the first online and offline time in the cache is earlier than the second online and offline time in the Hbase, the The device active data in the Hbase is replaced with the device active data in the cache; when the first online and offline date is the same as the second online and offline date, and the first online and offline time in the cache is later If it is equal to or equal to the second online and offline time in the Hbase, it will not be processed.
可选地,所述基于批处理服务,对所述Hbase中的所述设备活跃数据进行批量处理,获得活跃基础统计数据,并将所述活跃基础统计数据存入数据库,包括:基于所述批处理服务,批量获取所述Hbase中的所述设备活跃数据;对所述设备活跃数据作批量解析,获得所述设备活跃数据中的用户字段、产品字段以及设备字段;将解析后的所述设备活跃数据组装成上下线数据序列,所述上下线数据序列由多个单元数据组成;从所述上下线数据序列中的最后一个所述单元数据开始向前递归,根据所述单元数据判断设备在所述指定时段内的所述活跃情况;根据所述用户字段、所述产品字段、所述设备字段和所述指定时段,对所述活跃情况进行聚合计算,获得所述活跃基础统计数据,并将所述活跃基础统计数据存入所述数据库。Optionally, the batch-based service performs batch processing on the device active data in the Hbase, obtains active basic statistical data, and stores the active basic statistical data in a database, including: based on the batch Processing services, obtaining the device active data in the Hbase in batches; analyzing the device active data in batches to obtain user fields, product fields and device fields in the device active data; analyzing the device active data The active data is assembled into an on-line and off-line data sequence, and the on-line and off-line data sequence is composed of a plurality of unit data; starting from the last unit data in the on-line and off-line data sequence, recursively forward, judging from the unit data that the device is in The active status within the specified time period; according to the user field, the product field, the device field and the specified time period, perform aggregation calculations on the active status to obtain the active basic statistical data, and storing said activity base statistics data in said database.
可选地,所述单元数据包括设备操作和操作时间;所述从所述上下线数据序列中的最后 一个所述单元数据开始向前递归,根据所述单元数据判断设备在所述指定时段内的所述活跃情况,包括:当所述操作时间处于所述指定时段内,判断所述设备在所述指定时段内活跃;当所述操作时间晚于所述指定时段的结束时间,取前一个所述单元数据重新判断;当所述操作时间早于或等于所述指定时段的开始时间,且所述设备操作为上线,判断所述设备在所述指定时段内活跃;当所述操作时间早于或等于所述指定时段的开始时间,且所述设备操作为下线,判断所述设备在所述指定时段内非活跃。Optionally, the unit data includes device operation and operation time; the recursion starts from the last unit data in the online and offline data sequence, and it is judged according to the unit data that the device is within the specified time period The active status of the device includes: when the operation time is within the specified period, judging that the device is active within the specified period; when the operation time is later than the end time of the specified period, take the previous The unit data is re-judged; when the operation time is earlier than or equal to the start time of the specified period, and the operation of the device is online, it is determined that the device is active within the specified period; when the operation time is earlier If it is at or equal to the start time of the specified time period, and the operation of the device is offline, it is determined that the device is inactive within the specified time period.
第二方面,本申请实施例提供了一种物联网设备的活跃数据处理系统,包括:第一模块,用于获取设备上报消息,并将所述设备上报消息推送至分布式的消息队列;其中,所述设备上报消息包括消息类型;第二模块,用于基于流计算服务,在所述消息队列中构建所述设备上报消息的数据流;第三模块,用于根据所述消息类型,对所述数据流进行清洗,获得仅包含设备活跃数据的所述数据流;第四模块,用于根据完成清洗后的所述数据流,对缓存中的所述设备活跃数据进行新增、修改或不处理;第五模块,用于根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理;第六模块,用于基于批处理服务,对所述Hbase中的所述设备活跃数据进行批量处理,获得活跃基础统计数据,并将所述活跃基础统计数据存入数据库。In the second aspect, the embodiment of the present application provides an active data processing system of an Internet of Things device, including: a first module, configured to obtain a device report message, and push the device report message to a distributed message queue; wherein , the message reported by the device includes a message type; the second module is configured to construct a data flow of the message reported by the device in the message queue based on the flow computing service; the third module is configured to generate the message according to the message type The data stream is cleaned to obtain the data stream containing only the active data of the device; the fourth module is used to add, modify or modify the active data of the device in the cache according to the data stream after cleaning. No processing; the fifth module is used to add, modify or not process the device active data in Hbase according to the device active data in the cache; the sixth module is used for batch-based services, Perform batch processing on the device active data in the Hbase to obtain active basic statistical data, and store the active basic statistical data in a database.
第三方面,本申请实施例提供了一种装置,包括:至少一个处理器;至少一个存储器,用于存储至少一个程序;当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如第一方面所述的物联网设备的活跃数据处理方法。In a third aspect, the embodiment of the present application provides an apparatus, including: at least one processor; at least one memory for storing at least one program; when the at least one program is executed by the at least one processor, the At least one processor implements the active data processing method of the IoT device as described in the first aspect.
第四方面,本申请实施例提供了一种计算机存储介质,其中存储有处理器可执行的程序,所述处理器可执行的程序在由所述处理器执行时用于实现如第一方面所述的物联网设备的活跃数据处理方法。In a fourth aspect, an embodiment of the present application provides a computer storage medium, which stores a processor-executable program, and the processor-executable program is used to implement the program described in the first aspect when executed by the processor. The active data processing method of the IoT device described above.
本申请实施例的有益效果如下:首先获取设备上报消息,并将设备上报消息推送至分布式的消息队列;其中,设备上报消息包括消息类型;基于流计算服务,在消息队列中构建设备上报消息的数据流;根据消息类型,对数据流进行清洗,获得仅包含设备活跃数据的数据流;根据完成清洗后的数据流,对缓存中的设备活跃数据进行新增、修改或不处理;根据缓存中的设备活跃数据,对Hbase中的设备活跃数据进行新增、修改或不处理;基于批处理服务,对Hbase中的设备活跃数据进行批量处理,获得活跃基础统计数据,并将活跃基础统计数据存入数据库。本申请通过流处理服务对设备活跃数据进行处理,能够进一步满足海量设备数据所产生的高并发需求。并且本申请引入缓存作为设备活跃数据的缓冲,能够减少与Hbase之间的数据交互,减少大量磁盘I/O处理,从而进一步提升数据处理性能。The beneficial effects of the embodiments of the present application are as follows: firstly obtain the device report message, and push the device report message to the distributed message queue; wherein, the device report message includes the message type; based on the flow computing service, construct the device report message in the message queue According to the message type, the data flow is cleaned to obtain a data flow containing only the active data of the device; according to the cleaned data flow, the active data of the device in the cache is added, modified or not processed; according to the cache Add, modify or not process the device active data in Hbase; based on the batch processing service, perform batch processing on the device active data in Hbase to obtain active basic statistical data and convert the active basic statistical data into the database. This application processes device active data through stream processing services, which can further meet the high concurrency requirements generated by massive device data. Moreover, this application introduces cache as a buffer for device active data, which can reduce data interaction with Hbase and reduce a large amount of disk I/O processing, thereby further improving data processing performance.
附图说明Description of drawings
附图用来提供对本申请技术方案的进一步理解,并且构成说明书的一部分,与本申请的实施例一起用于解释本申请的技术方案,并不构成对本申请技术方案的限制。The accompanying drawings are used to provide a further understanding of the technical solution of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the technical solution of the present application, and do not constitute a limitation to the technical solution of the present application.
图1为本申请实施例提供的物联网设备的活跃数据处理方法的步骤流程图;FIG. 1 is a flow chart of the steps of the active data processing method of the Internet of Things device provided by the embodiment of the present application;
图2为本申请实施例提出的数据采集优化算法的实现流程图;Fig. 2 is the implementation flowchart of the data collection optimization algorithm that the embodiment of the present application proposes;
图3为本申请实施例提供的设备活跃数据从缓存到Hbase中的同步过程的步骤流程图;Fig. 3 is the flow chart of the steps of the synchronization process of the device active data provided by the embodiment of the present application from cache to Hbase;
图4为本申请实施例提供的物联网设备的活跃数据处理系统的示意图;FIG. 4 is a schematic diagram of an active data processing system of an Internet of Things device provided by an embodiment of the present application;
图5为本申请实施例提供的装置的示意图。Fig. 5 is a schematic diagram of the device provided by the embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
需要说明的是,虽然在系统示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于系统中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that although the functional modules are divided in the system schematic diagram and the logical order is shown in the flow chart, in some cases, it can be executed in a different order than the module division in the system or the flow chart steps shown or described. The terms "first", "second" and the like in the specification and claims and the above drawings are used to distinguish similar objects, and not necessarily used to describe a specific sequence or sequence.
下面结合附图,对本申请实施例作进一步阐述。The embodiments of the present application will be further described below in conjunction with the accompanying drawings.
参考图1,图1为本申请实施例提供的物联网设备的活跃数据处理方法的步骤流程图,该方法包括但不限于步骤S100-S150:Referring to FIG. 1, FIG. 1 is a flow chart of the steps of the active data processing method of the Internet of Things device provided by the embodiment of the present application. The method includes but is not limited to steps S100-S150:
S100、获取设备上报消息,并将设备上报消息推送至分布式的消息队列;其中,设备上报消息包括消息类型;S100. Obtain a device report message, and push the device report message to a distributed message queue; wherein, the device report message includes a message type;
具体地,物联网平台通过MQTT(Message Queuing Telemetry Transport,消息队列遥测传输),LWM2M(Lightweight Machine-To-Machine,轻量级M2M)等多种协议接入数量庞大、类型各异的物联网设备,在物联网的日常运营中,设备需要将自身的信息上报到物联网平台中,这些信息称为设备上报信息。因为设备上报信息往往数量庞大,因此物联网平台获取到这些设备上报信息后,会对设备上报消息进行简单解析,并推送至分布式的消息队列,由消息队列暂时存储。Specifically, the Internet of Things platform accesses a large number of different types of Internet of Things devices through various protocols such as MQTT (Message Queuing Telemetry Transport) and LWM2M (Lightweight Machine-To-Machine, Lightweight M2M). , in the daily operation of the Internet of Things, the device needs to report its own information to the Internet of Things platform, and this information is called device reporting information. Because the information reported by devices is often huge, after the IoT platform obtains the information reported by these devices, it will simply analyze the information reported by the devices, and push it to the distributed message queue, which is temporarily stored in the message queue.
可以理解的是,设备上报信息可以包括物联网设备在运行过程中各种情况,因此设备上 报信息可能包括设备故障数据、设备耗能数据或者是设备活跃数据,通过设备上报信息的消息类型可以对多种类型的设备数据进行区分。It can be understood that the information reported by the device can include various conditions of the IoT device during operation, so the information reported by the device may include device failure data, device energy consumption data, or device activity data. The message type of the device reported information can be used for Multiple types of device data are distinguished.
S110、基于流计算服务,在消息队列中构建设备上报消息的数据流;S110. Based on the flow computing service, construct a data flow of the message reported by the device in the message queue;
具体地,基于流计算服务,构建数据源流对象,并指定分布式消息队列中需要消费的数据源,从而在消息队列中构建设备上报消息的数据流,并通过流计算引擎对消息队列中的消费数据(即设备上报消息)进行实时监听。Specifically, based on the flow computing service, build a data source flow object, and specify the data source that needs to be consumed in the distributed message queue, so as to build the data flow of the message reported by the device in the message queue, and use the flow computing engine to consume in the message queue The data (that is, the message reported by the device) is monitored in real time.
S120、根据消息类型,对数据流进行清洗,获得仅包含设备活跃数据的数据流;S120. Clean the data stream according to the message type to obtain a data stream containing only device active data;
具体地,上述内容中提到,通过设备上报信息的消息类型,可以将反映设备不同情况的设备上报消息进行区分。由于本申请实施例提出的是物联网设备的活跃数据处理方法,因此在本步骤中,根据设备上报消息的消息类型,对数据流进行清洗,具体是将数据流中如设备故障数据、设备耗能数据这样的非设备活跃数据清洗、过滤掉,获得仅包含设备活跃数据的数据流。Specifically, it is mentioned in the above content that the message reported by the device reflecting different situations of the device can be distinguished through the message type of the information reported by the device. Since the embodiment of the present application proposes a method for processing active data of IoT devices, in this step, the data flow is cleaned according to the message type of the message reported by the device, specifically, the data in the data flow such as device failure data, device consumption data, etc. Non-device active data such as performance data are cleaned and filtered to obtain a data stream containing only device active data.
可以理解的是,由于设备处于活跃状态,一般是指设备处于上线状态;同理,设备处于非活跃状态,一般是指设备处于下线状态,因此设备活跃数据是能够体现设备上下线状态的数据。It can be understood that since the device is in the active state, it generally means that the device is in the online state; similarly, the device is in the inactive state, which generally means that the device is in the offline state, so the active data of the device is the data that can reflect the state of the device in the offline state. .
更具体地,设备活跃数据包括但不限于设备ID、设备的上下线日期和设备的上下线时间,可以理解的是,上下线日期分为上线日期和下线日期,上下线时间分为上线时间和下线时间,设备上线或设备下线这个操作所对应的时间,就是设备的上线时间或下线时间。More specifically, device active data includes but not limited to device ID, device log-off date, and device log-off time. It can be understood that the log-off date is divided into the log-on date and the log-off date, and the log-off time is divided into the log-on time And offline time, the time corresponding to the operation of device online or device offline is the online time or offline time of the device.
S130、根据完成清洗后的数据流,对缓存中的设备活跃数据进行新增、修改或不处理;S130. Add, modify, or not process the device active data in the cache according to the cleaned data stream;
具体地,本申请实施例在对物联网设备的活跃数据处理的流程中引入了缓存,该缓存为一个窗口汇聚数据库,设备活跃数据先从消息队列中进入缓存,在缓存中运行本申请实施例提出的数据采集优化算法,对设备活跃数据作进一步的筛选、整合,再将缓存中的数据整合到Hbase中,从而极大地减少进入Hbase中给的数据量,通过减少数据量来减少磁盘I/O处理,达到提升数据处理能力的目的。Specifically, the embodiment of the present application introduces a cache in the process of processing the active data of the Internet of Things device. The cache is a window aggregation database. The active data of the device enters the cache from the message queue first, and the embodiment of the present application is run in the cache. The proposed data acquisition optimization algorithm further screens and integrates the active data of the equipment, and then integrates the data in the cache into Hbase, thereby greatly reducing the amount of data entering Hbase, and reducing the disk I/O by reducing the amount of data. O processing, to achieve the purpose of improving data processing capabilities.
上述内容中提到,设备活跃数据包括但不限于设备ID、设备的上下线日期和设备的上下线时间。本申请实施例提出的数据采集优化算法能够根据设备ID等数据,将数据流的设备活跃数据整合到缓存内的设备活跃数据中。下面阐述数据采集优化算法的具体流程。As mentioned above, device active data includes but is not limited to device ID, device log-off date and device log-off time. The data acquisition optimization algorithm proposed in the embodiment of the present application can integrate the device activity data of the data stream into the device activity data in the cache according to the device ID and other data. The specific flow of the data acquisition optimization algorithm is described below.
参照图2,图2为本申请实施例提出的数据采集优化算法的实现流程图,该方法包括但不限于步骤S200-S250:With reference to Fig. 2, Fig. 2 is the implementation flowchart of the data acquisition optimization algorithm proposed by the embodiment of the present application, the method includes but not limited to steps S200-S250:
S200、根据数据流中的设备ID和上下线日期,对缓存进行查询;S200. Query the cache according to the device ID and the online and offline dates in the data stream;
具体地,以数据流中的设备ID和上下线日期作为key,在缓存中进行查询。Specifically, the query is performed in the cache using the device ID and the online and offline dates in the data stream as keys.
可以理解的是,当数据从数据流移动到缓存,或者从缓存移动到Hbase的过程中,可能需要进行一些存储格式方面的调整,导致数据流、缓存和Hbase中的设备活跃数据可能会有差异,但是数据流、缓存和Hbase中的设备活跃数据均包含设备ID、设备的上下线日期、设备的上下线时间等能够体现某个设备的活跃情况的信息,因此在本申请实施例中,将数据流、缓存和Hbase中存储的体现设备的活跃情况的数据都统称为设备活跃数据,并且可以将位于数据流、缓存和Hbase不同地方的设备活跃数据进行对比,从而确定后续的数据处理步骤。It is understandable that when the data is moved from the data stream to the cache, or from the cache to Hbase, some storage format adjustments may be required, resulting in differences in the data stream, cache, and device active data in Hbase , but the device active data in the data stream, cache, and Hbase all contain information that can reflect the activity of a certain device, such as the device ID, the date of the device's log-in and log-out, and the device's log-in and log-out time. Therefore, in the embodiment of this application, the Data streams, caches, and data stored in Hbase that reflect device activity are collectively referred to as device activity data, and device activity data located in different places in data streams, caches, and Hbase can be compared to determine subsequent data processing steps.
S210、判断数据流和缓存中是否存在同一设备在同一日期内的设备活跃数据;S210, judging whether there is device active data of the same device in the same date in the data stream and cache;
具体地,步骤S200通过设备ID和上下线日期这两个条件,对缓存内的设备活跃数据进行查询,若缓存中查询到数据流中的设备ID,说明数据流和缓存中存在同一设备的数据;若缓存中查询到该设备的上下线日期与数据流中的上下线信息相同,则说明数据流和缓存中存在同一设备在同一日期内的设备活跃数据。Specifically, step S200 queries the active data of the device in the cache through the two conditions of the device ID and the online and offline date. If the device ID in the data stream is found in the cache, it means that there is data of the same device in the data stream and the cache. ; If the online and offline date of the device queried in the cache is the same as the online and offline information in the data stream, it means that there is device active data of the same device on the same date in the data stream and cache.
因此,本步骤是对步骤S200实现的查询的结果进行判断,也就是判断数据流和缓存中是否存在同一设备在同一日期内的设备活跃数据,若否,跳转到步骤S220;若是,跳转到步骤S230。Therefore, this step is to judge the result of the query realized in step S200, that is, to judge whether there is device active data of the same device in the same date in the data stream and cache, if not, jump to step S220; if yes, jump to Go to step S230.
S220、根据数据流中的设备ID、上下线日期和上下线时间,在缓存中新增设备活跃数据;S220, according to the device ID in the data stream, the date of going online and going online, and the time of going online and going online, add active data of the device in the cache;
具体地,根据步骤S210判断确定数据流和缓存中不存在同一设备在同一日期内的设备活跃数据,则设备ID、上下线日期和上下线时间,在缓存中新增一条设备活跃数据。Specifically, according to the judgment of step S210, it is determined that there is no device active data of the same device in the same date in the data stream and cache, then add a device active data in the cache for the device ID, log-off date and log-off time.
S230、判断数据流中的第一上下线时间是否晚于缓存中的第二上下线时间;S230. Determine whether the first log-off time in the data stream is later than the second log-off time in the cache;
具体地,根据步骤S210判断确定数据流和缓存中存在同一设备在同一日期内的设备活跃数据,则在本步骤中进行进一步的判断,将数据流中的第一上下线时间和缓存中的第二上下线时间进行比较判断。Specifically, according to the judgment of step S210, it is determined that there is device active data of the same device in the same date in the data stream and the cache, then a further judgment is made in this step, and the first online and offline time in the data stream and the first online and offline time in the cache are combined. 2. Comparing and judging the online and offline time.
上述内容中提到,上下线时间分为上线时间和下线时间,则可以理解的是,第一上下线时间和第二上下线时间进行比较时,应该是第一上下线时间中的上线时间和第二上下线时间中的上线时间对比,第一上下线时间中的下线时间和第二上下线时间中的下线时间。As mentioned above, the online and offline time is divided into the online time and the offline time. It can be understood that when the first online and offline time is compared with the second online and offline time, it should be the online time in the first online and offline time Compared with the online time in the second online and offline time, the offline time in the first online and offline time is the offline time in the second online and offline time.
根据上述对比方法,判断数据流中的第一上下线时间是否晚于缓存中的第二上下线时间,若否,跳转到步骤S240;若是,跳转到步骤S250。According to the comparison method above, it is judged whether the first log-off time in the data stream is later than the second log-off time in the cache, if not, go to step S240; if yes, go to step S250.
S240、不对缓存中的设备活跃数据作处理;S240, do not process the device active data in the cache;
具体地,根据步骤S230判断数据流中的第一上下线时间早于或等于缓存中的第二上下线时间,则不对缓存中的设备活跃数据作处理。Specifically, according to step S230, it is determined that the first log-off time in the data stream is earlier than or equal to the second log-off time in the cache, and then the device active data in the cache is not processed.
可以理解的是,第一上下线时间早于或等于第二上下线时间具体是指:第一上下线时间中的上线时间早于或等于第二上下线时间中的上线时间,且第一上下线时间中的下线时间早于或等于第二上下线时间中的下线时间。It can be understood that the first online and offline time is earlier than or equal to the second online and offline time specifically means: the online and offline time in the first online and offline time is earlier than or equal to the online and offline time in the second online and offline time, and the first online and offline time The offline time in the online time is earlier than or equal to the offline time in the second online and offline time.
S250、缓存中的第二上下线时间替换为第一上下线时间;S250, replacing the second online and offline time in the cache with the first online and offline time;
具体地,根据步骤S230判断数据流中的第一上下线时间晚于缓存中的第二上下线时间,将缓存中该设备ID对应的第二上下线时间替换为第一上下线时间。Specifically, according to step S230, it is determined that the first log-off time in the data stream is later than the second log-off time in the cache, and replace the second log-off time corresponding to the device ID in the cache with the first log-off time.
可以理解的时间,第一上下线时间晚于第二上下线时间具体是指:第一上下线时间中的上线时间晚于第二上下线时间中的上线时间和/或第一上下线时间中的下线时间早于或等于第二上下线时间中的下线时间,则将第二上下线时间替换为第一上下线时间具体是指:用第一上下线时间中更晚的上线时间替换第二上下线时间中的上线时间和/或用第一上下线时间中更晚的下线时间替换第二上下线时间中的下线时间。It can be understood that when the first online and offline time is later than the second online and offline time, it specifically means that the online time in the first online and offline time is later than the online time in the second online and offline time and/or in the first online and offline time If the offline time of is earlier than or equal to the offline time in the second online and offline time, then replace the second online and offline time with the first online and offline time. The online time in the second online online time and/or replace the offline time in the second online online time with a later offline time in the first online online time.
通过步骤S200-S250,本申请实施例提供了一种数据采集优化算法,对数据流运行该数据采集优化算法,能够极大程度地对设备活跃数据进行精简。以智慧社区电梯类设备上报频率为例,传统的设备日活存储的记录多达86400条一天,采用设备活跃优化采集算法计算日活仅需存储1条原始记录用于日活及月活计算。以智慧电力的电表类设备上报频率为例,传统的设备日活存储的记录多达96条一天,采用设备活跃优化采集算法计算日活仅需存储1条原始记录用于日活及月活计算。因此,本申请实施例中的数据采集优化算法有助于减少缓存内的数据存储量,有效规避了高频率上下线设备给系统带来的存储和计算压力。Through steps S200-S250, the embodiment of the present application provides a data collection optimization algorithm, and running the data collection optimization algorithm on the data stream can greatly simplify the active data of the device. Taking the reporting frequency of elevator equipment in a smart community as an example, traditional equipment can store as many as 86,400 records per day. Using the equipment activity optimization collection algorithm to calculate daily activity only needs to store 1 original record for daily and monthly activity calculations. Taking the reporting frequency of electric meter equipment in smart power as an example, the traditional daily activity of the equipment stores as many as 96 records per day, and the calculation of the daily activity using the equipment activity optimization collection algorithm only needs to store one original record for daily and monthly activity calculations. Therefore, the data acquisition optimization algorithm in the embodiment of the present application helps to reduce the amount of data stored in the cache, and effectively avoids the storage and calculation pressure brought by high-frequency offline devices to the system.
步骤S130已经通过上述内容阐述完毕,下面开始阐述步骤S140。Step S130 has been described through the above content, and step S140 will be described below.
S140、根据缓存中的设备活跃数据,对Hbase中的设备活跃数据进行新增、修改或不处理;S140. Add, modify or not process the device active data in Hbase according to the device active data in the cache;
具体地,物联网中各种业务对时限的要求不一样,可以根据业务需求,设置定时任务;该定时任务包括操作间隔时长。每经过一段操作间隔时长,则根据缓存中的设备活跃数据,对Hbase中的设备活跃数据进行新增、修改或不处理,从而达到定时将缓存中的设备活跃数据同步到Hbase中的目的。Specifically, various businesses in the Internet of Things have different requirements for time limits, and a timing task can be set according to business requirements; the timing task includes the duration of the operation interval. After each operation interval, according to the device active data in the cache, add, modify or not process the device active data in Hbase, so as to achieve the purpose of synchronizing the device active data in the cache to Hbase at regular intervals.
而在数据同步的过程中,可以对缓存中的数据进一步精简,进一步降低与Hbase中数据交互次数,从而提高数据处理的性能。下面阐述设备活跃数据从缓存到Hbase中的同步过程。In the process of data synchronization, the data in the cache can be further simplified, and the number of data interactions with Hbase can be further reduced, thereby improving the performance of data processing. The following describes the synchronization process of device active data from cache to Hbase.
参照图3,图3为本申请实施例提供的设备活跃数据从缓存到Hbase中的同步过程的步骤流程图,该方法包括但不限于步骤S300-S370:Referring to Fig. 3, Fig. 3 is a flow chart of the steps of the synchronization process of the device active data from the cache to the Hbase provided by the embodiment of the present application. The method includes but is not limited to steps S300-S370:
S300、根据缓存中的设备ID,对Hbase进行查询;S300, query Hbase according to the device ID in the cache;
具体地,以缓存中的设备ID作为key,在Hbase进行查询。Specifically, use the device ID in the cache as the key to query in Hbase.
S310、判断缓存和Hbase中是否存在同一设备的设备活跃数据;S310, judging whether there is device active data of the same device in the cache and Hbase;
具体地,若Hbase中存在与缓存中的设备ID相同的设备ID,则说明缓存和Hbase中存在同一设备的设备活跃数据。因此本步骤中,判断缓存和Hbase中是否存在同一设备的设备活跃数据,若否,跳转到步骤S320;若是,跳转到步骤S330。Specifically, if there is a device ID identical to the device ID in the cache in the Hbase, it means that the device active data of the same device exists in the cache and Hbase. Therefore, in this step, it is judged whether there is device active data of the same device in the cache and Hbase, if not, jump to step S320; if yes, jump to step S330.
S320、根据设备ID、上下线日期和上下线时间,在Hbase中新增设备活跃数据;S320, according to equipment ID, on-line date and on-line time, add equipment active data in Hbase;
具体地,根据步骤S310判断缓存和Hbase中不存在同一设备的设备活跃数据,则根据设备ID、上下线日期和上下线时间,在Hbase中新增该设备ID所对应的设备活跃数据。可以理解的是,以后该设备ID所对应的设备活跃数据则记录在这个新建的设备ID下。Specifically, according to step S310, it is judged that the device active data of the same device does not exist in the cache and Hbase, then according to the device ID, the date of going online and the time of going online and going online, the device active data corresponding to the device ID is added in Hbase. It can be understood that, in the future, the device activity data corresponding to the device ID will be recorded under the newly created device ID.
S330、判断缓存中的第一上下线日期和Hbase中的第二上下线日期是否相同;S330, judging whether the first log-off date in the cache is the same as the second log-off date in Hbase;
具体地,根据步骤S310判断缓存和Hbase中存在同一设备的设备活跃数据,则进一步判断该设备ID对应的上下线日期。Specifically, according to step S310, it is judged that there is device active data of the same device in the cache and Hbase, and then the date corresponding to the device ID is further judged.
可以理解的是,第一上下线日期中的上线日期与第二上下线日期中的上线日期相比,第一上下线日期中的下线日期与第二上下线日期中的下线日期相比。本步骤中,判断缓存中的第一上下线日期和Hbase中的第二上下线日期是否相同,若否,跳转到步骤S340;若是,跳转到步骤S350。It can be understood that the online date in the first online and offline date is compared with the online date in the second online and offline date, and the offline date in the first online and offline date is compared with the offline date in the second online and offline date . In this step, it is judged whether the first log-off date in the cache is the same as the second log-off date in Hbase, if not, go to step S340; if yes, go to step S350.
S340、在Hbase中新增设备活跃数据;S340, adding device active data in Hbase;
具体地,根据步骤S330判断,第一上下线日期和第二上下线日期不相同,也就是第一上下线日期中的上线日期与第二上下线日期中的上线日期不相同,且第一上下线日期中的下线日期与第二上下线日期中的下线日期不相同,则在Hbase中的该设备ID下,插入缓存中的设备活跃数。Specifically, according to the judgment in step S330, the first online and offline date is different from the second online and offline date, that is, the online date in the first online and offline date is different from the online date in the second online and offline date, and the first online and offline date If the offline date in the offline date is different from the offline date in the second online and offline date, then under the device ID in Hbase, insert the active number of the device in the cache.
S350、判断缓存中的第一上下线时间是否早于Hbase中的第二上下线时间;S350, judging whether the first online and offline time in the cache is earlier than the second online and offline time in Hbase;
具体地,根据步骤S330判断,第一上下线日期和第二上下线日期相同,也就是第一上下线日期中的上线日期与第二上下线日期中的上线日期不相同和/或第一上下线日期中的下线日期与第二上下线日期中的下线日期不相同,则进一步缓存中的第一上下线时间是否早于Hbase中的第二上下线时间,若否,跳转到步骤S360;若是跳转到步骤S370。Specifically, according to the judgment in step S330, the first online and offline date is the same as the second online and offline date, that is, the online date in the first online and offline date is different from the online date in the second online and offline date, and/or the first online and offline date is different. The offline date in the online date is different from the offline date in the second online and offline date, whether the first online and offline time in the further cache is earlier than the second online and offline time in Hbase, if not, skip to the step S360; if so, go to step S370.
可以理解的是,第一上下线时间中的上线时间和第二上下线时间中的上线时间对比,第一上下线时间中的下线时间和第二上下线时间中的下线时间对比。It can be understood that the online time in the first online and offline time is compared with the online time in the second online and offline time, and the offline time in the first online and offline time is compared with the offline time in the second online and offline time.
S360、舍弃当前数据流;S360, abandoning the current data flow;
具体地,根据步骤S350判断,第一上下线时间晚于或等于Hbase中的第二上下线时间, 也就是第一上下线时间中的上线时间晚于或等于第二上下线时间中的上线时间,且第一上下线时间中的下线时间晚于或等于第二上下线时间中的下线时间,则对Hbase不作处理,舍弃缓存中的当前数据流。Specifically, according to the judgment of step S350, the first online and offline time is later than or equal to the second online and offline time in Hbase, that is, the online time in the first online and offline time is later than or equal to the online time in the second online and offline time , and the offline time in the first online and offline time is later than or equal to the offline time in the second online and offline time, then Hbase will not be processed, and the current data stream in the cache will be discarded.
S370、将Hbase中设备活跃数据替换为缓存中的设备活跃数据;S370. Replace the device active data in the Hbase with the device active data in the cache;
具体地,根据步骤S350判断,第一上下线时间早于Hbase中的第二上下线时间,也就是第一上下线时间中的上线时间早于第二上下线时间中的上线时间和/或第一上下线时间中的下线时间早于第二上下线时间中的下线时间,则将Hbase中设备活跃数据替换为缓存中的设备活跃数据,具体是将Hbase中的设备活跃数据删除,并根据缓存中的设备活跃数据新增一条设备活跃数据。Specifically, according to the judgment of step S350, the first online and offline time is earlier than the second online and offline time in Hbase, that is, the online time in the first online and offline time is earlier than the online time in the second online and offline time and/or the first online and offline time. If the offline time in the first online and offline time is earlier than the offline time in the second online and offline time, the device active data in Hbase is replaced with the device active data in the cache, specifically, the device active data in Hbase is deleted, and Add a piece of device active data based on the device active data in the cache.
通过步骤S300-S370,本申请实施例提供了一种设备活跃数据从缓存到Hbase中的同步过程,步骤S140已经阐述完毕,下面开始阐述步骤S150。Through steps S300-S370, the embodiment of the present application provides a synchronization process of device active data from cache to Hbase. Step S140 has been described, and step S150 will be described below.
S150、基于批处理服务,对Hbase中的设备活跃数据进行批量处理,获得活跃基础统计数据,并将活跃基础统计数据存入数据库;S150, based on the batch processing service, perform batch processing on the device active data in the Hbase, obtain the active basic statistical data, and store the active basic statistical data in the database;
具体地,本申请实施例提出根据批处理引擎来建立批处理任务,从而对Hbase中的设备活跃数据进行批量处理,最终得到活跃基础统计数据,并将活跃基础统计数据存入数据库。后续当物联网平台需要进行用户级别、产品级别或平台级别的设备日活、月活统计,则可以直接调用数据库中的活跃基础统计数据进行统计。Specifically, the embodiment of the present application proposes to establish a batch processing task based on a batch processing engine, thereby performing batch processing on device active data in Hbase, and finally obtaining active basic statistical data, and storing the active basic statistical data in the database. In the future, when the IoT platform needs to perform user-level, product-level, or platform-level equipment daily and monthly activity statistics, it can directly call the active basic statistical data in the database for statistics.
在本申请实施例中,根据批处理引擎来处理Hbase中的设备活跃数据的方法包括:首先基于批处理服务,批量获取Hbase中的设备活跃数据;然后通过扩展批处理JobTask,对设备活跃数据作批量解析,获得设备活跃数据中的用户字段、产品字段以及设备字段,并且将解析后的设备活跃数据组装成由多个单元数据组成的上下线数据序列。然后,根据本申请实施例提出的活跃判定优化算法对上下线数据序列进行逻辑处理,也就是从上下线数据序列中的最后一个单元数据开始向前递归,根据单元数据判断设备在指定时段内的活跃情况。结束对上下线数据序列的处理后,根据解析得到用户字段、产品字段、设备字段,在指定时段内对活跃情况进行聚合计算,获得不同级别的活跃基础统计数据。例如可以计算得到同一个用户所对应的多个物联网设备在指定时段内的活跃基础统计数据,也可以计算得到所有相同产品(例如智慧社区中的所有摄像头、人脸门禁等等物联网设备)在指定时段内的活跃基础统计数据,还可以计算得到物联网平台内所有设备的活跃基础统计数据。最后,将计算得到的活跃基础统计数据存入数据库。In the embodiment of the present application, the method for processing the device active data in Hbase according to the batch processing engine includes: firstly, based on the batch processing service, obtaining the device active data in Hbase in batches; Batch parsing to obtain user fields, product fields, and device fields in device active data, and assemble the parsed device active data into an on-line and on-line data sequence composed of multiple unit data. Then, according to the active determination optimization algorithm proposed in the embodiment of the present application, logically process the online and offline data sequence, that is, start recursively from the last unit data in the online and offline data sequence, and judge the status of the device within a specified period according to the unit data. Active situation. After completing the processing of the online and offline data sequence, the user field, product field, and device field are obtained according to the analysis, and the activity status is aggregated and calculated within a specified period of time to obtain basic statistics of activity at different levels. For example, it is possible to calculate the active basic statistics of multiple IoT devices corresponding to the same user within a specified period of time, and it is also possible to calculate all the same products (such as all cameras in the smart community, face access control, etc. IoT devices) The active basic statistical data within a specified period can also be calculated to obtain the active basic statistical data of all devices in the IoT platform. Finally, the calculated active basic statistics are stored in the database.
下面阐述本申请实施例提出的活跃判定优化算法的实现过程。首先确定上下线数据序列 中最后一个单元数据,该单元数据包括设备操作和操作时间,设备操作是指设备上线或设备下线,操作时间是指设备上线的时间或设备下线的时间。以当前的单元数据为标准,当操作时间处于指定时段内,则判断设备在指定时段内活跃;当操作时间晚于指定时段的结束时间,则取上下线数据序列中的前一个单元数据,重新进行活跃判定;当操作时间早于或等于指定时段的开始时间,且设备操作为上线,判断设备在指定时段内活跃;当操作时间早于或等于指定时段的开始时间,且设备操作为下线,判断设备在指定时段内非活跃。The implementation process of the liveness determination optimization algorithm proposed in the embodiment of the present application is described below. Firstly, determine the last unit data in the online and offline data sequence. The unit data includes equipment operation and operation time. Equipment operation refers to the time when the device goes online or goes offline, and the operation time refers to the time when the device goes online or when the device goes offline. Taking the current unit data as the standard, when the operation time is within the specified period, it is judged that the device is active within the specified period; when the operation time is later than the end time of the specified period, the previous unit data in the offline data sequence is taken and re Perform active judgment; when the operation time is earlier than or equal to the start time of the specified period, and the device operation is online, it is judged that the device is active within the specified period; when the operation time is earlier than or equal to the start time of the specified period, and the device operation is offline , to determine that the device has been inactive for a specified period of time.
根据以上内容,对上下线数据序列运行活跃判定优化算法,最终确定活跃基础统计数据。Based on the above content, run the activity determination optimization algorithm on the online and offline data series, and finally determine the active basic statistical data.
通过步骤S100-S150,本申请实施例提供了一种物联网设备的活跃数据处理方法,该方法包括:首先获取设备上报消息,并将设备上报消息推送至分布式的消息队列;其中,设备上报消息包括消息类型;基于流计算服务,在消息队列中构建设备上报消息的数据流;根据消息类型,对数据流进行清洗,获得仅包含设备活跃数据的数据流;根据完成清洗后的数据流,根据数据采集优化算法对缓存中的设备活跃数据进行新增、修改或不处理;根据缓存中的设备活跃数据,对Hbase中的设备活跃数据进行新增、修改或不处理;基于批处理服务,根据活跃判定优化算法对Hbase中的设备活跃数据进行批量处理,获得活跃基础统计数据,并将活跃基础统计数据存入数据库。本申请通过流处理服务对设备活跃数据进行处理,能够进一步满足海量设备数据所产生的高并发需求。并且本申请引入缓存作为设备活跃数据的缓冲,能够减少与Hbase之间的数据交互,减少大量磁盘I/O处理,从而进一步提升数据处理性能。Through steps S100-S150, the embodiment of the present application provides a method for processing active data of an Internet of Things device. The method includes: first obtaining a device report message, and pushing the device report message to a distributed message queue; wherein, the device report The message includes the message type; based on the flow computing service, the data flow of the device reporting the message is constructed in the message queue; according to the message type, the data flow is cleaned to obtain a data flow containing only the active data of the device; according to the cleaned data flow, Add, modify or not process the device active data in the cache according to the data collection optimization algorithm; add, modify or not process the device active data in Hbase according to the device active data in the cache; based on batch processing services, According to the active determination optimization algorithm, the device active data in Hbase is batch-processed to obtain active basic statistical data, and the active basic statistical data is stored in the database. This application processes device active data through stream processing services, which can further meet the high concurrency requirements generated by massive device data. Moreover, this application introduces cache as a buffer for device active data, which can reduce data interaction with Hbase and reduce a large amount of disk I/O processing, thereby further improving data processing performance.
下面,结合上述多个实施例以及实际情况,阐述本申请实施例提供的物联网设备的活跃数据处理方法,具体步骤如下:Next, in combination with the above-mentioned multiple embodiments and the actual situation, the active data processing method of the Internet of Things device provided by the embodiment of the present application is described, and the specific steps are as follows:
1)准备21台配置为16G内存、8核心CPU的虚拟机,在9台服务器上部署分布式实时流计算引擎Flink,每3台部署一个Flink集群(Flink集群1~3);在3台服务器上部署Redis缓存服务;在3台服务器上部署分布式消息队列Pulsar;3台服务器部署HBase数据库,3台服务器部署Mysql数据库。1) Prepare 21 virtual machines configured with 16G memory and 8-core CPU, deploy distributed real-time stream computing engine Flink on 9 servers, and deploy a Flink cluster (Flink cluster 1-3) for every 3 servers; Deploy the Redis cache service on the server; deploy the distributed message queue Pulsar on the 3 servers; deploy the HBase database on the 3 servers, and deploy the Mysql database on the 3 servers.
2)在Pulsar上创建设备上报消息的推送队列,用于物联网平台接收海量设备上报的消息推送到指定队列中。2) Create a push queue for device reporting messages on Pulsar, which is used for the IoT platform to receive messages reported by a large number of devices and push them to the designated queue.
3)在其中第1-2的Flink集群上部署数据采集及储存的流计算服务,在第3Flink集群上部署海量设备活跃计算的批处理服务。3) Deploy stream computing services for data collection and storage on the 1st-2nd Flink cluster, and deploy batch processing services for active computing of massive devices on the 3rd Flink cluster.
4)将Flink第1-2集群的流计算服务设置job task数量为30,最大同时支持30个数据流进行数据处理服务;设置数据采集及存储的流计算服务监听Pulsar集群的设备上报消息Topic 队列。4) Set the number of job tasks for the flow computing service of Flink 1-2 clusters to 30, and support a maximum of 30 data streams for data processing services at the same time; set the flow computing service for data collection and storage to monitor the device reporting message Topic queue of the Pulsar cluster .
5)流计算服务在获取设备上报消息后,首先进行消息清洗,将非设备上下线消息进行清洗过滤,之后将清洗后数据进行解析处理,识别出用户,产品,设备,操作时间,操作标识等信息,形成结构化数据。5) After the stream computing service obtains the message reported by the device, it first cleans the message, cleans and filters the non-device offline message, and then analyzes and processes the cleaned data to identify the user, product, device, operation time, operation identifier, etc. information to form structured data.
6)对结构化数据做数据采集优化算法处理,具体处理方法参照上述步骤S200-S250:6) Perform data acquisition optimization algorithm processing on the structured data, and refer to the above steps S200-S250 for specific processing methods:
7)将Flink第3集群的批处理计算服务设置job task数量为15个,最大同时支持15个数据流进行批处理数据服务。7) Set the number of job tasks for the batch computing service of Flink's third cluster to 15, and support a maximum of 15 data streams for batch data services at the same time.
8)物联网平台调度服务每晚00:05开始触发批处理服务计算当日平台海量设备的日活及月活状态。8) The scheduling service of the IoT platform starts to trigger the batch processing service at 00:05 every night to calculate the daily and monthly active status of massive devices on the platform on that day.
9)批处理服务一次性加载Hbase中的海量设备上下线数据,并解析数据,形成结构化List。9) Batch processing service loads massive equipment online and offline data in Hbase at one time, and parses the data to form a structured list.
10)对设备上下线数据List做活跃判定优化算法处理,具体处理方法参照上述步骤S150。若需要计算设备日活,则指定时段则为指定的某一天;若需要计算设备月活,则指定时段则为指定的某个月份。并且将判定活跃的设备标记为1,非活跃的设备标记为0。10) Perform active determination optimization algorithm processing on the device log-off data List, and refer to the above-mentioned step S150 for the specific processing method. If it is necessary to calculate the daily activity of the device, the specified period is a specified day; if it is necessary to calculate the monthly activity of the device, the specified period is a specified month. And the active device is marked as 1, and the inactive device is marked as 0.
11)对判定后设备数据从用户,产品和时间范围的维度进行聚合累加计算,生成活跃基础统计数据。11) After the judgment, the device data is aggregated and calculated from the dimensions of user, product and time range to generate active basic statistical data.
12)等15个job task计算完毕,将海量活跃基础统计数据做入库操作,插入mysql数据库中,做后续业务层数据统计处理。12) After the 15 job tasks are calculated, the massive active basic statistical data will be put into the database, inserted into the mysql database, and then the subsequent business layer data statistics will be processed.
13)当获得将海量设备活跃基础数据后,物联网平台可利用基础数据做用户设备日活月活统计,产品级设备日活月活统计或全平台级设备日活月活统计。13) After obtaining the active basic data of a large number of devices, the IoT platform can use the basic data to make daily and monthly activity statistics of user equipment, product-level equipment daily and monthly activity statistics, or platform-level equipment daily and monthly activity statistics.
另外,参照图4,图4为本申请实施例提供的物联网设备的活跃数据处理系统的示意图,该系统400包括第一模块410、第二模块420、第三模块430、第四模块440、第五模块450和第六模块460。第一模块用于获取设备上报消息,并将设备上报消息推送至分布式的消息队列;其中,设备上报消息包括消息类型;第二模块用于基于流计算服务,在消息队列中构建设备上报消息的数据流;第三模块用于根据消息类型,对数据流进行清洗,获得仅包含设备活跃数据的数据流;第四模块用于根据完成清洗后的数据流,对缓存中的设备活跃数据进行新增、修改或不处理;第五模块用于根据缓存中的设备活跃数据,对Hbase中的设备活跃数据进行新增、修改或不处理;第六模块用于基于批处理服务,对Hbase中的设备活跃数据进行批量处理,获得活跃基础统计数据,并将活跃基础统计数据存入数据库。In addition, referring to FIG. 4, FIG. 4 is a schematic diagram of an active data processing system of an Internet of Things device provided by an embodiment of the present application. The system 400 includes a first module 410, a second module 420, a third module 430, a fourth module 440, The fifth module 450 and the sixth module 460 . The first module is used to obtain the device report message, and push the device report message to the distributed message queue; wherein, the device report message includes the message type; the second module is used to build the device report message in the message queue based on the stream computing service The data flow; the third module is used to clean the data flow according to the message type, and obtain the data flow containing only the device active data; the fourth module is used to clean the device active data in the cache according to the cleaned data flow Add, modify or not process; the fifth module is used to add, modify or not process the device active data in Hbase according to the device active data in the cache; Batch processing of device activity data to obtain active basic statistical data and store active basic statistical data in the database.
参考图5,图5为本申请实施例提供的装置的示意图,该装置500包括至少一个处理器 510,还包括至少一个存储器520,用于存储至少一个程序;图5中以一个处理器及一个存储器为例。Referring to FIG. 5, FIG. 5 is a schematic diagram of a device provided by an embodiment of the present application. The device 500 includes at least one processor 510 and at least one memory 520 for storing at least one program; in FIG. 5, a processor and a memory as an example.
处理器和存储器可以通过总线或者其他方式连接,图5中以通过总线连接为例。The processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 5 .
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。As a non-transitory computer-readable storage medium, memory can be used to store non-transitory software programs and non-transitory computer-executable programs. In addition, the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, flash memory device or other non-transitory solid-state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, which remote memory may be connected to the device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
本申请的另一个实施例还提供了一种装置,该装置可用于执行如上任意实施例中的控制方法,例如,执行以上描述的图1中的方法步骤。Another embodiment of the present application further provides an apparatus, which can be used to execute the control method in any of the above embodiments, for example, execute the method steps in FIG. 1 described above.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
本申请实施例还公开了一种计算机存储介质,其中存储有处理器可执行的程序,其特征在于,处理器可执行的程序在由处理器执行时用于实现本申请提出的物联网设备的活跃数据处理方法。The embodiment of the present application also discloses a computer storage medium, which stores a processor-executable program, which is characterized in that the processor-executable program is used to implement the Internet of Things device proposed in this application when executed by the processor. Active data processing method.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统可以被实施为软件、固件、硬件及其适当的组合。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those skilled in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware and an appropriate combination thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit . Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
以上是对本申请的较佳实施进行了具体说明,但本申请并不局限于上述实施方式,熟悉本领域的技术人员在不违背本申请精神的前提下还可作出种种的等同变形或替换,这些等同 的变形或替换均包含在本申请权利要求所限定的范围内。The above is a specific description of the preferred implementation of the application, but the application is not limited to the above-mentioned implementation, and those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the application. Equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims (10)

  1. 一种物联网设备的活跃数据处理方法,其特征在于,包括:A method for processing active data of an Internet of Things device, characterized in that it comprises:
    获取设备上报消息,并将所述设备上报消息推送至分布式的消息队列;其中,所述设备上报消息包括消息类型;Obtaining a device report message, and pushing the device report message to a distributed message queue; wherein, the device report message includes a message type;
    基于流计算服务,在所述消息队列中构建所述设备上报消息的数据流;Based on the flow computing service, constructing the data flow of the message reported by the device in the message queue;
    根据所述消息类型,对所述数据流进行清洗,获得仅包含设备活跃数据的所述数据流;According to the message type, the data stream is cleaned to obtain the data stream containing only device active data;
    根据完成清洗后的所述数据流,对缓存中的所述设备活跃数据进行新增、修改或不处理;Adding, modifying or not processing the active data of the device in the cache according to the data stream after cleaning;
    根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理;According to the active data of the device in the cache, the active data of the device in Hbase is added, modified or not processed;
    基于批处理服务,对所述Hbase中的所述设备活跃数据进行批量处理,获得活跃基础统计数据,并将所述活跃基础统计数据存入数据库。Based on the batch processing service, perform batch processing on the device active data in the Hbase to obtain active basic statistical data, and store the active basic statistical data in a database.
  2. 根据权利要求1所述的物联网设备的活跃数据处理方法,其特征在于,所述设备活跃数据包括设备ID、设备的上下线日期和设备的上下线时间,所述根据完成清洗后的所述数据流,对缓存中的所述设备活跃数据进行新增、修改或不处理,包括:The active data processing method of an Internet of Things device according to claim 1, wherein the active data of the device includes the device ID, the date of going online and offline of the device, and the time of going online and offline of the device. Data flow, adding, modifying or not processing the active data of the device in the cache, including:
    根据所述数据流中的所述设备ID和所述上下线日期,对所述缓存进行查询;Querying the cache according to the device ID and the online and offline dates in the data stream;
    当查询到所述数据流和所述缓存中不存在同一设备在同一日期内的所述设备活跃数据,根据所述数据流中的所述设备ID、所述上下线日期和所述上下线时间,在所述缓存中新增所述设备活跃数据;When it is found that the data stream and the cache do not contain the device active data of the same device on the same date, according to the device ID in the data stream, the online and offline date, and the online and offline time , adding active data of the device in the cache;
    当查询到所述数据流和所述缓存中存在同一设备在同一日期内的所述设备活跃数据,根据所述上下线时间,对所述缓存中的所述设备活跃数据进行所述修改处理或不作处理。When it is found that the data stream and the cache have the device active data of the same device on the same date, according to the online and offline time, perform the modification process on the device active data in the cache or No processing.
  3. 根据权利要求2所述的物联网设备的活跃数据处理方法,其特征在于,所述当查询到所述数据流和所述缓存中存在同一设备在同一日期内的所述设备活跃数据,根据所述上下线时间,对所述缓存中的所述设备活跃数据进行所述修改处理或不作处理,包括:The method for processing active data of an Internet of Things device according to claim 2, wherein when the device active data of the same device in the same date is found in the data stream and the cache, according to the The above-mentioned online and offline time is used to modify or not process the active data of the device in the cache, including:
    当所述数据流和所述缓存中存在同一设备在同一日期内的所述设备活跃数据,将所述数据流中的第一上下线时间和所述缓存中的第二上下线时间进行比较;When the device active data of the same device on the same date exists in the data stream and the cache, compare the first log-off time in the data stream with the second log-off time in the cache;
    当所述第一上下线时间晚于所述第二上下线时间,则将所述缓存中的所述第二上下线时间替换为所述第一上下线时间;When the first online and offline time is later than the second online and offline time, replace the second online and offline time in the cache with the first online and offline time;
    当所述第一上下线时间早于或等于所述第二上下线时间,不作处理。When the first log-off time is earlier than or equal to the second log-off time, no processing is performed.
  4. 根据权利要求2所述的物联网设备的活跃数据处理方法,其特征在于,所述根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理,包 括:The active data processing method of the Internet of Things device according to claim 2, wherein, according to the active data of the device in the cache, the active data of the device in Hbase is added, modified or deleted. processing, including:
    根据业务需求,设置定时任务;所述定时任务包括操作间隔时长;Set a timing task according to business requirements; the timing task includes an operation interval;
    每经过所述操作间隔时长,根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理。Every time the operation interval passes, according to the device active data in the cache, the device active data in the Hbase is added, modified or not processed.
  5. 根据权利要求2-4任一项所述的物联网设备的活跃数据处理方法,其特征在于,所述根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理,包括:The active data processing method of an Internet of Things device according to any one of claims 2-4, wherein the active data of the device in Hbase is updated according to the active data of the device in the cache. addition, modification or non-processing, including:
    根据所述缓存中的所述设备ID,对所述Hbase进行查询;According to the device ID in the cache, query the Hbase;
    当所述缓存和所述Hbase中存在同一设备的所述设备活跃数据,将所述缓存中的第一上下线日期和所述Hbase中的第二上下线日期作比较;When the device active data of the same device exists in the cache and the Hbase, compare the first log-off date in the cache with the second log-off date in the Hbase;
    当所述第一上下线日期与所述第二上下线日期不相同,根据所述缓存中的所述设备ID、所述上下线日期和所述上下线时间,在所述Hbase中新增所述设备活跃数据;When the first online and offline dates are different from the second online and offline dates, according to the device ID in the cache, the online and offline dates and the online and offline time, add the new online and offline time in the Hbase active data of the above device;
    当所述第一上下线日期与所述第二上下线日期相同,且所述缓存中的第一上下线时间早于所述Hbase中的第二上下线时间,将所述Hbase中所述设备活跃数据替换为所述缓存中的所述设备活跃数据;When the first online and offline date is the same as the second online and offline date, and the first online and offline time in the cache is earlier than the second online and offline time in the Hbase, the device in the Hbase The active data is replaced with the device active data in the cache;
    当所述第一上下线日期与所述第二上下线日期相同,且所述缓存中的第一上下线时间晚于或者等于所述Hbase中的第二上下线时间,不作处理。When the first log-off date is the same as the second log-off date, and the first log-off time in the cache is later than or equal to the second log-off time in the Hbase, no processing is performed.
  6. 根据权利要求1所述的物联网设备的活跃数据处理方法,其特征在于,所述基于批处理服务,对所述Hbase中的所述设备活跃数据进行批量处理,获得活跃基础统计数据,并将所述活跃基础统计数据存入数据库,包括:The active data processing method of an Internet of Things device according to claim 1, wherein the batch-based service performs batch processing on the active data of the device in the Hbase to obtain active basic statistical data, and The active basic statistical data is stored in the database, including:
    基于所述批处理服务,批量获取所述Hbase中的所述设备活跃数据;Based on the batch processing service, obtain the device active data in the Hbase in batches;
    对所述设备活跃数据作批量解析,获得所述设备活跃数据中的用户字段、产品字段以及设备字段;Batch parsing the device active data to obtain user fields, product fields and device fields in the device active data;
    将解析后的所述设备活跃数据组装成上下线数据序列,所述上下线数据序列由多个单元数据组成;Assembling the parsed active data of the device into an on-line and off-line data sequence, the on-line and off-line data sequence is composed of a plurality of unit data;
    从所述上下线数据序列中的最后一个所述单元数据开始向前递归,根据所述单元数据判断设备在指定时段内的所述活跃情况;starting from the last unit data in the online and offline data sequence, and recursing forward, judging the activity of the device within a specified period of time according to the unit data;
    根据所述用户字段、所述产品字段、所述设备字段和所述指定时段,对所述活跃情况进行聚合计算,获得所述活跃基础统计数据,并将所述活跃基础统计数据存入所述数据库。According to the user field, the product field, the device field and the specified time period, the active status is aggregated and calculated to obtain the active basic statistical data, and the active basic statistical data is stored in the database.
  7. 根据权利要求6所述的物联网设备的活跃数据处理方法,其特征在于,所述单元数据 包括设备操作和操作时间;所述从所述上下线数据序列中的最后一个所述单元数据开始向前递归,根据所述单元数据判断设备在所述指定时段内的所述活跃情况,包括:The active data processing method of an Internet of Things device according to claim 6, wherein said unit data includes device operation and operation time; said starting from the last said unit data in said online and offline data sequence to Pre-recursion, judging the activity of the device within the specified time period according to the unit data, including:
    当所述操作时间处于所述指定时段内,判断所述设备在所述指定时段内活跃;When the operation time is within the specified period, it is determined that the device is active within the specified period;
    当所述操作时间晚于所述指定时段的结束时间,取前一个所述单元数据重新判断;When the operation time is later than the end time of the specified time period, the previous unit data is used for re-judgment;
    当所述操作时间早于或等于所述指定时段的开始时间,且所述设备操作为上线,判断所述设备在所述指定时段内活跃;When the operation time is earlier than or equal to the start time of the specified period, and the operation of the device is online, it is determined that the device is active within the specified period;
    当所述操作时间早于或等于所述指定时段的开始时间,且所述设备操作为下线,判断所述设备在所述指定时段内非活跃。When the operation time is earlier than or equal to the start time of the specified period, and the operation of the device is offline, it is determined that the device is inactive within the specified period.
  8. 一种物联网设备的活跃数据处理系统,其特征在于,包括:An active data processing system for an Internet of Things device, characterized in that it includes:
    第一模块,用于获取设备上报消息,并将所述设备上报消息推送至分布式的消息队列;其中,所述设备上报消息包括消息类型;The first module is configured to obtain a device report message, and push the device report message to a distributed message queue; wherein the device report message includes a message type;
    第二模块,用于基于流计算服务,在所述消息队列中构建所述设备上报消息的数据流;The second module is configured to construct a data flow of the message reported by the device in the message queue based on the flow computing service;
    第三模块,用于根据所述消息类型,对所述数据流进行清洗,获得仅包含设备活跃数据的所述数据流;A third module, configured to clean the data stream according to the message type, and obtain the data stream containing only device active data;
    第四模块,用于根据完成清洗后的所述数据流,对缓存中的所述设备活跃数据进行新增、修改或不处理;The fourth module is used to add, modify or not process the active data of the device in the cache according to the data stream after cleaning;
    第五模块,用于根据所述缓存中的所述设备活跃数据,对Hbase中的所述设备活跃数据进行新增、修改或不处理;The fifth module is used to add, modify or not process the device active data in Hbase according to the device active data in the cache;
    第六模块,用于基于批处理服务,对所述Hbase中的所述设备活跃数据进行批量处理,获得活跃基础统计数据,并将所述活跃基础统计数据存入数据库。The sixth module is configured to perform batch processing on the device active data in the Hbase based on a batch processing service, obtain active basic statistical data, and store the active basic statistical data in a database.
  9. 一种装置,其特征在于,包括:A device, characterized in that it comprises:
    至少一个处理器;at least one processor;
    至少一个存储器,用于存储至少一个程序;at least one memory for storing at least one program;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一项所述的物联网设备的活跃数据处理方法。When the at least one program is executed by the at least one processor, the at least one processor implements the active data processing method of the Internet of Things device according to any one of claims 1-7.
  10. 一种计算机存储介质,其中存储有处理器可执行的程序,其特征在于,所述处理器可执行的程序在由所述处理器执行时用于实现如权利要求1-7任一项所述的物联网设备的活跃数据处理方法。A computer storage medium, in which a program executable by a processor is stored, wherein the program executable by the processor is used to implement any one of claims 1-7 when executed by the processor Active data processing methods for IoT devices.
PCT/CN2022/138668 2021-12-14 2022-12-13 Method and apparatus for processing active data for internet of things device, and storage medium WO2023109806A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111523668.0 2021-12-14
CN202111523668.0A CN114385378A (en) 2021-12-14 2021-12-14 Active data processing method and device for Internet of things equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023109806A1 true WO2023109806A1 (en) 2023-06-22

Family

ID=81196015

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/138668 WO2023109806A1 (en) 2021-12-14 2022-12-13 Method and apparatus for processing active data for internet of things device, and storage medium

Country Status (2)

Country Link
CN (1) CN114385378A (en)
WO (1) WO2023109806A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114385378A (en) * 2021-12-14 2022-04-22 天翼物联科技有限公司 Active data processing method and device for Internet of things equipment and storage medium
CN114706537B (en) * 2022-06-02 2022-09-02 深圳市迅犀数字科技有限公司 Inactive data processing method and processing system of Internet of things equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075693A1 (en) * 2015-09-16 2017-03-16 Salesforce.Com, Inc. Handling multiple task sequences in a stream processing framework
CN106873945A (en) * 2016-12-29 2017-06-20 中山大学 Data processing architecture and data processing method based on batch processing and Stream Processing
CN109753531A (en) * 2018-12-26 2019-05-14 深圳市麦谷科技有限公司 A kind of big data statistical method, system, computer equipment and storage medium
CN109800129A (en) * 2019-01-17 2019-05-24 青岛特锐德电气股份有限公司 A kind of real-time stream calculation monitoring system and method for processing monitoring big data
US20190303487A1 (en) * 2018-03-27 2019-10-03 Paypal, Inc. System and platform for computing and analyzing big data
CN110704484A (en) * 2019-09-09 2020-01-17 华迪计算机集团有限公司 Method and system for processing mass real-time data stream
CN114385378A (en) * 2021-12-14 2022-04-22 天翼物联科技有限公司 Active data processing method and device for Internet of things equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170075693A1 (en) * 2015-09-16 2017-03-16 Salesforce.Com, Inc. Handling multiple task sequences in a stream processing framework
CN106873945A (en) * 2016-12-29 2017-06-20 中山大学 Data processing architecture and data processing method based on batch processing and Stream Processing
US20190303487A1 (en) * 2018-03-27 2019-10-03 Paypal, Inc. System and platform for computing and analyzing big data
CN109753531A (en) * 2018-12-26 2019-05-14 深圳市麦谷科技有限公司 A kind of big data statistical method, system, computer equipment and storage medium
CN109800129A (en) * 2019-01-17 2019-05-24 青岛特锐德电气股份有限公司 A kind of real-time stream calculation monitoring system and method for processing monitoring big data
CN110704484A (en) * 2019-09-09 2020-01-17 华迪计算机集团有限公司 Method and system for processing mass real-time data stream
CN114385378A (en) * 2021-12-14 2022-04-22 天翼物联科技有限公司 Active data processing method and device for Internet of things equipment and storage medium

Also Published As

Publication number Publication date
CN114385378A (en) 2022-04-22

Similar Documents

Publication Publication Date Title
WO2023109806A1 (en) Method and apparatus for processing active data for internet of things device, and storage medium
CN110222091B (en) Real-time statistical analysis method for mass data
CN108388479B (en) Delayed message pushing method and device, computer equipment and storage medium
US11212208B2 (en) Adaptive metric collection, storage, and alert thresholds
CN110362455B (en) Data processing method and data processing device
CN109560989B (en) Link monitoring system
CN108228322B (en) Distributed link tracking and analyzing method, server and global scheduler
US20080162690A1 (en) Application Management System
CN110620699B (en) Message arrival rate determination method, device, equipment and computer readable storage medium
CN111222089B (en) Data processing method, data processing device, computer equipment and storage medium
CN110688277A (en) Data monitoring method and device for micro-service framework
CN112148493A (en) Streaming media task management method and device and data server
Meng et al. Monitoring continuous state violation in datacenters: Exploring the time dimension
CN105357026B (en) A kind of resource information collection method and calculate node
CN112306369A (en) Data processing method, device, server and storage medium
CN113505173A (en) Data acquisition synchronization system and synchronization method
US9009735B2 (en) Method for processing data, computing node, and system
KR101968575B1 (en) Method for automatic real-time analysis for bottleneck and apparatus for using the same
CN108628884B (en) Complex event processing method, system and device
CN107566187B (en) SLA violation monitoring method, device and system
CN114389960B (en) Method and system for collecting and reporting network service performance
CN112702376B (en) Real-time transaction monitoring method
CN113590437B (en) Alarm information processing method, device, equipment and medium
CN112256446B (en) Kafka message bus control method and system
CN117609315B (en) Data processing method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22906551

Country of ref document: EP

Kind code of ref document: A1