CN117076850A - Internet of things data cleaning method and device - Google Patents

Internet of things data cleaning method and device Download PDF

Info

Publication number
CN117076850A
CN117076850A CN202310675597.9A CN202310675597A CN117076850A CN 117076850 A CN117076850 A CN 117076850A CN 202310675597 A CN202310675597 A CN 202310675597A CN 117076850 A CN117076850 A CN 117076850A
Authority
CN
China
Prior art keywords
rule
analysis
internet
target
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310675597.9A
Other languages
Chinese (zh)
Inventor
金海卫
王磊
李爱武
程昌权
陆青健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Oriental Cable Network Co ltd
Original Assignee
Shanghai Pudong Oriental Cable Network Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Oriental Cable Network Co ltd filed Critical Shanghai Pudong Oriental Cable Network Co ltd
Priority to CN202310675597.9A priority Critical patent/CN117076850A/en
Publication of CN117076850A publication Critical patent/CN117076850A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16YINFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
    • G16Y40/00IoT characterised by the purpose of the information processing

Abstract

The invention relates to the technical field of data cleaning, in particular to an Internet of things data cleaning method and device, comprising the steps of setting a cleaning target of Internet of things service data; analyzing corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data; formulating analysis rule requirements according to the cleaning target based on the analysis data; defining rules according to analysis rule requirements to obtain target rules; setting analysis rules according to the target rules; selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule; setting execution time, frequency and range based on the encapsulation rule; executing analysis tasks based on the execution time, frequency and range to obtain an analysis report; the analysis report is output in different forms, so that the problem that the rule analysis on the data cleaning of the Internet of things application cannot be realized is solved.

Description

Internet of things data cleaning method and device
Technical Field
The invention relates to the technical field of data cleaning, in particular to an Internet of things data cleaning method and device.
Background
The existing data cleaning technical scheme mainly solves the defects of universality and systematicness, if the data is designed into multi-system associated data, each data source system is required to be respectively modified by making relevant specifications, and after data standardization is completed, data cleaning of standardized data (standardized data in the place is data conforming to standard data specifications) is carried out.
The existing data cleaning technical scheme lacks a targeted scheme for business and industrial data defects, so that a compression and summarization mechanism of mass data with industrial attributes is special processing, is not in the coverage range of a general data cleaning technology, cannot realize regular analysis on data cleaning applied to the Internet of things industry, and reduces the data cleaning effect.
Disclosure of Invention
The invention aims to provide an Internet of things data cleaning method and device, and aims to solve the problem that regular analysis on data cleaning applied in the Internet of things industry cannot be realized.
In order to achieve the above object, in a first aspect, the present invention provides an internet of things data cleaning method, including the following steps:
setting a cleaning target of the business data of the Internet of things;
analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data;
formulating analysis rule requirements according to the cleaning target based on the analysis data;
defining rules according to the analysis rule requirements to obtain target rules;
setting analysis rules according to the target rules;
selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule;
setting execution time, frequency and range based on the encapsulation rule;
executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report;
the analysis report is selected for output in a different form.
The cleaning target comprises alarm characteristics of data reported by the sensor, alarm characteristics required by the internet of things service and parameter characteristics of the data reported by the sensor.
Wherein the analysis rule requirements include a rule name and a rule type.
Wherein the defining the rule according to the analysis rule requirement to obtain the target rule comprises:
searching in a rule base according to the analysis rule requirement, if a corresponding rule exists in the rule base, selecting the corresponding rule as a target rule, and if the corresponding rule does not exist in the rule base, customizing the rule to obtain the target rule.
Wherein, the custom rule includes:
setting the name of a custom rule;
selecting a rule object;
setting an algorithm expression of the object;
storing the name and the algorithm expression to obtain a target rule;
and storing the target rule into the rule base.
Wherein, the setting the analysis rule according to the target rule includes:
setting a verification expression according to the target rule;
setting analysis parameters according to the attribute or the algorithm expression of the object of the target rule;
and storing the verification expression and the analysis parameters to obtain an analysis rule.
Wherein the setting of the execution time, frequency and range based on the encapsulation rule includes:
setting tasks based on the encapsulation rules;
setting the execution time, frequency and range of the task.
In a second aspect, the invention provides an internet of things data cleaning device, which comprises an acquisition module, a rule definition module, a rule task execution module and an analysis report output module, wherein the acquisition module, the rule definition module, the rule task execution module and the analysis report output module are sequentially connected;
the acquisition module is used for acquiring corresponding Internet of things service data from a sensor or Internet of things equipment;
the rule definition module is used for setting a cleaning target of the internet of things service data; analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data; formulating analysis rule requirements according to the cleaning target based on the analysis data; defining rules according to the analysis rule requirements to obtain target rules; setting analysis rules according to the target rules;
the rule task execution module is used for selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule; setting execution time, frequency and range based on the encapsulation rule; executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report; selecting different forms of generated report based on the analysis result;
the analysis report output module is used for outputting the generated report.
According to the method for cleaning the Internet of things data, the cleaning target of the Internet of things service data is set; analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data; formulating analysis rule requirements according to the cleaning target based on the analysis data; defining rules according to the analysis rule requirements to obtain target rules; setting analysis rules according to the target rules; selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule; setting execution time, frequency and range based on the encapsulation rule; executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report; the analysis report is output in different forms, an analysis rule base is abstracted, analysis rules can be defined, the rules are subjected to packaging travel analysis tasks, and data cleaning analysis is realized through task driving. The method solves the problem that rule analysis cannot be carried out on data cleaning applied to the Internet of things industry.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a simplified flow chart of an internet of things data cleaning method provided by the invention.
FIG. 2 is a table diagram of a conventional data processing and cleansing rule base.
Fig. 3 is a detailed flowchart of an internet of things data cleaning method provided by the invention.
Fig. 4 is a schematic structural diagram of an internet of things data cleaning device provided by the invention.
Fig. 5 is a workflow diagram of a rule definition module.
FIG. 6 is a workflow diagram of a rule task execution module.
The system comprises a 1-acquisition module, a 2-rule definition module, a 3-rule task execution module and a 4-analysis report output module.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
Referring to fig. 1 to 3, in a first aspect, the present invention provides an internet of things data cleaning method, which includes the following steps:
s1, setting a cleaning target of internet of things service data;
specifically, the cleaning target comprises alarm features of data reported by the sensor, such as a fireproof door switch, a smoke sensing device for reporting a fire alarm and the like, alarm features required by an internet of things service, such as care of the old (also called four sets: smoke sensing + SOS + gas detection + human body sensing) and the like, and parameter features of data reported by the sensor, such as water pH value, water turbidity and the like. And carrying out default setting on alarm indexes and service parameter characteristics required by common internet of things service. A common data processing and cleaning rule base is shown in fig. 2.
S2, analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data;
specifically, according to the equipment type/equipment model corresponding to the set target, analyzing the internet of things service data or service alarm of the corresponding type/model to obtain analysis data.
S3, formulating analysis rule requirements according to the cleaning target based on the analysis data;
specifically, the analysis rule requirements include a rule name and a rule type.
S4, defining rules according to the analysis rule requirements to obtain target rules;
specifically, searching in a rule base according to the analysis rule requirement, if a corresponding rule exists in the rule base, selecting the corresponding rule as a target rule, and if the corresponding rule does not exist in the rule base, customizing the rule to obtain the target rule.
The custom rule comprises:
s41, setting the name of a custom rule;
s42, selecting a rule object;
in particular, the object may be any object existing in the system, such as a smoke sensing device, a wireless geomagnetic device, etc.
S43, setting an algorithm expression of the object;
specifically, the algorithm expression may be calculated for an attribute field and an associated field of the analysis object.
S44, storing the name and the algorithm expression to obtain a target rule;
s45, storing the target rule into the rule base.
S5, setting analysis rules according to the target rules;
the specific method is as follows:
s51, setting a verification expression according to the target rule;
specifically, the verification expression includes "> =" < "" < "=" "not equal to" "" including "," not including ", and the like.
S52, setting analysis parameters according to the attribute or the algorithm expression of the object of the target rule;
and S53, storing the verification expression and the analysis parameters to obtain an analysis rule.
S6, packaging is carried out by selecting a packaging algorithm according to the analysis rule, so that a packaging rule is obtained;
the encapsulation algorithm may be in a "and" "or" relationship, and may be grouped.
And (3) performing AND operation: it is necessary to meet the algorithm requirements of the rules at the same time.
Or operation: only one of the rules or conditions need be satisfied.
"group" operation: some of the rules may be formed into groups, and then ANDed or calculated within the group, followed by calculation with other groups or rules.
S7, setting execution time, frequency and range based on the encapsulation rule;
the specific method is as follows:
s71, setting tasks based on the packaging rules;
s72 sets the execution time of the task, which may be subdivided into hours, frequencies, hours, days, weeks, etc., and ranges, and may be a street town, a resident, a cell, a unit, etc. The device can be a device type range, namely smoke sensing devices, intelligent well lid devices and the like.
S8, executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report;
specifically, the execution includes manual execution or automatic execution;
and (3) manually performing: and manually executing an analysis task, and immediately executing corresponding analysis to obtain an analysis report.
Automatic execution: and executing according to the time and frequency set by the analysis task.
S9 selects a different form of output from the analysis report.
Specifically, the analysis report supports excel, word, pdf.
Instant notice processing mechanism
The internet of things data cleaning can also solve the problems of repeated faults, alarm and recovery events by introducing a prompt processing mechanism, and the event quantity is compressed and inhibited.
By defining a discriminant rule of the instant notice, i.e. the instant notice. One of the transient conditions is another condition that "failure+recovery" occurs in pairs, and the number of times reaches a certain number of transient decisions within a specified time period. Therefore, the time period and the times are the judgment basis of the instant notice.
Secondly, processing the prompt, namely defining a delay inhibition processing mechanism for the event, and realizing the pressing operation of the prompt by setting parameters such as equipment model, fault level, fault type, duration time of the prompt and the like. When the system receives such faults, the system does not upload the faults, and when the historical threshold is exceeded, the faults are uploaded to fault real-time monitoring. The prompt switch mechanism and the judging condition are flexibly configured by the manager to execute time suppression and upload.
High-frequency fault and alarm pressing mechanism
And (3) repeatedly alarming in a large number within a certain time window, suppressing the uploading of the event according to a fault and alarm frequency setting threshold, and supporting the setting of the conditions such as the type of the Internet of things equipment, the fault/alarm type, the fault level, the fault/alarm reason, the occurrence frequency and the like. The fault/alarm is uploaded if and only if the number of occurrences of the fault/alarm satisfying the condition exceeds/falls below/equals a set threshold value.
Multisource data fusion analysis and assisted learning
Through the internet of things equipment in a specific scene, the 'association equipment' can be constructed according to a certain equipment relationship, for example, smoke sensing equipment of the same unit building can be converged into a group of association equipment, and a plurality of geomagnetic equipment distributed on a large fire fighting channel can be converged into a group of association equipment.
And orderly correlating the Internet of things equipment through natural dimensions such as time, region and the like, researching the time sequence of generating and clearing the faults and alarms of the Internet of things equipment with the correlation, deducing the correlation factor discrimination of the multi-source data, and further realizing the root cause analysis of the multi-source faults or the alarms.
And carrying out association analysis on faults or alarms of a plurality of internet of things equipment which are reported simultaneously, and synchronously or sequentially receiving the gas alarm and the smoke alarm, automatically judging the service alarm caused by gas leakage by the platform, and eliminating false alarms caused by operation (such as igniting the gas for multiple times) by combining with a 'service false alarm dynamic identification rule', so as to accurately locate the real alarm.
And constructing an association traversal matrix, backtracking the association identification process of each fault/alarm in the fusion data management module, and continuing traversing all association rules without ending the identification process when one association rule is met, and summarizing all rule sets of which the alarm meets the association.
Based on the associated traversal matrix, entering an accuracy refining analysis mode, providing a manual auditing mode in a fault/alarm sand table in a period, determining the coverage of an accurate association rule, comparing with the matrix analysis result, sorting, and confirming, and outputting the accuracy of an association system result set.
The invention provides an Internet of things data cleaning method, which is characterized in that a cleaning target of Internet of things service data is set; analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data; formulating analysis rule requirements according to the cleaning target based on the analysis data; defining rules according to the analysis rule requirements to obtain target rules; setting analysis rules according to the target rules; selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule; setting execution time, frequency and range based on the encapsulation rule; executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report; the analysis report is output in different forms, an analysis rule base is abstracted, analysis rules can be defined, the rules are subjected to packaging travel analysis tasks, and data cleaning analysis is realized through task driving. The method solves the problem that rule analysis cannot be carried out on data cleaning applied to the Internet of things industry. In addition, the data fields of the sensor business are abstracted and normalized according to the types according to the application types of the sensor, and unified standardization of the data format is completed at the equipment level.
Referring to fig. 4 to 6, in a second aspect, the present invention provides an internet of things data cleaning device, which includes an acquisition module 1, a rule definition module 2, a rule task execution module 3, and an analysis report output module 4, where the acquisition module 1, the rule definition module 2, the rule task execution module 3, and the analysis report output module 4 are sequentially connected;
the acquisition module 1 is used for acquiring corresponding internet of things service data from a sensor or internet of things equipment;
the rule definition module 2 is used for setting a cleaning target of the internet of things service data; analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data; formulating analysis rule requirements according to the cleaning target based on the analysis data; defining rules according to the analysis rule requirements to obtain target rules; setting analysis rules according to the target rules;
the rule task execution module 3 is used for selecting a packaging algorithm for packaging according to the analysis rule to obtain a packaging rule; setting execution time, frequency and range based on the encapsulation rule; executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report;
the analysis report output module 4 is used for selecting different forms of output of the analysis report.
Specifically, the invention solves the continuous report of mass alarms through the service data and the alarm compression mechanism, and reduces the pushing of the service data and the alarms by constructing the data cleaning rule of the Internet of things equipment.
The data cleaning rule dynamically distributes a buffer queue for each piece of reported service data or alarming internet of things equipment, presses the received service data or alarming one by one according to the receiving time, simultaneously monitors the state of each event, presses the same state into the queue, only if the alarming and recovering events occur in the queue at the same time, records the real-time alarming list, puts the number of the events in the queue, and simultaneously empties the buffer queue.
And constructing a service false alarm dynamic identification rule by a custom rule base, setting the rule according to the type and the model of the equipment, for example, in a certain time period after the equipment reports an alarm, judging whether service alarm needs to be generated according to the rule according to the corresponding equipment type and model matching rule.
The false alarm dynamic identification rule setting parameters are different for different device types or device models. For example, geomagnetic equipment, the identification rule is based on the duration (accurate to minutes) from the alarm to the elimination, and the time dimension is taken as the main consideration of the rule; such as manhole cover equipment, and the identification rule is judged according to the detected inclination angle change amplitude.
The false alarm dynamic identification rule is not cured for a long time after being set, and in fact, a data sand table can be constructed through a rule verification expression and is used for the normalized and high-simulation identification rule optimization test. The predicted false alarm processing result is output through the real historical original alarm data of a period and the optimized recognition rule, so as to judge the optimization effect and feasibility of the recognition rule.
The above disclosure is merely illustrative of a preferred embodiment of the present invention and it is not intended to limit the scope of the present invention, and those skilled in the art will appreciate that all or part of the procedures described in the above embodiments may be performed according to the equivalent changes of the claims and still fall within the scope of the present invention.

Claims (8)

1. The method for cleaning the data of the Internet of things is characterized by comprising the following steps of:
setting a cleaning target of the business data of the Internet of things;
analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data;
formulating analysis rule requirements according to the cleaning target based on the analysis data;
defining rules according to the analysis rule requirements to obtain target rules;
setting analysis rules according to the target rules;
selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule;
setting execution time, frequency and range based on the encapsulation rule;
executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report;
the analysis report is selected for output in a different form.
2. The method for cleaning Internet of things data according to claim 1, wherein,
the cleaning target comprises alarm characteristics of data reported by the sensor, alarm characteristics required by the internet of things service and parameter characteristics of the data reported by the sensor.
3. The method for cleaning Internet of things data according to claim 1, wherein,
the analysis rule requirements include a rule name and a rule type.
4. The method for cleaning Internet of things data according to claim 1, wherein,
defining rules according to the analysis rule requirement to obtain target rules, including:
searching in a rule base according to the analysis rule requirement, if a corresponding rule exists in the rule base, selecting the corresponding rule as a target rule, and if the corresponding rule does not exist in the rule base, customizing the rule to obtain the target rule.
5. The method for cleaning Internet of things data according to claim 4, wherein,
the custom rule comprises:
setting the name of a custom rule;
selecting a rule object;
setting an algorithm expression of the object;
storing the name and the algorithm expression to obtain a target rule;
and storing the target rule into the rule base.
6. The method for cleaning Internet of things data according to claim 1, wherein,
the setting of the analysis rule according to the target rule includes:
setting a verification expression according to the target rule;
setting analysis parameters according to the attribute or the algorithm expression of the object of the target rule;
and storing the verification expression and the analysis parameters to obtain an analysis rule.
7. The method for cleaning Internet of things data according to claim 1, wherein,
the setting of the execution time, frequency and range based on the encapsulation rule includes:
setting tasks based on the encapsulation rules;
setting the execution time, frequency and range of the task.
8. An internet of things data cleaning device applied to the internet of things data cleaning method as set forth in claim 1, characterized in that,
the system comprises an acquisition module, a rule definition module, a rule task execution module and an analysis report output module, wherein the acquisition module, the rule definition module, the rule task execution module and the analysis report output module are connected in sequence;
the acquisition module is used for acquiring corresponding Internet of things service data from a sensor or Internet of things equipment;
the rule definition module is used for setting a cleaning target of the internet of things service data; analyzing the corresponding Internet of things service data according to the equipment type corresponding to the cleaning target to obtain analysis data; formulating analysis rule requirements according to the cleaning target based on the analysis data; defining rules according to the analysis rule requirements to obtain target rules; setting analysis rules according to the target rules;
the rule task execution module is used for selecting a packaging algorithm to package according to the analysis rule to obtain a packaging rule; setting execution time, frequency and range based on the encapsulation rule; executing analysis tasks based on the execution time, the frequency and the range to obtain an analysis report; selecting different forms of generated report based on the analysis result;
the analysis report output module is used for outputting the generated report.
CN202310675597.9A 2023-06-08 2023-06-08 Internet of things data cleaning method and device Pending CN117076850A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310675597.9A CN117076850A (en) 2023-06-08 2023-06-08 Internet of things data cleaning method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310675597.9A CN117076850A (en) 2023-06-08 2023-06-08 Internet of things data cleaning method and device

Publications (1)

Publication Number Publication Date
CN117076850A true CN117076850A (en) 2023-11-17

Family

ID=88714136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310675597.9A Pending CN117076850A (en) 2023-06-08 2023-06-08 Internet of things data cleaning method and device

Country Status (1)

Country Link
CN (1) CN117076850A (en)

Similar Documents

Publication Publication Date Title
CN111539550B (en) Method, device, equipment and storage medium for determining working state of photovoltaic array
CN109726200B (en) Power grid information system fault positioning system and method based on bidirectional deep neural network
CN101590918B (en) Method for automatic fault diagnosis of satellite and diagnostic system thereof
US8248228B2 (en) Method and device for optimizing the alarm configuration
CN111917877A (en) Data processing method and device for Internet of things equipment, electronic equipment and storage medium
CN109343395B (en) Abnormity detection system and method for DCS operation log of nuclear power plant
CN112817280A (en) Implementation method for intelligent monitoring alarm system of thermal power plant
CN109947806B (en) Case-based reasoning ultrahigh-rise construction safety accident emergency auxiliary decision-making method
CN115077627B (en) Multi-fusion environmental data supervision method and supervision system
CN115186917A (en) Active early warning type risk management and control system and method
CN110580492A (en) Track circuit fault precursor discovery method based on small fluctuation detection
CN108763966B (en) Tail gas detection cheating supervision system and method
CN111627199A (en) Hydropower station dam safety monitoring system and monitoring method
CN110989042A (en) Intelligent prediction method for highway fog-clustering risk
CN107506832B (en) Hidden danger mining method for assisting monitoring tour
CN109885978B (en) Remote sensing ground station fault diagnosis system and method
CN117076850A (en) Internet of things data cleaning method and device
CN103713976A (en) Signal appliance fault source searching method for signal central monitoring system
CN116951328A (en) Intelligent drainage pipeline operation monitoring system based on big data
CN111667391A (en) Environment-friendly big data monitoring system
CN116448234A (en) Power transformer running state voiceprint monitoring method and system
CN115329598A (en) Data processing platform based on digital twins
CN115296193A (en) Intelligent inspection system and method for transformer substation
CN117094563B (en) Intelligent liquid waste leakage monitoring system and method based on big data
CN115442247B (en) Adopt artificial intelligence data processing fortune dimension case

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination