CN117830028A - Data preprocessing method, engine, equipment and storage medium - Google Patents

Data preprocessing method, engine, equipment and storage medium Download PDF

Info

Publication number
CN117830028A
CN117830028A CN202410014810.6A CN202410014810A CN117830028A CN 117830028 A CN117830028 A CN 117830028A CN 202410014810 A CN202410014810 A CN 202410014810A CN 117830028 A CN117830028 A CN 117830028A
Authority
CN
China
Prior art keywords
data
power data
current power
preprocessing
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410014810.6A
Other languages
Chinese (zh)
Inventor
李世豪
缪巍巍
曾锃
夏元轶
滕昌志
张瑞
余益团
张震
李马峰
邱文元
顾亚林
张俊杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nari Information and Communication Technology Co
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
Nari Information and Communication Technology Co
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nari Information and Communication Technology Co, Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd filed Critical Nari Information and Communication Technology Co
Priority to CN202410014810.6A priority Critical patent/CN117830028A/en
Publication of CN117830028A publication Critical patent/CN117830028A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data preprocessing method, an engine, equipment and a storage medium, wherein the data preprocessing engine is respectively connected with a front message cluster, a rear message cluster, a multifunctional platform and a rule base, and the rear message cluster is connected with a big data processing platform, and the method comprises the following steps: acquiring current power data sent by the pre-message cluster; invoking the rule base, determining a corresponding preprocessing rule according to the current power data, and preprocessing the current power data according to the preprocessing rule to obtain a preprocessing result; and according to the preprocessing result, the current power data is sent to the multifunctional platform or the post-message cluster. According to the technical scheme, the storage space of the distributed clusters is saved, the processing pressure of the big data processing platform is reduced, and the data processing efficiency of the Internet of things management platform is improved.

Description

Data preprocessing method, engine, equipment and storage medium
Technical Field
The present invention relates to the field of power technologies, and in particular, to a data preprocessing method, an engine, a device, and a storage medium.
Background
The electric power internet of things is developed at a high speed, and the number of electric power terminal equipment is continuously increased. The power terminal equipment comprises an ammeter, a distribution box, an intelligent sensor and the like, and the equipment can continuously generate a large amount of data such as power consumption data, equipment operation state data, environment data and the like. These data need to be processed and analyzed in real time to support the operation and management of the power system.
However, the existing data processing technology and hardware equipment often cannot meet the requirement of mass data processing, at present, the internet of things management platform comprises a distributed message cluster and a big data processing platform, the distributed message cluster receives power data transmitted by the power terminal equipment, then the power data is stored or sent to the big data processing platform at the rear end, and the big data processing platform performs the work of checking and processing the power data. In the mode, the problems of insufficient storage capacity, high data processing pressure, low processing efficiency and the like exist, and the development of the electric power Internet of things is restricted.
Disclosure of Invention
The invention provides a data preprocessing method, an engine, equipment and a storage medium, which save the storage space of a distributed cluster, reduce the processing pressure of a big data processing platform and improve the data processing efficiency of an Internet of things management platform.
In a first aspect, an embodiment of the present disclosure provides a data preprocessing method, applied to a data preprocessing engine, where the data preprocessing engine is connected to a pre-message cluster, a post-message cluster, a multifunctional platform and a rule base, and the post-message cluster is connected to a big data processing platform, where the method includes:
acquiring current power data sent by the pre-message cluster;
invoking the rule base, determining a corresponding preprocessing rule according to the current power data, and preprocessing the current power data according to the preprocessing rule to obtain a preprocessing result;
and according to the preprocessing result, the current power data is sent to the multifunctional platform or the post-message cluster.
In a second aspect, an embodiment of the present disclosure provides a data preprocessing engine, which is respectively connected to a pre-message cluster, a post-message cluster, a multifunctional platform and a rule base, where the post-message cluster is connected to a big data processing platform, and includes:
the power data acquisition module is used for acquiring current power data sent by the front message cluster;
the power data processing module is used for calling the rule base, determining corresponding preprocessing rules according to the current power data, and preprocessing the current power data according to the preprocessing rules to obtain preprocessing results;
And the power data transmitting module is used for transmitting the current power data to the multifunctional platform or the post-message cluster according to the preprocessing result.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data preprocessing method provided by the embodiment of the first aspect described above.
In a fourth aspect, embodiments of the present disclosure provide a computer readable storage medium storing computer instructions for causing a processor to execute the data preprocessing method provided in the above first aspect.
The embodiment of the invention relates to a data preprocessing method, an engine, equipment and a storage medium, wherein the data preprocessing engine is respectively connected with a front message cluster, a rear message cluster, a multifunctional platform and a rule base, and the rear message cluster is connected with a big data processing platform, and the method comprises the following steps: acquiring current power data sent by the pre-message cluster; invoking the rule base, determining a corresponding preprocessing rule according to the current power data, and preprocessing the current power data according to the preprocessing rule to obtain a preprocessing result; and according to the preprocessing result, the current power data is sent to the multifunctional platform or the post-message cluster. According to the technical scheme, the storage space of the distributed clusters is saved, the processing pressure of the big data processing platform is reduced, and the data processing efficiency of the Internet of things management platform is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data preprocessing method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an internet of things management platform related to a data preprocessing method according to a first embodiment of the present invention;
FIG. 3 is a flowchart of a data preprocessing method according to a second embodiment of the present invention;
FIG. 4 is a schematic diagram of a data preprocessing engine according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and "object" in the description of the present invention and the claims and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a data preprocessing method according to an embodiment of the present invention, where the method may be performed by a data preprocessing engine, and the data preprocessing engine may be implemented in hardware and/or software.
Fig. 2 is a schematic structural diagram of an internet of things management platform related to a data preprocessing method according to an embodiment of the present invention. As shown in fig. 2, the internet of things management platform includes a data preprocessing module 10 and a big data processing platform 20, wherein the data preprocessing module 10 includes a pre-message cluster 11, a data preprocessing engine 12, a post-message cluster 13, a multi-function platform 14 and a rule base 15, and in the data preprocessing module 10, the pre-message cluster 11, the post-message cluster 13, the multi-function platform 14 and the rule base 15 are all connected with the data preprocessing engine 12.
The data preprocessing module is an operation processing module which is formed by improving a single distributed message cluster only used for storage in the prior art and used for saving data storage space and relieving processing pressure of a big data processing platform, and the data preprocessing module performs data preprocessing through a data preprocessing engine. The pre-message cluster and the post-message cluster are distributed message clusters, and may be Kafka message clusters, for example. The data preprocessing engine can be understood as a rule processing engine, which is used for preprocessing the power data according to a set rule, and is equivalent to an electronic device with processing operation capability. The multifunctional platform can be understood as an operation platform with multiple functions and at least has the functions of data storage and alarm notification; the multi-function platform may also be considered to be a variety of functional platforms, including at least a storage platform and an alert platform, wherein the storage platform may be a database, such as a relational database. The rule base is an arrangeable rule base, and is stored with preset operation processing rules, when the operation processing rules in the rule base change, the data preprocessing engine can process and operate corresponding power data according to the operation processing rules which are latest modified and effective in the rule base.
Specifically, the front message cluster is responsible for receiving massive internet of things terminal data (current power data), various equipment models and equipment processing logic rules are preset in the rule base, the data preprocessing engine is responsible for analyzing the data processing logic set in the rule base, distributing the internet of things terminal data (current power data) to each queue in the rear message cluster according to the corresponding data processing logic, or directly storing the current power data, notifying an alarm and other operations.
According to the data processing method provided by the embodiment of the invention, the rule base has the data processing arrangement capacity of the self-defined equipment, and the original power data is split in a mode of directly entering the message queue and being processed by the data stream by setting the data preprocessing engine on the buffer layer. The original power data is stored by a queue of a distributed message cluster no matter whether the original power data is abnormal or not, the data is backlogged together for processing, and a certain transmission time delay is also provided. According to the method, the traditional distributed message clusters are optimally set into the front message clusters, the data preprocessing engine, the rear message clusters, the multifunctional platform and the rule base, the data preprocessing of massive power data is realized through the equipment arrangement processing capacity of the data preprocessing engine and the rule base, real-time warning is directly processed in a buffer layer through real-time monitored abnormal data, normal monitored data can enter an archiving database through a low-level queue channel, efficient processing of the effective power data by a big data processing platform is facilitated, the data processing efficiency of an Internet of things management platform is further improved, and the performance problems of data backlog processing delay and the like faced by a big data processing frame (big data platform) and a message queue (distributed message cluster) are improved.
As shown in fig. 1, the method includes:
s101, acquiring current power data sent by a front message cluster.
In this embodiment, the current power data may be understood as power data of each power terminal device at the current moment, for example, may be real-time acquisition information and monitoring information of devices such as an edge internet of things agent, various electric meters, and a distribution box.
Specifically, each power terminal device reports its current power data to a pre-message cluster, and the pre-message cluster forwards the current power data to a data preprocessing engine after receiving the current power data reported by each power terminal device.
The current power data reported to the front message cluster by each power terminal device is data which is accessed to the internet of things gateway through an MQTT (Message Queuing Telemetry Transport, message queue telemetry transport) protocol and is subjected to security authentication on the device data through a security terminal.
S102, calling a rule base, determining a corresponding preprocessing rule according to the current power data, and preprocessing the current power data according to the preprocessing rule to obtain a preprocessing result.
In this embodiment, the preprocessing rule may be understood as a processing rule corresponding to the device type in the current power data. The preprocessing result may be understood as a result determined after preprocessing the current power data, and at least includes an abnormality judgment result and a data classification result. The abnormality judgment result is a detection result of whether the current power data has abnormality or not, and is used for determining whether the current power data can be stored in the post-message cluster or not. The data classification result is a classification result of the data type and is used for determining the storage position of the current power data in the post-message cluster.
Specifically, after receiving current power data sent by a pre-message cluster, the data preprocessing engine determines the equipment type of the power terminal equipment from the current power data, invokes an associated rule base, determines a preprocessing rule corresponding to the equipment type from the rule base, executes corresponding preprocessing operation on the current power data according to the preprocessing rule, judges whether the current power data is abnormal or not, and determines which queue in the post-message cluster the current power data should be stored to if the current power data is not abnormal, so as to obtain a corresponding preprocessing result.
And S103, according to the preprocessing result, the current power data is sent to the multifunctional platform or the post-message cluster.
In this embodiment, when it is determined that the current power data has an abnormality according to the preprocessing result, the current power data is sent to the multifunctional platform to be stored, and the multifunctional platform performs alarm notification according to the abnormal current power data. When the current power data is determined to have no abnormality according to the preprocessing result, the current power data is sent to a corresponding queue in the post-message cluster according to the data type of the current power data in the preprocessing result, so that the post-message cluster stores the current power data in a classified manner, and efficient and orderly storage is provided.
After the post-message cluster stores the current power data in a classified manner, the current power data is sent to the big data processing platform for data processing, at the moment, the data processed by the big data processing platform is preprocessed and has no abnormal data, when the data is called from the post-message cluster, the data type of the called data can be clarified, the processing pressure of the big data processing platform is effectively reduced, and the data processing efficiency of the Internet of things management platform is improved.
Specifically, according to the data processing method provided by the embodiment of the invention, the data preprocessing engine consumes the data information (the current power data sent by the front message cluster) reported by the equipment in the front message queue based on the stream data processing mode, performs analysis processing according to the equipment data processing rule defined by the manager of the internet of things management platform, triggers various data processing nodes, forwards the equipment data to the corresponding position (the message queue in the multifunctional platform or the rear message cluster), for example, directly writes the data into the relational database for storage or writes the data into the preset message queue according to the equipment type, and the subsequent data processing flow consumes the data in the queue.
According to the data preprocessing method provided by the embodiment of the invention, the current power data sent by the preposed message cluster is obtained; invoking a rule base, determining a corresponding preprocessing rule according to the current power data, and preprocessing the current power data according to the preprocessing rule to obtain a preprocessing result; and according to the preprocessing result, the current power data is sent to the multifunctional platform or the post-message cluster. According to the technical scheme, the traditional distributed message clusters are divided into the front message clusters and the rear message clusters, the data preprocessing engine is arranged between the front message clusters and the rear message clusters, the current power data is preprocessed, the current power data without abnormality after preprocessing is transmitted to the rear message clusters, the big data processing platform calls the data from the rear message clusters to process the data, the storage space of the distributed clusters is effectively saved, the processing pressure of the big data processing platform is reduced, and the data processing efficiency of the Internet of things management platform is improved.
As a first optional embodiment of the embodiments, on the basis of the foregoing embodiment, the first optional embodiment further preferably further includes, before invoking the rule base to determine the corresponding preprocessing rule according to the current power data:
a1 Judging whether the current power data is complete.
In this embodiment, because when the power terminal device performs reporting of the current power data in the actual processing process of the current power data, a situation that the device type is lost may occur, and at this time, the current power data reported by the power terminal device is incomplete data. Therefore, it is determined whether there is a data loss in the current power data, that is, whether the current power data is complete. The specific judgment method may be determined by the field value of each field in the current power data, or may be checked by a preset function, which is not limited in this embodiment.
b1 If the equipment type in the current power data is lost, determining the equipment type corresponding to the current power data according to the current power data and a pre-constructed characteristic model, wherein the characteristic model is constructed according to the historical power data.
In this embodiment, the device type may be understood as a type of a power terminal device, including an electric meter, a distribution box, an edge internet of things agent, and the like, where each power terminal device has corresponding power data, for example, the power data that the electric meter may include includes information such as a device type, a device number, an electricity consumption amount, an area, a coordinate, and the like; the power data that the distribution box may contain includes information on the type of equipment, equipment number, number of loops, area, coordinates, etc. The feature model may be understood as a model storing the latest complete power data corresponding to each equipment type, or may be understood as a feature table, and the feature model is constructed according to historical power data. The historical power data may be understood as power data of each power terminal device at a historical time. The complete power data corresponding to each equipment type stored in the feature model is the power data of the historical time closest to the current time.
Specifically, if the lost data in the current power data is the equipment type of the power terminal equipment, matching the current power data with a pre-constructed characteristic model, comparing each field in the current power data with each field in the historical power data, performing characteristic extraction calculation, determining the similarity between the current power data and the historical power data, and determining the equipment type of the historical power equipment with the highest similarity with the current power data or meeting a similarity threshold as the equipment type of the current power data.
c1 Filling the device type into the current power data.
In this embodiment, after feature matching is performed according to the current power data and the feature model, device type information of the current power data is determined, and the determined device type is filled into the current power data to form complete current power data.
Further, determining the device type corresponding to the current power data according to the current power data and the pre-constructed feature model comprises the following steps:
b11 The current power data and the historical power data are divided into words respectively to form a current power data word bag and a historical power data word bag.
In this embodiment, the current power data word bag may be understood as a word bag formed by performing word segmentation and word frequency statistics on current power data. The word bag of the historical power data can be understood as a word bag formed by word segmentation and word frequency statistics of the historical power data.
Specifically, the matching of the equipment is assisted by feature extraction and vectorization processing, abnormal-free equipment data (historical power data and/or current power data) of various types are copied by a data copying technology, the content of data fields of various types of equipment is divided into words, word frequency is counted to form word bags, the content of data field values is divided according to a certain rule, for example, 4-bit data values are divided, each divided word or phrase is called a word or token (token/mark), the frequency of the word appearing in each field is counted, the current power data word bag is formed according to the counting result of the current power data, and the historical power data word bag is multiplied according to the counting result of the historical power data.
Illustratively, data1 is complete and standard historical power data, data2 is current power data of the missing device type field,
data1="{'device_id':'001','voltage':220,'current':10,'type':1}"
data2="{'device_id':'002','voltage':230,'current':12}"
the field content is segmented, and in the example, the JSON field content is in a character string format, and can be separated according to spaces or other characters:
the generated word bags are as follows:
'device_id:':1,
'001':1,
'voltage:':1,
'220':1,
'current:':1,
'10':1,
'type':1,
'1':1
the word vector generation result is:
data1=[1,1,1,1,1,1,1,1]
data2=[1,1,1,1,1,1,0,0]
b12 Determining a consistency result of the current power data word bag and the historical power data word bag through a preset consistency algorithm.
In this embodiment, the preset consistency algorithm may be understood as a preset algorithm for calculating consistency of two power data, for example, a cosine similarity algorithm. The consistency result may be understood as a similarity value of the current power data and the historical power data.
Specifically, a preset consistency algorithm is used for carrying out consistency calculation on the current power data word bags and the historical power data word bags, and a similarity value of the current power data and the historical power data is determined.
Illustratively, the consistency of two device types can be calculated by a cosine similarity algorithm, with the inner product calculated first: dot product=1× 1+1×1+1×1+1 x1+1+1+1+1×0+1×0=6, recalculating cosine similarity
b13 Determining a device type corresponding to the current power data based on the consistency result.
In this embodiment, when a similarity value corresponding to a consistency result meets a set similarity threshold, determining a device type corresponding to historical power data meeting the similarity threshold, which is the same as a device type corresponding to current power data, and determining the device type of the historical power data as the device type corresponding to the current power data; and/or comparing the similarity values corresponding to the consistency results of all the historical power data and the current power data, and determining the equipment type of the historical power data with the highest similarity value with the current power data as the equipment type corresponding to the current power data.
Example two
Fig. 3 is a flowchart of a data preprocessing method according to a second embodiment of the present invention, where any of the foregoing embodiments is further optimized, and the method may be applicable to a case of performing data preprocessing on massive power data of an in-line management platform, where the method may be performed by a data preprocessing engine, and the data preprocessing engine may be implemented in a form of hardware and/or software.
As shown in fig. 3, the method includes:
s201, acquiring current power data sent by the preamble message cluster.
S202, calling a rule base, and determining a target processing rule corresponding to the equipment type in the rule base according to the equipment type in the current power data.
In this embodiment, the target processing rule may be understood as a pre-configured processing rule corresponding to a device type, where the target processing rule is a processing rule formed by several or all of a flow arrangement node, a data processing node, a condition judgment node, a data storage node, a data transmission node, a data verification node, and a data notification node. The choice of each node is decided by the operation and maintenance manager. The process arranging node provides a process connector selectable by a user, and the process connector comprises a starting node, an ending node, a node connecting line and the like; the data processing nodes comprise data cleaning, matching, encryption and decryption, assembly fields and the like, a manager can select certain equipment data, and data content conversion is carried out in a rule engine processing link; for example, the electric quantity use data in the electric meter can be decrypted; the condition judgment node comprises a judgment condition setter, a circulation controller and the like; the data storage node can support a manager to store data in a rule engine analysis process in a relational database and a non-relational database; the data transmission node enables a manager to select corresponding post-message queue information according to information such as equipment type, area, coordinates and the like; the data check node enables a manager to clean data of the information of the Internet of things equipment containing the monitoring index data in the rule engine, and directly stores and reports the abnormal data information in a marked mode; the data notification node, the manager can select the data notification type such as App notification, internet of things notification, platform alarm notification, etc.
Specifically, the rule base stores data processing rules corresponding to various types of power terminal equipment, and the data processing rules are processing flows formed according to each processing node. And calling a rule base associated with the data preprocessing engine, and determining a data processing rule corresponding to the equipment type of the current power data in the rule base as a target processing rule.
S203, preprocessing the current power data according to a target processing rule to obtain a preprocessing result, wherein the preprocessing operation at least comprises cleaning, matching, encryption and decryption, verification and field assembly.
In this embodiment, the preprocessing result may be understood as a processing result generated after preprocessing the current power data according to the target processing rule, where the preprocessing result includes an abnormality determination result and a data classification result. The abnormality judgment result is a judgment result of whether the current power data is abnormal data or not, and comprises that the current power data is abnormal and the current power data is not abnormal. The data classification result may be understood as a data type of the current power data.
Specifically, according to a target processing rule corresponding to the equipment type of the current power data, preprocessing operations such as cleaning, matching, encryption and decryption, verification, field assembly and the like are performed on the current power data, and an abnormality judgment result and a data classification result of the current power data after preprocessing are obtained.
S204, when the abnormality judgment result is that the current power data is abnormal, generating an alarm identifier, and sending the alarm identifier and the current power data to the multifunctional platform.
In this embodiment, the alarm identifier may be understood as an alarm information identifier, i.e. an alarm ID.
Specifically, when the abnormality judgment result in the preprocessing result is that the current power data is abnormal, an alarm identifier for the current power data is generated. When the multifunctional platform is a platform, the alarm identification and the current power data are sent to the multifunctional platform, the multifunctional platform stores the current power data, and the alarm identification is pushed to the associated client, so that a manager can conveniently check the current power data corresponding to the abnormality according to the alarm identification; when the multifunctional platform is a storage platform and an alarm platform respectively, the alarm identification is sent to the alarm platform, the alarm platform stores the alarm identification and pushes the alarm identification to an associated client, and the storage platform stores current power data with abnormality into a relational database.
And S205, when the abnormality judgment result is that the current power data has no abnormality, the current power data is sent to a post-message cluster according to the data classification result corresponding to the current power data.
In this embodiment, when the abnormality determination result in the preprocessing result is that the current power data is not abnormal, it is determined that the current power data is safe and standard data, and according to the data classification result corresponding to the current power data, a message queue corresponding to the data type of the current power data is determined, and the current power data is sent to a message queue corresponding to the current power data in the post-message cluster to be stored.
It can be understood that the internet of things management platform provides the data processing rules of the dragging type equipment, and the management personnel of the internet of things management platform can define the data processing rules of various types of equipment through the data processing rules page of the equipment. And dragging and arranging the custom device data processing scenes of various types from the data nodes of various types described in the step S202.
For example, the data uploaded by the edge internet of things proxy device includes detection data of the internet of things proxy gateway itself, such as CPU usage rate, memory usage rate, and the like. Arranging the data processing flow of the edge internet of things proxy equipment, and setting data check nodes, check data fields and data index alarm threshold information; continuing to add the condition node, and setting an added data notification node as an Internet of things proxy gateway maintainer App and an Internet of things data management platform when the detected data is larger than a set threshold value; when the detected data is smaller than the set threshold value, the post distributed message queue information is set, and the history monitoring information is archived and stored by a subsequent low-level data processor.
After the configuration is carried out, when edge internet of things proxy equipment (power terminal equipment) with abnormal monitoring data reports the data to a front message cluster, a data preprocessing engine judges that the current edge internet of things proxy equipment is abnormal according to a processing flow corresponding to the equipment type of the edge internet of things proxy equipment, and then corresponding maintenance staff information is inquired according to the serial number of the internet of things proxy equipment, and alarm information is sent through a management end App; the data with the detection abnormality are not sent to the post-message cluster any more, but the data preprocessing engine directly stores the abnormality data into a relational database (multifunctional platform) and sets an alarm identification value, the internet of things management platform is synchronously notified, and the internet of things management platform can inquire specific abnormality information detail information based on the alarm identification.
After the arrangement of the device data processing rules is completed, the device processing rules can be issued and validated in real time, and the rule engine processes the newly issued data processing rules.
The existing real-time processing method for the big data of the Internet of things generally increases a data processing frame through a distributed middleware to process massive data, optimizes the local part of the big data processing frame, and can decouple the massive data and the data processing application, but can not well process the problem of updating of power equipment in the power industry. If a certain area in a certain city is subjected to intelligent ammeter equipment test point upgrading, other areas are not upgraded temporarily, the Internet of things management platform is in butt joint with the newly added test point equipment, service logic in a big data processing frame must be upgraded, and otherwise, the newly added test point equipment data cannot be adapted and identified. Frequent upgrades of big data handling traffic in existing networks can bring unpredictable risks.
According to the data preprocessing method provided by the embodiment of the invention, the current power data sent by the preposed message cluster is obtained; invoking a rule base, and determining a target processing rule corresponding to the equipment type in the rule base according to the equipment type in the current power data; preprocessing current power data according to a target processing rule to obtain a preprocessing result, wherein the preprocessing operation at least comprises cleaning, matching, encryption and decryption, verification and field assembly; when the abnormality judgment result is that the current power data is abnormal, generating an alarm mark, and transmitting the alarm mark and the current power data to the multifunctional platform; and when the abnormality judgment result is that the current power data is not abnormal, sending the current power data and a data classification result corresponding to the current power data to a post-message cluster. According to the technical scheme, the front-back distributed message queues are adopted, the power equipment rule engine is arranged in the middle of the data buffer layer, so that modeling and data processing rules arranging can be carried out on different types of power terminal equipment, and a subsequent data processing target channel is arranged, and great flexibility is brought to data processing of the Internet of things power terminal equipment. By way of example, a test point device is newly added to a certain area in a city, service logic in a deployed big data processing frame in the existing network does not need to be modified and reissued, only needs to newly subscribe to a new queue, adapt to processing services of new device data and issue, meanwhile, the test point device is modeled in a preprocessing engine, a corresponding processing logic rule is set in a visualized device processing rule arrangement page, for example, a data message setting a device type as the test point device is forwarded to the newly added queue message, and then the device processing rule is issued. The data preprocessing engine processes the newly effective rule in real time. Similarly, old power equipment can directly perform failure processing on power equipment rules, and equipment data offline processing can be completed after the rule base is reissued. In summary, the invention effectively saves the storage space of the distributed cluster, reduces the processing pressure of the big data processing platform and improves the data processing efficiency of the internet of things management platform.
Example III
Fig. 4 is a schematic structural diagram of a data preprocessing engine according to a third embodiment of the present invention. The data preprocessing engine is respectively connected with the front message cluster, the rear message cluster, the multifunctional platform and the rule base, and the rear message cluster is connected with the big data processing platform. As shown in fig. 4, the engine includes:
a power data acquisition module 31, configured to acquire current power data sent by the preamble message cluster;
the power data processing module 32 is configured to invoke the rule base, determine a corresponding preprocessing rule according to the current power data, and perform preprocessing on the current power data according to the preprocessing rule to obtain a preprocessing result;
and the power data sending module 33 is configured to send the current power data to the multifunctional platform or the post-message cluster according to the preprocessing result.
The data preprocessing engine adopted by the technical scheme saves the storage space of the distributed clusters, reduces the processing pressure of a big data processing platform and improves the data processing efficiency of the Internet of things management platform.
Optionally, the engine further comprises:
the data integrity judging module is used for judging whether the current power data is complete or not before the rule base is called and the corresponding preprocessing rule is determined according to the current power data;
The device type determining module is used for determining the device type corresponding to the current power data according to the current power data and a pre-constructed characteristic model if the device type in the current power data is lost, wherein the characteristic model is constructed according to historical power data;
and the power data filling module is used for filling the equipment type into the current power data.
Optionally, the device type determining module is specifically configured to:
word segmentation is carried out on the current power data and the historical power data respectively to form a current power data word bag and a historical power data word bag;
determining a consistency result of the current power data word bag and the historical power data word bag through a preset consistency algorithm;
and determining the equipment type corresponding to the current power data according to the consistency result.
Optionally, the power data processing module 32 is specifically configured to:
invoking the rule base, and determining a target processing rule corresponding to the equipment type in the rule base according to the equipment type in the current power data;
and preprocessing the current power data according to the target processing rule to obtain a preprocessing result, wherein the preprocessing operation at least comprises cleaning, matching, encryption and decryption, verification and field assembly.
Optionally, the preprocessing result includes an abnormality judgment result and a data classification result.
Optionally, the power data transmission module 33 is specifically configured to:
and when the abnormality judgment result is that the current power data is abnormal, generating an alarm identifier, and sending the alarm identifier and the current power data to a multifunctional platform.
Optionally, the power data transmission module 33 is specifically configured to:
and when the abnormality judgment result is that the current power data is abnormal, sending the current power data and a data classification result corresponding to the current power data to a post-message cluster.
The data preprocessing engine provided by the embodiment of the invention can execute the data preprocessing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 shows a schematic diagram of an electronic device 40 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 40 includes at least one processor 41, and a memory communicatively connected to the at least one processor 41, such as a Read Only Memory (ROM) 42, a Random Access Memory (RAM) 43, etc., in which the memory stores a computer program executable by the at least one processor, and the processor 41 may perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 42 or the computer program loaded from the storage unit 48 into the Random Access Memory (RAM) 43. In the RAM 43, various programs and data required for the operation of the electronic device 40 may also be stored. The processor 41, the ROM 42 and the RAM 43 are connected to each other via a bus 44. An input/output (I/O) interface 45 is also connected to bus 44.
Various components in electronic device 40 are connected to I/O interface 45, including: an input unit 46 such as a keyboard, a mouse, etc.; an output unit 47 such as various types of displays, speakers, and the like; a storage unit 48 such as a magnetic disk, an optical disk, or the like; and a communication unit 49 such as a network card, modem, wireless communication transceiver, etc. The communication unit 49 allows the electronic device 40 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 41 may be various general and/or special purpose processing components with processing and computing capabilities. Some examples of processor 41 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 41 performs the various methods and processes described above, such as the data preprocessing method.
In some embodiments, the data preprocessing method may be implemented as a computer program tangibly embodied on a computer readable storage medium, such as the storage unit 48. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 40 via the ROM 42 and/or the communication unit 49. When the computer program is loaded into RAM 43 and executed by processor 41, one or more steps of the data preprocessing method described above may be performed. Alternatively, in other embodiments, the processor 41 may be configured to perform the data preprocessing method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. The data preprocessing method is characterized by being applied to a data preprocessing engine, wherein the data preprocessing engine is respectively connected with a front message cluster, a rear message cluster, a multifunctional platform and a rule base, and the rear message cluster is connected with a big data processing platform, and the method comprises the following steps:
acquiring current power data sent by the pre-message cluster;
invoking the rule base, determining a corresponding preprocessing rule according to the current power data, and preprocessing the current power data according to the preprocessing rule to obtain a preprocessing result;
And according to the preprocessing result, the current power data is sent to the multifunctional platform or the post-message cluster.
2. The method of claim 1, further comprising, prior to said invoking the rule base to determine a corresponding preprocessing rule based on the current power data:
judging whether the current power data is complete or not;
if the equipment type in the current power data is lost, determining the equipment type corresponding to the current power data according to the current power data and a pre-constructed characteristic model, wherein the characteristic model is constructed according to historical power data;
and filling the equipment type into the current power data.
3. The method of claim 2, wherein determining the device type corresponding to the current power data from the current power data and a pre-constructed feature model comprises:
word segmentation is carried out on the current power data and the historical power data respectively to form a current power data word bag and a historical power data word bag;
determining a consistency result of the current power data word bag and the historical power data word bag through a preset consistency algorithm;
And determining the equipment type corresponding to the current power data according to the consistency result.
4. The method of claim 1, wherein the invoking the rule base to determine a corresponding preprocessing rule according to the current power data, preprocessing the current power data according to the preprocessing rule, and obtaining a preprocessing result includes:
invoking the rule base, and determining a target processing rule corresponding to the equipment type in the rule base according to the equipment type in the current power data;
and preprocessing the current power data according to the target processing rule to obtain a preprocessing result, wherein the preprocessing operation at least comprises cleaning, matching, encryption and decryption, verification and field assembly.
5. The method of claim 1, wherein the pre-processing results include an anomaly determination result and a data classification result.
6. The method according to claim 5, wherein the sending the current power data to the multi-function platform or the post-message cluster according to the preprocessing result comprises:
and when the abnormality judgment result is that the current power data is abnormal, generating an alarm identifier, and sending the alarm identifier and the current power data to a multifunctional platform.
7. The method according to claim 5, wherein the sending the current power data to the multi-function platform or the post-message cluster according to the preprocessing result comprises:
and when the abnormality judgment result is that the current power data is abnormal, sending the current power data to a post-message cluster according to a data classification result corresponding to the current power data.
8. The data preprocessing engine is characterized by being respectively connected with a front message cluster, a rear message cluster, a multifunctional platform and a rule base, wherein the rear message cluster is connected with a big data processing platform, and comprises the following components:
the power data acquisition module is used for acquiring current power data sent by the front message cluster;
the power data processing module is used for calling the rule base, determining corresponding preprocessing rules according to the current power data, and preprocessing the current power data according to the preprocessing rules to obtain preprocessing results;
and the power data transmitting module is used for transmitting the current power data to the multifunctional platform or the post-message cluster according to the preprocessing result.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform a data preprocessing method according to any one of claims 1-7.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores computer instructions for causing a processor to implement a data preprocessing method according to any one of claims 1-7 when executed.
CN202410014810.6A 2024-01-04 2024-01-04 Data preprocessing method, engine, equipment and storage medium Pending CN117830028A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410014810.6A CN117830028A (en) 2024-01-04 2024-01-04 Data preprocessing method, engine, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410014810.6A CN117830028A (en) 2024-01-04 2024-01-04 Data preprocessing method, engine, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117830028A true CN117830028A (en) 2024-04-05

Family

ID=90517144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410014810.6A Pending CN117830028A (en) 2024-01-04 2024-01-04 Data preprocessing method, engine, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117830028A (en)

Similar Documents

Publication Publication Date Title
JP7389860B2 (en) Security information processing methods, devices, electronic devices, storage media and computer programs
CN115686910A (en) Fault analysis method and device, electronic equipment and medium
CN116523140A (en) Method and device for detecting electricity theft, electronic equipment and storage medium
CN113656252B (en) Fault positioning method, device, electronic equipment and storage medium
CN117499148A (en) Network access control method, device, equipment and storage medium
CN116594563A (en) Distributed storage capacity expansion method and device, electronic equipment and storage medium
CN117830028A (en) Data preprocessing method, engine, equipment and storage medium
CN111274089B (en) Server abnormal behavior perception system based on bypass technology
CN113986657A (en) Alarm event processing method and processing device
CN112307271A (en) Safety monitoring method and device for remote control service of power distribution automation system
CN113157911A (en) Service verification method and device
CN110275812A (en) The method and terminal device of faulted-phase judgment
CN117076185B (en) Server inspection method, device, equipment and medium
CN116643105A (en) Equipment fault detection method and device, electronic equipment and storage medium
CN114328224A (en) Method and device for reproducing exception request, electronic equipment and storage medium
CN117389828A (en) Power supply server management method, device, system, equipment and storage medium
CN117009111A (en) Data processing method, device, equipment and medium
CN115360820A (en) Power data monitoring method and device, electronic equipment and storage medium
CN118018405A (en) Upgrading method and device of Internet of things equipment, server and storage medium
CN116501968A (en) Method, device, electronic equipment and medium for subscribing internet of things data
CN114844920A (en) Internet of things equipment checking method and device, electronic equipment and storage medium
CN117807616A (en) Power data processing method, device, computer equipment and storage medium
CN114064391A (en) Distributed alarm or event information processing method and device
CN117421377A (en) Data processing method, device, equipment and medium for energy station
CN115879166A (en) Data identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination