CN115484326A - Method, system and storage medium for processing data - Google Patents

Method, system and storage medium for processing data Download PDF

Info

Publication number
CN115484326A
CN115484326A CN202211001422.1A CN202211001422A CN115484326A CN 115484326 A CN115484326 A CN 115484326A CN 202211001422 A CN202211001422 A CN 202211001422A CN 115484326 A CN115484326 A CN 115484326A
Authority
CN
China
Prior art keywords
data
fusion
industrial
analysis
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211001422.1A
Other languages
Chinese (zh)
Inventor
江百川
王立恒
乔浩磊
王启蒙
龚亮华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fengtai Technology Beijing Co ltd
Original Assignee
Fengtai Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fengtai Technology Beijing Co ltd filed Critical Fengtai Technology Beijing Co ltd
Priority to CN202211001422.1A priority Critical patent/CN115484326A/en
Publication of CN115484326A publication Critical patent/CN115484326A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07CTIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
    • G07C3/00Registering or indicating the condition or the working of machines or other apparatus, other than vehicles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1491Countermeasures against malicious traffic using deception as countermeasure, e.g. honeypots, honeynets, decoys or entrapment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/18Multiprotocol handlers, e.g. single devices capable of handling multiple protocols

Abstract

The application is applicable to the technical field of data processing, and provides a method, a system and a storage medium for processing data, which comprises the following steps: acquiring industrial data generated by each device in an industrial environment, wherein the communication protocols adopted by each device are different; analyzing each industrial data to obtain a plurality of analyzed data; determining a fusion algorithm corresponding to a plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction; and performing fusion processing on the plurality of analysis data according to a fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device. According to the technical scheme, the industrial data of different communication protocols generated by the equipment are analyzed, and the analyzed data are fused according to the fusion purpose of the user, so that the comprehensive analysis of the industrial data is realized, the obtained fusion result can describe the operation condition of the equipment in the industrial environment in a multi-dimensional manner, and the current condition of the equipment can be more accurately and completely reflected.

Description

Method, system and storage medium for processing data
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method, a system, and a storage medium for processing data.
Background
With the continuous development of computer networks, from local area networks in laboratories to wide area networks connected to thousands of households, various computers are increasingly popularized in life and industry, and more devices are connected to the networks. Whether it is a computer or an industrial control device, the protocol communication is needed in the network, so various communication protocols are derived.
Each device of the industrial environment may generate different industrial data according to different communication protocols. In the prior art, the industrial data of each protocol is usually analyzed and processed independently, and comprehensive analysis cannot be performed, so that the industrial data in the industrial environment is analyzed one-sidedly, and the condition of the industrial environment cannot be accurately described.
Disclosure of Invention
In view of this, embodiments of the present application provide a method, a system, and a storage medium for processing data, so as to solve the problem in the prior art that the analysis of industrial data in an industrial environment is one-sided and the condition of the industrial environment cannot be accurately described, because the industrial data of each protocol is usually analyzed and processed independently and cannot be comprehensively analyzed.
A first aspect of an embodiment of the present application provides a method for processing data, including: acquiring industrial data generated by each device in an industrial environment, wherein the communication protocols adopted by each device are different;
analyzing each industrial data to obtain a plurality of analyzed data;
determining a fusion algorithm corresponding to a plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction;
and performing fusion processing on the plurality of analysis data according to a fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device.
In the implementation mode, the industrial data of different communication protocols generated by the equipment are analyzed, and then the analyzed data are fused according to the fusion purpose of the user, so that the comprehensive analysis of the industrial data is realized, the obtained fusion result can describe the operation condition of the equipment in the industrial environment in a multi-dimensional manner, and the current condition of the equipment can be more accurately and completely reflected.
Optionally, each industrial data includes a communication protocol identifier, and the analyzing of each industrial data to obtain a plurality of analyzed data includes: determining an analysis scheme corresponding to each industrial data according to each communication protocol identifier; and analyzing each industrial data according to each analysis scheme to obtain a plurality of analysis data.
Optionally, analyzing each industrial data according to each analysis scheme to obtain a plurality of analysis data, including: according to each analysis scheme, analyzing each industrial data into analyzable data; and filtering redundant data in each analyzable data to obtain a plurality of analyzable data.
Optionally, performing fusion processing on the multiple analysis data according to a fusion algorithm to obtain a fusion result, including: searching for complementary data in the plurality of parsed data; and combining the complementary data to obtain a fusion result.
Optionally, the industrial data includes data entered into the device, log data of the device, and data output by the device.
Optionally, after the fusion processing is performed on the multiple analysis data according to a fusion algorithm to obtain a fusion result, the method further includes: and constructing honeypot data according to the fusion result, wherein the honeypot data is used for honeypot deployment.
Optionally, the method further comprises: and backing up each industrial data and the fusion result to obtain backup data, wherein the backup data is used for simulating simulation data.
A second aspect of an embodiment of the present application provides a system for processing data, including:
the system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring industrial data generated by each device in an industrial environment, and communication protocols adopted by each device are different;
the analysis module is used for analyzing each industrial data to obtain a plurality of analysis data;
the fusion module is used for determining fusion algorithms corresponding to the plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction; and performing fusion processing on the plurality of analysis data according to a fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device.
Optionally, the system further comprises: and the storage module is used for backing up each industrial data and the fusion result to obtain backup data, and the backup data is used for simulating simulation data.
A third aspect of embodiments of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of the method according to the first aspect.
A fourth aspect of an embodiment of the present application provides a chip, including: a processor, configured to call and run a computer program from a memory, so that a device on which the chip is installed performs the steps of the method according to the first aspect.
A fifth aspect of embodiments of the present application provides a computer program product, which, when run on a system for processing data, causes the system for processing data to perform the steps of the method according to the first aspect described above.
For other advantages of the present application, please refer to the description of the advantages of the first aspect, which is not repeated herein.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic flow chart diagram of a method of processing data provided by an exemplary embodiment of the present application;
FIG. 2 is a flowchart illustrating in detail step S102 of a method for processing data according to another exemplary embodiment of the present application;
FIG. 3 is a detailed flowchart illustrating a step S104 of a method for processing data according to yet another exemplary embodiment of the present application;
FIG. 4 is a schematic flow chart diagram of a method of processing data provided by yet another exemplary embodiment of the present application;
FIG. 5 is a schematic flow chart diagram of a method of processing data provided by yet another exemplary embodiment of the present application;
fig. 6 is a schematic diagram of a system for processing data according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In the description of the embodiments of the present application, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more than two.
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified.
With the continuous development of computer networks, from local area networks in laboratories to wide area networks connected to thousands of households, various computers are increasingly popularized in life and industry, and more devices are connected to the networks. Whether it is a computer or an industrial control device, the protocol communication is needed in the network, so various communication protocols are derived.
Each device in the industrial environment can generate different industrial data according to different communication protocols, and for the industrial data generated by the different communication protocols, different kinds of collectors are required for matching collection. In a traditional mode, industrial data of a certain communication protocol is usually collected independently and then analyzed, and the industrial data corresponding to various communication protocols cannot be analyzed simultaneously, so that comprehensive analysis cannot be performed, and the situation of the industrial environment cannot be accurately described on the one hand when the industrial data in the industrial environment is analyzed.
In view of the above, the present application provides a method, system and storage medium for processing data, including: acquiring industrial data generated by each device in an industrial environment, wherein the communication protocols adopted by each device are different; analyzing each industrial data to obtain a plurality of analyzed data; determining a fusion algorithm corresponding to a plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction; and performing fusion processing on the plurality of analysis data according to a fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device. In the implementation mode, the industrial data of different communication protocols generated by the equipment is analyzed, and the analyzed data is fused according to the fusion purpose of the user, so that the comprehensive analysis of the industrial data is realized, the obtained fusion result can describe the operation state of the equipment in the industrial environment in a multi-dimensional manner, and the current state of the equipment can be more accurately and completely reflected.
The technical solution of the present application will be described in detail below with specific examples. These several specific embodiments may be combined with each other below, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for processing data according to an exemplary embodiment of the present application. The execution main body of the method for processing data provided by the present application may be a terminal device, wherein the terminal device includes, but is not limited to, a vehicle-mounted computer, a tablet computer, a smart phone, a wearable device, a Personal Digital Assistant (PDA), and other devices. But may also be various types of servers. For example, the server may be an independent server, or may be a cloud service that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform. But also systems that process data.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for processing data according to an exemplary embodiment of the present application. The method as shown in fig. 1 may include: s101 to S104 are as follows:
s101: industrial data generated by various devices in an industrial environment is collected, and communication protocols adopted by each device are different.
Industrial environment refers to the physical space of an industrial economic activity, which includes all external influences and forces acting on the industrial economic activity. Industrial environments are provided with a plurality of devices, such as various industrial devices, industrial control devices, production devices, and the like. The communication protocol employed by each different type of device is different.
The communication protocol is also called a communication protocol, and refers to an agreement for controlling data transmission between two communication parties. The convention may include making unified provisions for problems such as data format, synchronization mode, transmission speed, transmission step, error correction mode, control character definition, etc., and both communication parties must comply with the unified provisions, which is also called link control procedure.
Illustratively, the communication protocol may include: a PROcess FIeld BUS (PROcess FIeld BUS) protocol, a serial communication protocol (Modbus), a Controller Area Network (CAN) serial communication protocol, an Addressable Remote sensor high-speed channel open communication protocol (Highway Addressable Remote transmitter, HART), an EtherNet control automation technology (EtherCAT), an industrial EtherNet communication protocol (EtherNet/IP), an industrial FIeld BUS protocol (Modbus/TCP), an automation BUS standard based on the industrial EtherNet technology (PROFINET), a Unified Architecture protocol (Unified Architecture, OPC UA), etc., and may further include proprietary protocols of different manufacturers. The description is given for illustrative purposes only and is not intended to be limiting.
Industrial data refers to the variety of data that the device is capable of generating. For example, the industrial data may include any one or any combination of data entered into the device, log data of the device, and data output by the device.
For example, the data input into the device may include manipulation instruction data input into the device, data transmitted to the device by other devices, and the like. The data output by the device may include feedback data of the device on the manipulation instruction, data output by the device after processing data transmitted to the device by other devices, and the like. The log data of the device refers to data generated by triggering of an event during the operation of the device. The log data of the device may include data of the current state of the device, real-time data read by the sensor, and the like.
Illustratively, each device is provided with a collector, and each collector is configured according to actual requirements, for example, a port matched with a communication protocol of the device is configured, and each configured collector is used for collecting industrial data generated by each device in real time. The collector is an automatic device with the functions of collecting and processing real-time data on site. The system has the functions of real-time acquisition, automatic storage, instant display, instant feedback, automatic processing, automatic transmission and the like. And guarantees are provided for authenticity, validity, instantaneity and usability of the collected industrial data.
Illustratively, the collectors may include industrial data collectors and network data collection software. The industrial data collector can be used for collecting data input and/or output by each device, and the network data collection software can be used for collecting log data of each device. The description is given for illustrative purposes only and is not intended to be limiting.
S102: and analyzing each industrial data to obtain a plurality of analyzed data.
Since the directly collected industrial data may include scrambled data, encrypted data, redundant data, and the like, and such data cannot be directly analyzed, it is necessary to analyze each collected industrial data to obtain a plurality of analyzed data.
The analysis data is data that can be directly used for analysis. Specifically, the data included in the parsed data are all data that conform to a preset data format, are decrypted, and are subjected to removal of redundant data.
Because the communication protocols adopted by each different type of equipment are different, an analysis scheme corresponding to each equipment can be set in advance according to different communication protocols, and the analysis scheme is used for analyzing industrial data generated by the equipment. Illustratively, after industrial data of each device is collected, the industrial data generated by each device is analyzed according to an analysis scheme corresponding to each device, so that a plurality of analysis data are obtained.
S103: and determining a fusion algorithm corresponding to the plurality of analysis data according to a preset fusion purpose.
The preset fusion purpose is set by the user according to the actual requirement. For example, the preset fusion purpose may include device early warning, topological graph construction, vulnerability detection, hidden danger detection, attack information prediction, sensor early warning, user suggestion generation, data report generation, data processing result recording, global information generation, and the like.
Illustratively, different fusion algorithms are set for different fusion purposes in advance, and the different fusion purposes and the corresponding fusion algorithms are stored in a database or an algorithm library in an associated manner. And after the fusion purpose is determined, searching a fusion algorithm matched with the fusion purpose in a database or an algorithm library, and determining the searched fusion algorithm as the fusion algorithm corresponding to the plurality of analysis data.
Optionally, different algorithm designs can be performed according to the real environment of field deployment, so that different fusion algorithms are designed for a plurality of analysis data. For example, weighted fusion, feature-level based fusion, and decision-level-given fusion, among others.
S104: and carrying out fusion processing on the plurality of analysis data according to a fusion algorithm to obtain a fusion result.
And the fusion result is used for describing the operation condition of each device in a multi-dimension mode. For example, the fusion result may describe the operation status of each device in many aspects, such as network security, network speed, device operation efficiency, device function diversification degree, device economic benefit, device network environment (such as general network environment, local area network environment, wide area network environment, etc.), device security, device failure occurrence rate, device power consumption degree, and coordination degree of each device.
For example, the fusion purpose is device early warning, and the corresponding fusion algorithm is to grade or set weights for a plurality of analysis data, analyze data with high grade or high weight to obtain a first analysis result, and analyze data with other grades or other weights to obtain a second analysis result. The first analysis result and the second analysis result jointly form a fusion result. The first analysis result is used for early warning of the user equipment, and the second analysis result is used for describing the operation conditions of the equipment except for network security.
For example, network attack data (e.g., data generated after a device is attacked) is selected from a plurality of analyzed data, the network attack data is combined together, a high level or a large weight is set for the network attack data, and the attack times and the attack degree are analyzed according to the network attack data. And comparing whether the attack frequency exceeds the preset attack frequency or not, and whether the attack degree exceeds the preset attack degree or not, and recording the comparison result as a first analysis result.
And if the attack times exceed the preset attack times and/or the attack degree exceeds the preset attack degree, early warning the equipment. For example, information, mails and the like are sent to the user in time to prompt that the attack frequency of the user equipment exceeds the preset attack frequency and/or the attack degree exceeds the preset attack degree, and early warning can be performed in a voice mode, an alarm mode and the like.
And analyzing data of other levels or other weights, for example, analyzing functional data which can be realized by the equipment so as to obtain the diversity degree of the functions of the equipment, and analyzing the frequency and time data of equipment failure so as to obtain the failure occurrence rate of the equipment and the like. These analysis results were recorded as second analysis results. The first analysis result and the second analysis result jointly form a fusion result.
The fusion result can be displayed in the forms of a document, an analysis graph, a table, a log, a topological graph and the like according to the actual requirements of the user. The description is given for illustrative purposes only and is not intended to be limiting.
Optionally, in a possible implementation manner, the multiple analysis data are fused according to a fusion algorithm, and obtaining a fusion result may also be that the multiple analysis data are associated with the multiple industrial data, that is, the analyzed industrial data are associated with the non-analyzed industrial data, so as to obtain the target position or attribute. For example, in some special environments, a network has information such as topology information and actual geographic location, and after a plurality of pieces of analysis data are associated with a plurality of pieces of industrial data, the information can be obtained, which is convenient for operations such as drawing and logical association in the later period.
Optionally, in a possible implementation manner, a plurality of analysis data are subjected to fusion processing, and the obtained fusion result may provide real-time data support in competition, learning, and simulation for platform display.
In the implementation mode, the industrial data of different communication protocols generated by the equipment is analyzed, and the analyzed data is fused according to the fusion purpose of the user, so that the comprehensive analysis of the industrial data is realized, the obtained fusion result can describe the operation state of the equipment in the industrial environment in a multi-dimensional manner, and the current state of the equipment can be more accurately and completely reflected. And the processing mode can be used for processing industrial data in batches, so that the processing efficiency is improved. The method provided by the application can be applied to various network environments, and the applicability is improved.
Referring to fig. 2, fig. 2 is a detailed flowchart of step S102 of a method for processing data according to another exemplary embodiment of the present application, and optionally, in a possible implementation manner, the step S102 may include steps S1021 to S1022, specifically as follows:
s1021: and determining the analysis scheme corresponding to each industrial data according to each communication protocol identifier.
Illustratively, since each of the different types of devices employs a different communication protocol, the industrial data generated by each device may include therein a communication protocol identification for identifying the communication protocol employed by the device.
Different analysis schemes are set for different communication protocols in advance, and the different communication protocols are associated with the analysis schemes corresponding to the different communication protocols and then stored in a protocol library. If the new communication protocol is acquired later or decoded, the new communication protocol can be stored in the protocol library. The parsing scheme may be provided by a client, or may be configured by packet capture analysis, which is not limited to this.
The analysis schemes in the protocol library can be dynamically loaded, namely, the communication protocol of the identifier is determined according to different communication protocol identifiers, the analysis scheme corresponding to the communication protocol is dynamically searched in the protocol library according to the communication protocol, and the searched analysis scheme is determined to be the analysis scheme corresponding to each current industrial data.
S1022: and analyzing each industrial data according to each analysis scheme to obtain a plurality of analysis data.
Illustratively, the parsing scheme includes a decoding mode, a decryption mode, a mode of eliminating redundant data, and the like, and the scrambled data in the industrial data is decoded into data conforming to a preset data format according to the decoding mode, the encrypted data in the industrial data is decrypted according to the decryption mode, and the redundant data in the industrial data is eliminated according to the mode of eliminating the redundant data, so that a plurality of parsing data are obtained.
In the embodiment, each industrial data is analyzed according to different analysis schemes, and the industrial data is analyzed in a targeted manner, so that the analyzed data obtained by analysis is more accurate, and the method is favorable for accurately describing the operation condition of the equipment based on the fusion result obtained by the analyzed data.
Optionally, when each industrial data is analyzed according to each analysis scheme, the analysis may be performed in a cluster manner. The method includes that different analysis schemes and corresponding industrial data to be analyzed are distributed to a plurality of mutually independent computers in a cluster, and each computer analyzes the industrial data distributed to the computer. The method adopts a design idea of light weight, rapidness and convenient iteration, thereby not only improving the analysis speed, but also being convenient for rapidly expanding the capacity to deal with a large amount of data in a short time.
Optionally, in a possible implementation manner, the S1022 may include S10221 to S10222, which are as follows:
s10221: and analyzing each industrial data into analyzable data according to each analysis scheme.
The analyzable data refers to data which conforms to a preset data format and is decrypted.
The analysis scheme comprises a decoding mode, a decryption mode, a redundant data elimination mode and the like, disordered code data in the industrial data are decoded into data conforming to a preset data format according to the decoding mode, encrypted data in the industrial data are decrypted according to the decryption mode, and the obtained data are analyzable data.
Optionally, in a possible implementation manner, after each industrial data is analyzed into analyzable data, the effect of the analyzable data can be further analyzed and marked.
S10222: and filtering redundant data in each analyzable data to obtain a plurality of analyzable data.
The redundant data may include heartbeat packet data, time check packet data, duplicate data, and the like. And removing redundant data in the analyzable data according to the manner of removing the redundant data so as to obtain a plurality of analytical data. For example, heartbeat packet data and time proofreading packet data in the analyzable data are deleted, one group of the repeated data is reserved, and then the rest of the repeated data are deleted, and finally the rest of the repeated data are the analyzable data.
Optionally, redundant data in each analyzable data may be filtered according to historical data and empirical analysis to obtain a plurality of analytic data.
In the embodiment, each industrial data is analyzed according to different analysis schemes, redundant data are filtered, the finally obtained analysis data are more accurate, and the subsequent fusion result obtained based on the analysis data can accurately describe the operation condition of the equipment. And the interference of redundant data is avoided, and the speed of subsequent fusion processing is indirectly improved.
Referring to fig. 3, fig. 3 is a detailed flowchart of step S104 of a method for processing data according to another exemplary embodiment of the present application, and optionally, in a possible implementation manner, the step S104 may include steps S1041 to S1042, which are specifically as follows:
s1041: complementary data is looked up in the plurality of parsed data.
S1042: and combining the complementary data to obtain a fusion result.
Different association relationships exist among the plurality of analysis data, such as complementary relationships, redundant relationships, and cooperative relationships. When the relationship existing among the plurality of analysis data is a complementary relationship, the plurality of analysis data includes the complementary data. Complementary data refers to different data belonging to the same scene, and the combination of different data in the same scene can obtain more complete global information of the scene. The same scene refers to a corresponding scene when the device completes a certain event.
Complementary data is searched for in the plurality of parsed data, and it is understood that the complementary data is at least two sets of data. And combining the complementary data to obtain a fusion result.
In the embodiment, complementary data are searched in the plurality of analysis data and combined, and the obtained fusion result can reflect global information more completely, so that the operation state of the equipment can be accurately reflected.
Optionally, when an association existing between the multiple analysis data is a redundancy type relationship, the multiple analysis data includes redundancy type data, it is worth to say that the redundancy type data is not the same as the redundancy data, the redundancy type data refers to completely the same data, the redundancy type data refers to that two or more sources of input data provide information for the same target, or the redundancy type data refers to that two or more analysis data represent the same target, and fusing the redundancy type data may enhance the reliability.
Optionally, when the relationship existing between the plurality of analysis data is a collaborative relationship, the plurality of analysis data includes collaborative data, and the plurality of collaborative data may be combined to express more complex information than the individual data.
Optionally, after data fusion, analysis and research can be performed according to the existing real data, and the fusion algorithm is finely adjusted and processed, so that a more accurate, more complete and more reliable data report after fusion can be provided, the comparison and analysis time of related personnel can be shortened, and great convenience is provided for later network security, honeypot deployment, competition learning and data collection and analysis.
Referring to fig. 4, fig. 4 is a schematic flow chart of a method for processing data according to still another exemplary embodiment of the present application, where the method shown in fig. 4 may include: s201 to S205. It should be noted that S201 to S204 in this embodiment are identical to S101 to S104 in the embodiment corresponding to fig. 1, and specific reference is made to the description of S101 to S104 in the embodiment corresponding to fig. 1, which is not repeated herein. S205 is specifically as follows:
s205: and constructing honeypot data according to the fusion result, wherein the honeypot data is used for honeypot deployment.
The honeypot technology is a technology for cheating attackers essentially, the attackers are induced to attack the attackers by arranging hosts, network services or information as decoys, so that the attack behavior can be captured and analyzed, tools and methods used by the attackers are known, attack intentions and motivations are presumed, defenders can clearly know the security threats faced by the attackers, and the security protection capability of an actual system is enhanced through technical and management means.
Illustratively, data matched with the type is selected from the fusion result according to the type of the honeypots, and the selected data is used as honeypot data. According to the selected honeypot data, the device can be simulated more truly, and honeypot deployment is achieved. The honeypot type can include research honeypots, product honeypots, high-interaction honeypots, low-interaction honeypots and the like.
In the embodiment, honeypot data are constructed according to the fusion result, and can simulate equipment more truly and attract real data of other attack sides reversely, so that on one hand, attack on the real equipment is blocked, on the other hand, a research sample is obtained, the attack mode of an attacker is conveniently researched, and the safety of the real equipment is better protected.
Referring to fig. 5, fig. 5 is a schematic flow chart of a method for processing data according to another exemplary embodiment of the present application, where the method shown in fig. 5 may include: s301 to S305. It should be noted that S301 to S304 in this embodiment are completely the same as S101 to S104 in the embodiment corresponding to fig. 1, and specific reference is made to the description of S101 to S104 in the embodiment corresponding to fig. 1, which is not repeated herein. S305 is specifically as follows:
s305: and backing up each industrial data and the fusion result to obtain backup data.
Illustratively, the respective industrial data and the fusion result may be backup-stored. For example, each industrial data and the fusion result are copied and stored in the database, and the data copied and stored in the database is backup data.
Optionally, the backup of each industrial data may also be performed after the industrial data generated by each device in the industrial environment is collected.
In this embodiment, backup is performed on each industrial data and the fusion result to obtain backup data. On one hand, the backup data can be used for simulating simulation data, namely when simulation real data is needed in some scenes, the backup data can be called out from the database and used for simulating the simulation data. And on the other hand, each industrial data and the fusion result are conveniently replayed and retrieved during the later investigation of the left file.
Fig. 6 is a schematic diagram of a system for processing data according to an exemplary embodiment of the present application, and as shown in fig. 6, the system includes an acquisition module, a parsing module, and a fusion module.
The acquisition module is used for acquiring industrial data generated by each device in an industrial environment, and communication protocols adopted by each device are different. The acquisition module transmits each acquired industrial data to the analysis module.
And the analysis module is used for analyzing each industrial data to obtain a plurality of analysis data. The analysis module transmits a plurality of analyzed data obtained by analysis to the fusion module.
The fusion module is used for determining fusion algorithms corresponding to the plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction; and performing fusion processing on the plurality of analysis data according to the fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device.
Optionally, each industrial data includes a communication protocol identifier, and the parsing module is specifically configured to: determining an analysis scheme corresponding to each industrial data according to each communication protocol identifier; and analyzing each industrial data according to each analysis scheme to obtain a plurality of analysis data.
Optionally, the parsing module is further configured to: according to each analysis scheme, analyzing each industrial data into analyzable data; and filtering redundant data in each analyzable data to obtain a plurality of analyzable data.
Optionally, the fusion module is specifically configured to: searching for complementary data in the plurality of parsed data; and combining the complementary data to obtain a fusion result.
Optionally, the industrial data includes data entered into the device, log data of the device, and data output by the device.
Optionally, the system further includes a building module, configured to build honeypot data according to the fusion result, where the honeypot data is used for honeypot deployment.
Optionally, the system further includes a storage module, configured to backup each industrial data and the fusion result to obtain backup data, where the backup data is used to simulate simulation data.
Illustratively, the storage module is mainly used for storing all the collected industrial data and the fusion result after fusion processing. Since the data volume may be very large, there will be corresponding clusters and buffer areas, backup areas to process. For example, the merging process of the multiple pieces of resolution data can be performed in the cache area, and the backup of the industrial data and the merging result can be performed in the backup area.
It should be noted that the acquisition module, the analysis module, the fusion module, the construction module, and the storage module included in the system may be an entity module or a virtual module, and may all be used to implement the steps in the above method embodiments. The method is not limited to the actual situation.
The embodiments of the present application further provide a computer storage medium, where the computer storage medium may be nonvolatile or volatile, and the computer storage medium stores a computer program, and the computer program, when executed by a processor, implements the steps in the above method embodiments.
The present application also provides a computer program product for causing a device to perform the steps of the above-described method embodiments when the computer program product is run on the device.
An embodiment of the present application further provides a chip or an integrated circuit, where the chip or the integrated circuit includes: a processor for calling and running the computer program from the memory so that the device on which the chip or integrated circuit is installed performs the steps of the above-mentioned method embodiments.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the description of each embodiment has its own emphasis, and reference may be made to the related description of other embodiments for parts that are not described or recited in any embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not cause the essential features of the corresponding technical solutions to depart from the spirit scope of the technical solutions of the embodiments of the present application, and are intended to be included within the scope of the present application.

Claims (10)

1. A method of processing data, the method comprising:
acquiring industrial data generated by each device in an industrial environment, wherein the communication protocols adopted by each device are different;
analyzing each industrial data to obtain a plurality of analyzed data;
determining a fusion algorithm corresponding to the plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction;
and carrying out fusion processing on the plurality of analysis data according to the fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device.
2. The method of claim 1, wherein each industrial data includes a communication protocol identifier, and the parsing each industrial data to obtain a plurality of parsed data comprises:
determining an analysis scheme corresponding to each industrial data according to each communication protocol identifier;
and analyzing each industrial data according to each analysis scheme to obtain a plurality of analysis data.
3. The method of claim 2, wherein said parsing each of said industrial data according to each of said parsing schemes to obtain said plurality of parsed data comprises:
analyzing each industrial data into analyzable data according to each analysis scheme;
and filtering redundant data in each analyzable data to obtain the plurality of analyzable data.
4. The method according to claim 1, wherein the fusing the plurality of analysis data according to the fusion algorithm to obtain a fusion result comprises:
searching for complementary data in the plurality of parsed data;
and combining the complementary data to obtain the fusion result.
5. The method of claim 1, wherein the industrial data comprises data entered into the device, log data for the device, and data output by the device.
6. The method according to any one of claims 1 to 5, wherein after the fusing the plurality of analysis data according to the fusion algorithm to obtain a fused result, the method further comprises:
and constructing honeypot data according to the fusion result, wherein the honeypot data is used for honeypot deployment.
7. The method of any of claims 1 to 5, further comprising:
and backing up each industrial data and the fusion result to obtain backup data, wherein the backup data is used for simulating simulation data.
8. A system for processing data, the system comprising:
the system comprises an acquisition module, a data processing module and a data processing module, wherein the acquisition module is used for acquiring industrial data generated by each device in an industrial environment, and communication protocols adopted by each device are different;
the analysis module is used for analyzing each industrial data to obtain a plurality of analysis data;
the fusion module is used for determining a fusion algorithm corresponding to the plurality of analysis data according to a preset fusion purpose, wherein the fusion purpose comprises equipment early warning and topological graph construction; and carrying out fusion processing on the plurality of analysis data according to the fusion algorithm to obtain a fusion result, wherein the fusion result is used for describing the operation condition of each device.
9. The system of claim 8, wherein the system further comprises:
and the storage module is used for backing up all the industrial data and the fusion result to obtain backup data, and the backup data is used for simulating simulation data.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202211001422.1A 2022-08-19 2022-08-19 Method, system and storage medium for processing data Pending CN115484326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211001422.1A CN115484326A (en) 2022-08-19 2022-08-19 Method, system and storage medium for processing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211001422.1A CN115484326A (en) 2022-08-19 2022-08-19 Method, system and storage medium for processing data

Publications (1)

Publication Number Publication Date
CN115484326A true CN115484326A (en) 2022-12-16

Family

ID=84420848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211001422.1A Pending CN115484326A (en) 2022-08-19 2022-08-19 Method, system and storage medium for processing data

Country Status (1)

Country Link
CN (1) CN115484326A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116248778A (en) * 2023-05-15 2023-06-09 珠海迈科智能科技股份有限公司 Data fusion transmission method and system in multi-protocol environment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116248778A (en) * 2023-05-15 2023-06-09 珠海迈科智能科技股份有限公司 Data fusion transmission method and system in multi-protocol environment
CN116248778B (en) * 2023-05-15 2023-08-11 珠海迈科智能科技股份有限公司 Data fusion transmission method and system in multi-protocol environment

Similar Documents

Publication Publication Date Title
CN107135093B (en) Internet of things intrusion detection method and detection system based on finite automaton
CN107360145B (en) Multi-node honeypot system and data analysis method thereof
CN109587125B (en) Network security big data analysis method, system and related device
CN111177779B (en) Database auditing method, device, electronic equipment and computer storage medium
CN101605074A (en) The method and system of communication behavioural characteristic monitoring wooden horse Network Based
CN111866016B (en) Log analysis method and system
US11546295B2 (en) Industrial control system firewall module
CN110210213B (en) Method and device for filtering malicious sample, storage medium and electronic device
CN114567463B (en) Industrial network information safety monitoring and protecting system
CN112073437B (en) Multi-dimensional security threat event analysis method, device, equipment and storage medium
WO2019084072A1 (en) A graph model for alert interpretation in enterprise security system
CN114679292B (en) Honeypot identification method, device, equipment and medium based on network space mapping
CN112560029A (en) Website content monitoring and automatic response protection method based on intelligent analysis technology
CN105577670A (en) Warning system of database-hit attack
CN112883031A (en) Industrial control asset information acquisition method and device
CN115484326A (en) Method, system and storage medium for processing data
Waagsnes et al. Intrusion Detection System Test Framework for SCADA Systems.
Graveto et al. A network intrusion detection system for building automation and control systems
CN114465741A (en) Anomaly detection method and device, computer equipment and storage medium
CN111209566A (en) Intelligent anti-crawler system and method for multi-layer threat interception
KR102295348B1 (en) Method for Analyzing and Detecting Security Threat of Operational Technology Data
CN114760083A (en) Method and device for issuing attack detection file and storage medium
Dheeraj et al. Design and Development of SCADA Firewall Security Features for Protecting Industrial Operations
Hormann et al. Parsing and extracting features from opc unified architecture in industrial environments
CN116827698B (en) Network gateway flow security situation awareness system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination