CN115941357A - Flow log detection method and device based on industrial safety and electronic equipment - Google Patents

Flow log detection method and device based on industrial safety and electronic equipment Download PDF

Info

Publication number
CN115941357A
CN115941357A CN202310023746.3A CN202310023746A CN115941357A CN 115941357 A CN115941357 A CN 115941357A CN 202310023746 A CN202310023746 A CN 202310023746A CN 115941357 A CN115941357 A CN 115941357A
Authority
CN
China
Prior art keywords
identification
vector
sample
target
identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310023746.3A
Other languages
Chinese (zh)
Other versions
CN115941357B (en
Inventor
姜双林
李小龙
赵时晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Andi Technology Co ltd
Original Assignee
Beijing Andi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Andi Technology Co ltd filed Critical Beijing Andi Technology Co ltd
Priority to CN202310023746.3A priority Critical patent/CN115941357B/en
Publication of CN115941357A publication Critical patent/CN115941357A/en
Application granted granted Critical
Publication of CN115941357B publication Critical patent/CN115941357B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the disclosure discloses a flow log detection method and device based on industrial safety and electronic equipment. One embodiment of the method comprises: acquiring a flow log from a target industrial network device; vectorizing the target device identifier to generate a target device identifier vector; inputting the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result; in response to the fact that the identification result of the virtual and fake equipment identification represents that the target equipment identification is the real equipment identification, carrying out data splitting processing on the flow log according to a data splitting format to generate a split flow data set; according to the analysis time and the data type corresponding to the split flow data in the split flow data set, clustering the split flow data set to generate a split flow data set; and sending the split flow data group set to a data detection terminal according to a preset data sending format. This embodiment shortens the detection time.

Description

Flow log detection method and device based on industrial safety and electronic equipment
Technical Field
The embodiment of the disclosure relates to the field of industrial internet, in particular to a flow log detection method and device based on industrial safety and electronic equipment.
Background
With the rapid iteration of the industrial control network, the information security of the industrial control system is increasingly highlighted while the informatization level of the industrial control system is improved. Currently, the detection of a flow log in an industrial device (e.g., an industrial switch) generally adopts the following methods: and sending the flow log in the industrial equipment to the detection equipment for detection.
However, the following technical problems generally exist in the above manner:
1, directly sending the flow log to detection equipment for detection, wherein the detection efficiency is low and the detection time is long;
2, the device identifier in the flow log is not detected, the detection accuracy is low, and the time for detecting the flow log is long.
Disclosure of Invention
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Some embodiments of the present disclosure propose an industrial safety-based flow log detection method, apparatus, electronic device and computer-readable medium to solve one or more of the technical problems mentioned in the background section above.
In a first aspect, some embodiments of the present disclosure provide a method for detecting a flow log based on industrial safety, the method including: obtaining a flow log from a target industrial network device, wherein the flow log comprises: a target device identification; vectorizing the target device identifier to generate a target device identifier vector; inputting the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result corresponding to the target equipment identification vector; in response to the fact that the identification result of the virtual and fake equipment identification represents that the target equipment identification is a real equipment identification, carrying out data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set; according to the analysis time and the data type corresponding to the split flow data in the split flow data set, clustering the split flow data set to generate a split flow data set; and sending the split flow data group set to a related data detection terminal according to a preset data sending format.
In a second aspect, some embodiments of the present disclosure provide an industrial safety-based flow log detection apparatus, including: an obtaining unit configured to obtain a flow log from a target industrial network device, wherein the flow log includes: a target device identification; a vectorization unit configured to perform vectorization processing on the target device identifier to generate a target device identifier vector; the input unit is configured to input the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result corresponding to the target equipment identification vector; the splitting unit is configured to perform data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set in response to the fact that the target equipment identifier is represented as a real equipment identifier according to the virtual and fake equipment identifier identification result; the clustering unit is configured to perform clustering processing on the split flow data set according to the analysis time and the data type corresponding to the split flow data in the split flow data set to generate a split flow data group set; and the sending unit is configured to send the split flow data group set to an associated data detection terminal according to a preset data sending format.
In a third aspect, some embodiments of the present disclosure provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement the method described in any of the implementations of the first aspect.
In a fourth aspect, some embodiments of the present disclosure provide a computer readable medium on which a computer program is stored, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect.
The above embodiments of the present disclosure have the following advantages: through the flow log detection method based on industrial safety of some embodiments of the disclosure, the detection efficiency of the flow log is improved, and the detection time is shortened. Specifically, the reason why the detection time is long is that: the flow log is directly sent to the detection equipment for detection, and the detection efficiency is low. Based on this, the traffic log detection method based on industrial safety of some embodiments of the present disclosure first obtains a traffic log from a target industrial network device. Wherein the traffic log comprises: and identifying the target equipment. Thereby, detection of the traffic log is facilitated. Secondly, vectorization processing is carried out on the target device identification to generate a target device identification vector. Therefore, whether the device identification in the flow log is the false hypothesis backup identification or not is convenient to detect. And then, inputting the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result corresponding to the target equipment identification vector. Therefore, whether the target device identifier included in the traffic log is a virtual standby identifier can be detected. Therefore, the flow log can be preliminarily detected, and the abnormal flow log is exposed. And then, in response to the fact that the identification result of the virtual and fake equipment identifier represents that the target equipment identifier is a real equipment identifier, carrying out data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set. Therefore, under the condition that the target equipment identification is determined to be the real equipment identification, the data in the flow log can be split, so that the data detection can be conveniently carried out subsequently. Then, according to the analysis time and the data type corresponding to the split flow data in the split flow data set, clustering the split flow data set to generate a split flow data group set. Therefore, different types of data in different time periods can be clustered, and the data in different types can be conveniently detected subsequently. And finally, according to a preset data sending format, sending the split flow data group set to a related data detection terminal. Because the abnormal flow log is preliminarily detected through the equipment identification detection, the detection pressure of the data detection terminal is reduced. In addition, because the data included in the flow logs are analyzed and clustered, the data detection terminal is further convenient to detect, the detection efficiency of the flow logs is improved, and the detection time is shortened.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a flow diagram of some embodiments of an industrial safety-based flow log detection method according to the present disclosure;
FIG. 2 is a schematic block diagram of some embodiments of an industrial safety-based flow log detection apparatus according to the present disclosure;
FIG. 3 is a schematic block diagram of an electronic device suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 is a flow diagram of some embodiments of an industrial safety-based traffic log detection method according to the present disclosure. A flow 100 of some embodiments of an industrial safety-based traffic log detection method according to the present disclosure is shown. The flow log detection method based on industrial safety comprises the following steps:
step 101, obtaining a flow log from a target industrial network device.
In some embodiments, an executing entity (e.g., a server) of the industrial security-based traffic log detection method may obtain the traffic log from the target industrial network device through a wired connection or a wireless connection. Wherein the traffic log comprises: and identifying the target equipment. Here, the target device identifier may refer to a device identifier (such as a URL or a preset identifier uniquely representing the industrial network device) of the industrial network device carried in the transmitted traffic data. The target industrial network device may refer to a computing device (e.g., an industrial personal computer) operating in an industrial control network environment. The traffic log may include timestamps, source IP, destination IP, source port, destination port, ingress and egress traffic, quality of service (Qos), and the like. Here, it may be preliminarily detected whether the traffic data is tampered or whether the data transmission end is abnormal by using the device identifier carried in the data.
Step 102, vectorizing the target device identifier to generate a target device identifier vector.
In some embodiments, the execution subject may perform vectorization processing on the target device identifier to generate a target device identifier vector. The target device identification may be vectorized by a pre-trained vector coding model to generate a target device identification vector. The vector coding model may refer to a pre-trained neural network model with the target device identification as an input and the target device identification vector as an output. For example, the vector coding model may be a BERT model.
Optionally, a sample set of device identifications is obtained.
In some embodiments, the execution subject may obtain the device identification sample set from the terminal device through a wired connection manner or a wireless connection manner. Wherein, the device identification sample set comprises: a positive device identification sample group and a negative device identification sample group. Here, the positive device identification samples in the positive device identification sample group may be samples of true device identifications. The negative device identification samples in the negative device identification sample group may refer to samples of the dummy device identification. For example, the negative device id sample may be obtained by adding a false device id to a real device id, and then obtaining the added device id as the negative device id sample. The negative device identification sample may also be a positive device identification sample subjected to a noise adding process, and the generated noise added device identification sample is used as the negative device identification sample.
Optionally, a positive device identification sample is selected from the positive device identification sample group, and a negative device identification sample is selected from the negative device identification sample group.
In some embodiments, the execution subject may randomly select a positive device identification pattern from the set of positive device identification patterns and a negative device identification pattern from the set of negative device identification patterns.
Optionally, vectorization processing is performed on the positive device identification sample and the negative device identification sample respectively to generate a positive device identification sample vector and a negative device identification sample vector.
In some embodiments, the execution subject may perform vectorization processing on the positive device identification samples and the negative device identification samples respectively to generate a positive device identification sample vector and a negative device identification sample vector. That is, the positive device identification sample and the negative device identification sample may be vectorized by a pre-trained vector coding model to generate a positive device identification sample vector and a negative device identification sample vector. The vector coding model may refer to a pre-trained neural network model with device identification as input and device identification vector as output. For example, the vector coding model may be a BERT model.
Optionally, the negative device identifier sample vector is input to an initial sample denoising model included in the initial virtual hypothesis device identifier recognition model, so as to obtain a negative device identifier denoising vector.
In some embodiments, the executing entity may input the negative device identification sample vector to an initial sample denoising model included in an initial virtual hypothesis device identification recognition model, so as to obtain a negative device identification denoising vector. Wherein the initial virtual fake device identification recognition model may be an untrained virtual fake device identification recognition model. The virtual device identifier recognition model may be a model for recognizing a virtual device identifier. For example, the virtual device identification recognition model may be a multi-layer Convolutional Neural Network (CNN). The initial sample denoising model can be an untrained sample denoising model. The sample noise removal model may be a model that removes spurious noise information in the device identification samples. For example, the sample noise removal model may be a convolutional neural network model or a cyclic neural network model. The negative device identification noise removal vector can represent feature information of the negative device identification sample after noise features are removed from sample features corresponding to the negative device identification sample. Here, the initial sample denoising model includes: an initial coding model and an initial decoding model. The initial coding model may be an untrained coding model. The coding model may be a model that encodes the device identification vector to extract noise characteristic information. The resulting encoded vector may represent a vector of noise information identified by the null hypothesis. For example, the coding model may be a Deep Neural Networks (DNN) model. Wherein the initial decoding model may be an untrained decoding model. The decoding model may be a model that decodes the encoded vector to extract a device identification sample vector that does not include noise feature information. For example, the decoding model may be a deep neural network model.
In practice, the executing agent may input the negative device identifier sample vector to an initial sample denoising model included in the initial virtual hypothesis device identifier recognition model through the following steps to obtain a negative device identifier denoising vector:
firstly, inputting the negative equipment identification sample vector into the initial coding model to obtain a negative equipment identification sample coding vector. The vector dimension of the negative device identification sample vector is larger than the vector dimension of the negative device identification sample encoding vector.
And secondly, inputting the negative equipment identification sample coding vector into the initial decoding model to obtain a negative equipment identification sample decoding vector serving as a negative equipment identification noise removal vector. The vector dimension of the negative device identification sample encoding vector is less than the vector dimension of the negative device identification noise removal vector.
Optionally, device identifier classification information that characterizes whether the device identifier corresponding to the negative device identifier sample is a false hypothesis device identifier is generated according to the negative device identifier noise removal vector, the negative device identifier sample vector, and an initial device identifier classification model.
In some embodiments, the executing entity may generate device identifier classification information that characterizes whether the device identifier corresponding to the negative device identifier sample is a null hypothesis device identifier according to the negative device identifier noise removal vector, the negative device identifier sample vector, and an initial device identifier classification model. Wherein, the identification model of the initial virtual false device identifier comprises: the initial device identification classification model. The initial device identification classification model may be an untrained device identification classification model. The device identifier classification model may be a classification model for determining whether the device identifier is a dummy device identifier. For example, the device identification classification model may be a convolutional neural network model. For example, the device identification classification information includes: and the device identifier corresponding to the characterization negative device identifier sample is information of the virtual hypothesis device identifier and the device identifier corresponding to the characterization negative device identifier sample is not information of the virtual hypothesis device identifier.
In practice, according to the negative device identifier noise removal vector, the negative device identifier sample vector, and the initial device identifier classification model, the execution subject may generate device identifier classification information that represents whether the device identifier corresponding to the negative device identifier sample is a null hypothesis device identifier by:
and step one, splicing the negative equipment identification noise removal vector, the negative equipment identification sample vector and the negative equipment identification sample coding vector to obtain a negative equipment identification splicing vector.
And secondly, inputting the negative equipment identifier splicing vector into the initial equipment identifier classification model to obtain equipment identifier classification information.
Optionally, the initial pseudo-random spare identifier recognition model is trained according to the positive device identifier sample vector, the negative device identifier noise removal vector, and the device identifier classification information, so as to obtain a trained pseudo-random spare identifier recognition model.
In some embodiments, the executing entity may train the initial pseudo-random number-of-candidates recognition model according to the positive device identifier sample vector, the negative device identifier noise removal vector, and the device identifier classification information, to obtain a trained pseudo-random number-of-candidates recognition model.
In practice, the executing entity may train the initial pseudo-random tag recognition model to obtain a trained pseudo-random tag recognition model by the following steps:
first, a first sample loss value is generated based on the positive device identification sample vector and the negative device identification noise removal vector. In practice, the loss value between the positive device identification sample vector and the negative device identification noise removal vector may be determined as the first sample loss value through a preset loss function. The loss function may be, but is not limited to: mean square error loss function (MSE), hinge loss function (SVM), cross entropy loss function (Cross Entropy), 0-1 loss function, absolute value loss function, log logarithmic loss function, squared loss function, exponential loss function, and the like.
And secondly, generating a second sample loss value based on the equipment identification classification information. In practice, the execution subject may determine, according to a second preset loss function, a loss value of a vector corresponding to the device identifier classification information and a negative device identifier concatenation vector as a second sample loss value.
And thirdly, generating a first sample weight corresponding to the first sample loss value and a second sample weight corresponding to the second sample loss value according to the training batch corresponding to the equipment identification sample set. First, the execution subject may query a training batch corresponding to the device identification sample set in a batch query manner. Then, the executive may determine a first sample weight and a second sample weight corresponding to the training batch. That is, the first sample weight is different from the second sample weight for different training batches. A training batch corresponds to a first sample weight and a second sample weight. Here, each first sample weight and each second sample weight are preset.
And fourthly, determining the product of the first sample weight and the first sample loss value as a first weight product.
And fifthly, determining the product of the second sample weight and the second sample loss value as a second weight product.
And sixthly, determining the sum of the first weight product and the second weight product as a model training loss value.
And seventhly, training the initial virtual hypothesis standby identifier recognition model according to the model training loss value to obtain a trained virtual hypothesis standby identifier recognition model. That is, the executing entity may adjust the network parameter of the initial virtual false device identifier recognition model in response to determining that the model training loss value is greater than or equal to a preset loss value. For example, the model training penalty value and the preset penalty value may be differenced. On the basis, the error value is transmitted from the last layer of the model to the front by using methods such as back propagation, random gradient descent and the like so as to adjust the parameter of each layer. Of course, according to the requirement, a network freezing (dropout) method may also be adopted, and the network parameters of some layers are kept unchanged and are not adjusted, which is not limited in any way. The executing agent may determine the initial null hypothesis standby identity recognition model as a trained null hypothesis standby identity recognition model in response to determining that the model training loss value is smaller than a preset loss value.
It should be noted that training the initial pseudo-hypothesis candidate identification recognition model may include the following training steps:
firstly, determining the network structure of the initial virtual device identifier recognition model and initializing the network parameters of the initial virtual device identifier recognition model.
Secondly, inputting the device identification sample as the initial virtual hypothesis device identification recognition model, taking device identification classification information corresponding to the device identification sample as expected output of the initial virtual hypothesis device identification recognition model, and training the initial virtual hypothesis device identification recognition model by using a deep learning method.
The above-mentioned related content serves as an inventive point of the present disclosure, and solves the technical problem mentioned in the background art that "the time for detecting the flow log is long". The factors that take longer to detect the flow log tend to be as follows: the device identification in the flow log is not detected, and the detection accuracy is low. If the above-mentioned factors are solved, the effect of shortening the detection time for the flow log can be achieved. To achieve this, first, a sample set of device identification is obtained. Wherein, the device identification sample set comprises: a positive device identification sample group and a negative device identification sample group. Therefore, the device identification sample set can be used as training data to facilitate the training of the subsequent virtual hypothesis and identification recognition model. Next, a positive device identification sample is selected from the positive device identification sample group, and a negative device identification sample is selected from the negative device identification sample group. Therefore, the training of the subsequent model is facilitated through the positive equipment identification sample and the negative equipment identification sample. And then, respectively carrying out vectorization processing on the positive equipment identification sample and the negative equipment identification sample to generate a positive equipment identification sample vector and a negative equipment identification sample vector. Thus, conversion of device identification samples into vector form for input into the model is facilitated by the vectorization process. And then, inputting the negative equipment identification sample vector into an initial sample denoising model included in the initial virtual hypothesis device identification recognition model to obtain a negative equipment identification noise removal vector. Therefore, the characteristic information which does not comprise noise information can be extracted through the initial sample denoising model, so that the training of the subsequent initial virtual hypothesis identification model is facilitated. And then, generating equipment identifier classification information representing whether the equipment identifier corresponding to the negative equipment identifier sample is a virtual hypothesis equipment identifier or not according to the negative equipment identifier noise removal vector, the negative equipment identifier sample vector and an initial equipment identifier classification model. Wherein, the identification model of the initial virtual false device identifier comprises: the initial device identification classification model described above. Therefore, the classification model is identified through the initial equipment to generate classification information, so that the training of the subsequent initial virtual hypothesis identification recognition model is facilitated. And finally, training the initial pseudo-random standby identifier recognition model according to the positive equipment identifier sample vector, the negative equipment identifier noise removal vector and the equipment identifier classification information to obtain a trained pseudo-random standby identifier recognition model. Therefore, the initial virtual false equipment identifier recognition model can be trained more effectively and accurately according to the positive equipment identifier sample vector, the negative equipment identifier noise removal vector and the equipment identifier classification information. Therefore, a virtual standby identifier recognition model which can more accurately recognize the virtual standby identifier can be obtained. Further, the device identification in the traffic log can be accurately identified. The detection time of the flow log is shortened.
Step 103, inputting the target device identification vector into a pre-trained false hypothesis backup identification recognition model to obtain a false hypothesis backup identification recognition result corresponding to the target device identification vector.
In some embodiments, the executing entity may input the target device identification vector into a pre-trained imaginary spare identification recognition model to obtain an imaginary spare identification recognition result corresponding to the target device identification vector. The hypothetical candidate identification recognition model may be a Convolutional Neural Network (CNN) trained in advance, which takes the target device identification vector as an input and the hypothetical candidate identification recognition result as an output. The virtual standby identifier recognition result may indicate that the target device identifier is a real device identifier or a virtual standby identifier.
And step 104, in response to the fact that the identification result of the virtual and fake equipment identifier represents that the target equipment identifier is a real equipment identifier, performing data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set.
In some embodiments, the executing body may perform data splitting processing on the traffic log according to a preset data splitting format in response to determining that the virtual device identifier recognition result indicates that the target device identifier is a real device identifier, so as to generate a split traffic data set. Here, the preset data splitting format may be a preset data format. For example, the preset data splitting format may be a JSON (JavaScript Object notification) format, a regular format. In practice, first, the execution main body may split data, which conforms to a preset data splitting format, in each data included in the traffic log according to the preset data splitting format. Then, for each data which does not conform to the preset data splitting format and is included in the flow log, each data is analyzed and split through a classification and regression tree algorithm and a K nearest neighbor algorithm. Thus, a split traffic data set is obtained.
And 105, clustering the split flow data set according to the analysis time and the data type corresponding to the split flow data in the split flow data set to generate a split flow data set.
In some embodiments, the execution subject may perform clustering on the split traffic data set according to the parsing time and the data type corresponding to the split traffic data in the split traffic data set, so as to generate a split traffic data set. That is, the split flow data in the same time interval and with the same data type are grouped into one type to generate a split flow data group, so as to obtain a split flow data group set. The data in the split stream data set is merged and parsed, such as by moving windows.
And step 106, sending the split flow data group set to a related data detection terminal according to a preset data sending format.
In some embodiments, the execution subject may send the split traffic data set to an associated data detection terminal according to a preset data sending format. Here, the preset data transmission format may be a format of transmission data set in advance. The preset data transmission format may be a JSON format or a Syslog protocol format. The data detection terminal may refer to a terminal for detecting the split flow data. For example, the data detection terminal may be a server for data detection.
Optionally, in response to determining that the identification result of the virtual device identifier represents that the target device identifier is a virtual device identifier, generating abnormal traffic log information corresponding to the target industrial network device.
In some embodiments, the executing body may generate abnormal traffic log information corresponding to the target industrial network device in response to determining that the virtual device identifier recognition result indicates that the target device identifier is a virtual device identifier. Here, the abnormal traffic log information may indicate that all the information of the traffic data described in the traffic log is dummy information.
Optionally, the abnormal traffic log information is sent to a related network device detection terminal, and a related warning device is controlled to perform a warning operation.
In some embodiments, the execution subject may send the abnormal traffic log information to an associated network device detection terminal, and control an associated alarm device to perform an alarm operation. Here, the network device detection terminal may be a terminal communicatively connected to the execution main body for detecting the industrial network device. For example, after receiving the abnormal traffic log information, the network device detection terminal may notify a relevant technician to perform detection and maintenance on the network device. The alerting device may be a device that alerts the network device of an anomaly. For example, the alerting device may be a voice prompting device. For example, an associated alerting device may be controlled to emit a voice alert tone to alert the network device of an anomaly.
With further reference to fig. 2, as an implementation of the methods shown in the above figures, the present disclosure provides some embodiments of an industrial safety-based traffic log detection apparatus, which correspond to those of the method embodiments shown in fig. 1, and which can be applied in various electronic devices in particular.
As shown in fig. 2, the industrial safety-based flow log detection apparatus 200 of some embodiments includes: an acquisition unit 201, a vectorization unit 202, an input unit 203, a splitting unit 204, a clustering unit 205, and a sending unit 206. The obtaining unit 201 is configured to obtain a flow log from a target industrial network device, where the flow log includes: a target device identification; a vectorization unit 202 configured to perform vectorization processing on the target device identifier to generate a target device identifier vector; an input unit 203, configured to input the target device identifier vector into a pre-trained pseudo-random identifier recognition model, so as to obtain a pseudo-random identifier recognition result corresponding to the target device identifier vector; a splitting unit 204, configured to, in response to determining that the virtual device identifier recognition result indicates that the target device identifier is a real device identifier, perform data splitting processing on the traffic log according to a preset data splitting format to generate a split traffic data set; a clustering unit 205 configured to perform clustering processing on the split traffic data set according to the parsing time and data type corresponding to the split traffic data in the split traffic data set, so as to generate a split traffic data set; the sending unit 206 is configured to send the split traffic data set to an associated data detection terminal according to a preset data sending format.
It is understood that the units described in the industrial safety-based flow log detecting apparatus 200 correspond to the respective steps in the method described with reference to fig. 1. Therefore, the operations, features and advantageous effects of the methods described above are also applicable to the flow log detection apparatus 200 based on industrial safety and the units included therein, and are not described herein again.
Referring now to fig. 3, a block diagram of an electronic device (e.g., a server) 300 suitable for use in implementing some embodiments of the present disclosure is shown. The electronic device in some embodiments of the present disclosure may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 3, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM303, various programs and data necessary for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM302, and the RAM303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device 300 to communicate wirelessly or by wire with other devices to exchange data. While fig. 3 illustrates an electronic device 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 3 may represent one device or may represent multiple devices, as desired.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In some such embodiments, the computer program may be downloaded and installed from a network through the communication device 309, or installed from the storage device 308, or installed from the ROM 302. The computer program, when executed by the processing apparatus 301, performs the above-described functions defined in the methods of some embodiments of the present disclosure.
It should be noted that the computer readable medium described in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: obtaining a flow log from a target industrial network device, wherein the flow log comprises: a target device identification; vectorizing the target device identifier to generate a target device identifier vector; inputting the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result corresponding to the target equipment identification vector; in response to the fact that the identification result of the virtual and fake equipment identification represents that the target equipment identification is a real equipment identification, carrying out data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set; according to the analysis time and the data type corresponding to the split flow data in the split flow data set, carrying out clustering processing on the split flow data set to generate a split flow data set; and sending the split flow data group set to a related data detection terminal according to a preset data sending format.
Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a vectorization unit, an input unit, a splitting unit, a clustering unit, and a sending unit. The names of these units do not form a limitation on the unit itself in some cases, for example, the sending unit may also be described as a "unit sending the split traffic data set to the associated data detection terminal according to a preset data sending format".
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combinations of the above-mentioned features, and other embodiments in which the above-mentioned features or their equivalents are combined arbitrarily without departing from the spirit of the invention are also encompassed. For example, the above features and (but not limited to) the features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims (6)

1. A flow log detection method based on industrial safety comprises the following steps:
obtaining a traffic log from a target industrial network device, wherein the traffic log comprises: a target device identification;
vectorizing the target device identification to generate a target device identification vector;
inputting the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result corresponding to the target equipment identification vector;
in response to the fact that the identification result of the virtual and fake equipment identification represents that the target equipment identification is a real equipment identification, carrying out data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set;
according to the analysis time and the data type corresponding to the split flow data in the split flow data set, carrying out clustering processing on the split flow data set to generate a split flow data set;
and sending the split flow data group set to a related data detection terminal according to a preset data sending format.
2. The method of claim 1, wherein before the inputting the target device identification vector into a pre-trained null hypothesis and candidate identification recognition model to obtain a null hypothesis and candidate identification recognition result corresponding to the target device identification vector, the method further comprises:
obtaining a device identification sample set, wherein the device identification sample set comprises: a positive device identification sample group and a negative device identification sample group;
selecting a positive device identification sample from the positive device identification sample group and a negative device identification sample from the negative device identification sample group;
vectorizing the positive device identification sample and the negative device identification sample respectively to generate a positive device identification sample vector and a negative device identification sample vector;
inputting the negative equipment identification sample vector into an initial sample denoising model included in an initial virtual hypothesis equipment identification model to obtain a negative equipment identification noise removal vector;
generating device identifier classification information representing whether the device identifier corresponding to the negative device identifier sample is a false-false device identifier or not according to the negative device identifier noise removal vector, the negative device identifier sample vector and an initial device identifier classification model, wherein the initial false-false device identifier classification model comprises: the initial device identification classification model;
and training the initial pseudo-random standby identifier recognition model according to the positive equipment identifier sample vector, the negative equipment identifier noise removal vector and the equipment identifier classification information to obtain a trained pseudo-random standby identifier recognition model.
3. The method of claim 1, wherein the method further comprises:
responding to the fact that the identification result of the virtual fake equipment identification represents that the target equipment identification is a virtual fake equipment identification, and generating abnormal flow log information corresponding to the target industrial network equipment;
and sending the abnormal flow log information to a related network equipment detection terminal, and controlling related warning equipment to perform warning operation.
4. An industrial safety-based flow log detection device, comprising:
an obtaining unit configured to obtain a traffic log from a target industrial network device, wherein the traffic log comprises: a target device identification;
a vectorization unit configured to perform vectorization processing on the target device identification to generate a target device identification vector;
the input unit is configured to input the target equipment identification vector into a pre-trained false hypothesis standby identification recognition model to obtain a false hypothesis standby identification recognition result corresponding to the target equipment identification vector;
the splitting unit is configured to perform data splitting processing on the flow log according to a preset data splitting format to generate a split flow data set in response to the fact that the target equipment identifier is represented as a real equipment identifier according to the virtual and fake equipment identifier identification result;
the clustering unit is configured to perform clustering processing on the split flow data set according to analysis time and data types corresponding to the split flow data in the split flow data set to generate a split flow data group set;
and the sending unit is configured to send the split flow data group set to an associated data detection terminal according to a preset data sending format.
5. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-3.
6. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-3.
CN202310023746.3A 2023-01-09 2023-01-09 Industrial safety-based flow log detection method and device and electronic equipment Active CN115941357B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310023746.3A CN115941357B (en) 2023-01-09 2023-01-09 Industrial safety-based flow log detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310023746.3A CN115941357B (en) 2023-01-09 2023-01-09 Industrial safety-based flow log detection method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN115941357A true CN115941357A (en) 2023-04-07
CN115941357B CN115941357B (en) 2023-05-12

Family

ID=85827088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310023746.3A Active CN115941357B (en) 2023-01-09 2023-01-09 Industrial safety-based flow log detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115941357B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750412A (en) * 2019-09-02 2020-02-04 北京云集智造科技有限公司 Log abnormity detection method
CN112511459A (en) * 2020-11-23 2021-03-16 恒安嘉新(北京)科技股份公司 Traffic identification method and device, electronic equipment and storage medium
CN114676104A (en) * 2022-03-28 2022-06-28 珠海金山数字网络科技有限公司 Log generation method and device
CN114912500A (en) * 2021-11-29 2022-08-16 长沙理工大学 Unsupervised log anomaly detection method based on pre-training model
WO2022227388A1 (en) * 2021-04-29 2022-11-03 华为技术有限公司 Log anomaly detection model training method, apparatus and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750412A (en) * 2019-09-02 2020-02-04 北京云集智造科技有限公司 Log abnormity detection method
CN112511459A (en) * 2020-11-23 2021-03-16 恒安嘉新(北京)科技股份公司 Traffic identification method and device, electronic equipment and storage medium
WO2022227388A1 (en) * 2021-04-29 2022-11-03 华为技术有限公司 Log anomaly detection model training method, apparatus and device
CN114912500A (en) * 2021-11-29 2022-08-16 长沙理工大学 Unsupervised log anomaly detection method based on pre-training model
CN114676104A (en) * 2022-03-28 2022-06-28 珠海金山数字网络科技有限公司 Log generation method and device

Also Published As

Publication number Publication date
CN115941357B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
JP6355683B2 (en) Risk early warning method, apparatus, storage medium, and computer program
US9491186B2 (en) Method and apparatus for providing hierarchical pattern recognition of communication network data
CN111431819B (en) Network traffic classification method and device based on serialized protocol flow characteristics
CN113141360B (en) Method and device for detecting network malicious attack
CN111368551B (en) Method and device for determining event main body
CN114422267B (en) Flow detection method, device, equipment and medium
CN113765928B (en) Internet of things intrusion detection method, equipment and medium
CN115357470B (en) Information generation method and device, electronic equipment and computer readable medium
CN111444931A (en) Method and device for detecting abnormal access data
CN112650841A (en) Information processing method and device and electronic equipment
CN113760674A (en) Information generation method and device, electronic equipment and computer readable medium
CN116932919A (en) Information pushing method, device, electronic equipment and computer readable medium
CN110008926B (en) Method and device for identifying age
CN113689868A (en) Training method and device of voice conversion model, electronic equipment and medium
CN115941357B (en) Industrial safety-based flow log detection method and device and electronic equipment
CN115766401A (en) Industrial alarm information analysis method and device, electronic equipment and computer medium
CN114765025A (en) Method for generating and recognizing speech recognition model, device, medium and equipment
CN115277261A (en) Abnormal machine intelligent identification method, device and equipment based on industrial control network virus
CN115204150B (en) Information verification method and device, electronic equipment and computer readable medium
CN111582456A (en) Method, apparatus, device and medium for generating network model information
CN113572768B (en) Analysis method for abnormal change of number of botnet family propagation sources
CN117235744B (en) Source file online method, device, electronic equipment and computer readable medium
CN115801447B (en) Industrial safety-based flow analysis method and device and electronic equipment
CN116881097B (en) User terminal alarm method, device, electronic equipment and computer readable medium
CN111582482A (en) Method, apparatus, device and medium for generating network model information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant