CN113010500A - Processing method and processing system for DPI data - Google Patents
Processing method and processing system for DPI data Download PDFInfo
- Publication number
- CN113010500A CN113010500A CN201911305426.7A CN201911305426A CN113010500A CN 113010500 A CN113010500 A CN 113010500A CN 201911305426 A CN201911305426 A CN 201911305426A CN 113010500 A CN113010500 A CN 113010500A
- Authority
- CN
- China
- Prior art keywords
- data
- dpi
- time period
- dpi data
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 68
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 34
- 230000015654 memory Effects 0.000 claims description 17
- 238000011176 pooling Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 6
- 238000012937 correction Methods 0.000 claims description 4
- 238000007689 inspection Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 19
- 230000006870 function Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000004140 cleaning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000012850 discrimination method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000004205 output neuron Anatomy 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The disclosure provides a processing method and a processing system for DPI data, and relates to the field of data processing. The processing method comprises the following steps: detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period. The method and the device for completing the DPI data achieve completion of the missing DPI data, and reduce the influence of data missing when a user uses the data.
Description
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a method and a system for processing DPI data.
Background
With the rapid development of internet technology and data technology, currently, each large internet company has PB (beat byte) level data stock and hundred TB (terabyte) level data daily increment. Data, as a raw material for data service products, is an important asset for large companies. Therefore, guaranteeing data stability and availability is a core work of data operation. DPI (Deep Packet Inspection) data is a very large magnitude of data. In the process of data transmission, the problem of DPI data loss may be caused by uncontrollable factors such as network fluctuation, resource load or source data abnormity, and the like, so that difficulty is brought to subsequent use.
Disclosure of Invention
The technical problem that this disclosure solved is: a method for processing DPI data is provided to complete missing DPI data.
According to an aspect of the present disclosure, there is provided a processing method for deep packet inspection, DPI, data, including: detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period.
In some embodiments, prior to detecting the first time period of missing DPI data, the processing method further comprises: acquiring sample DPI data of a sample time period; and inputting the sample DPI data into the DPI data completion model unit to train the DPI data completion model unit.
In some embodiments, the step of training the DPI data completion model element comprises: preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network (GAN); inputting a random value into a generator of the GAN; the generator calculates the random value to generate random feature data, and inputs the random feature data into the discriminator; the discriminator compares the characteristic data of the DPI data with the random characteristic data and judges to obtain a judgment result; when the judgment result is not in the preset range, the discriminator determines that the current DPI data completion model unit does not reach the optimal state, and returns the judgment result to the generator, so that the generator generates the next random feature data; and when the judgment result is in the preset range, the discriminator determines that the current DPI data completion model unit reaches the optimal state.
In some embodiments, the predetermined range is 0.45 to 0.55.
In some embodiments, the step of the generator generating random feature data comprises: and the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an incremental time period, correspondingly and gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquires the time information of the random characteristic data by using a forgetting gate.
In some embodiments, the pre-processing comprises: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
According to another aspect of the present disclosure, there is provided a processing system for DPI data, comprising: the system comprises an acquisition unit, a DPI data completion model unit and a DPI data completion model unit, wherein the acquisition unit is used for detecting a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period into the DPI data completion model unit; and the DPI data completion model unit is used for generating the missing DPI data of the first time period based on the DPI data of the second time period.
In some embodiments, the obtaining unit is further configured to obtain sample DPI data of a sample time period and input the sample DPI data to the DPI data completion model unit; the DPI data completion model unit is further used for training based on sample DPI data.
In some embodiments, the DPI data completion model unit comprises: the data processing module is used for preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into the convolution layer, the correction linear unit layer, the pooling layer and the full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network GAN; and the GAN, including a generator and a discriminator; the generator is used for receiving a random value, calculating the random value to generate random feature data, and inputting the random feature data into the discriminator; the discriminator is used for comparing and judging the characteristic data of the DPI data of the sample and the random characteristic data to obtain a judgment result; when the judgment result is not in the preset range, determining that the current DPI data completion model unit does not reach the optimal state, and returning the judgment result to the generator to enable the generator to generate the next random feature data; and when the judgment result is in the preset range, determining that the current DPI data completion model unit reaches the optimal state.
In some embodiments, the predetermined range is 0.45 to 0.55.
In some embodiments, the generator is configured to generate a data sequence of an initial time period based on the random value, and gradually increase the data sequence with a preset time period as an increment time period, until the data sequence with the time period equal to the length of the sample time period is increased, that is, the random feature data, and acquire time information of the random feature data by using a forgetting gate.
In some embodiments, the pre-processing comprises: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
According to another aspect of the present disclosure, there is provided a processing system for DPI data, comprising: a memory; and a processor coupled to the memory, the processor configured to perform the method as previously described based on instructions stored in the memory.
According to another aspect of the present disclosure, a computer-readable storage medium is provided, having stored thereon computer program instructions, which when executed by a processor, implement the steps of the method as previously described.
In the processing method, detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period. The processing method realizes the completion of the missing DPI data and reduces the influence of data missing when a user uses the data.
Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
figure 1 is a flow diagram illustrating a processing method for DPI data according to some embodiments of the present disclosure;
figure 2 is a schematic diagram illustrating missing DPI data according to some embodiments of the present disclosure;
figure 3 is a flow diagram illustrating a method of training a DPI data completion model element in accordance with some embodiments of the present disclosure;
figure 4 is a block diagram illustrating a processing system for DPI data according to some embodiments of the present disclosure;
figure 5 is a block diagram illustrating a processing system for DPI data in accordance with further embodiments of the present disclosure;
figure 6 is a block diagram illustrating a processing system for DPI data in accordance with further embodiments of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Figure 1 is a flow diagram illustrating a processing method for DPI data according to some embodiments of the present disclosure. As shown in fig. 1, the processing method may include steps S102 to S108.
In step S102, a first time period during which missing DPI data is detected.
Figure 2 is a schematic diagram illustrating missing DPI data according to some embodiments of the present disclosure. For example, as shown in fig. 2, in a period of DPI data, a first period of DPI data is missing, and the first period of time for which the missing DPI data is obtained can be detected.
Returning to fig. 1, at step S104, DPI data of a second time period adjacent to the first time period is acquired.
For example, as shown in figure 2, the second time period with DPI data is adjacent to the first time period. In some embodiments, as shown in fig. 2, the second time period may precede the first time period. In other embodiments, the second time period may be subsequent to the first time period. In other embodiments, the second time period may be on both sides of the first time period, i.e., the second time period may be divided into two portions: one part before the first period and the other part after the first period. In either case, the second time period is adjacent to the first time period. In this step, DPI data for the second time period may be acquired.
In step S106, the DPI data of the second time period is input to the DPI data completion model unit. The DPI data completion model unit is a model which is trained by sample data.
In step S108, the DPI data completion model unit generates missing DPI data of the first period based on the DPI data of the second period.
For example, if the first time period is a certain day and the second time period is the first 30 days of the certain day, the DPI data completion model unit may generate missing DPI data of the certain day based on the DPI data of the first 30 days.
To this end, a processing method for DPI data according to some embodiments of the present disclosure is provided. The processing method comprises the following steps: detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period. The processing method realizes the completion of the missing DPI data and reduces the influence of data missing when a user uses the data.
The processing method is beneficial to the anti-fluctuation of the DPI data and provides technical data for a back-end user. The method is based on mining analysis of a large amount of data of telecommunication access class, data completion is carried out in multiple dimensions, and the influence of data loss when a user uses the data is reduced.
In some embodiments, before step S102, the processing method may further include: acquiring sample DPI data of a sample time period; and inputting the sample DPI data into a DPI data completion model unit to train the DPI data completion model unit. Through training, the DPI data completion model unit which reaches the optimal state can be obtained, and therefore completion of missing DPI data is facilitated.
Figure 3 is a flow diagram illustrating a method of training a DPI data completion model element according to some embodiments of the present disclosure. The process of training the DPI data completion model unit is described in detail below with reference to fig. 3. As shown in fig. 3, the method may include steps S302 to S314.
In step S302, sample DPI data is preprocessed, and the preprocessed sample DPI data is sequentially input to a convolutional layer, a modified Linear Unit (ReLU) layer, a pooling layer, and a full link layer for processing, so as to obtain feature data of the sample DPI data. For example, the sample DPI data may be 30 days (as a sample period of time) DPI data. In some embodiments, the sample DPI data may be embodied in the form of a data matrix.
In some embodiments, the pre-processing may include: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing. These pre-processing means may be performed in a manner known to those skilled in the art and will therefore not be described in detail here.
The convolutional layer, the ReLU layer, the pooling layer, and the full-link layer are described below, respectively.
And (3) rolling layers: the parameters of the convolutional neural network are made up of a set of learnable filters, each filter being relatively small in space (e.g., width and height), but the depth is consistent with the depth of the input data.
Relu layer: the Relu layer is an activation function and can increase the nonlinear segmentation capability of the network.
A pooling layer: pooling layers are typically inserted periodically between convolutional layers and act to gradually reduce the spatial size of the data volume, which reduces the number of parameters in the network, reduces computational resource consumption, and also effectively controls overfitting.
Full connection layer: each neuron of the full connection layer is connected with all neurons of the previous layer, while a Convolutional Neural Network (CNN) is connected with only one local region in the input data, and each depth slice of the output neurons shares parameters.
The convolutional layers, the ReLU layers, the pooling layers, and the fully-connected layers described above may all be convolutional layers, ReLU layers, pooling layers, and fully-connected layers known to those skilled in the art, and thus their specific functions or operations will not be described in detail herein.
Through this step S302, characteristic data of the sample DPI data can be obtained. The characteristic data may represent the primary information of the sample DPI data. The characteristic data may be embodied in the form of a data matrix, for example.
In step S304, the feature data of the sample DPI data is input to a discriminator of GAN (Generative adaptive Networks).
The GAN may include a discriminator D and a generator G. For example, the generator G and the discriminator D may be implemented by a network composed of LSTM (Long Short-Term Memory) units. In this step, the characteristic data of the sample DPI data is input to the discriminator D.
In step S306, a random value is input into the generator of GAN.
For example, a known algorithm may be used to generate the random value z and input the random value z into the generator G of GAN.
In step S308, the generator generates random feature data, which is input to the discriminator.
For example, the generator G may calculate a random value z to generate random feature data, which is input into the discriminator D.
In some embodiments, the step of the generator generating the random characteristic data may comprise: the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an increment time period, correspondingly and gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquires the time information of the random characteristic data by using the forgetting gate.
For example, the generator G first generates a data sequence of day 1, takes day 1 as a preset time period, and takes day 1 as an incremental time period on the basis of day 1, and accordingly, the data sequence is gradually increased, for example, a known algorithm may be adopted to gradually increase the data sequence of day 2 and day 3 … … until the data sequence is increased to 30 days (as a sample time period), and the data sequence of day 30 is the random feature data, and the time information of the random feature data is obtained by using a known forgetting gate technique.
In step S310, the discriminator compares and discriminates the feature data of the sample DPI data and the random feature data to obtain a discrimination result, and discriminates whether or not the discrimination result is within a predetermined range.
For example, the discriminator may compare the feature data of the sample DPI data with the random feature data, and may perform the discrimination using a known discrimination method, thereby obtaining a discrimination result, and determine whether the discrimination result is within a predetermined range. If so, the process advances to step S314; otherwise the process proceeds to step S312.
In some embodiments, the predetermined range may be 0.45 to 0.55.
In step S312, when the determination result is not within the predetermined range, the discriminator determines that the current DPI data completion model unit does not reach the optimum state, and returns the determination result to the generator. This may cause the generator to generate the next random feature data (e.g., random feature data may be generated based on other random values). The generator inputs the next random feature data into the discriminator; the discriminator continues to compare the characteristic data of the sample DPI data with the next random characteristic data and makes a decision to obtain a next decision until the decision is within a predetermined range.
In step S314, when the determination result is within the predetermined range, the discriminator determines that the current DPI data completion model unit reaches the optimum state.
Here, the optimal state means that the DPI data completion model unit can be used to perform a completion operation on missing DPI data, and the completed DPI data is very close to the missing real DPI data (i.e. the difference is within an acceptable range).
Thus, a method of training a DPI data completion model element is provided according to some embodiments of the present disclosure. The method comprises the following steps: preprocessing sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the DPI data of the sample into a discriminator of the GAN; inputting the random value into a generator of the GAN; the generator calculates the random value to generate random feature data, and the random feature data is input into the discriminator; the discriminator compares the characteristic data of the DPI data with the random characteristic data and judges to obtain a judgment result; when the judgment result is not in the preset range, the discriminator determines that the current DPI data completion model unit does not reach the optimal state, and returns the judgment result to the generator, so that the generator generates the next random feature data; and when the judgment result is in a preset range, the discriminator determines that the current DPI data completion model unit reaches the optimal state.
For example, the discriminator D compares the feature data of the processed sample DPI data with the random feature data generated by the generator G, and when the determination result D (G (z)) is about 0.5, the model reaches the optimal state, that is, the difference between the data generated by the generator and the real data is not large. Therefore, the missing DPI data of a certain time period can be generated by the generator to complete the data, and the function of data fluctuation resistance is achieved.
Through the training of the DPI data completion model unit, the DPI data completion model unit can implement completion operation on missing DPI data.
The DPI data completion model unit is different from some existing algorithm models, such as a K-means clustering algorithm. The model disclosed by the embodiment of the invention is closer to an actual application scene, increases intelligent hidden feature extraction, obtains time sequence information of long sequence dependence, and intelligently confronts to generate missing data, is applied to a multi-path flow type processing cleaning platform of big data, is a core algorithm model of the cleaning platform, and provides a data anti-fluctuation function for a platform system.
In some embodiments, the DPI data completion model unit may continuously train with sample DPI data of a sample time period (for example, 30 days) before the current day, so as to keep the calculation result of the DPI data completion model unit as close as possible to the real data.
Figure 4 is a block diagram illustrating a processing system for DPI data according to some embodiments of the present disclosure. As shown in fig. 4, the processing system may include an acquisition unit 410 and a DPI data completion model unit 420.
The obtaining unit 410 is configured to detect a first time period in which DPI data is missing, obtain DPI data of a second time period adjacent to the first time period, and input the DPI data of the second time period to the DPI data completion model unit 420.
The DPI data completion model unit 420 is configured to generate missing DPI data for the first time period based on the DPI data for the second time period.
Thus, a processing system for DPI data is provided according to some embodiments of the present disclosure. In the processing system, an acquisition unit is used for detecting a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period into a DPI data completion model unit; the DPI data completion model unit is used for generating missing DPI data of the first time period based on the DPI data of the second time period. The processing system realizes the completion of missing DPI data and reduces the influence of data missing when a user uses the data.
In some embodiments, the obtaining unit 410 may be further configured to obtain sample DPI data for the sample time period and input the sample DPI data to the DPI data completion model unit 420. DPI data completion model element 420 may also be used to train based on sample DPI data.
In some embodiments, as shown in fig. 4, DPI data completion model element 420 may include a data processing module 421 and a GAN 422.
The data processing module 421 is configured to preprocess the sample DPI data, and sequentially input the preprocessed sample DPI data into the convolutional layer, the modified linear unit layer, the pooling layer, and the full-link layer for processing, so as to obtain characteristic data of the sample DPI data; and inputs the characteristic data of the sample DPI data into the discriminator 4222 of the GAN 422. For example, the pre-processing may include: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
The GAN 422 may include a generator 4221 and a discriminator 4222.
The generator 4221 is configured to receive the random value, calculate the random value to generate random feature data, and input the random feature data into the discriminator 4222.
The discriminator 4222 is used for comparing and judging the characteristic data of the DPI data of the sample with the random characteristic data to obtain a judgment result; when the determination result is not within the predetermined range, determining that the current DPI data completion model unit 420 does not reach the optimal state, and returning the determination result to the generator 4221, so that the generator 4221 generates the next random feature data; when the determination result is within the predetermined range, it is determined that the current DPI data completion model unit 420 reaches the optimal state.
In some embodiments, the predetermined range may be 0.45 to 0.55.
In some embodiments, the generator 4221 may be configured to generate a data sequence of an initial time period based on the random value, and gradually increase the data sequence with a preset time period as an increment time period, until the data sequence with a time period equal to the length of the sample time period is increased, that is, the random feature data, and acquire time information of the random feature data by using a forgetting gate.
Figure 5 is a block diagram illustrating a processing system for DPI data in accordance with further embodiments of the present disclosure. The processing system includes a memory 510 and a processor 520. Wherein:
the memory 510 may be a magnetic disk, flash memory, or any other non-volatile storage medium. The memory is used for storing instructions in the embodiments corresponding to fig. 1 and/or fig. 3.
In some embodiments, as also shown in fig. 6, the processing system 600 includes a memory 610 and a processor 620. Processor 620 is coupled to memory 610 through a BUS 630. The processing system 600 may also be coupled to an external storage device 650 via a storage interface 640 for facilitating retrieval of external data, and may also be coupled to a network or another computer system (not shown) via a network interface 660, which will not be described in detail herein.
In the embodiment, the data instruction is stored in the memory, and the instruction is processed by the processor, so that the missing DPI data is supplemented, and the influence of data missing when a user uses the data is reduced.
In other embodiments, the present disclosure also provides a computer-readable storage medium on which computer program instructions are stored, the instructions implementing the steps of the method in the embodiment corresponding to fig. 1 and/or fig. 3 when executed by a processor. As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Thus far, the present disclosure has been described in detail. Some details that are well known in the art have not been described in order to avoid obscuring the concepts of the present disclosure. It will be fully apparent to those skilled in the art from the foregoing description how to practice the presently disclosed embodiments.
The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the foregoing examples are for purposes of illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.
Claims (14)
1. A processing method for Deep Packet Inspection (DPI) data comprises the following steps:
detecting a first time period of missing DPI data;
acquiring DPI data of a second time period adjacent to the first time period;
inputting the DPI data of the second time period into a DPI data completion model unit; and
the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period.
2. The process of claim 1, wherein prior to detecting the first time period of missing DPI data, the process further comprises:
acquiring sample DPI data of a sample time period; and
inputting the sample DPI data into the DPI data completion model unit to train the DPI data completion model unit.
3. The process of claim 2, wherein the step of training the DPI data completion model element comprises:
preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data;
inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network (GAN);
inputting a random value into a generator of the GAN;
the generator calculates the random value to generate random feature data, and inputs the random feature data into the discriminator; and
the discriminator compares the characteristic data of the DPI data with the random characteristic data and judges to obtain a judgment result;
when the judgment result is not in the preset range, the discriminator determines that the current DPI data completion model unit does not reach the optimal state, and returns the judgment result to the generator, so that the generator generates the next random feature data;
and when the judgment result is in the preset range, the discriminator determines that the current DPI data completion model unit reaches the optimal state.
4. The processing method according to claim 3,
the predetermined range is 0.45 to 0.55.
5. A process according to claim 3, in which the step of generating random feature data by the generator comprises:
and the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an incremental time period, correspondingly and gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquires the time information of the random characteristic data by using a forgetting gate.
6. The processing method according to claim 3,
the pretreatment comprises the following steps: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
7. A processing system for DPI data, comprising:
the system comprises an acquisition unit, a DPI data completion model unit and a DPI data completion model unit, wherein the acquisition unit is used for detecting a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period into the DPI data completion model unit; and
the DPI data completion model unit is used for generating the missing DPI data of the first time period based on the DPI data of the second time period.
8. The processing system of claim 7,
the acquisition unit is further used for acquiring sample DPI data of a sample time period and inputting the sample DPI data into the DPI data completion model unit;
the DPI data completion model unit is further used for training based on the sample DPI data.
9. The processing system of claim 8, wherein the DPI data completion model element comprises:
the data processing module is used for preprocessing the sample DPI data, sequentially inputting the preprocessed sample DPI data into the convolutional layer, the modified linear unit layer, the pooling layer and the full-connection layer for processing to obtain the characteristic data of the sample DPI data, and inputting the characteristic data of the sample DPI data into the discriminator of the generative countermeasure network GAN; and
the GAN comprises a generator and a discriminator; wherein the content of the first and second substances,
the generator is used for receiving a random value, calculating the random value to generate random feature data, and inputting the random feature data into the discriminator;
the discriminator is used for comparing and judging the characteristic data of the DPI data of the sample and the random characteristic data to obtain a judgment result; when the judgment result is not in the preset range, determining that the current DPI data completion model unit does not reach the optimal state, and returning the judgment result to the generator to enable the generator to generate the next random feature data; and when the judgment result is in the preset range, determining that the current DPI data completion model unit reaches the optimal state.
10. The processing system of claim 9,
the predetermined range is 0.45 to 0.55.
11. The processing system of claim 9,
the generator is used for generating a data sequence of an initial time period based on the random value, taking a preset time period as an incremental time period, correspondingly and gradually increasing the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquiring the time information of the random characteristic data by using a forgetting gate.
12. The processing system of claim 9,
the pretreatment comprises the following steps: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
13. A processing system for DPI data, comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the method of any of claims 1-6 based on instructions stored in the memory.
14. A computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911305426.7A CN113010500A (en) | 2019-12-18 | 2019-12-18 | Processing method and processing system for DPI data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911305426.7A CN113010500A (en) | 2019-12-18 | 2019-12-18 | Processing method and processing system for DPI data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113010500A true CN113010500A (en) | 2021-06-22 |
Family
ID=76381114
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911305426.7A Pending CN113010500A (en) | 2019-12-18 | 2019-12-18 | Processing method and processing system for DPI data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113010500A (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106301825A (en) * | 2015-05-18 | 2017-01-04 | 中兴通讯股份有限公司 | The generation method and device of DPI rule |
CN106971348A (en) * | 2016-01-14 | 2017-07-21 | 阿里巴巴集团控股有限公司 | A kind of data predication method and device based on time series |
CN107133190A (en) * | 2016-02-29 | 2017-09-05 | 阿里巴巴集团控股有限公司 | The training method and training system of a kind of machine learning system |
CN107169520A (en) * | 2017-05-19 | 2017-09-15 | 济南浪潮高新科技投资发展有限公司 | A kind of big data lacks attribute complementing method |
WO2017215565A1 (en) * | 2016-06-12 | 2017-12-21 | 中兴通讯股份有限公司 | Method and device for transmitting dpi policy |
CN109063433A (en) * | 2018-07-09 | 2018-12-21 | 中国联合网络通信集团有限公司 | Recognition methods, device and the readable storage medium storing program for executing of fictitious users |
CN109165664A (en) * | 2018-07-04 | 2019-01-08 | 华南理工大学 | A kind of attribute missing data collection completion and prediction technique based on generation confrontation network |
CN109815223A (en) * | 2019-01-21 | 2019-05-28 | 北京科技大学 | A kind of complementing method and complementing device for industry monitoring shortage of data |
WO2019100724A1 (en) * | 2017-11-24 | 2019-05-31 | 华为技术有限公司 | Method and device for training multi-label classification model |
CN110288537A (en) * | 2019-05-20 | 2019-09-27 | 湖南大学 | Facial image complementing method based on the depth production confrontation network from attention |
-
2019
- 2019-12-18 CN CN201911305426.7A patent/CN113010500A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106301825A (en) * | 2015-05-18 | 2017-01-04 | 中兴通讯股份有限公司 | The generation method and device of DPI rule |
CN106971348A (en) * | 2016-01-14 | 2017-07-21 | 阿里巴巴集团控股有限公司 | A kind of data predication method and device based on time series |
CN107133190A (en) * | 2016-02-29 | 2017-09-05 | 阿里巴巴集团控股有限公司 | The training method and training system of a kind of machine learning system |
WO2017215565A1 (en) * | 2016-06-12 | 2017-12-21 | 中兴通讯股份有限公司 | Method and device for transmitting dpi policy |
CN107169520A (en) * | 2017-05-19 | 2017-09-15 | 济南浪潮高新科技投资发展有限公司 | A kind of big data lacks attribute complementing method |
WO2019100724A1 (en) * | 2017-11-24 | 2019-05-31 | 华为技术有限公司 | Method and device for training multi-label classification model |
CN109165664A (en) * | 2018-07-04 | 2019-01-08 | 华南理工大学 | A kind of attribute missing data collection completion and prediction technique based on generation confrontation network |
CN109063433A (en) * | 2018-07-09 | 2018-12-21 | 中国联合网络通信集团有限公司 | Recognition methods, device and the readable storage medium storing program for executing of fictitious users |
CN109815223A (en) * | 2019-01-21 | 2019-05-28 | 北京科技大学 | A kind of complementing method and complementing device for industry monitoring shortage of data |
CN110288537A (en) * | 2019-05-20 | 2019-09-27 | 湖南大学 | Facial image complementing method based on the depth production confrontation network from attention |
Non-Patent Citations (2)
Title |
---|
冀俭俭;杨刚;: "基于生成对抗网络的分级联合图像补全方法", 《图学学报》, no. 6, 15 December 2019 (2019-12-15), pages 29 - 37 * |
王力 等: "基于生成式对抗网络的路网交通流数据补全方法", 《交通运输系统工程与信息》, vol. 18, no. 6, 15 December 2018 (2018-12-15), pages 63 - 71 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109271958B (en) | Face age identification method and device | |
CN113221687B (en) | Training method of pressing plate state recognition model and pressing plate state recognition method | |
CN111899759B (en) | Method, device, equipment and medium for pre-training and model training of audio data | |
CN110991321B (en) | Video pedestrian re-identification method based on tag correction and weighting feature fusion | |
CN103744974B (en) | Method and device for selecting local interest points | |
US11645328B2 (en) | 3D-aware image search | |
CN114693942A (en) | Multimode fault understanding and auxiliary labeling method for intelligent operation and maintenance of instruments and meters | |
CN112597831A (en) | Signal abnormity detection method based on variational self-encoder and countermeasure network | |
CN117557872B (en) | Unsupervised anomaly detection method and device for optimizing storage mode | |
CN114691868A (en) | Text clustering method and device and electronic equipment | |
CN111010595B (en) | New program recommendation method and device | |
CN113313065A (en) | Video processing method and device, electronic equipment and readable storage medium | |
CN111353526A (en) | Image matching method and device and related equipment | |
CN112906883A (en) | Hybrid precision quantization strategy determination method and system for deep neural network | |
CN113010500A (en) | Processing method and processing system for DPI data | |
CN112465012A (en) | Machine learning modeling method and device, electronic equipment and readable storage medium | |
CN110071845B (en) | Method and device for classifying unknown applications | |
CN116468947A (en) | Cutter image recognition method, cutter image recognition device, computer equipment and storage medium | |
CN112115991B (en) | Mobile terminal change prediction method, device, equipment and readable storage medium | |
Zhong et al. | Target aware network adaptation for efficient representation learning | |
CN113743593A (en) | Neural network quantization method, system, storage medium and terminal | |
CN110210518B (en) | Method and device for extracting dimension reduction features | |
CN112738098A (en) | Anomaly detection method and device based on network behavior data | |
CN114626501A (en) | Data processing method and device, electronic equipment and storage medium | |
CN113343938B (en) | Image identification method, device, equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220126 Address after: 100007 room 205-32, floor 2, building 2, No. 1 and No. 3, qinglonghutong a, Dongcheng District, Beijing Applicant after: Tianyiyun Technology Co.,Ltd. Address before: No.31, Financial Street, Xicheng District, Beijing, 100033 Applicant before: CHINA TELECOM Corp.,Ltd. |