CN113010500A - Processing method and processing system for DPI data - Google Patents

Processing method and processing system for DPI data Download PDF

Info

Publication number
CN113010500A
CN113010500A CN201911305426.7A CN201911305426A CN113010500A CN 113010500 A CN113010500 A CN 113010500A CN 201911305426 A CN201911305426 A CN 201911305426A CN 113010500 A CN113010500 A CN 113010500A
Authority
CN
China
Prior art keywords
data
dpi
time period
dpi data
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911305426.7A
Other languages
Chinese (zh)
Inventor
安翔宇
闫健儒
马奕凡
朱晨曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Cloud Technology Co Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201911305426.7A priority Critical patent/CN113010500A/en
Publication of CN113010500A publication Critical patent/CN113010500A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The disclosure provides a processing method and a processing system for DPI data, and relates to the field of data processing. The processing method comprises the following steps: detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period. The method and the device for completing the DPI data achieve completion of the missing DPI data, and reduce the influence of data missing when a user uses the data.

Description

Processing method and processing system for DPI data
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a method and a system for processing DPI data.
Background
With the rapid development of internet technology and data technology, currently, each large internet company has PB (beat byte) level data stock and hundred TB (terabyte) level data daily increment. Data, as a raw material for data service products, is an important asset for large companies. Therefore, guaranteeing data stability and availability is a core work of data operation. DPI (Deep Packet Inspection) data is a very large magnitude of data. In the process of data transmission, the problem of DPI data loss may be caused by uncontrollable factors such as network fluctuation, resource load or source data abnormity, and the like, so that difficulty is brought to subsequent use.
Disclosure of Invention
The technical problem that this disclosure solved is: a method for processing DPI data is provided to complete missing DPI data.
According to an aspect of the present disclosure, there is provided a processing method for deep packet inspection, DPI, data, including: detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period.
In some embodiments, prior to detecting the first time period of missing DPI data, the processing method further comprises: acquiring sample DPI data of a sample time period; and inputting the sample DPI data into the DPI data completion model unit to train the DPI data completion model unit.
In some embodiments, the step of training the DPI data completion model element comprises: preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network (GAN); inputting a random value into a generator of the GAN; the generator calculates the random value to generate random feature data, and inputs the random feature data into the discriminator; the discriminator compares the characteristic data of the DPI data with the random characteristic data and judges to obtain a judgment result; when the judgment result is not in the preset range, the discriminator determines that the current DPI data completion model unit does not reach the optimal state, and returns the judgment result to the generator, so that the generator generates the next random feature data; and when the judgment result is in the preset range, the discriminator determines that the current DPI data completion model unit reaches the optimal state.
In some embodiments, the predetermined range is 0.45 to 0.55.
In some embodiments, the step of the generator generating random feature data comprises: and the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an incremental time period, correspondingly and gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquires the time information of the random characteristic data by using a forgetting gate.
In some embodiments, the pre-processing comprises: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
According to another aspect of the present disclosure, there is provided a processing system for DPI data, comprising: the system comprises an acquisition unit, a DPI data completion model unit and a DPI data completion model unit, wherein the acquisition unit is used for detecting a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period into the DPI data completion model unit; and the DPI data completion model unit is used for generating the missing DPI data of the first time period based on the DPI data of the second time period.
In some embodiments, the obtaining unit is further configured to obtain sample DPI data of a sample time period and input the sample DPI data to the DPI data completion model unit; the DPI data completion model unit is further used for training based on sample DPI data.
In some embodiments, the DPI data completion model unit comprises: the data processing module is used for preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into the convolution layer, the correction linear unit layer, the pooling layer and the full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network GAN; and the GAN, including a generator and a discriminator; the generator is used for receiving a random value, calculating the random value to generate random feature data, and inputting the random feature data into the discriminator; the discriminator is used for comparing and judging the characteristic data of the DPI data of the sample and the random characteristic data to obtain a judgment result; when the judgment result is not in the preset range, determining that the current DPI data completion model unit does not reach the optimal state, and returning the judgment result to the generator to enable the generator to generate the next random feature data; and when the judgment result is in the preset range, determining that the current DPI data completion model unit reaches the optimal state.
In some embodiments, the predetermined range is 0.45 to 0.55.
In some embodiments, the generator is configured to generate a data sequence of an initial time period based on the random value, and gradually increase the data sequence with a preset time period as an increment time period, until the data sequence with the time period equal to the length of the sample time period is increased, that is, the random feature data, and acquire time information of the random feature data by using a forgetting gate.
In some embodiments, the pre-processing comprises: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
According to another aspect of the present disclosure, there is provided a processing system for DPI data, comprising: a memory; and a processor coupled to the memory, the processor configured to perform the method as previously described based on instructions stored in the memory.
According to another aspect of the present disclosure, a computer-readable storage medium is provided, having stored thereon computer program instructions, which when executed by a processor, implement the steps of the method as previously described.
In the processing method, detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period. The processing method realizes the completion of the missing DPI data and reduces the influence of data missing when a user uses the data.
Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
figure 1 is a flow diagram illustrating a processing method for DPI data according to some embodiments of the present disclosure;
figure 2 is a schematic diagram illustrating missing DPI data according to some embodiments of the present disclosure;
figure 3 is a flow diagram illustrating a method of training a DPI data completion model element in accordance with some embodiments of the present disclosure;
figure 4 is a block diagram illustrating a processing system for DPI data according to some embodiments of the present disclosure;
figure 5 is a block diagram illustrating a processing system for DPI data in accordance with further embodiments of the present disclosure;
figure 6 is a block diagram illustrating a processing system for DPI data in accordance with further embodiments of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Figure 1 is a flow diagram illustrating a processing method for DPI data according to some embodiments of the present disclosure. As shown in fig. 1, the processing method may include steps S102 to S108.
In step S102, a first time period during which missing DPI data is detected.
Figure 2 is a schematic diagram illustrating missing DPI data according to some embodiments of the present disclosure. For example, as shown in fig. 2, in a period of DPI data, a first period of DPI data is missing, and the first period of time for which the missing DPI data is obtained can be detected.
Returning to fig. 1, at step S104, DPI data of a second time period adjacent to the first time period is acquired.
For example, as shown in figure 2, the second time period with DPI data is adjacent to the first time period. In some embodiments, as shown in fig. 2, the second time period may precede the first time period. In other embodiments, the second time period may be subsequent to the first time period. In other embodiments, the second time period may be on both sides of the first time period, i.e., the second time period may be divided into two portions: one part before the first period and the other part after the first period. In either case, the second time period is adjacent to the first time period. In this step, DPI data for the second time period may be acquired.
In step S106, the DPI data of the second time period is input to the DPI data completion model unit. The DPI data completion model unit is a model which is trained by sample data.
In step S108, the DPI data completion model unit generates missing DPI data of the first period based on the DPI data of the second period.
For example, if the first time period is a certain day and the second time period is the first 30 days of the certain day, the DPI data completion model unit may generate missing DPI data of the certain day based on the DPI data of the first 30 days.
To this end, a processing method for DPI data according to some embodiments of the present disclosure is provided. The processing method comprises the following steps: detecting a first time period of missing DPI data; acquiring DPI data of a second time period adjacent to the first time period; inputting the DPI data of the second time period into a DPI data completion model unit; and the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period. The processing method realizes the completion of the missing DPI data and reduces the influence of data missing when a user uses the data.
The processing method is beneficial to the anti-fluctuation of the DPI data and provides technical data for a back-end user. The method is based on mining analysis of a large amount of data of telecommunication access class, data completion is carried out in multiple dimensions, and the influence of data loss when a user uses the data is reduced.
In some embodiments, before step S102, the processing method may further include: acquiring sample DPI data of a sample time period; and inputting the sample DPI data into a DPI data completion model unit to train the DPI data completion model unit. Through training, the DPI data completion model unit which reaches the optimal state can be obtained, and therefore completion of missing DPI data is facilitated.
Figure 3 is a flow diagram illustrating a method of training a DPI data completion model element according to some embodiments of the present disclosure. The process of training the DPI data completion model unit is described in detail below with reference to fig. 3. As shown in fig. 3, the method may include steps S302 to S314.
In step S302, sample DPI data is preprocessed, and the preprocessed sample DPI data is sequentially input to a convolutional layer, a modified Linear Unit (ReLU) layer, a pooling layer, and a full link layer for processing, so as to obtain feature data of the sample DPI data. For example, the sample DPI data may be 30 days (as a sample period of time) DPI data. In some embodiments, the sample DPI data may be embodied in the form of a data matrix.
In some embodiments, the pre-processing may include: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing. These pre-processing means may be performed in a manner known to those skilled in the art and will therefore not be described in detail here.
The convolutional layer, the ReLU layer, the pooling layer, and the full-link layer are described below, respectively.
And (3) rolling layers: the parameters of the convolutional neural network are made up of a set of learnable filters, each filter being relatively small in space (e.g., width and height), but the depth is consistent with the depth of the input data.
Relu layer: the Relu layer is an activation function and can increase the nonlinear segmentation capability of the network.
A pooling layer: pooling layers are typically inserted periodically between convolutional layers and act to gradually reduce the spatial size of the data volume, which reduces the number of parameters in the network, reduces computational resource consumption, and also effectively controls overfitting.
Full connection layer: each neuron of the full connection layer is connected with all neurons of the previous layer, while a Convolutional Neural Network (CNN) is connected with only one local region in the input data, and each depth slice of the output neurons shares parameters.
The convolutional layers, the ReLU layers, the pooling layers, and the fully-connected layers described above may all be convolutional layers, ReLU layers, pooling layers, and fully-connected layers known to those skilled in the art, and thus their specific functions or operations will not be described in detail herein.
Through this step S302, characteristic data of the sample DPI data can be obtained. The characteristic data may represent the primary information of the sample DPI data. The characteristic data may be embodied in the form of a data matrix, for example.
In step S304, the feature data of the sample DPI data is input to a discriminator of GAN (Generative adaptive Networks).
The GAN may include a discriminator D and a generator G. For example, the generator G and the discriminator D may be implemented by a network composed of LSTM (Long Short-Term Memory) units. In this step, the characteristic data of the sample DPI data is input to the discriminator D.
In step S306, a random value is input into the generator of GAN.
For example, a known algorithm may be used to generate the random value z and input the random value z into the generator G of GAN.
In step S308, the generator generates random feature data, which is input to the discriminator.
For example, the generator G may calculate a random value z to generate random feature data, which is input into the discriminator D.
In some embodiments, the step of the generator generating the random characteristic data may comprise: the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an increment time period, correspondingly and gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquires the time information of the random characteristic data by using the forgetting gate.
For example, the generator G first generates a data sequence of day 1, takes day 1 as a preset time period, and takes day 1 as an incremental time period on the basis of day 1, and accordingly, the data sequence is gradually increased, for example, a known algorithm may be adopted to gradually increase the data sequence of day 2 and day 3 … … until the data sequence is increased to 30 days (as a sample time period), and the data sequence of day 30 is the random feature data, and the time information of the random feature data is obtained by using a known forgetting gate technique.
In step S310, the discriminator compares and discriminates the feature data of the sample DPI data and the random feature data to obtain a discrimination result, and discriminates whether or not the discrimination result is within a predetermined range.
For example, the discriminator may compare the feature data of the sample DPI data with the random feature data, and may perform the discrimination using a known discrimination method, thereby obtaining a discrimination result, and determine whether the discrimination result is within a predetermined range. If so, the process advances to step S314; otherwise the process proceeds to step S312.
In some embodiments, the predetermined range may be 0.45 to 0.55.
In step S312, when the determination result is not within the predetermined range, the discriminator determines that the current DPI data completion model unit does not reach the optimum state, and returns the determination result to the generator. This may cause the generator to generate the next random feature data (e.g., random feature data may be generated based on other random values). The generator inputs the next random feature data into the discriminator; the discriminator continues to compare the characteristic data of the sample DPI data with the next random characteristic data and makes a decision to obtain a next decision until the decision is within a predetermined range.
In step S314, when the determination result is within the predetermined range, the discriminator determines that the current DPI data completion model unit reaches the optimum state.
Here, the optimal state means that the DPI data completion model unit can be used to perform a completion operation on missing DPI data, and the completed DPI data is very close to the missing real DPI data (i.e. the difference is within an acceptable range).
Thus, a method of training a DPI data completion model element is provided according to some embodiments of the present disclosure. The method comprises the following steps: preprocessing sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data; inputting the characteristic data of the DPI data of the sample into a discriminator of the GAN; inputting the random value into a generator of the GAN; the generator calculates the random value to generate random feature data, and the random feature data is input into the discriminator; the discriminator compares the characteristic data of the DPI data with the random characteristic data and judges to obtain a judgment result; when the judgment result is not in the preset range, the discriminator determines that the current DPI data completion model unit does not reach the optimal state, and returns the judgment result to the generator, so that the generator generates the next random feature data; and when the judgment result is in a preset range, the discriminator determines that the current DPI data completion model unit reaches the optimal state.
For example, the discriminator D compares the feature data of the processed sample DPI data with the random feature data generated by the generator G, and when the determination result D (G (z)) is about 0.5, the model reaches the optimal state, that is, the difference between the data generated by the generator and the real data is not large. Therefore, the missing DPI data of a certain time period can be generated by the generator to complete the data, and the function of data fluctuation resistance is achieved.
Through the training of the DPI data completion model unit, the DPI data completion model unit can implement completion operation on missing DPI data.
The DPI data completion model unit is different from some existing algorithm models, such as a K-means clustering algorithm. The model disclosed by the embodiment of the invention is closer to an actual application scene, increases intelligent hidden feature extraction, obtains time sequence information of long sequence dependence, and intelligently confronts to generate missing data, is applied to a multi-path flow type processing cleaning platform of big data, is a core algorithm model of the cleaning platform, and provides a data anti-fluctuation function for a platform system.
In some embodiments, the DPI data completion model unit may continuously train with sample DPI data of a sample time period (for example, 30 days) before the current day, so as to keep the calculation result of the DPI data completion model unit as close as possible to the real data.
Figure 4 is a block diagram illustrating a processing system for DPI data according to some embodiments of the present disclosure. As shown in fig. 4, the processing system may include an acquisition unit 410 and a DPI data completion model unit 420.
The obtaining unit 410 is configured to detect a first time period in which DPI data is missing, obtain DPI data of a second time period adjacent to the first time period, and input the DPI data of the second time period to the DPI data completion model unit 420.
The DPI data completion model unit 420 is configured to generate missing DPI data for the first time period based on the DPI data for the second time period.
Thus, a processing system for DPI data is provided according to some embodiments of the present disclosure. In the processing system, an acquisition unit is used for detecting a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period into a DPI data completion model unit; the DPI data completion model unit is used for generating missing DPI data of the first time period based on the DPI data of the second time period. The processing system realizes the completion of missing DPI data and reduces the influence of data missing when a user uses the data.
In some embodiments, the obtaining unit 410 may be further configured to obtain sample DPI data for the sample time period and input the sample DPI data to the DPI data completion model unit 420. DPI data completion model element 420 may also be used to train based on sample DPI data.
In some embodiments, as shown in fig. 4, DPI data completion model element 420 may include a data processing module 421 and a GAN 422.
The data processing module 421 is configured to preprocess the sample DPI data, and sequentially input the preprocessed sample DPI data into the convolutional layer, the modified linear unit layer, the pooling layer, and the full-link layer for processing, so as to obtain characteristic data of the sample DPI data; and inputs the characteristic data of the sample DPI data into the discriminator 4222 of the GAN 422. For example, the pre-processing may include: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
The GAN 422 may include a generator 4221 and a discriminator 4222.
The generator 4221 is configured to receive the random value, calculate the random value to generate random feature data, and input the random feature data into the discriminator 4222.
The discriminator 4222 is used for comparing and judging the characteristic data of the DPI data of the sample with the random characteristic data to obtain a judgment result; when the determination result is not within the predetermined range, determining that the current DPI data completion model unit 420 does not reach the optimal state, and returning the determination result to the generator 4221, so that the generator 4221 generates the next random feature data; when the determination result is within the predetermined range, it is determined that the current DPI data completion model unit 420 reaches the optimal state.
In some embodiments, the predetermined range may be 0.45 to 0.55.
In some embodiments, the generator 4221 may be configured to generate a data sequence of an initial time period based on the random value, and gradually increase the data sequence with a preset time period as an increment time period, until the data sequence with a time period equal to the length of the sample time period is increased, that is, the random feature data, and acquire time information of the random feature data by using a forgetting gate.
Figure 5 is a block diagram illustrating a processing system for DPI data in accordance with further embodiments of the present disclosure. The processing system includes a memory 510 and a processor 520. Wherein:
the memory 510 may be a magnetic disk, flash memory, or any other non-volatile storage medium. The memory is used for storing instructions in the embodiments corresponding to fig. 1 and/or fig. 3.
Processor 520 is coupled to memory 510 and may be implemented as one or more integrated circuits, such as a microprocessor or microcontroller. The processor 520 is configured to execute instructions stored in the memory, so as to complete missing DPI data and reduce the influence of data missing when a user uses data.
In some embodiments, as also shown in fig. 6, the processing system 600 includes a memory 610 and a processor 620. Processor 620 is coupled to memory 610 through a BUS 630. The processing system 600 may also be coupled to an external storage device 650 via a storage interface 640 for facilitating retrieval of external data, and may also be coupled to a network or another computer system (not shown) via a network interface 660, which will not be described in detail herein.
In the embodiment, the data instruction is stored in the memory, and the instruction is processed by the processor, so that the missing DPI data is supplemented, and the influence of data missing when a user uses the data is reduced.
In other embodiments, the present disclosure also provides a computer-readable storage medium on which computer program instructions are stored, the instructions implementing the steps of the method in the embodiment corresponding to fig. 1 and/or fig. 3 when executed by a processor. As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Thus far, the present disclosure has been described in detail. Some details that are well known in the art have not been described in order to avoid obscuring the concepts of the present disclosure. It will be fully apparent to those skilled in the art from the foregoing description how to practice the presently disclosed embodiments.
The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the foregoing examples are for purposes of illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

Claims (14)

1. A processing method for Deep Packet Inspection (DPI) data comprises the following steps:
detecting a first time period of missing DPI data;
acquiring DPI data of a second time period adjacent to the first time period;
inputting the DPI data of the second time period into a DPI data completion model unit; and
the DPI data completion model unit generates missing DPI data of the first time period based on the DPI data of the second time period.
2. The process of claim 1, wherein prior to detecting the first time period of missing DPI data, the process further comprises:
acquiring sample DPI data of a sample time period; and
inputting the sample DPI data into the DPI data completion model unit to train the DPI data completion model unit.
3. The process of claim 2, wherein the step of training the DPI data completion model element comprises:
preprocessing the sample DPI data, and sequentially inputting the preprocessed sample DPI data into a convolution layer, a correction linear unit layer, a pooling layer and a full-connection layer for processing to obtain characteristic data of the sample DPI data;
inputting the characteristic data of the sample DPI data into a discriminator of a generative countermeasure network (GAN);
inputting a random value into a generator of the GAN;
the generator calculates the random value to generate random feature data, and inputs the random feature data into the discriminator; and
the discriminator compares the characteristic data of the DPI data with the random characteristic data and judges to obtain a judgment result;
when the judgment result is not in the preset range, the discriminator determines that the current DPI data completion model unit does not reach the optimal state, and returns the judgment result to the generator, so that the generator generates the next random feature data;
and when the judgment result is in the preset range, the discriminator determines that the current DPI data completion model unit reaches the optimal state.
4. The processing method according to claim 3,
the predetermined range is 0.45 to 0.55.
5. A process according to claim 3, in which the step of generating random feature data by the generator comprises:
and the generator generates a data sequence of an initial time period based on the random value, takes a preset time period as an incremental time period, correspondingly and gradually increases the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquires the time information of the random characteristic data by using a forgetting gate.
6. The processing method according to claim 3,
the pretreatment comprises the following steps: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
7. A processing system for DPI data, comprising:
the system comprises an acquisition unit, a DPI data completion model unit and a DPI data completion model unit, wherein the acquisition unit is used for detecting a first time period of missing DPI data, acquiring DPI data of a second time period adjacent to the first time period, and inputting the DPI data of the second time period into the DPI data completion model unit; and
the DPI data completion model unit is used for generating the missing DPI data of the first time period based on the DPI data of the second time period.
8. The processing system of claim 7,
the acquisition unit is further used for acquiring sample DPI data of a sample time period and inputting the sample DPI data into the DPI data completion model unit;
the DPI data completion model unit is further used for training based on the sample DPI data.
9. The processing system of claim 8, wherein the DPI data completion model element comprises:
the data processing module is used for preprocessing the sample DPI data, sequentially inputting the preprocessed sample DPI data into the convolutional layer, the modified linear unit layer, the pooling layer and the full-connection layer for processing to obtain the characteristic data of the sample DPI data, and inputting the characteristic data of the sample DPI data into the discriminator of the generative countermeasure network GAN; and
the GAN comprises a generator and a discriminator; wherein the content of the first and second substances,
the generator is used for receiving a random value, calculating the random value to generate random feature data, and inputting the random feature data into the discriminator;
the discriminator is used for comparing and judging the characteristic data of the DPI data of the sample and the random characteristic data to obtain a judgment result; when the judgment result is not in the preset range, determining that the current DPI data completion model unit does not reach the optimal state, and returning the judgment result to the generator to enable the generator to generate the next random feature data; and when the judgment result is in the preset range, determining that the current DPI data completion model unit reaches the optimal state.
10. The processing system of claim 9,
the predetermined range is 0.45 to 0.55.
11. The processing system of claim 9,
the generator is used for generating a data sequence of an initial time period based on the random value, taking a preset time period as an incremental time period, correspondingly and gradually increasing the data sequence until the data sequence with the time period equal to the length of the sample time period is increased, namely the random characteristic data, and acquiring the time information of the random characteristic data by using a forgetting gate.
12. The processing system of claim 9,
the pretreatment comprises the following steps: removing at least one of missing value processing, dimension reduction processing, normalization processing, and vector encoding processing.
13. A processing system for DPI data, comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the method of any of claims 1-6 based on instructions stored in the memory.
14. A computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 6.
CN201911305426.7A 2019-12-18 2019-12-18 Processing method and processing system for DPI data Pending CN113010500A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911305426.7A CN113010500A (en) 2019-12-18 2019-12-18 Processing method and processing system for DPI data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911305426.7A CN113010500A (en) 2019-12-18 2019-12-18 Processing method and processing system for DPI data

Publications (1)

Publication Number Publication Date
CN113010500A true CN113010500A (en) 2021-06-22

Family

ID=76381114

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911305426.7A Pending CN113010500A (en) 2019-12-18 2019-12-18 Processing method and processing system for DPI data

Country Status (1)

Country Link
CN (1) CN113010500A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301825A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 The generation method and device of DPI rule
CN106971348A (en) * 2016-01-14 2017-07-21 阿里巴巴集团控股有限公司 A kind of data predication method and device based on time series
CN107133190A (en) * 2016-02-29 2017-09-05 阿里巴巴集团控股有限公司 The training method and training system of a kind of machine learning system
CN107169520A (en) * 2017-05-19 2017-09-15 济南浪潮高新科技投资发展有限公司 A kind of big data lacks attribute complementing method
WO2017215565A1 (en) * 2016-06-12 2017-12-21 中兴通讯股份有限公司 Method and device for transmitting dpi policy
CN109063433A (en) * 2018-07-09 2018-12-21 中国联合网络通信集团有限公司 Recognition methods, device and the readable storage medium storing program for executing of fictitious users
CN109165664A (en) * 2018-07-04 2019-01-08 华南理工大学 A kind of attribute missing data collection completion and prediction technique based on generation confrontation network
CN109815223A (en) * 2019-01-21 2019-05-28 北京科技大学 A kind of complementing method and complementing device for industry monitoring shortage of data
WO2019100724A1 (en) * 2017-11-24 2019-05-31 华为技术有限公司 Method and device for training multi-label classification model
CN110288537A (en) * 2019-05-20 2019-09-27 湖南大学 Facial image complementing method based on the depth production confrontation network from attention

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301825A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 The generation method and device of DPI rule
CN106971348A (en) * 2016-01-14 2017-07-21 阿里巴巴集团控股有限公司 A kind of data predication method and device based on time series
CN107133190A (en) * 2016-02-29 2017-09-05 阿里巴巴集团控股有限公司 The training method and training system of a kind of machine learning system
WO2017215565A1 (en) * 2016-06-12 2017-12-21 中兴通讯股份有限公司 Method and device for transmitting dpi policy
CN107169520A (en) * 2017-05-19 2017-09-15 济南浪潮高新科技投资发展有限公司 A kind of big data lacks attribute complementing method
WO2019100724A1 (en) * 2017-11-24 2019-05-31 华为技术有限公司 Method and device for training multi-label classification model
CN109165664A (en) * 2018-07-04 2019-01-08 华南理工大学 A kind of attribute missing data collection completion and prediction technique based on generation confrontation network
CN109063433A (en) * 2018-07-09 2018-12-21 中国联合网络通信集团有限公司 Recognition methods, device and the readable storage medium storing program for executing of fictitious users
CN109815223A (en) * 2019-01-21 2019-05-28 北京科技大学 A kind of complementing method and complementing device for industry monitoring shortage of data
CN110288537A (en) * 2019-05-20 2019-09-27 湖南大学 Facial image complementing method based on the depth production confrontation network from attention

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
冀俭俭;杨刚;: "基于生成对抗网络的分级联合图像补全方法", 《图学学报》, no. 6, 15 December 2019 (2019-12-15), pages 29 - 37 *
王力 等: "基于生成式对抗网络的路网交通流数据补全方法", 《交通运输系统工程与信息》, vol. 18, no. 6, 15 December 2018 (2018-12-15), pages 63 - 71 *

Similar Documents

Publication Publication Date Title
CN109271958B (en) Face age identification method and device
CN113221687B (en) Training method of pressing plate state recognition model and pressing plate state recognition method
CN111899759B (en) Method, device, equipment and medium for pre-training and model training of audio data
CN110991321B (en) Video pedestrian re-identification method based on tag correction and weighting feature fusion
CN103744974B (en) Method and device for selecting local interest points
US11645328B2 (en) 3D-aware image search
CN114693942A (en) Multimode fault understanding and auxiliary labeling method for intelligent operation and maintenance of instruments and meters
CN112597831A (en) Signal abnormity detection method based on variational self-encoder and countermeasure network
CN117557872B (en) Unsupervised anomaly detection method and device for optimizing storage mode
CN114691868A (en) Text clustering method and device and electronic equipment
CN111010595B (en) New program recommendation method and device
CN113313065A (en) Video processing method and device, electronic equipment and readable storage medium
CN111353526A (en) Image matching method and device and related equipment
CN112906883A (en) Hybrid precision quantization strategy determination method and system for deep neural network
CN113010500A (en) Processing method and processing system for DPI data
CN112465012A (en) Machine learning modeling method and device, electronic equipment and readable storage medium
CN110071845B (en) Method and device for classifying unknown applications
CN116468947A (en) Cutter image recognition method, cutter image recognition device, computer equipment and storage medium
CN112115991B (en) Mobile terminal change prediction method, device, equipment and readable storage medium
Zhong et al. Target aware network adaptation for efficient representation learning
CN113743593A (en) Neural network quantization method, system, storage medium and terminal
CN110210518B (en) Method and device for extracting dimension reduction features
CN112738098A (en) Anomaly detection method and device based on network behavior data
CN114626501A (en) Data processing method and device, electronic equipment and storage medium
CN113343938B (en) Image identification method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220126

Address after: 100007 room 205-32, floor 2, building 2, No. 1 and No. 3, qinglonghutong a, Dongcheng District, Beijing

Applicant after: Tianyiyun Technology Co.,Ltd.

Address before: No.31, Financial Street, Xicheng District, Beijing, 100033

Applicant before: CHINA TELECOM Corp.,Ltd.