CN112784910A - Deep filtering method and system for junk data - Google Patents

Deep filtering method and system for junk data Download PDF

Info

Publication number
CN112784910A
CN112784910A CN202110122376.XA CN202110122376A CN112784910A CN 112784910 A CN112784910 A CN 112784910A CN 202110122376 A CN202110122376 A CN 202110122376A CN 112784910 A CN112784910 A CN 112784910A
Authority
CN
China
Prior art keywords
data
filtering
text
cluster
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110122376.XA
Other languages
Chinese (zh)
Inventor
蒙政先
蔡楚才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Bochang Software Development Co ltd
Original Assignee
Wuhan Bochang Software Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Bochang Software Development Co ltd filed Critical Wuhan Bochang Software Development Co ltd
Priority to CN202110122376.XA priority Critical patent/CN112784910A/en
Publication of CN112784910A publication Critical patent/CN112784910A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a system for deep filtering of junk data, wherein the method comprises the following steps: acquiring network data, and performing quintuple preliminary filtering on the network data; performing text vectorization representation on the data subjected to preliminary filtering, performing clustering division on the text subjected to the vectorization representation by adopting an improved k-means clustering algorithm, determining a data source, and performing secondary filtering based on the data source; and performing deep content filtering based on the convolutional neural network. The invention realizes multi-level deep filtration of illegal data and garbage, ensures data safety and improves filtration precision.

Description

Deep filtering method and system for junk data
Technical Field
The invention relates to the technical field of data filtering, in particular to a depth data filtering method and system.
Background
With the development of information industries such as computers, the internet of things and the like, massive data streams are constantly flowing on the network, and with the rise of a big data concept, massive data is processed by means of big data analysis, monitored and filtered, and related business services can be efficiently and accurately improved.
Existing data filtering is mostly based on protocol filtering, by combining one or more of the five tuples, i.e. at least one of the source IP address, source port, destination IP address, destination port, or transport layer protocol. When the blacklist is adopted for filtering, when the data to be filtered accords with a certain record in the blacklist, corresponding filtering operation is executed to reject the data and/or discard the data, otherwise, the data is enabled to pass. When the filtering type is white list filtering, when the data to be filtered is consistent with a certain record in the white list, corresponding filtering operation is executed to enable the data to pass, otherwise, the data is rejected to pass or discarded. The modes play a good protection role in data security to a certain extent, and have a common filtering effect on the junk data in the mass network data.
Disclosure of Invention
In view of this, the invention provides a method and a system for filtering data of an internet of things, which are used for solving the problem that the existing data filtering method cannot effectively filter illegal data.
In a first aspect of the present invention, a method for deep filtering of garbage data is disclosed, the method comprising:
acquiring network data, and performing quintuple preliminary filtering on the network data;
performing text vectorization representation on the data subjected to preliminary filtering, performing clustering division on the text subjected to the vectorization representation by adopting an improved k-means clustering algorithm, determining a data source, and performing secondary filtering based on the data source;
and carrying out deep content filtering based on an optimized AdaBoost method.
Preferably, the five-tuple preliminary filtering includes filtering of a source IP address, a destination IP address, a source port number, a destination port number, and a transport protocol type.
Preferably, the performing text vectorization representation on the data after the preliminary filtering, performing cluster division on the text represented by the vectorization representation, and determining the data source specifically includes:
restoring the preliminarily filtered network data packet based on different protocols to obtain a binary file;
extracting features of the binary file based on a bag-of-words model to obtain a text feature vector, and finishing text vectorization expression;
the method comprises the steps of obtaining a standard text vector set, dividing the standard text vector set into a plurality of clusters by adopting an improved k-means clustering algorithm, determining the central point of each cluster, calculating the cluster to which a vectorized text belongs, and determining a data source in the cluster to which the text belongs in a similarity calculation mode.
Preferably, the dividing the training sample set into a plurality of clusters by using the improved k-means clustering algorithm specifically includes:
setting a population scale and a boundary, initializing the number of cluster clusters in the boundary range, carrying out K-means clustering based on the number of different cluster clusters, calculating the fitness under different clustering results, calculating the optimal position according to the fitness and carrying out individual position updating, carrying out iterative operation to finally obtain the optimal position as the number K of the cluster clusters, wherein the optimization target of the particle swarm algorithm is that the sum of the intra-class distance value means of each cluster is minimum.
Preferably, the depth content filtering based on the optimized AdaBoost method specifically includes:
adopting a conditional generation countermeasure network to perform data enhancement on the standard text vector set to obtain a training set;
training a convolutional neural network classification model through the training set, inputting the data after secondary filtering into the convolutional neural network model, and performing deep content filtering according to a classification result.
In a second aspect of the present invention, a garbage data depth filtering system is disclosed, the system comprising:
a preliminary filtering module: acquiring network data, and performing quintuple preliminary filtering on the network data;
a data source filtering module: performing text vectorization representation on the data subjected to preliminary filtering, and performing clustering division on the text subjected to vectorization representation by adopting an improved k-means clustering algorithm to determine a data source;
a content filtering module: and performing deep content filtering based on the convolutional neural network.
Preferably, the data source filtering module specifically includes:
restoring the preliminarily filtered network data packet based on different protocols to obtain a binary file;
a vectorization unit: extracting features of the binary file based on a bag-of-words model to obtain a text feature vector, and finishing text vectorization expression;
a clustering unit: acquiring a standard text vector set, dividing the standard text vector set into a plurality of clusters by adopting an improved k-means clustering algorithm, determining the central point of each cluster, and calculating the cluster to which the vectorized text belongs;
a data source filtering unit: and determining a data source in the belonged cluster in a similarity calculation mode, and carrying out secondary filtering.
Preferably, the content filtering module specifically includes:
a data enhancement unit: adopting a conditional generation countermeasure network to perform data enhancement on the standard text vector set to obtain a training set;
a depth filtering unit: training a convolutional neural network classification model through the training set, inputting the data after secondary filtering into the convolutional neural network model, and performing deep content filtering according to a classification result.
Compared with the prior art, the invention has the following beneficial effects:
the method comprises the steps of performing quintuple preliminary filtering on network data, tracing the data based on an improved clustering algorithm, and performing secondary filtering based on a data source; the method and the device have the advantages that the countermeasure network enhanced data set is generated by adopting conditions, and deep content filtering is performed on the basis of the convolutional neural network model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a garbage data depth filtering method according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, the present invention provides a method for deep filtering garbage data, where the method includes:
s1, acquiring network data, and performing quintuple preliminary filtering on the network data; the five-tuple preliminary filtering includes filtering of a source IP address, a destination IP address, a source port number, a destination port number, and a transport protocol type.
S2, performing text vectorization representation on the data after preliminary filtering, performing clustering division on all vectorization represented texts by adopting an improved k-means clustering algorithm, determining a data source, and performing secondary filtering based on the data source; the method specifically comprises the following steps:
s21, restoring the preliminarily filtered network data packet based on different protocols to obtain a binary file;
s22, extracting features of the binary file based on the bag-of-words model to obtain a text feature vector, and finishing text vectorization expression;
s23, obtaining a standard text vector set, and dividing the standard text vector set into a plurality of clusters by adopting a k-means clustering algorithm improved by a particle swarm algorithm; in particular, the method comprises the following steps of,
setting a population scale N and a boundary range [ L, U ], initializing a population position in the boundary range, and obtaining N different initial cluster numbers;
performing k-means clustering on the standard text vector set based on the number of different clustering clusters, calculating and sequencing fitness under different clustering results, obtaining an optimal position according to the fitness sequencing, and updating the individual position;
iterative operation is carried out until a termination condition is reached, an optimal individual position is obtained and serves as the optimal clustering cluster number K, and the optimization target of the floating algorithm is that the sum of the mean values of the intra-cluster distance values of all clusters is minimum and the inter-cluster distance is maximum;
and performing K-means clustering on the standard text vector set based on the number K of the optimal clustering clusters, dividing the standard text vector set into K clusters, and determining the central point of each cluster.
The clustering category number of the k-means clustering algorithm is usually required to be given in advance, different clustering category numbers have great influence on a partitioning result, and the clustering category number is difficult to be determined in advance for unknown data sets or large data sets. Based on the optimal clustering category number, the optimal clustering center combination is calculated through the Wu-Wei-gull optimization algorithm, and accurate data clustering division is realized, so that the data traceability accuracy is improved, and the data secondary filtering error caused by inaccurate clustering division is reduced. By optimizing the clustering category number and the clustering center, accurate clustering can be realized and errors can be reduced.
S24, based on the optimal clustering category number, calculating the optimal clustering center combination through a k-means clustering algorithm improved by a Woofer optimization algorithm, and determining the center point of each cluster;
specifically, the population scale, the boundary and the iteration times of the Woofer optimization algorithm are set, the population dimension is the same as the clustering category number K, and the data of each dimension represents a clustering center position. Initializing the cluster number in a boundary range, carrying out migration operation and attack operation on individuals in the cluster, carrying out position updating, calculating a fitness value, recording a global optimal value, judging whether the iteration times are reached, if not, carrying out the migration operation and the attack operation on the individuals in the cluster again, carrying out the position updating and calculating the fitness, and finally obtaining an optimal position as an optimal combination in clustering to obtain a central point of each cluster through iterative operation. The fitness function of the gull optimization algorithm is the minimum sum of the distances in various clusters.
S25, respectively calculating Euclidean distances between the vectorized text and the center points of the various clusters, and selecting the cluster with the small Euclidean distance as the cluster to which the text belongs;
s26, determining a data source in the belonged cluster by means of similarity calculation, calculating cosine similarity between the text expressed by vectorization and samples in the belonged cluster respectively, determining the data source according to the cosine similarity, and performing secondary filtering based on the data source.
And S3, performing deep content filtering based on the optimized neural network.
Adopting a conditional generation countermeasure network to perform data enhancement on the standard text vector set to obtain a training set;
training a convolutional neural network classification model through the training set, inputting the data after secondary filtering into the convolutional neural network model, and performing deep content filtering according to a classification result.
The method comprises the steps of performing quintuple preliminary filtering on network data, tracing the data based on an improved clustering algorithm, and performing secondary filtering based on a data source; the method and the device have the advantages that the countermeasure network enhanced data set is generated by adopting conditions, and deep content filtering is performed on the basis of the convolutional neural network model.
Corresponding to the embodiment of the method, the invention also provides a garbage data depth filtering system, which comprises:
a preliminary filtering module: the system comprises a data processing module, a data processing module and a data processing module, wherein the data processing module is used for acquiring network data and carrying out quintuple primary filtering on the network data;
a data source filtering module: the system comprises a data source, a data vector and a data vector, wherein the data vector is used for performing text vectorization representation on the data after primary filtering, and clustering division is performed on the text subjected to vectorization representation by adopting an improved k-means clustering algorithm to determine the data source; the data source filtering module specifically comprises:
restoring the preliminarily filtered network data packet based on different protocols to obtain a binary file;
a vectorization unit: extracting features of the binary file based on a bag-of-words model to obtain a text feature vector, and finishing text vectorization expression;
a clustering unit: acquiring a standard text vector set, dividing the standard text vector set into a plurality of clusters by adopting an improved k-means clustering algorithm, determining the central point of each cluster, and calculating the cluster to which the vectorized text belongs;
a data source filtering unit: and determining a data source in a similarity calculation mode in the belonged cluster, and performing secondary filtering.
A content filtering module: and performing deep content filtering based on the convolutional neural network. The content filtering module specifically comprises:
a data enhancement unit: adopting a conditional generation countermeasure network to perform data enhancement on the standard text vector set to obtain a training set;
a depth filtering unit: training a convolutional neural network classification model through the training set, inputting the data after secondary filtering into the convolutional neural network model, and performing deep content filtering according to a classification result.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (8)

1. A method for deep filtering of junk data, the method comprising:
acquiring network data, and performing quintuple preliminary filtering on the network data;
performing text vectorization representation on the data subjected to preliminary filtering, performing clustering division on the text subjected to the vectorization representation by adopting an improved k-means clustering algorithm, determining a data source, and performing secondary filtering based on the data source;
and performing deep content filtering based on the convolutional neural network.
2. The method of claim 1, wherein the five-tuple prefiltering comprises filtering of a source IP address, a destination IP address, a source port number, a destination port number, and a transport protocol type.
3. The method according to claim 1, wherein the performing text vectorization on the preliminarily filtered data, performing cluster division on the text represented in the vectorization mode, and determining the data source specifically includes:
restoring the preliminarily filtered network data packet based on different protocols to obtain a binary file;
extracting features of the binary file based on a bag-of-words model to obtain a text feature vector, and finishing text vectorization expression;
the method comprises the steps of obtaining a standard text vector set, dividing the standard text vector set into a plurality of clusters by adopting an improved k-means clustering algorithm, determining the central point of each cluster, calculating the cluster to which a vectorized text belongs, and determining a data source in the cluster to which the text belongs in a similarity calculation mode.
4. The method for deep filtering of garbage data according to claim 3, wherein the step of employing the improved k-means clustering algorithm to divide the training sample set into a plurality of clusters specifically comprises:
optimizing the clustering category number of the k-means clustering algorithm through a particle swarm algorithm: setting the population scale and the boundary of a particle swarm algorithm, initializing the number of clustering clusters in the boundary range, carrying out K-means clustering on a training sample set based on different numbers of clustering clusters, calculating the fitness under different clustering results, setting a fitness threshold, calculating an optimal position according to the fitness and carrying out individual position updating, carrying out iterative operation to finally obtain the optimal position meeting the fitness threshold as the number K of the clustering clusters, wherein the optimization target of the particle swarm algorithm is that the sum of the intra-class distance value means of each cluster is minimum.
5. The method for deep filtering of spam data according to claim 4, wherein the performing deep content filtering based on the optimized AdaBoost method specifically comprises:
adopting a conditional generation countermeasure network to perform data enhancement on the standard text vector set to obtain a training set;
training a convolutional neural network classification model through the training set, inputting the data after secondary filtering into the convolutional neural network model, and performing deep content filtering according to a classification result.
6. A spam data depth filtering system, the system comprising:
a preliminary filtering module: acquiring network data, and performing quintuple preliminary filtering on the network data;
a data source filtering module: performing text vectorization representation on the data subjected to preliminary filtering, and performing clustering division on the text subjected to vectorization representation by adopting an improved k-means clustering algorithm to determine a data source;
a content filtering module: and performing deep content filtering based on the convolutional neural network.
7. The system of claim 6, wherein the data source filtering module specifically comprises:
restoring the preliminarily filtered network data packet based on different protocols to obtain a binary file;
a vectorization unit: extracting features of the binary file based on a bag-of-words model to obtain a text feature vector, and finishing text vectorization expression;
a clustering unit: acquiring a standard text vector set, dividing the standard text vector set into a plurality of clusters by adopting an improved k-means clustering algorithm, determining the central point of each cluster, and calculating the cluster to which the vectorized text belongs;
a data source filtering unit: and determining a data source in the belonged cluster in a similarity calculation mode, and carrying out secondary filtering.
8. The system for deep filtering of spam data according to claim 7, wherein the content filtering module specifically comprises:
a data enhancement unit: adopting a conditional generation countermeasure network to perform data enhancement on the standard text vector set to obtain a training set;
a depth filtering unit: training a convolutional neural network classification model through the training set, inputting the data after secondary filtering into the convolutional neural network model, and performing deep content filtering according to a classification result.
CN202110122376.XA 2021-01-28 2021-01-28 Deep filtering method and system for junk data Pending CN112784910A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110122376.XA CN112784910A (en) 2021-01-28 2021-01-28 Deep filtering method and system for junk data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110122376.XA CN112784910A (en) 2021-01-28 2021-01-28 Deep filtering method and system for junk data

Publications (1)

Publication Number Publication Date
CN112784910A true CN112784910A (en) 2021-05-11

Family

ID=75759619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110122376.XA Pending CN112784910A (en) 2021-01-28 2021-01-28 Deep filtering method and system for junk data

Country Status (1)

Country Link
CN (1) CN112784910A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663100A (en) * 2012-04-13 2012-09-12 西安电子科技大学 Two-stage hybrid particle swarm optimization clustering method
CN103838210A (en) * 2014-02-25 2014-06-04 北京理工大学 Emergency scene poisonous gas remote wireless monitoring system and method
CN105306296A (en) * 2015-10-21 2016-02-03 北京工业大学 Data filter processing method based on LTE (Long Term Evolution) signaling
CN106528705A (en) * 2016-10-26 2017-03-22 桂林电子科技大学 Repeated record detection method and system based on RBF neural network
CN108363810A (en) * 2018-03-09 2018-08-03 南京工业大学 A kind of file classification method and device
CN108595499A (en) * 2018-03-18 2018-09-28 西安财经学院 A kind of population cluster High dimensional data analysis method of clone's optimization
CN110399485A (en) * 2019-07-01 2019-11-01 上海交通大学 The data source tracing method and system of word-based vector sum machine learning
WO2019214133A1 (en) * 2018-05-08 2019-11-14 华南理工大学 Method for automatically categorizing large-scale customer complaint data
CN111260157A (en) * 2020-02-21 2020-06-09 天津开发区精诺瀚海数据科技有限公司 Smelting ingredient optimization method based on ecological niche optimization genetic algorithm
CN111368077A (en) * 2020-02-28 2020-07-03 大连大学 K-Means text classification method based on particle swarm location updating thought wolf optimization algorithm

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663100A (en) * 2012-04-13 2012-09-12 西安电子科技大学 Two-stage hybrid particle swarm optimization clustering method
CN103838210A (en) * 2014-02-25 2014-06-04 北京理工大学 Emergency scene poisonous gas remote wireless monitoring system and method
CN105306296A (en) * 2015-10-21 2016-02-03 北京工业大学 Data filter processing method based on LTE (Long Term Evolution) signaling
CN106528705A (en) * 2016-10-26 2017-03-22 桂林电子科技大学 Repeated record detection method and system based on RBF neural network
CN108363810A (en) * 2018-03-09 2018-08-03 南京工业大学 A kind of file classification method and device
CN108595499A (en) * 2018-03-18 2018-09-28 西安财经学院 A kind of population cluster High dimensional data analysis method of clone's optimization
WO2019214133A1 (en) * 2018-05-08 2019-11-14 华南理工大学 Method for automatically categorizing large-scale customer complaint data
CN110399485A (en) * 2019-07-01 2019-11-01 上海交通大学 The data source tracing method and system of word-based vector sum machine learning
CN111260157A (en) * 2020-02-21 2020-06-09 天津开发区精诺瀚海数据科技有限公司 Smelting ingredient optimization method based on ecological niche optimization genetic algorithm
CN111368077A (en) * 2020-02-28 2020-07-03 大连大学 K-Means text classification method based on particle swarm location updating thought wolf optimization algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
魏萌: "基于深度学习的肺结节分割算法研究", 《中国优秀博硕士学位论文全文数据库(硕士)医药卫生科技辑》 *

Similar Documents

Publication Publication Date Title
CN109639739B (en) Abnormal flow detection method based on automatic encoder network
CN106713324B (en) Flow detection method and device
CN107145778B (en) Intrusion detection method and device
CN109981625B (en) Log template extraction method based on online hierarchical clustering
CN111556016B (en) Network flow abnormal behavior identification method based on automatic encoder
Naiemi et al. An efficient character recognition method using enhanced HOG for spam image detection
CN109525508B (en) Encrypted stream identification method and device based on flow similarity comparison and storage medium
CN103077720B (en) Speaker identification method and system
CN102420723A (en) Anomaly detection method for various kinds of intrusion
CN110149347B (en) Network intrusion detection method for realizing dynamic self-adaptive clustering by using inflection point radius
CN106850338B (en) Semantic analysis-based R +1 type application layer protocol identification method and device
CN103780588A (en) User abnormal behavior detection method in digital home network
CN116662817B (en) Asset identification method and system of Internet of things equipment
CN112884121A (en) Traffic identification method based on generation of confrontation deep convolutional network
CN111428151B (en) False message identification method and device based on network acceleration
CN113762377A (en) Network traffic identification method, device, equipment and storage medium
CN112134862A (en) Coarse-fine granularity mixed network anomaly detection method and device based on machine learning
CN114826681A (en) DGA domain name detection method, system, medium, equipment and terminal
Jan et al. Semi-supervised labeling: a proposed methodology for labeling the twitter datasets
CN117478390A (en) Network intrusion detection method based on improved density peak clustering algorithm
CN112784910A (en) Deep filtering method and system for junk data
CN106533784A (en) Method for improving application layer traffic classification accuracy
CN114492569B (en) Typhoon path classification method based on width learning system
CN116467141A (en) Log recognition model training, log clustering method, related system and equipment
CN115982722A (en) Vulnerability classification detection method based on decision tree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210511

RJ01 Rejection of invention patent application after publication