CN108874850B - Network video service feature selection method based on PSOGSA-CI - Google Patents

Network video service feature selection method based on PSOGSA-CI Download PDF

Info

Publication number
CN108874850B
CN108874850B CN201810151475.9A CN201810151475A CN108874850B CN 108874850 B CN108874850 B CN 108874850B CN 201810151475 A CN201810151475 A CN 201810151475A CN 108874850 B CN108874850 B CN 108874850B
Authority
CN
China
Prior art keywords
particle
psogsa
video service
data
feature selection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810151475.9A
Other languages
Chinese (zh)
Other versions
CN108874850A (en
Inventor
董育宁
吴兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN201810151475.9A priority Critical patent/CN108874850B/en
Publication of CN108874850A publication Critical patent/CN108874850A/en
Application granted granted Critical
Publication of CN108874850B publication Critical patent/CN108874850B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention belongs to the technical field of pattern recognition and video service classification, and particularly relates to a network video service feature selection method based on PSOGSA-CI (pseudo-random open system-common interface), which performs combined tuning and optimization on parameters of a PSOGSA-CI algorithm while performing feature selection, sorts the features by using a variance importance degree coefficient, guides the initialization of a particle population by using the features which are ranked at the top, and finally selects an optimal feature subset by using the variance importance degree coefficient which is low in complexity and high in feature distinguishing capability as a fitness function. The invention not only enhances the searching capability of the algorithm, but also ensures the convergence capability of the algorithm. The method is used for classifying seven video service streams including standard-definition, high-definition and ultra-definition Web videos, instant video communication, live network videos, P2P client videos and Http download videos, and experimental results show that the method has a better classification effect and greatly reduces the operation time compared with the existing method.

Description

Network video service feature selection method based on PSOGSA-CI
Technical Field
The invention belongs to the technical field of pattern recognition and video service classification, and particularly relates to a network video service feature selection method based on PSOGSA-CI.
Background
With the rapid development of internet technology, the status of networks is continuously improved in the lives of people, and people cannot leave the networks. The problems of how to perform effective network resource management and how to ensure the quality of network service are followed. Among network services, network video services dominate and have a relatively strong growth trend. The classification of the network video service flow can effectively guide the ISP to carry out efficient network resource management, and provide better service for users. Currently, the main methods for stream classification are: port number based methods, deep packet inspection based methods, and statistical feature based machine learning methods. The first two flow classification methods are no longer applicable due to the use of dynamic ports and the encrypted transmission of data. At present, the main research focus is the field of Machine learning methods based on network flow statistical characteristics, and currently widely used Machine learning algorithms mainly include Support Vector Machines (SVMs), naive bayes, artificial neural networks, K-nearest neighbors (K-NN), K-Means (K-Means) clustering algorithms, and the like.
When the machine learning algorithm is used for feature selection, the calculated amount is often large, and especially for network video streams with high feature dimensions, the time consumption is more serious. Among many features, the existence of redundancy or even irrelevant features between features can have a great influence on the selection of features and the performance of a classifier. Therefore, the network video stream needs to be preprocessed for feature selection, so that the time for classification is reduced, and the classification effect is improved. In addition, the efficiency of the algorithm can be improved by means of a search algorithm when feature selection is carried out, and a heuristic search algorithm is a current research hotspot. Genetic algorithms, ant colony algorithms, particle swarm algorithms, artificial neural networks, and the like are common, and these algorithms have been gradually used for feature selection.
The network video stream has higher statistical feature quantity, and if the existing particle swarm search algorithm is directly used for searching the more features, the time complexity is very high, and the search effect is poor. In addition, most search algorithms select the accuracy of the classifier as a fitness function, further increasing the time complexity of the algorithm. Meanwhile, the initialization of population particles, the selection of fitness function and the selection of parameters all affect the searching capability and convergence speed of the algorithm.
Disclosure of Invention
The invention provides a network video service feature selection method based on PSOGSA-CI. The method selects characteristics of seven video streams including standard-definition, high-definition and ultra-definition Web videos (websites such as youku, iQIYI and the like), instant video communication (QQ videos), live webcast videos (CBox, SopCast and the like), P2P client videos (Kankan) and Http download videos, and then classifies the videos by utilizing an SVM classifier. The experimental result shows that the method can obtain higher classification accuracy rate than the existing similar method, and the running time is greatly reduced.
The technical scheme of the invention is a network video service flow characteristic selection method based on PSOGSA-CI, which specifically comprises the following steps:
(1) data acquisition and feature extraction: collecting data of various video service flows on the Internet by using packet capturing software, and then extracting statistical characteristics of the video flows;
(2) feature selection and analysis: analyzing the extracted statistical characteristics of the video stream, and selecting a characteristic subset capable of effectively distinguishing the video service stream by using the method;
(3) and (3) a classification process: and carrying out classification experiments on the network video service flow by using the selected feature subset and an SVM classifier, and verifying the effectiveness of the method.
Further, the specific operations of data acquisition and feature extraction specifically include:
(1.1) capturing required multimedia service flow data through network packet analysis software WireShark in an open internet environment, and then converting original data into a standard five-tuple text format, namely arrival time of a data packet, a source IP address, a destination IP address, a protocol and packet size of the data packet;
(1.2) extracting basic statistical characteristics from a standard quintuple file of the original multimedia service flow, wherein the characteristics comprise: uplink/downlink packet size, entropy of uplink/downlink packet size information, overall packet size, uplink/downlink packet arrival time interval, downlink data packet rate, downlink byte rate, and ratio of uplink and downlink byte number.
The feature selection and analysis specifically includes:
(2.1) carrying out discretization operation on the statistical characteristics of the multimedia service flow;
(2.2) sorting all features by variance importance Coefficient (CI);
(2.3) particle design: each particle is composed of a string of bits (length of bits)Characteristic quantity D) and c of PSOGSA1And c2Parameter composition (total length D + 2). Any bit of the bit has two values, namely '0' or '1', the value of '1' represents that the characteristic is selected, and the value of 0 represents that the characteristic is not selected; c. C1And c2The parameters are selected by the PSOGSA algorithm and simultaneously the parameters of the PSOGSA algorithm are jointly optimized, so that better searching performance is obtained. To sum up, the ith particle is designed as: x is the number ofi=(xi1,xi2,...,xiD,ci1,ci2) Wherein x isiD∈{0,1},i=1,2,...,D,ci1∈[0,1],ci2∈[1,2];
(2.4) particle population initialization: selecting n characteristics with top rank in the step (3.2), namely setting the corresponding top n position as 1, setting the D-n position as 0, and setting the c position as C1Is set to 0.75, c2Setting the value to be 1.5;
(2.5) calculation of particle fitness value: the fitness function of the particle adopts the mean value of variance importance Coefficient (CI) with low complexity and high feature discrimination. The feature importance coefficient CI is a ratio of the coefficient of variance between classes to the coefficient of variance within a class. The variance essentially reflects the distance between the data, which is greater if the difference between the data is greater; conversely, the smaller. CI considers divergence degree between categories and divergence degree inside categories;
(2.6) particle renewal: updating the acceleration of the particle according to the particle fitness value in (2.5), combining c in the particle1,c2Updating the speed of the particles, the position of the particles, the optimal position of the particles and the optimal position of the particle population by using parameters;
(2.7) if the maximum iteration number iteration is met or the CI is kept unchanged in the iteration process, outputting an optimal feature subset; otherwise, repeating the step (2.5) and the step (2.6).
The classification process specifically includes:
(3.1) selecting the characteristics of the original multimedia service flow by adopting a characteristic selection method;
(3.2) classifying the selected features by using an SVM classifier, and adopting a ten-fold cross verification mode;
and (3.3) counting the whole classification effect.
The invention has the beneficial effects that:
1. the PSOGSA-CI-based network video service feature selection method provided by the invention selects the feature subset and simultaneously performs combined tuning and optimization on the parameters of the PSOGSA algorithm, and the searching capability and effect of the algorithm are obviously superior to those of the existing similar method;
2. the network video service feature selection method based on PSOGSA-CI provided by the invention firstly uses CI to sort, guides the initialization of particle populations by using features with a front rank, simultaneously selects CI with low complexity and strong feature distinguishing capability as the fitness function of the algorithm, has the running time far lower than that of the similar algorithm which adopts the accuracy of a classifier as the fitness function, and has better classification effect.
Drawings
FIG. 1 is a flow chart of a PSOGSA-CI-based network video service feature selection method according to the present invention.
FIG. 2 is a comparison graph of precision ratio of the PSOGSA-CI and GSAFODPSO-SVM algorithms of the method of the present invention to seven kinds of video service classification.
FIG. 3 is a comparison graph of F measurement of seven video service classifications by the PSOGSA-CI and GSAFODPSO-SVM algorithms of the method of the present invention.
Detailed Description
For the purpose of enhancing the understanding of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and examples, which are provided for the purpose of illustration only and are not intended to limit the scope of the present invention.
As shown in fig. 1, the present invention provides a network video service feature selection method based on PSOGSA-CI, which includes data set acquisition and feature extraction of video service flow, network video service feature selection based on PSOGSA-CI, output of experiment results using SVM classifier, etc., and includes the following steps:
step 1: the method comprises the following steps of data acquisition and feature extraction:
(1) in an open internet environment, capturing required multimedia service stream data through network packet analysis software WireShark, and then converting original data into a standard five-tuple text format, namely arrival time of a data packet, a source IP address, a destination IP address, a protocol and packet size of the data packet;
(2) extracting basic statistical characteristics from a standard quintuple file of an original multimedia service flow, wherein the characteristics comprise: uplink/downlink packet size, entropy of uplink/downlink packet size information, overall packet size, uplink/downlink packet arrival time interval, downlink data packet rate, downlink byte rate, and ratio of uplink and downlink byte number.
Step 2: the method comprises the following specific steps of:
(1) discretizing the statistical characteristics of the multimedia service flow;
(2) sorting all features by a variance importance Coefficient (CI);
(3) designing particles: each particle is composed of a string of bits (length is a characteristic number D) and c of PSOGSA1And c2Parameter composition (total length D + 2). Any bit of the bit has two values, namely '0' or '1', the value of '1' represents that the characteristic is selected, and the value of 0 represents that the characteristic is not selected; c. C1And c2The parameters are selected by the PSOGSA algorithm and simultaneously the parameters of the PSOGSA algorithm are jointly optimized, so that better searching performance is obtained. To sum up, the ith particle is designed as: x is the number ofi=(xi1,xi2,...,xiD,ci1,ci2) Wherein x isiD∈{0,1},i=1,2,...,D,ci1∈[0,1],ci2∈[1,2];
(4) Particle population initialization: selecting n characteristics ranked at the top in the step (2), namely setting the corresponding top n position as 1, setting the D-n position as 0, and setting the c position as C1Is set to 0.75, c2Setting the value to be 1.5;
(5) calculation of particle fitness value: the fitness function of the particle adopts the mean value of variance importance Coefficient (CI) with low complexity and high feature discrimination. The feature importance coefficient CI is a ratio of the coefficient of variance between classes to the coefficient of variance within a class. The variance essentially reflects the distance between the data, which is greater if the difference between the data is greater; conversely, the smaller. CI considers divergence degree between categories and divergence degree inside categories;
(6) particle updating: updating the acceleration of the particle according to the particle fitness value in (5), combining c in the particle1,c2Updating the speed of the particles, the position of the particles, the optimal position of the particles and the optimal position of the particle population by using parameters;
(7) if the maximum iteration number iteration is met or the CI is kept unchanged in the iteration process, outputting an optimal feature subset; otherwise, repeating the step (5) and the step (6).
And step 3, a classification process, which comprises the following specific steps:
(1) selecting the characteristics of the original multimedia service flow by adopting a characteristic selection method;
(2) classifying the selected features by using an SVM classifier, and adopting a ten-fold cross verification mode;
(3) and (5) counting the whole classification effect.
As can be seen from FIGS. 2 and 3, the method of the present invention has a better classification effect, and the precision ratio and the F measure are better than those of the comparative method.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (4)

1. The network video service feature selection method based on PSOGSA-CI is characterized by comprising the following steps:
(1) data acquisition and feature extraction: collecting data of various video service flows on the Internet by using packet capturing software, and then extracting statistical characteristics of the video flows;
(2) feature selection and analysis: analyzing the extracted statistical characteristics of the video stream, and selecting a characteristic subset capable of effectively distinguishing the video service stream;
(3) and (3) a classification process: carrying out classification experiments on the network video service flow by using the selected feature subset and an SVM classifier, and verifying the effectiveness of the method;
wherein the feature selection and analysis specifically comprises:
(3.1) carrying out discretization operation on the statistical characteristics of the multimedia service flow;
(3.2) sorting all the features through the variance importance degree coefficient CI;
(3.3) particle design: each particle consisting of a string of bits and c of PSOGSA1And c2The parameter composition, any bit in the bit position has two values of '0' or '1', the value of '1' represents that the characteristic is selected, and the value of 0 represents that the characteristic is not selected; c. C1And c2The parameters are selected by the PSOGSA algorithm and simultaneously the parameters of the particle x are jointly optimized, so that better search performance is obtained, and in sum, the ith particle xiThe design is as follows: x is the number ofi=(xi1,xi2,...,xiD,ci1,ci2) Wherein x isiD∈{0,1},i=1,2,...,D,ci1∈[0,1],ci2∈[1,2];
(3.4) particle population initialization: selecting n characteristics at the top in the step (3.2), namely, the front n position corresponding to the particle is 1, the back D-n position is 0, and c1Is set to 0.75, c2Setting the value to be 1.5;
(3.5) calculation of particle fitness value: the particle fitness function adopts the mean value of variance importance coefficient CI with low complexity and high feature distinguishing degree, the variance importance coefficient CI of the features is the ratio of the variance coefficient between the categories to the variance coefficient in the categories, the variance reflects the distance between the data essentially, and if the difference between the data is larger, the value is larger; otherwise, the smaller the CI is, the smaller the CI simultaneously considers the divergence degree between the categories and the divergence degree inside the categories;
(3.6) particle renewal: updating the acceleration of the particle according to the particle fitness value in (3.5), combining c in the particle1,c2Updating the speed of the particles, the position of the particles, the optimal position of the particles and the optimal position of the particle population by using parameters;
(3.7) if the maximum iteration number iteration is met or the CI is kept unchanged in the iteration process, outputting an optimal feature subset; otherwise, repeating the step (3.5) and the step (3.6).
2. The PSOGSA-CI based network video service feature selection method according to claim 1, wherein the data collection and preprocessing operation specifically comprises:
(2.1) capturing required multimedia service flow data through network packet analysis software WireShark in an open internet environment, and then converting the original data into a standard five-tuple text format, namely the arrival time of a data packet, a source IP address, a destination IP address, a protocol and the packet size of the data packet;
(2.2) extracting basic statistical characteristics from the standard quintuple file of the original multimedia service flow, wherein the characteristics comprise: uplink/downlink packet size, entropy of uplink/downlink packet size information, overall packet size, uplink/downlink packet arrival time interval, downlink data packet rate, downlink byte rate, and ratio of uplink and downlink byte number.
3. The PSOGSA-CI based network video service feature selection method according to claim 2, wherein n is 0.3 x D in the top n features in step (3.4), and the maximum iteration time iteration in step (3.7) is 20.
4. The PSOGSA-CI-based network video service feature selection method according to claim 1, wherein the classification process specifically comprises:
(5.1) selecting the characteristics of the original multimedia service flow by adopting the characteristic selection method provided by the invention;
(5.2) classifying the selected features by using an SVM classifier, and adopting a ten-fold cross verification mode;
and (5.3) counting the overall classification effect.
CN201810151475.9A 2018-02-14 2018-02-14 Network video service feature selection method based on PSOGSA-CI Active CN108874850B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810151475.9A CN108874850B (en) 2018-02-14 2018-02-14 Network video service feature selection method based on PSOGSA-CI

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810151475.9A CN108874850B (en) 2018-02-14 2018-02-14 Network video service feature selection method based on PSOGSA-CI

Publications (2)

Publication Number Publication Date
CN108874850A CN108874850A (en) 2018-11-23
CN108874850B true CN108874850B (en) 2022-02-22

Family

ID=64325996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810151475.9A Active CN108874850B (en) 2018-02-14 2018-02-14 Network video service feature selection method based on PSOGSA-CI

Country Status (1)

Country Link
CN (1) CN108874850B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331893A (en) * 2014-11-14 2015-02-04 东南大学 Complex image multi-threshold segmentation method
CN105787512A (en) * 2016-02-29 2016-07-20 南京邮电大学 Network browsing and video classification method based on novel characteristic selection method
CN106897733A (en) * 2017-01-16 2017-06-27 南京邮电大学 Video stream characteristics selection and sorting technique based on particle swarm optimization algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331893A (en) * 2014-11-14 2015-02-04 东南大学 Complex image multi-threshold segmentation method
CN105787512A (en) * 2016-02-29 2016-07-20 南京邮电大学 Network browsing and video classification method based on novel characteristic selection method
CN106897733A (en) * 2017-01-16 2017-06-27 南京邮电大学 Video stream characteristics selection and sorting technique based on particle swarm optimization algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
An Efficient Feature Selection Method for Network Video Traffic Classification;Yuning Dong等;《2017 17th IEEE International Conference on Communication Technology》;20171030;第1608-1612页 *
基于粒子群优化算法的视频流特征选择方法;冯茂等;《南京邮电大学学报( 自然科学版)》;20170430;第37卷(第2期);第80-85页 *
网络浏览和网络视频业务的特征分析与分类方法;王凯等;《南京邮电大学学报( 自然科学版)》;20161231;第36卷(第6期);第81-89页 *

Also Published As

Publication number Publication date
CN108874850A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
Erman et al. Traffic classification using clustering algorithms
CN108650194B (en) Network traffic classification method based on K _ means and KNN fusion algorithm
Este et al. Support vector machines for TCP traffic classification
Jin et al. A modular machine learning system for flow-level traffic classification in large networks
Zander et al. Self-learning IP traffic classification based on statistical flow characteristics
WO2018054342A1 (en) Method and system for classifying network data stream
Wang et al. Real-time load reduction in multimedia big data for mobile Internet
Wang et al. Real network traffic collection and deep learning for mobile app identification
CN110290022B (en) Unknown application layer protocol identification method based on adaptive clustering
Areström et al. Early online classification of encrypted traffic streams using multi-fractal features
CN113329023A (en) Encrypted flow malice detection model establishing and detecting method and system
CN111565156B (en) Method for identifying and classifying network traffic
Hajjar et al. Network traffic application identification based on message size analysis
Apiletti et al. SeLINA: A self-learning insightful network analyzer
CN109286576A (en) A kind of network agent encryption traffic characteristic extracting method of data packet frequency analysis
Yang et al. Smiler: Towards practical online traffic classification
Dixit et al. Internet traffic detection using naïve bayes and K-Nearest neighbors (KNN) algorithm
CN108494620B (en) Network service flow characteristic selection and classification method
Himura et al. Synoptic graphlet: Bridging the gap between supervised and unsupervised profiling of host-level network traffic
Min et al. Online Internet traffic identification algorithm based on multistage classifier
Oudah et al. A novel features set for internet traffic classification using burstiness
Takyi et al. Clustering techniques for traffic classification: A comprehensive review
CN108874850B (en) Network video service feature selection method based on PSOGSA-CI
CN108307231B (en) Network video stream feature selection and classification method based on genetic algorithm
Alizadeh et al. Timely classification and verification of network traffic using Gaussian mixture models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant