WO2022116811A1 - Method and device for predicting definition of video having encrypted traffic - Google Patents

Method and device for predicting definition of video having encrypted traffic Download PDF

Info

Publication number
WO2022116811A1
WO2022116811A1 PCT/CN2021/130890 CN2021130890W WO2022116811A1 WO 2022116811 A1 WO2022116811 A1 WO 2022116811A1 CN 2021130890 W CN2021130890 W CN 2021130890W WO 2022116811 A1 WO2022116811 A1 WO 2022116811A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
definition
encrypted traffic
data block
model
Prior art date
Application number
PCT/CN2021/130890
Other languages
French (fr)
Chinese (zh)
Inventor
王赟
侯贺明
曾伟
Original Assignee
武汉绿色网络信息服务有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 武汉绿色网络信息服务有限责任公司 filed Critical 武汉绿色网络信息服务有限责任公司
Publication of WO2022116811A1 publication Critical patent/WO2022116811A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4782Web browsing, e.g. WebTV
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols

Definitions

  • the invention belongs to the field of computer servers, and more particularly, relates to a method and device for predicting the definition of encrypted traffic video.
  • DPI Deep Packet Inspection
  • the present invention provides a method and device for predicting the definition of encrypted traffic video.
  • the model is trained with the feature set of the known data packet definition, and then the model is used to predict the definition of the encrypted video file of unknown definition, thus solving the technical problem that the DPI manufacturer cannot analyze the video definition when the video transmission is encrypted.
  • the present invention provides a method for predicting the clarity of encrypted traffic video, and the method for predicting the clarity of the video includes:
  • the model is established by using the corresponding relationship between the clarity of the known video file annotation and the feature set of the data block, and the model is trained. After the model training is completed, the feature extraction is performed on the TCP stream data packets of the encrypted traffic video to be tested. According to the feature set in the model and the clarity The corresponding relationship between the degrees predicts the clarity of the encrypted traffic video file to be tested.
  • the present invention also includes the following additional technical features.
  • the method for collecting data of network video playback with HTTPS encrypted traffic includes:
  • each encoding method corresponds to a unique video definition number in the play log, and when the browser plays the video, the video encoding number and the corresponding definition in the encrypted traffic network video play log are simultaneously recorded and collected.
  • the data block is detected according to the ACK field of the TCP message, and specifically includes:
  • the message whose ACK value has changed is recorded as a new data block.
  • the characteristics corresponding to the definition and the average value of the characteristics in the data block are extracted from the TCP stream file in the video transmission, wherein:
  • Described feature comprises: one or more in data block size, data packet number, first byte arrival time, data block download time, data block idle time, data block transmission time and data transmission rate;
  • the characteristic average value includes one of: average data block size, average data packet number, average first byte arrival time, average data block download time, average data block idle time, average data block transfer time, and average data transfer rate. item or multiple items.
  • the features in the data block, the feature average value and the data block of the current TCP flow file are combined into a feature set sample with a known data packet definition, and the model receives at least one feature set sample for training, using known
  • the video coding number in the video playback log of the definition verifies the predicted definition result of the model. If the accuracy of the prediction result of the model is higher than the preset value, the model is successfully trained.
  • the model prediction trained by the Android platform is used, and in the traditional fixed network environment, the model prediction trained by the PC platform is used.
  • a random oversampling method is used to balance the sharpness categories on the sample set.
  • the traffic when filtering out the target traffic with HTTPS encrypted traffic network video from the network traffic, by comparing the SNI field and the string in the domain name, if it completely matches the preset SNI field and the preset string, it means that the traffic is HLS Transport mode video traffic, and can become a video file that requires predictive definition encrypted traffic.
  • the present invention also provides a device for predicting the definition of encrypted traffic video, the device comprising:
  • DPI manufacturers can build a model and use the feature set in the video file of known definition to predict the definition of the encrypted traffic video file to be tested.
  • Embodiment 1 is a process of training a model in Embodiment 1 of the present invention
  • Fig. 2 is the definition that utilizes the video coding number in the play log to mark the corresponding definition in Embodiment 1 of the present invention
  • Embodiment 3 is a process of detecting a data block in the ACK field of a TCP message in Embodiment 1 of the present invention
  • Fig. 4 compares the SNI field and character string in the domain name in the first embodiment of the present invention, and filters out the encrypted traffic network video to be tested;
  • FIG. 5 is a schematic structural diagram of an apparatus for predicting sharpness of encrypted traffic video according to an embodiment of the present invention.
  • a first feature "on” or “under” a second feature may include the first and second features in direct contact, or may include the first and second features Not directly but through additional features between them.
  • the first feature being “above”, “over” and “above” the second feature includes the first feature being directly above and obliquely above the second feature, or simply means that the first feature is level higher than the second feature.
  • the first feature is “below”, “below” and “below” the second feature includes the first feature being directly below and diagonally below the second feature, or simply means that the first feature has a lower level than the second feature.
  • the first embodiment provides a method for predicting the clarity of the encrypted traffic video.
  • the prediction method for video resolution includes the following steps:
  • a Transmission Control Protocol (Transmission Control Protocol, abbreviated as: TCP) stream data packet and a playback log of network video playback with HTTPS encrypted traffic are captured.
  • TCP Transmission Control Protocol
  • the WEB side On the WEB side, that is, the browser to watch Tencent videos, use HTTPS encrypted transmission; the WEB side also has a P2P (Peer-to-Peer, abbreviated as: peer-to-peer network) transmission mechanism, and the P2P part uses the User Datagram Protocol (User Datagram Protocol, abbreviated as: UDP) transmission, this part of the traffic is not encrypted;
  • P2P Peer-to-Peer, abbreviated as: peer-to-peer network
  • UDP User Datagram Protocol
  • the PC side and the mobile phone side use HTTP and UDP transmission, both of which are transmitted in plain text and are not encrypted; UDP transmission accounts for more than 95%;
  • the first embodiment is to identify the clarity of the HTTPS encrypted traffic video.
  • Tencent Video will use different transmission modes according to the client type and different video types; for example, for long videos such as TV dramas and movies, dynamic bit rate adaptation technology (HTTP Live Streaming, abbreviated as: HLS) is used for block transmission or use MP4 is transmitted in blocks; for videos and short videos uploaded by users, MP4 is used for whole file transmission; for advertising videos, MP4 is used for block transmission.
  • HLS dynamic bit rate adaptation technology
  • the encoding method and transmission mode of Tencent Video are also constantly evolving.
  • the most mainstream transmission mode is the HLS transmission mode; in this embodiment 1, only the HLS transmission mode is introduced, and other transmission modes are not described. mode is introduced.
  • step 102 according to the video code number in the play log, the captured network video of encrypted traffic is marked with clarity.
  • Tencent Video divides the video into 4 resolutions, namely 270p, 480p, 720p and 1080p.
  • each encoding method has a unique number. , the number can correspond to the specific definition.
  • step 103 data blocks are detected from the TCP stream data packets.
  • step 1031 in a TCP stream, download multiple video data blocks, according to the keep-alive mechanism of the HTTP protocol, after statistics, the number of data blocks transmitted by a TCP stream is one or more, and the maximum can reach several Ten; in the same TCP stream, the resolution of the downloaded video data blocks is the same; if you want to switch the resolution, the TCP connection will be interrupted and another TCP connection will be used to download.
  • step 104 the feature corresponding to the definition in the data block is extracted to form a feature set of known data packet definition.
  • extracting features first extract the relevant features of each data block, then calculate an average of the features of all data blocks in the same TCP connection, and finally use the features extracted from the data blocks in the first step to be the same as the second step.
  • the feature average of all data blocks of the TCP stream is combined as a feature set of a data block.
  • step 105 a model is established and trained by using the corresponding relationship between the definition of the known video file annotation and the data block feature set. After the model training is completed, the feature extraction is performed on the TCP stream data packets of the encrypted traffic video to be tested. The correspondence between the feature set and the sharpness predicts the sharpness of the encrypted traffic video file to be tested.
  • Tencent Video will encode a video into multiple files of different resolutions, and then perform segmentation and block processing for each file, and record the video block information in an index file.
  • Tencent Video is in HLS transmission mode.
  • the index file corresponding to the video definition will be downloaded first.
  • the index file divides the video according to a certain length of time, and each video segment has a unique URL.
  • the client downloads the videos one by one according to the URL in the index file. Fragment.
  • Tencent Video When Tencent Video performs HLS transmission, the video is processed in two stages: segmentation and block.
  • the Tencent Video server performs segmentation processing on a complete video and divides it into a data segment every 5 minutes or so, and each segment is named 1. .ts, 2.ts, 3.ts, etc.;
  • Tencent Video Server then performs block processing for each data segment, and divides it into a data block every 10 seconds or so, and the name of each block is counted from 0.
  • a typical video clip URL format is as follows:
  • the first field is the downloaded video clip file name 00_b0033m9le2c.321002.1.ts, and each field is explained as follows:
  • 00 represents the data block index number
  • b0033m9le2c represents the video ID
  • 321002 represents the video encoding label
  • index 0: data block index number
  • Tencent Video When a browser watches Tencent Video, the browser first downloads an index file, and then downloads each video data block according to the URL of each video data block in the content of the index file.
  • Tencent Video uses the HLS transmission mode, which essentially divides the video into many video data blocks, and the client requests the data blocks one by one.
  • TLS Transport Layer Security
  • the DPI manufacturer cannot obtain the content of the video file itself, and cannot identify the video clarity from the video content, but due to TLS Encryption will not change the length of the data, so some length-based features at the TCP layer will not change.
  • each video data block can still be identified, and each video data block can also be calculated.
  • the size of the video data block and related characteristics Since the blocks of the video file are divided according to the time length, a model can be constructed to predict the definition of the video according to the length information of the data block and the information related to the length of the data block.
  • the random forest classification algorithm is used to construct a model, firstly, the feature sets of the video data blocks of known definition are collected, and then the model is trained and tested with the feature sets, and the results are compared with the clarity of the known video, such as If the accuracy rate exceeds the preset value, such as 80% or 70%, the model can be considered to be well trained. After curing the model, apply the trained model to the definition prediction of the encrypted video traffic to be tested.
  • this data block model After testing with experimental data, the prediction accuracy of this data block model can reach at least 70%, which largely solves the problem of DPI manufacturers encrypting traffic video files of unknown resolution.
  • On the browser request the database that stores video information to play videos with HTTPS encrypted traffic, and select at least two video files with different definition and video content. The more video files the more samples will be collected.
  • the unique video definition number in the play log corresponds to the video encoding number and the corresponding definition in the encrypted traffic network video play log when the browser plays the video.
  • Tencent Video divides the video into 4 resolutions, namely 270p, 480p, 720p and 1080p. For each resolution, at least one video file is required as a sample. As shown in Figure 2, the resolution and the corresponding video encoding number The relationship is listed, and the relevant definition can be obtained by referring to the video code number in the play log.
  • the data block is detected according to the TCP message acknowledgment character (Acknowledge Character, abbreviated as: ACK) field, in the first embodiment, the video is unidirectional from the server. It is transmitted to the client, and it is transmitted in blocks. During the video transmission process, the client does not send any message to the server; until a data block is transmitted, the client will send a request message to the server to request the next data. piece.
  • the ACK field of the TCP message sent by the server to the client remains unchanged. After the client sends an HTTP request, the ACK value increases in value, and the increased value is the value of the client. The message length of the HTTP request sent by the endpoint.
  • the steps to detect data blocks are as follows:
  • step 201 first judge whether a TCP stream is an HLS video stream of Tencent Video; the method is whether the SNI field in the Client Hello packet of the TLS message is a specific domain name of Tencent Video; these domain names include ltsbsy.qq.com , ltscsy.qq.com, ltssjy.qq.com, ltsws.qq.com, stsbsy.qq.com;
  • step 202 parse the TLS message, remove the TLS handshake message, and extract the transmission data message;
  • handshake first, and then data is transmitted; according to the TLS specification protocol analysis, you can know which are the handshake packets and which are the packets that transmit data;
  • step 203 determine the uplink and downlink packets
  • step 204 the ACK value of the downlink message is processed to detect whether the ACK value changes
  • step 204 can be specifically implemented as the following steps 2041 and 2042.
  • step 2041 the message with the same ACK value is recorded as a data block
  • step 2042 the message whose ACK value has changed is re-recorded as a new data block.
  • Data packets with the same ACK value are recorded as data block 1, that is, data block 1 is a set containing many individual data packets, and all data packets have the same ACK value; according to the time of the data packets To process the data packets in turn, the next data blocks are recorded as data blocks 2, 3, 4....N in turn. All video data blocks belonging to the same TCP stream have the same video definition. When the model predicts sharpness, the data blocks all belong to the same TCP stream.
  • the TCP stream file extracts the characteristic and the characteristic mean value thereof corresponding to the definition in the described data block in the video transmission, wherein:
  • the features include: one or more of the size of the data block, the number of data packets, the arrival time of the first byte, the download time of the data block, the idle time of the data block, the data block transmission time and the data transmission rate;
  • the characteristic average value includes one of: average data block size, average data packet number, average first byte arrival time, average data block download time, average data block idle time, average data block transfer time, and average data transfer rate. item or multiple items.
  • extracting features it is divided into three steps. First, extract the relevant features of each data block, then calculate an average value of the features of all data blocks in the same TCP connection, and finally calculate the features extracted from the data block in the first step.
  • the feature average of all data blocks of the same TCP stream as in the second step is combined together as a feature set of a data block.
  • the first step is to extract 7 features for each data block, which are:
  • Chunk_size the number of bytes of the data block
  • Packet_number the number of packets of the data block
  • Time to first byte referred to as TTFB, the time from when GET is issued to the arrival of the first byte of the response;
  • GET refers to the time after the HTTP request message is sent from the client.
  • the response refers to the first message that the server replies to the client.
  • the client and the server use the TCP protocol to exchange messages.
  • the server After the client sends an HTTP request message, the server will reply with a TCP ACK message.
  • the TTFB time refers to the client The time from the end of the HTTP request to the first packet returned by the server.
  • Download_time the time between the first packet and the last packet of the data block
  • Slack_time the time from the last message of the data block to the next GET request
  • Duration_time the time from the GET request to the next GET request
  • the data block transmission time is equal to the arrival time of the first byte plus the download time plus the idle time;
  • Download_speed data block bytes divided by download time
  • High-definition data blocks are significantly larger than low-definition data blocks in terms of the number of bytes and packets. These two features are of the highest importance, the most obvious meanings, and easier to understand. Several other time-related features can also play a role in distinguishing clarity after statistical analysis.
  • the second step is to calculate the average feature value for each TCP connection of Tencent Video, which are:
  • ave_chunk_size the average of all chunk sizes
  • ave_packet_number the average of all packets
  • ave_ttfb the average of all first byte arrival times
  • ave_download_time the average of all download times
  • ave_slack the average of all idle times
  • ave_duration_time the average of all transit times
  • ave_download_speed the average block size, divided by the average download time
  • Example data is as follows:
  • the encryption protocol is used in the video transmission process, the content of the video file itself cannot be directly obtained, and the video definition cannot be identified from the video content, but each video data block can be identified on the TCP layer, and technicians can calculate Come out the size of each video data block, and the characteristic parameters. Since the blocks of video files are divided according to the length of time, the length of high-definition video blocks is generally larger than that of low-definition video blocks, which is affected by many factors, such as a static image. For more videos, the length of the encoded data block is smaller than that of the video with many action pictures; for example, the video of the same definition corresponds to different video encoding on the mobile phone platform and the browser platform, so the length of the data block is also different. Same.
  • Step 105 in the first embodiment requires model training.
  • the features in the data block, the average value of the features, and the data block of the current TCP flow file are combined into a feature set sample with a known data packet definition, and the model receives at least
  • the video coding number in the video playback log of known definition is used to verify the predicted definition result of the model. If the accuracy of the prediction result of the model is higher than the preset value, the model is successfully trained.
  • the model After many times of feature collection, data verification, and feature weight adjustment, the model is successfully trained when the accuracy of the model prediction results exceeds a preset value.
  • All video data blocks belonging to the same TCP stream are of the same video definition, so calculating the average value of the characteristic parameters of all data blocks in the entire stream is to abstract the entire TCP stream into a data block, and all the data obtained After the definition of the block, a "training result optimization" step is implemented.
  • the process is as follows: Count the predicted definition categories of all data blocks in a TCP stream, find the prediction category with the largest proportion, and then put other data blocks.
  • the prediction results of are all changed to this clarity category, that is, to count all the prediction results of a TCP stream, and to force the correction of the prediction results by using the minority-subordination-majority mechanism.
  • the model prediction trained by the Android platform when the model predicts the encrypted traffic video of unknown definition, in the mobile communication network, the model prediction trained by the Android platform is used, and in the traditional fixed network environment, the model prediction trained by the PC platform is used. .
  • the average size of the video data block generated by the Tencent Video client on the Android mobile platform is smaller than that generated by the Tencent Video client on the PC platform; therefore, the Android platform and the PC platform need to be distinguished when doing training and prediction.
  • the model trained on the Android platform In the traditional fixed network environment, the model trained on the PC platform is used to predict, thereby improving the accuracy of the model prediction.
  • the random oversampling method is used to balance the clarity categories for the sample set, the machine learning-based model uses the random forest algorithm, and all parameters are kept by default; The distribution is not uniform, so it is necessary to perform category balancing processing on the sample set.
  • the Random Oversampler random oversampling method is used to perform category balancing processing. Random oversampling is a standard process, which specifically refers to random copying, repeating Minority class samples, and finally make the number of minority class and majority class the same to obtain a new balanced dataset.
  • the SNI field and the string in the domain name are compared. If it completely matches the preset SNI field and the preset string, it means that The traffic is the video traffic in the HLS transmission mode, and can be a video file that needs to predict the definition of encrypted traffic.
  • HTTPS is the HTTP protocol carried over the TLS protocol.
  • SNI in the ClientHello message.
  • This field indicates the domain name of the server to be connected. You can compare the SNI field with the Tencent Video server's domain name to match the corresponding traffic.
  • the domain name must meet the following format "lts***.qq.com”, or "sts***.qq.com”, only need to compare the SNI field and the lts or sts string, If it matches exactly, it means that it is the Tencent video traffic in HLS transmission mode, and can become the encrypted traffic video file of the to-be-predicted definition.
  • the method of identifying the sharpness through video data block information still has a high accuracy rate, which is a practical method.
  • the information of the video data block is collected, trained, and then the trained model is applied to the definition prediction of the encrypted traffic video to be tested.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • FIG. 5 it is a schematic structural diagram of an apparatus for performing definition prediction on encrypted traffic video according to an embodiment of the present invention.
  • the apparatus for predicting the sharpness of the encrypted traffic video in this embodiment includes one or more processors 21 and a memory 22 .
  • one processor 21 is taken as an example in FIG. 5 .
  • the processor 21 and the memory 22 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 5 .
  • the memory 22 can be used to store non-volatile software programs and non-volatile computer-executable programs, such as the method for sharpness prediction of encrypted traffic video in Embodiment 1 .
  • the processor 21 executes the method of sharpness prediction for encrypted traffic video by running non-volatile software programs and instructions stored in the memory 22 .
  • Memory 22 may include high speed random access memory, and may also include nonvolatile memory, such as at least one magnetic disk storage device, flash memory device, or other nonvolatile solid state storage device.
  • the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the program instructions/modules are stored in the memory 22, and when executed by the one or more processors 21, execute the method for predicting the resolution of encrypted traffic video in the above Embodiment 1, for example, execute the above Describe the various steps shown in Figures 1 and 3.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Disclosed in the present invention are a method and device for predicting the definition of a video having encrypted traffic. The method comprises: capturing TCP stream data packets for playback of network videos having encrypted traffic and a playback log; according to video code numbers in the playback log, labelling the definitions of the captured network videos having encrypted traffic; detecting data blocks from the TCP stream data packets; extracting features corresponding to the definitions in the data blocks and feature average values, and forming feature sets of the definitions of known data packets; establishing a model by means of correspondences between definitions labeled on known video files and the feature sets of the data blocks, training the model, performing feature extraction on a TCP stream data packet for an video having encrypted traffic to be predicted, and according to the correspondences between the feature sets and the definitions in the model, predicting the definition of the video file having encrypted traffic to be predicted. When video transmission is encrypted and the content of the video file cannot be obtained, the definition of a video file having encrypted traffic to be predicted is predicted by building a model.

Description

一种对加密流量视频进行清晰度预测的方法和装置A method and device for predicting sharpness of encrypted traffic video 技术领域technical field
本发明属于计算机服务器领域,更具体地,涉及一种对加密流量视频进行清晰度预测的方法和装置。The invention belongs to the field of computer servers, and more particularly, relates to a method and device for predicting the definition of encrypted traffic video.
背景技术Background technique
视频网站使用HTTP传输视频的时候,DPI(Deep Packet Inspection,深度报文检测)厂商可以从网络流量中提取到传输的视频文件,视频文件的头部信息中,包含了视频的编码,清晰度,视频码率,视频画面大小等信息;近几年几乎所有的大型网站都部署了数字证书,在和客户端交互时,使用HTTPS传输协议,视频网站也不例外。到2020年,中国的主流视频平台在使用浏览器观看视频时,都采用了HTTPS加密传输。When a video website uses HTTP to transmit video, DPI (Deep Packet Inspection) manufacturers can extract the transmitted video file from the network traffic. The header information of the video file contains the video encoding, definition, Video bit rate, video picture size and other information; in recent years, almost all large websites have deployed digital certificates. When interacting with clients, they use the HTTPS transmission protocol, and video websites are no exception. By 2020, China's mainstream video platforms will use HTTPS encrypted transmission when watching videos in browsers.
在视频传输加密的情况下,DPI厂商无法获取视频文件内容,导致无法对视频清晰度进行分析。In the case of video transmission encryption, the DPI manufacturer cannot obtain the content of the video file, which makes it impossible to analyze the video definition.
发明内容SUMMARY OF THE INVENTION
针对现有技术的以上缺陷或改进需求,本发明提供了一种对加密流量视频进行清晰度预测的方法和装置,其目的在于依据数据包的特征集与视频清晰度存在对应的关系,使用已知数据包清晰度的特征集训练模型,再使用模型对未知清晰度的加密视频文件进行清晰度预测,由此解决DPI厂商在视频传输加密的情况下无法对视频清晰度进行分析的技术问题。In view of the above defects or improvement requirements of the prior art, the present invention provides a method and device for predicting the definition of encrypted traffic video. The model is trained with the feature set of the known data packet definition, and then the model is used to predict the definition of the encrypted video file of unknown definition, thus solving the technical problem that the DPI manufacturer cannot analyze the video definition when the video transmission is encrypted.
为实现上述目的,第一方面,本发明提供了一种对加密流量视频进行清晰度预测的方法,对视频清晰度的预测方法包括:In order to achieve the above object, in the first aspect, the present invention provides a method for predicting the clarity of encrypted traffic video, and the method for predicting the clarity of the video includes:
捕获带有HTTPS加密流量网络视频播放的TCP流数据包和播放日志;Capture TCP stream packets and playback logs of network video playback with HTTPS encrypted traffic;
依据所述播放日志中的视频编码编号,给捕获到的加密流量网络视频 标注清晰度;Mark the clarity of the captured encrypted traffic network video according to the video encoding number in the playback log;
从所述TCP流数据包中检测数据块;detecting data blocks from the TCP stream data packets;
提取所述数据块中与清晰度相对应的特征以及特征平均值,构成已知数据包清晰度的特征集;Extracting the feature corresponding to the definition and the feature average value in the data block to form a feature set of known data packet definition;
利用已知视频文件标注的清晰度与数据块特征集的对应关系建立模型,并且训练模型,模型训练完成后,对待测加密流量视频的TCP流数据包进行特征提取,根据模型中特征集与清晰度的对应关系预测出待测加密流量视频文件的清晰度。The model is established by using the corresponding relationship between the clarity of the known video file annotation and the feature set of the data block, and the model is trained. After the model training is completed, the feature extraction is performed on the TCP stream data packets of the encrypted traffic video to be tested. According to the feature set in the model and the clarity The corresponding relationship between the degrees predicts the clarity of the encrypted traffic video file to be tested.
作为对上述方案进一步的完善和补充,本发明还包括以下附加技术特征。As a further improvement and supplement to the above solution, the present invention also includes the following additional technical features.
优选地,所述收集带有HTTPS加密流量网络视频播放的数据包括方法:Preferably, the method for collecting data of network video playback with HTTPS encrypted traffic includes:
在浏览器上向存储有视频信息的数据库请求播放带有HTTPS加密流量的视频,至少选择两种清晰度且视频内容有区别的视频文件。On the browser, request the database that stores video information to play videos with HTTPS encrypted traffic, and select at least two video files with different definition and video content.
优选地,每种编码方式对应所述播放日志中唯一的视频清晰度编号,浏览器播放视频时,加密流量网络视频播放日志中的视频编码编号和对应的清晰度同时被记录和收集。Preferably, each encoding method corresponds to a unique video definition number in the play log, and when the browser plays the video, the video encoding number and the corresponding definition in the encrypted traffic network video play log are simultaneously recorded and collected.
优选地,所述数据块根据TCP报文ACK字段检测,具体包括:Preferably, the data block is detected according to the ACK field of the TCP message, and specifically includes:
对一条TCP流的所有报文进行判断是否是HLS视频流;Determine whether all the packets of a TCP stream are HLS video streams;
解析TLS消息,去除TLS握手报文,保留传输数据的报文;Parse the TLS message, remove the TLS handshake message, and retain the message for transmitting data;
判断上行和下行报文,对下行报文进行处理;Determine the uplink and downlink packets, and process the downlink packets;
对下行报文的ACK值进行分类处理;Classify the ACK value of the downlink message;
相同ACK值的报文,记做一个数据块;Messages with the same ACK value are recorded as a data block;
把ACK值发生变化的报文,记做一个新的数据块。The message whose ACK value has changed is recorded as a new data block.
优选地,所述TCP流文件在视频传输中提取所述数据块中与清晰度相对应的特征以及特征平均值,其中:Preferably, the characteristics corresponding to the definition and the average value of the characteristics in the data block are extracted from the TCP stream file in the video transmission, wherein:
所述特征包括:数据块大小、数据包个数、首字节到达时间、数据块 下载时间、数据块空闲时间、数据块传输时间和数据传输速率中的一项或者多项;Described feature comprises: one or more in data block size, data packet number, first byte arrival time, data block download time, data block idle time, data block transmission time and data transmission rate;
所述特征平均值包括:平均数据块大小、平均数据包个数、平均首字节到达时间、平均数据块下载时间、平均数据块空闲时间、平均数据块传输时间和平均数据传输速率中的一项或者多项。The characteristic average value includes one of: average data block size, average data packet number, average first byte arrival time, average data block download time, average data block idle time, average data block transfer time, and average data transfer rate. item or multiple items.
优选地,所述数据块中特征、特征平均值和本次TCP流文件的数据块,组合成一个已知数据包清晰度的特征集样本,模型接收至少一个特征集样本的训练,使用已知清晰度视频播放日志中的视频编码编号验证模型的预测清晰度结果,若模型的预测结果的正确率高于预设值,所述模型训练成功。Preferably, the features in the data block, the feature average value and the data block of the current TCP flow file are combined into a feature set sample with a known data packet definition, and the model receives at least one feature set sample for training, using known The video coding number in the video playback log of the definition verifies the predicted definition result of the model. If the accuracy of the prediction result of the model is higher than the preset value, the model is successfully trained.
优选地,所述模型进行预测未知清晰度的加密流量视频时,在移动通信网络中,使用Android平台训练出来的模型预测,在传统固网环境中,使用PC平台训练出来的模型预测。Preferably, when the model predicts the encrypted traffic video of unknown definition, in the mobile communication network, the model prediction trained by the Android platform is used, and in the traditional fixed network environment, the model prediction trained by the PC platform is used.
优选地,对所述样本集合使用随机过采样方法进行清晰度类别平衡处理。Preferably, a random oversampling method is used to balance the sharpness categories on the sample set.
优选地,从网络流量中筛选出来带有HTTPS加密流量网络视频的目标流量时,通过比较域名中的SNI字段和字符串,若与预设SNI字段和预设字符串完全匹配则说明流量是HLS传输模式的视频流量,且能够成为需要预测清晰度加密流量视频文件。Preferably, when filtering out the target traffic with HTTPS encrypted traffic network video from the network traffic, by comparing the SNI field and the string in the domain name, if it completely matches the preset SNI field and the preset string, it means that the traffic is HLS Transport mode video traffic, and can become a video file that requires predictive definition encrypted traffic.
第二方面,本发明还提供了一种对加密流量视频进行清晰度预测的装置,装置包括:In a second aspect, the present invention also provides a device for predicting the definition of encrypted traffic video, the device comprising:
至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被程序设置为执行第一方面所述的对加密流量视频进行清晰度预测方法。at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being programmed to perform the first aspect The above-mentioned method for predicting sharpness of encrypted traffic video.
总体而言,通过本发明所构思的以上技术方案与现有技术相比,具有如下有益效果:In general, compared with the prior art, the above technical solutions conceived by the present invention have the following beneficial effects:
DPI厂商可以在视频传输加密无法获取视频文件内容的前提下,可以通过构建模型,利用已知清晰度的视频文件中的特征集预测待测加密流量视频文件的清晰度。Under the premise that the video transmission encryption cannot obtain the content of the video file, DPI manufacturers can build a model and use the feature set in the video file of known definition to predict the definition of the encrypted traffic video file to be tested.
附图说明Description of drawings
图1是本发明实施例一中训练模型的过程;1 is a process of training a model in Embodiment 1 of the present invention;
图2是本发明实施例一中利用播放日志中的视频编码编号标注对应的清晰度;Fig. 2 is the definition that utilizes the video coding number in the play log to mark the corresponding definition in Embodiment 1 of the present invention;
图3是本发明实施例一中TCP报文ACK字段检测数据块的过程;3 is a process of detecting a data block in the ACK field of a TCP message in Embodiment 1 of the present invention;
图4是本发明实施例一中比较域名中的SNI字段和字符串,筛选出来待测的加密流量网络视频;Fig. 4 compares the SNI field and character string in the domain name in the first embodiment of the present invention, and filters out the encrypted traffic network video to be tested;
图5是本发明实施例提供的一种对加密流量视频进行清晰度预测的装置结构示意图。FIG. 5 is a schematic structural diagram of an apparatus for predicting sharpness of encrypted traffic video according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.
在本发明中,除非另有明确的规定和限定,第一特征在第二特征之“上”或之“下”可以包括第一和第二特征直接接触,也可以包括第一和第二特征不是直接接触而是通过它们之间的另外的特征接触。而且,第一特征在第二特征“之上”、“上方”和“上面”包括第一特征在第二特征正上方和斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”包括第一特征在第二特征正下方和斜下方,或仅仅表示第一特征水平高度小于第二特征。In the present invention, unless otherwise expressly specified and limited, a first feature "on" or "under" a second feature may include the first and second features in direct contact, or may include the first and second features Not directly but through additional features between them. Also, the first feature being "above", "over" and "above" the second feature includes the first feature being directly above and obliquely above the second feature, or simply means that the first feature is level higher than the second feature. The first feature is "below", "below" and "below" the second feature includes the first feature being directly below and diagonally below the second feature, or simply means that the first feature has a lower level than the second feature.
实施例一:Example 1:
视频网站使用HTTP传输视频的时候,DPI厂商可以从网络流量中提取到传输的视频文件,视频文件的头部信息中,包含了视频的编码,清晰度,视频码率,视频画面大小等信息;近几年几乎所有的大型网站都部署了数字证书,在和客户端交互时,使用HTTPS传输协议,视频网站也不例外。到2020年,中国的主流视频平台是腾讯视频,爱奇艺视频,优酷视频;三个视频平台在使用浏览器观看视频时,都采用了HTTPS加密传输。When a video website uses HTTP to transmit video, DPI manufacturers can extract the transmitted video file from the network traffic. The header information of the video file contains the video encoding, definition, video bit rate, video picture size and other information; In recent years, almost all large websites have deployed digital certificates. When interacting with clients, the HTTPS transmission protocol is used, and video websites are no exception. By 2020, China's mainstream video platforms will be Tencent Video, iQiyi Video, and Youku Video; all three video platforms use HTTPS encrypted transmission when watching videos in browsers.
在视频传输加密的情况下,DPI厂商无法获取视频文件内容,导致无法对视频质量进行分析,为了解决这个问题,本实施例一提供一种对加密流量视频进行清晰度预测的方法,针对腾讯视频的平台举例,如图1和图3所示,对视频清晰度的预测方法包括以下步骤:In the case of video transmission encryption, the DPI manufacturer cannot obtain the content of the video file, resulting in the inability to analyze the video quality. In order to solve this problem, the first embodiment provides a method for predicting the clarity of the encrypted traffic video. For Tencent Video For example, as shown in Figure 1 and Figure 3, the prediction method for video resolution includes the following steps:
在步骤101中,捕获带有HTTPS加密流量网络视频播放的传输控制协议(Transmission Control Protocol,简写为:TCP)流数据包和播放日志。In step 101, a Transmission Control Protocol (Transmission Control Protocol, abbreviated as: TCP) stream data packet and a playback log of network video playback with HTTPS encrypted traffic are captured.
在WEB端,即浏览器观看腾讯视频,使用HTTPS加密传输;WEB端也有P2P(Peer-to-Peer,简写为:对等网络)传输机制,P2P部分使用用户数据报协议(UserDatagramProtocol,简写为:UDP)传输,此部分流量未加密;On the WEB side, that is, the browser to watch Tencent videos, use HTTPS encrypted transmission; the WEB side also has a P2P (Peer-to-Peer, abbreviated as: peer-to-peer network) transmission mechanism, and the P2P part uses the User Datagram Protocol (User Datagram Protocol, abbreviated as: UDP) transmission, this part of the traffic is not encrypted;
PC端和手机端,使用HTTP和UDP传输,都是明文传输,未加密;其中UDP传输占比在95%以上;The PC side and the mobile phone side use HTTP and UDP transmission, both of which are transmitted in plain text and are not encrypted; UDP transmission accounts for more than 95%;
本实施例一是针对HTTPS加密的流量视频做清晰度的识别。The first embodiment is to identify the clarity of the HTTPS encrypted traffic video.
腾讯视频会根据客户端类型和不同的视频类型,使用不同的传输模式;如对电视剧和电影等长视频,使用动态码率自适应技术(HTTP Live Streaming,简写为:HLS)分块传输或者使用MP4分块传输;对用户上传的视频和短视频,使用MP4整文件传输;对广告类视频,使用MP4分块传输。Tencent Video will use different transmission modes according to the client type and different video types; for example, for long videos such as TV dramas and movies, dynamic bit rate adaptation technology (HTTP Live Streaming, abbreviated as: HLS) is used for block transmission or use MP4 is transmitted in blocks; for videos and short videos uploaded by users, MP4 is used for whole file transmission; for advertising videos, MP4 is used for block transmission.
另外,腾讯视频的编码方式和传输模式也是在一直演进的,就目前来说,最主流的传输模式就是HLS传输模式;本实施例一中,只对HLS传输 模式进行介绍,不再对其它传输模式进行介绍。In addition, the encoding method and transmission mode of Tencent Video are also constantly evolving. At present, the most mainstream transmission mode is the HLS transmission mode; in this embodiment 1, only the HLS transmission mode is introduced, and other transmission modes are not described. mode is introduced.
在步骤102中,依据所述播放日志中的视频编码编号,给捕获到的加密流量网络视频标注清晰度。In step 102, according to the video code number in the play log, the captured network video of encrypted traffic is marked with clarity.
腾讯视频对视频分为4个清晰度,分别为270p,480p,720p和1080p,针对每种清晰度具体视频编码的时候,有多种编码方式,但是每一种编码方式都有一个唯一的编号,编号可以对应到具体的清晰度上。Tencent Video divides the video into 4 resolutions, namely 270p, 480p, 720p and 1080p. When encoding a specific video for each resolution, there are multiple encoding methods, but each encoding method has a unique number. , the number can correspond to the specific definition.
在步骤103中,从所述TCP流数据包中检测数据块。In step 103, data blocks are detected from the TCP stream data packets.
腾讯视频传输过程中有以下步骤:There are the following steps in the Tencent Video transmission process:
在步骤1031中,一个TCP流里面,下载多个视频数据块,依据HTTP协议的keep-alive机制,经过统计,一个TCP流传输的数据块数量个数为1个或多个,最多可以达到几十个;同一个TCP流里面,下载的视频数据块的清晰度都是一样的;如果要切换清晰度,那么会中断此TCP连接,重新使用另一个TCP连接去下载。In step 1031, in a TCP stream, download multiple video data blocks, according to the keep-alive mechanism of the HTTP protocol, after statistics, the number of data blocks transmitted by a TCP stream is one or more, and the maximum can reach several Ten; in the same TCP stream, the resolution of the downloaded video data blocks is the same; if you want to switch the resolution, the TCP connection will be interrupted and another TCP connection will be used to download.
在步骤104中,并提取数据块中与清晰度相对应的特征,构成已知数据包清晰度的特征集。提取特征时,首先提取每个数据块的相关特征,然后把同一个TCP连接里面的所有数据块的特征计算一个平均值,最后把第一步从数据块提取的特征,和第二步同一条TCP流所有数据块的特征平均值,综合在一起作为一个数据块的特征集。In step 104, the feature corresponding to the definition in the data block is extracted to form a feature set of known data packet definition. When extracting features, first extract the relevant features of each data block, then calculate an average of the features of all data blocks in the same TCP connection, and finally use the features extracted from the data blocks in the first step to be the same as the second step. The feature average of all data blocks of the TCP stream is combined as a feature set of a data block.
在步骤105中,利用已知视频文件标注的清晰度与数据块特征集的对应关系建立且训练模型,模型训练完成后,再对待测加密流量视频的TCP流数据包进行特征提取,根据模型中特征集与清晰度的对应关系预测出待测加密流量视频文件的清晰度。In step 105, a model is established and trained by using the corresponding relationship between the definition of the known video file annotation and the data block feature set. After the model training is completed, the feature extraction is performed on the TCP stream data packets of the encrypted traffic video to be tested. The correspondence between the feature set and the sharpness predicts the sharpness of the encrypted traffic video file to be tested.
腾讯视频会把一个视频编码分为不同清晰度的多个文件,然后针对每个文件进行分段和分块处理,把这些视频分块信息记录在一个索引文件当中,腾讯视频在HLS传输模式下,首先会下载对应视频清晰度的索引文件,索引文件中把视频按照一定的时间长度进行了分割,对每个视频片段都有 一个唯一的URL,客户端按照索引文件中的URL,逐个下载视频片段。Tencent Video will encode a video into multiple files of different resolutions, and then perform segmentation and block processing for each file, and record the video block information in an index file. Tencent Video is in HLS transmission mode. , the index file corresponding to the video definition will be downloaded first. The index file divides the video according to a certain length of time, and each video segment has a unique URL. The client downloads the videos one by one according to the URL in the index file. Fragment.
腾讯视频进行HLS传输时,对视频进行了分段和分块两级处理,首先腾讯视频服务器把一个完整视频进行分段处理,按照每5分钟左右划分为一个数据段,每一段的命名为1.ts,2.ts,3.ts等;腾讯视频服务器再针对每个数据段进行分块处理,每10秒钟左右划分为一个数据块,每一块命名是从0开始计数。When Tencent Video performs HLS transmission, the video is processed in two stages: segmentation and block. First, the Tencent Video server performs segmentation processing on a complete video and divides it into a data segment every 5 minutes or so, and each segment is named 1. .ts, 2.ts, 3.ts, etc.; Tencent Video Server then performs block processing for each data segment, and divides it into a data block every 10 seconds or so, and the name of each block is counted from 0.
一个典型的视频片段URL格式如下:A typical video clip URL format is as follows:
00_b0033m9le2c.321002.1.ts?index=0&start=0&end=7000&brs=0&bre=222967&ver=400_b0033m9le2c.321002.1.ts? index=0&start=0&end=7000&brs=0&bre=222967&ver=4
其中第一个字段为下载的视频片段文件名称00_b0033m9le2c.321002.1.ts,各字段解释如下:The first field is the downloaded video clip file name 00_b0033m9le2c.321002.1.ts, and each field is explained as follows:
00表示数据块索引号;00 represents the data block index number;
b0033m9le2c表示视频ID;b0033m9le2c represents the video ID;
321002表示视频编码标签;321002 represents the video encoding label;
1.ts表示分段索引;1.ts represents the segment index;
URL的参数解释如下:The parameters of the URL are explained as follows:
index=0:数据块索引号;index=0: data block index number;
start=0&end=7000:起始和结束时间;start=0&end=7000: start and end time;
brs=0&bre=222967:数据量起始和结束,这个偏移是针对本段视频的,1.ts切换到2.ts这个值就会从0开始。brs=0&bre=222967: The start and end of the data volume, this offset is for this video, the value of 1.ts switching to 2.ts will start from 0.
浏览器观看腾讯视频时,浏览器首先下载一个的索引文件,然后按照索引文件内容中的各个视频数据块的URL来下载各个视频数据块。腾讯视频使用HLS传输模式,此种传输模式本质上是把视频分为许多视频数据块,客户端逐个请求数据块。When a browser watches Tencent Video, the browser first downloads an index file, and then downloads each video data block according to the URL of each video data block in the content of the index file. Tencent Video uses the HLS transmission mode, which essentially divides the video into many video data blocks, and the client requests the data blocks one by one.
虽然视频传输过程中,使用了传输层安全性协议(Transport Layer Security,简称为:TLS)协议加密,DPI厂商不能够获取视频文件本身的 内容,无法从视频内容上识别视频清晰度,但是由于TLS加密是不会改变数据的长度的,所以在TCP层一些基于长度的特征是不会改变的,根据请求和响应的长度,仍然是可以识别出每一个视频数据块的,也可以计算出来每一个视频数据块的大小以及相关特征。由于视频文件的分块是按照时间长度来划分的,依据数据块的长度信息以及与数据块长度相关的信息,可以构建模型预测视频的清晰度。Although the transport layer security protocol (Transport Layer Security, referred to as: TLS) protocol encryption is used in the video transmission process, the DPI manufacturer cannot obtain the content of the video file itself, and cannot identify the video clarity from the video content, but due to TLS Encryption will not change the length of the data, so some length-based features at the TCP layer will not change. According to the length of the request and response, each video data block can still be identified, and each video data block can also be calculated. The size of the video data block and related characteristics. Since the blocks of the video file are divided according to the time length, a model can be constructed to predict the definition of the video according to the length information of the data block and the information related to the length of the data block.
本实施例一中,使用随机森林分类算法构建模型,先对已知清晰度视频数据块的特征集进行收集,再用特征集训练和测试模型,同时和已知视频的清晰度对比结果,如正确率超过预设值,比如80%或70%即可认为模型被训练好,固化模型后,把训练好的模型应用到待测加密视频流量的清晰度预测上。In the first embodiment, the random forest classification algorithm is used to construct a model, firstly, the feature sets of the video data blocks of known definition are collected, and then the model is trained and tested with the feature sets, and the results are compared with the clarity of the known video, such as If the accuracy rate exceeds the preset value, such as 80% or 70%, the model can be considered to be well trained. After curing the model, apply the trained model to the definition prediction of the encrypted video traffic to be tested.
在经过试验数据测试,此数据块模型的预测准确率至少可以达到70%以上,较大程度上解决了DPI厂商在未知清晰度加密流量视频文件的问题。After testing with experimental data, the prediction accuracy of this data block model can reach at least 70%, which largely solves the problem of DPI manufacturers encrypting traffic video files of unknown resolution.
对于实施例一中的所述收集带有HTTPS加密流量网络视频播放的数据包括方法:For the method of collecting data of network video playback with HTTPS encrypted traffic described in Embodiment 1:
在浏览器上向存储有视频信息的数据库请求播放带有HTTPS加密流量的视频,至少选择两种清晰度且视频内容有区别的视频文件。视频文件越多收集到的样本数量就会越多。On the browser, request the database that stores video information to play videos with HTTPS encrypted traffic, and select at least two video files with different definition and video content. The more video files the more samples will be collected.
对于实施例一中的所述每种编码方式对应播放日志中唯一的视频清晰度编号,浏览器播放视频时,加密流量网络视频播放日志中的视频编码编号和对应的清晰度同时被记录和收集。而且腾讯视频对视频分为4个清晰度,分别为270p,480p,720p和1080p,针对每种清晰度都需要至少一个视频文件做样本,如图2所示,清晰度与对应的视频编码编号的关系列出,参照播放日志中的视频编码编号即可对应出相关的清晰度。For each encoding method described in the first embodiment, the unique video definition number in the play log corresponds to the video encoding number and the corresponding definition in the encrypted traffic network video play log when the browser plays the video. . Moreover, Tencent Video divides the video into 4 resolutions, namely 270p, 480p, 720p and 1080p. For each resolution, at least one video file is required as a sample. As shown in Figure 2, the resolution and the corresponding video encoding number The relationship is listed, and the relevant definition can be obtained by referring to the video code number in the play log.
对于实施例一中的所述对加密流量数据预处理得到TCP流文件根据TCP报文确认字符(Acknowledge Character,简称为:ACK)字段检测数据 块,本实施例一中,视频是单方向从服务器传给客户端,而且是分块传输的,客户端在视频传输的过程中,并不向服务器发送任何消息;直到一个数据块传输完毕,客户端才会向服务器发送请求消息,请求下一个数据块。在数据块传输的过程中,服务器发往客户端TCP报文的ACK字段是保持不变的,一直到客户端发送了一个HTTP请求之后,ACK值发生数值上增加的变化,增加的数值就是客户端发出的HTTP请求的消息长度。For the described preprocessing of encrypted traffic data in the first embodiment to obtain the TCP stream file, the data block is detected according to the TCP message acknowledgment character (Acknowledge Character, abbreviated as: ACK) field, in the first embodiment, the video is unidirectional from the server. It is transmitted to the client, and it is transmitted in blocks. During the video transmission process, the client does not send any message to the server; until a data block is transmitted, the client will send a request message to the server to request the next data. piece. In the process of data block transmission, the ACK field of the TCP message sent by the server to the client remains unchanged. After the client sends an HTTP request, the ACK value increases in value, and the increased value is the value of the client. The message length of the HTTP request sent by the endpoint.
如图3所示,检测数据块的步骤如下:As shown in Figure 3, the steps to detect data blocks are as follows:
步骤201中,首先对一条TCP流进行判断,是否是腾讯视频的HLS视频流;方法是TLS消息的Client Hello报文中的SNI字段是否是腾讯视频的特定域名;这些域名包括ltsbsy.qq.com,ltscsy.qq.com,ltssjy.qq.com,ltsws.qq.com,stsbsy.qq.com;In step 201, first judge whether a TCP stream is an HLS video stream of Tencent Video; the method is whether the SNI field in the Client Hello packet of the TLS message is a specific domain name of Tencent Video; these domain names include ltsbsy.qq.com , ltscsy.qq.com, ltssjy.qq.com, ltsws.qq.com, stsbsy.qq.com;
步骤202中,解析TLS消息,去除TLS握手报文,提取传输数据报文;In step 202, parse the TLS message, remove the TLS handshake message, and extract the transmission data message;
TLS传输时,先握手,然后传输数据;根据TLS规范协议解析可以知道哪些是握手报文,哪些是传输数据的报文;During TLS transmission, handshake first, and then data is transmitted; according to the TLS specification protocol analysis, you can know which are the handshake packets and which are the packets that transmit data;
步骤203中,判断上行和下行报文;In step 203, determine the uplink and downlink packets;
步骤204中,对下行报文的ACK值进行处理,检测ACK值是否发生变化;In step 204, the ACK value of the downlink message is processed to detect whether the ACK value changes;
上述步骤204具体可以实现为以下步骤2041和步骤2042。The above step 204 can be specifically implemented as the following steps 2041 and 2042.
步骤2041中,相同ACK值的报文,记做一个数据块;In step 2041, the message with the same ACK value is recorded as a data block;
步骤2042中,把ACK值发生变化的报文,重新记做一个新的数据块。In step 2042, the message whose ACK value has changed is re-recorded as a new data block.
把具有相同ACK值的数据报文,记做数据块1,即数据块1是一个集合,里面是许多单个的数据报文,所有的数据报文有相同的ACK值;按照数据报文的时间来依次处理数据报文的,依次把接下来的数据块依次记做数据块2,3,4….N。属于同一条TCP流的所有视频数据块,都是同一个视频清晰度。在模型预测清晰度时,数据块全部属于同一条TCP流。Data packets with the same ACK value are recorded as data block 1, that is, data block 1 is a set containing many individual data packets, and all data packets have the same ACK value; according to the time of the data packets To process the data packets in turn, the next data blocks are recorded as data blocks 2, 3, 4....N in turn. All video data blocks belonging to the same TCP stream have the same video definition. When the model predicts sharpness, the data blocks all belong to the same TCP stream.
本实施例一中,所述TCP流文件在视频传输中提取所述数据块中与清 晰度相对应的特征及其特征平均值,其中:In the present embodiment one, the TCP stream file extracts the characteristic and the characteristic mean value thereof corresponding to the definition in the described data block in the video transmission, wherein:
所述特征包括:数据块大小、数据包个数、首字节到达时间、数据块下载时间、数据块空闲时间、数据块传输时间和数据传输速率中的一项或者多项;The features include: one or more of the size of the data block, the number of data packets, the arrival time of the first byte, the download time of the data block, the idle time of the data block, the data block transmission time and the data transmission rate;
所述特征平均值包括:平均数据块大小、平均数据包个数、平均首字节到达时间、平均数据块下载时间、平均数据块空闲时间、平均数据块传输时间和平均数据传输速率中的一项或者多项。The characteristic average value includes one of: average data block size, average data packet number, average first byte arrival time, average data block download time, average data block idle time, average data block transfer time, and average data transfer rate. item or multiple items.
提取特征时,分为三个步骤,首先提取每个数据块的相关特征,然后把同一个TCP连接里面的所有数据块的特征计算一个平均值,最后把第一步从数据块提取的特征,和第二步同一条TCP流所有数据块的特征平均值,综合在一起作为一个数据块的特征集。When extracting features, it is divided into three steps. First, extract the relevant features of each data block, then calculate an average value of the features of all data blocks in the same TCP connection, and finally calculate the features extracted from the data block in the first step. The feature average of all data blocks of the same TCP stream as in the second step is combined together as a feature set of a data block.
第一步,针对每个数据块,提取7个特征,分别是:The first step is to extract 7 features for each data block, which are:
1.数据块大小;1. Data block size;
Chunk_size,数据块的字节个数;Chunk_size, the number of bytes of the data block;
2.数据包个数;2. The number of data packets;
Packet_number,数据块的数据包个数;Packet_number, the number of packets of the data block;
3.首字节到达时间;3. The arrival time of the first byte;
Time to first byte,简称TTFB,从发出GET后,到响应的第一个字节到达的时间;Time to first byte, referred to as TTFB, the time from when GET is issued to the arrival of the first byte of the response;
GET是指从客户端发出HTTP请求报文后的时间。响应是指服务器端回复客户端的第一个报文,客户端和服务器使用TCP协议进行报文交互,客户端发出一个HTTP请求报文后,服务器会回复一个TCP ACK报文,TTFB时间是指客户端发出了HTTP请求,到服务器返回的第一个报文的时间。GET refers to the time after the HTTP request message is sent from the client. The response refers to the first message that the server replies to the client. The client and the server use the TCP protocol to exchange messages. After the client sends an HTTP request message, the server will reply with a TCP ACK message. The TTFB time refers to the client The time from the end of the HTTP request to the first packet returned by the server.
4.下载时间;4. Download time;
Download_time,数据块从第一个包到最后一个包之间的时间;Download_time, the time between the first packet and the last packet of the data block;
5.空闲时间;5. Free time;
Slack_time,从数据块最后一个报文,到下一个GET请求发出的时间;Slack_time, the time from the last message of the data block to the next GET request;
6.数据块传输时间;6. Data block transmission time;
Duration_time,从GET请求,到下一个GET请求的时间;Duration_time, the time from the GET request to the next GET request;
数据块传输时间等于首字节到达时间加下载时间加空闲时间;The data block transmission time is equal to the arrival time of the first byte plus the download time plus the idle time;
7.传输速率;7. Transmission rate;
Download_speed,数据块字节除以下载时间;Download_speed, data block bytes divided by download time;
高清晰度的数据块在字节数和数据包个数上要明显大于低清晰度的数据块,这两个特征的重要性最高,含义也最明显,比较容易理解。其它的几个时间相关的特征,经过统计分析也可以起到区分清晰度的作用。High-definition data blocks are significantly larger than low-definition data blocks in terms of the number of bytes and packets. These two features are of the highest importance, the most obvious meanings, and easier to understand. Several other time-related features can also play a role in distinguishing clarity after statistical analysis.
第二步,针对每个腾讯视频的TCP连接,计算特征平均值,分别是:The second step is to calculate the average feature value for each TCP connection of Tencent Video, which are:
1.平均数据块大小;1. Average data block size;
ave_chunk_size,所有数据块大小的平均值;ave_chunk_size, the average of all chunk sizes;
2.平均包个数;2. Average number of packages;
ave_packet_number,所有包个数的平均值;ave_packet_number, the average of all packets;
3.平均首字节到达时间;3. Average first byte arrival time;
ave_ttfb,所有首字节到达时间的平均值;ave_ttfb, the average of all first byte arrival times;
4.平均下载时间;4. Average download time;
ave_download_time,所有下载时间的平均值;ave_download_time, the average of all download times;
5.平均空闲时间;5. Average idle time;
ave_slack,所有空闲时间的平均值;ave_slack, the average of all idle times;
6.平均传输时间;6. Average transmission time;
ave_duration_time,所有传输时间的平均值;ave_duration_time, the average of all transit times;
7.平均下载速率;7. Average download rate;
ave_download_speed,平均数据块大小,除以平均下载时间;ave_download_speed, the average block size, divided by the average download time;
第三步,我们针对每一个数据块,把从数据块提取7个特征,再加上 整体TCP流的7个平均值特征,一共是14个特征,再加上本数据块的清晰度,算是一个样本。In the third step, for each data block, we extract 7 features from the data block, plus the 7 average features of the overall TCP stream, a total of 14 features, plus the clarity of this data block, it is regarded as a sample.
示例数据如下:Example data is as follows:
chunk_size,packet_number,ttfb,download_time,slack,duration_time,download_speed,ave_chunk_size,ave_packet_number,ave_ttfb,ave_download_time,ave_slack,ave_duration_time,ave_download_speed,resolutionchunk_size,packet_number,ttfb,download_time,slack,duration_time,download_speed,ave_chunk_size,ave_packet_number,ave_ttfb,ave_download_time,ave_slack,ave_duration_time,ave_download_speed,resolution
2480000,1750,0.012,4.2,3.0,7.212,590476.1904761905,2338589.1333333333,1801.2666666666667,0.019322029749552407,3.6202432473500568,3.8155619303385415,7.455127207438151,645975.6910105784,720p2480000,1750,0.012,4.2,3.0,7.212,590476.1904761905,2338589.1333333333,1801.2666666666667,0.019322029749552407,3.6202432473500568,3.8155619303385415,7.455127207438151,645975.6910105784,720p
虽然视频传输过程中,使用了加密协议,不能够直接获取视频文件本身的内容,无法从视频内容上识别视频清晰度,但是在TCP层上可以识别出每一个视频数据块的,技术人员可以计算出来每一个视频数据块的大小,以及特征参数。由于视频文件的分块是按照时间长度来划分的,高清晰度的视频块长度一般会大于低清晰度的视频块长度视频数据块的长度,这样就导致受到很多因素影响,比如,一个静态画面多的视频,编码后的数据块的长度要小于动作画面多的视频;比如,同一个清晰度的视频,在手机平台和浏览器平台对应的是不同的视频编码,所以数据块的长度也不一样。Although the encryption protocol is used in the video transmission process, the content of the video file itself cannot be directly obtained, and the video definition cannot be identified from the video content, but each video data block can be identified on the TCP layer, and technicians can calculate Come out the size of each video data block, and the characteristic parameters. Since the blocks of video files are divided according to the length of time, the length of high-definition video blocks is generally larger than that of low-definition video blocks, which is affected by many factors, such as a static image. For more videos, the length of the encoded data block is smaller than that of the video with many action pictures; for example, the video of the same definition corresponds to different video encoding on the mobile phone platform and the browser platform, so the length of the data block is also different. Same.
本实施例一中的步骤105中需要模型训练,所述数据块中特征、特征平均值和本次TCP流文件的数据块,组合成一个已知数据包清晰度的特征集样本,模型接收至少一个特征集样本的训练,使用已知清晰度视频播放日志中的视频编码编号验证模型的预测清晰度结果,若模型的预测结果的正确率高于预设值,所述模型训练成功。Step 105 in the first embodiment requires model training. The features in the data block, the average value of the features, and the data block of the current TCP flow file are combined into a feature set sample with a known data packet definition, and the model receives at least In the training of a feature set sample, the video coding number in the video playback log of known definition is used to verify the predicted definition result of the model. If the accuracy of the prediction result of the model is higher than the preset value, the model is successfully trained.
经过多次的特征收集、数据验证和调校特征权重,当模型预测结果的正确率达到预设值以上,所述模型训练成功。After many times of feature collection, data verification, and feature weight adjustment, the model is successfully trained when the accuracy of the model prediction results exceeds a preset value.
属于同一条TCP流的所有视频数据块,都是同一个视频清晰度,所以计算整条流所有数据块的特征参数的平均值,就是把整条TCP流抽象为一 个数据块,获的所有数据块的清晰度之后,再实施一个“训练结果优化”的步骤,过程如下:统计一条TCP流中的所有数据块被预测的清晰度类别,找到其中占比最大的预测类别,然后把其它数据块的预测结果全部改为这个清晰度类别,即统计一条TCP流的所有预测结果,使用少数服从多数的机制,强制修正预测结果。All video data blocks belonging to the same TCP stream are of the same video definition, so calculating the average value of the characteristic parameters of all data blocks in the entire stream is to abstract the entire TCP stream into a data block, and all the data obtained After the definition of the block, a "training result optimization" step is implemented. The process is as follows: Count the predicted definition categories of all data blocks in a TCP stream, find the prediction category with the largest proportion, and then put other data blocks. The prediction results of , are all changed to this clarity category, that is, to count all the prediction results of a TCP stream, and to force the correction of the prediction results by using the minority-subordination-majority mechanism.
本实施例一中,所述模型进行预测未知清晰度的加密流量视频时,在移动通信网络中,使用Android平台训练出来的模型预测,在传统固网环境中,使用PC平台训练出来的模型预测。In the first embodiment, when the model predicts the encrypted traffic video of unknown definition, in the mobile communication network, the model prediction trained by the Android platform is used, and in the traditional fixed network environment, the model prediction trained by the PC platform is used. .
Android手机平台的腾讯视频客户端产生的视频数据块大小平均要小于PC平台上的腾讯视频客户端产生的视频数据块大小;因此把Android平台和PC平台在做训练和预测时需要区别开。在训练模型时,可以做到针对不同平台分别进行采集流量;在使用模型进行预测时,同样地可以筛选目标流量,例如在4G的移动通信网络中,使用基于Android平台训练出来的模型去预测,在传统固网环境中,使用基于PC平台训练出来的模型去预测,由此提高模型预测的准确率。The average size of the video data block generated by the Tencent Video client on the Android mobile platform is smaller than that generated by the Tencent Video client on the PC platform; therefore, the Android platform and the PC platform need to be distinguished when doing training and prediction. When training the model, you can collect traffic for different platforms separately; when using the model for prediction, you can also filter the target traffic. For example, in the 4G mobile communication network, use the model trained on the Android platform to predict, In the traditional fixed network environment, the model trained on the PC platform is used to predict, thereby improving the accuracy of the model prediction.
本实施例一中,对所述样本集合使用随机过采样方法进行清晰度类别平衡处理,基于机器学习的模型使用随机森林算法,所有参数保持默认;另外采集数据集的时候,各种清晰度的分布并不均匀,所以这里需要对样本集合进行类别平衡处理,本实施例一中使用Random Oversampler随机过采样方法来进行类别平衡处理,随机过采样是一个标准过程,具体是指随机的复制、重复少数类样本,最终使得少数类与多数类的个数相同从而得到一个新的均衡的数据集。In the first embodiment, the random oversampling method is used to balance the clarity categories for the sample set, the machine learning-based model uses the random forest algorithm, and all parameters are kept by default; The distribution is not uniform, so it is necessary to perform category balancing processing on the sample set. In this embodiment 1, the Random Oversampler random oversampling method is used to perform category balancing processing. Random oversampling is a standard process, which specifically refers to random copying, repeating Minority class samples, and finally make the number of minority class and majority class the same to obtain a new balanced dataset.
本实施例一中,从网络流量中筛选出来带有HTTPS加密流量网络视频的目标流量时,通过比较域名中的SNI字段和字符串,若与预设SNI字段和预设字符串完全匹配则说明流量是HLS传输模式的视频流量,且能够成为需要预测清晰度加密流量视频文件。In the first embodiment, when the target traffic with HTTPS encrypted traffic network video is filtered from the network traffic, the SNI field and the string in the domain name are compared. If it completely matches the preset SNI field and the preset string, it means that The traffic is the video traffic in the HLS transmission mode, and can be a video file that needs to predict the definition of encrypted traffic.
如图4所示,在实际使用中,技术人员只有TLS加密后的流量,是不知道明文的URL的,所以首先需要从网络流量中筛选出来目标流量,即腾讯视频流量中采用HLS传输模式的加密流量。As shown in Figure 4, in actual use, technicians only have TLS-encrypted traffic and do not know the plaintext URL. Therefore, they first need to filter out the target traffic from the network traffic, that is, the HLS transmission mode is used in the Tencent video traffic. Encrypted traffic.
HTTPS是承载在TLS协议之上的HTTP协议,TLS协议的握手消息中,在ClientHello消息中有一个叫做SNI的扩展字段,此字段表示要连接的服务器的域名,可以比较SNI字段和腾讯视频服务器的域名,来匹配到对应的流量。对于HLS传输模式的腾讯视频服务器,域名都满足以下格式”lts***.qq.com”,或者是”sts***.qq.com”,只需要比较SNI字段和lts或sts字符串,如果完全匹配则说明是HLS传输模式的腾讯视频流量,能够成为待预测清晰度的加密流量视频文件。HTTPS is the HTTP protocol carried over the TLS protocol. In the handshake message of the TLS protocol, there is an extension field called SNI in the ClientHello message. This field indicates the domain name of the server to be connected. You can compare the SNI field with the Tencent Video server's domain name to match the corresponding traffic. For the Tencent Video server in HLS transmission mode, the domain name must meet the following format "lts***.qq.com", or "sts***.qq.com", only need to compare the SNI field and the lts or sts string, If it matches exactly, it means that it is the Tencent video traffic in HLS transmission mode, and can become the encrypted traffic video file of the to-be-predicted definition.
虽然有各种因素的影响,经过实际测试发现,通过视频数据块信息来识别清晰度的方法依然有着较高的准确率,不失为一种实用的方法。使用机器学习的方法,对视频数据块的信息进行收集,训练,然后把训练的模型应用到待测加密流量视频的清晰度预测上。Although there are various factors, it is found through actual tests that the method of identifying the sharpness through video data block information still has a high accuracy rate, which is a practical method. Using the method of machine learning, the information of the video data block is collected, trained, and then the trained model is applied to the definition prediction of the encrypted traffic video to be tested.
实施例二:Embodiment 2:
如图5所示,是本发明实施例的对加密流量视频进行清晰度预测的装置的架构示意图。本实施例的对加密流量视频进行清晰度预测的装置包括一个或多个处理器21以及存储器22。其中,图5中以一个处理器21为例。As shown in FIG. 5 , it is a schematic structural diagram of an apparatus for performing definition prediction on encrypted traffic video according to an embodiment of the present invention. The apparatus for predicting the sharpness of the encrypted traffic video in this embodiment includes one or more processors 21 and a memory 22 . Among them, one processor 21 is taken as an example in FIG. 5 .
处理器21和存储器22可以通过总线或者其他方式连接,图5中以通过总线连接为例。The processor 21 and the memory 22 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 5 .
存储器22作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序和非易失性计算机可执行程序,如实施例1中的对加密流量视频进行清晰度预测的方法。处理器21通过运行存储在存储器22中的非易失性软件程序和指令,从而执行对加密流量视频进行清晰度预测的方法。As a non-volatile computer-readable storage medium, the memory 22 can be used to store non-volatile software programs and non-volatile computer-executable programs, such as the method for sharpness prediction of encrypted traffic video in Embodiment 1 . The processor 21 executes the method of sharpness prediction for encrypted traffic video by running non-volatile software programs and instructions stored in the memory 22 .
存储器22可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。 在一些实施例中,存储器22可选包括相对于处理器21远程设置的存储器,这些远程存储器可以通过网络连接至处理器21。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。 Memory 22 may include high speed random access memory, and may also include nonvolatile memory, such as at least one magnetic disk storage device, flash memory device, or other nonvolatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 through a network. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
所述程序指令/模块存储在所述存储器22中,当被所述一个或者多个处理器21执行时,执行上述实施例1中的对加密流量视频进行清晰度预测的方法,例如,执行以上描述的图1和图3所示的各个步骤。The program instructions/modules are stored in the memory 22, and when executed by the one or more processors 21, execute the method for predicting the resolution of encrypted traffic video in the above Embodiment 1, for example, execute the above Describe the various steps shown in Figures 1 and 3.
值得说明的是,上述装置和系统内的模块、单元之间的信息交互、执行过程等内容,由于与本发明的处理方法实施例基于同一构思,具体内容可参见本发明方法实施例中的叙述,此处不再赘述。It is worth noting that the information exchange, execution process and other contents between the modules and units in the above-mentioned device and the system are based on the same concept as the processing method embodiments of the present invention. For details, please refer to the descriptions in the method embodiments of the present invention. , and will not be repeated here.
本领域普通技术人员可以理解实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the embodiments can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium, and the storage medium can include: Read memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or CD, etc.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims (10)

  1. 一种对加密流量视频进行清晰度预测的方法,其特征在于,对视频清晰度的预测方法包括:A method for predicting the sharpness of encrypted traffic video, characterized in that the method for predicting the sharpness of the video comprises:
    捕获带有HTTPS加密流量网络视频播放的TCP流数据包和播放日志;Capture TCP stream packets and playback logs of network video playback with HTTPS encrypted traffic;
    依据所述播放日志中的视频编码编号,给捕获到的加密流量网络视频标注清晰度;According to the video coding number in the playback log, the captured encrypted traffic network video is marked with clarity;
    从所述TCP流数据包中检测数据块;detecting data blocks from the TCP stream data packets;
    提取所述数据块中与清晰度相对应的特征以及特征平均值,构成已知数据包清晰度的特征集;Extracting the feature corresponding to the definition and the feature average value in the data block to form a feature set of known data packet definition;
    利用已知视频文件标注的清晰度与数据块特征集的对应关系建立模型,并且训练模型,模型训练完成后,对待测加密流量视频的TCP流数据包进行特征提取,根据模型中特征集与清晰度的对应关系预测出待测加密流量视频文件的清晰度。The model is established by using the corresponding relationship between the clarity of the known video file annotation and the feature set of the data block, and the model is trained. After the model training is completed, the feature extraction is performed on the TCP stream data packets of the encrypted traffic video to be tested. According to the feature set in the model and the clarity The corresponding relationship between the degrees predicts the clarity of the encrypted traffic video file to be tested.
  2. 如权利要求1所述的对加密流量视频进行清晰度预测的方法,其特征在于,所述收集带有HTTPS加密流量网络视频播放的数据包括方法:The method for predicting the definition of encrypted traffic video as claimed in claim 1, wherein, the method for collecting data of network video playback with HTTPS encrypted traffic comprises:
    在浏览器上向存储有视频信息的数据库请求播放带有HTTPS加密流量的视频,至少选择两种清晰度且视频内容有区别的视频文件。On the browser, request the database that stores video information to play videos with HTTPS encrypted traffic, and select at least two video files with different definition and video content.
  3. 如权利要求1所述的对加密流量视频进行清晰度预测的方法,其特征在于,每种编码方式对应所述播放日志中唯一的视频清晰度编号,浏览器播放视频时,加密流量网络视频播放日志中的视频编码编号和对应的清晰度同时被记录和收集。The method for predicting the definition of encrypted traffic video according to claim 1, wherein each encoding method corresponds to a unique video definition number in the playback log, and when the browser plays the video, the encrypted traffic network video plays The video code numbers and corresponding resolutions in the log are recorded and collected at the same time.
  4. 如权利要求1所述的对加密流量视频进行清晰度预测的方法,其特征在于,所述数据块根据TCP报文ACK字段检测,具体包括:The method for predicting the definition of encrypted traffic video according to claim 1, wherein the data block is detected according to the ACK field of the TCP message, and specifically includes:
    对一条TCP流的所有报文进行判断是否是HLS视频流;Determine whether all the packets of a TCP stream are HLS video streams;
    解析TLS消息,去除TLS握手报文,保留传输数据的报文;Parse the TLS message, remove the TLS handshake message, and retain the message for transmitting data;
    判断上行和下行报文,对下行报文进行处理;Determine the uplink and downlink packets, and process the downlink packets;
    对下行报文的ACK值进行分类处理;Classify the ACK value of the downlink message;
    相同ACK值的报文,记做一个数据块;Messages with the same ACK value are recorded as a data block;
    把ACK值发生变化的报文,记做一个新的数据块。The message whose ACK value has changed is recorded as a new data block.
  5. 如权利要求4所述的对加密流量视频进行清晰度预测的方法,其特征在于,所述TCP流文件在视频传输中提取所述数据块中与清晰度相对应的特征以及特征平均值,其中:The method for predicting sharpness of encrypted traffic video according to claim 4, characterized in that, the feature corresponding to sharpness and the average value of the feature in the data block are extracted from the TCP stream file during video transmission, wherein :
    所述特征包括:数据块大小、数据包个数、首字节到达时间、数据块下载时间、数据块空闲时间、数据块传输时间和数据传输速率中的一项或者多项;The features include: one or more of the size of the data block, the number of data packets, the arrival time of the first byte, the download time of the data block, the idle time of the data block, the data block transmission time and the data transmission rate;
    所述特征平均值包括:平均数据块大小、平均数据包个数、平均首字节到达时间、平均数据块下载时间、平均数据块空闲时间、平均数据块传输时间和平均数据传输速率中的一项或者多项。The characteristic average value includes one of: average data block size, average data packet number, average first byte arrival time, average data block download time, average data block idle time, average data block transfer time, and average data transfer rate. item or multiple items.
  6. 如权利要求5所述的对加密流量视频进行清晰度预测的方法,其特征在于,所述数据块中特征、特征平均值和本次TCP流文件的数据块,组合成一个已知数据包清晰度的特征集样本,模型接收至少一个特征集样本的训练,使用已知清晰度视频播放日志中的视频编码编号验证模型的预测清晰度结果,若模型的预测结果的正确率高于预设值,所述模型训练成功。The method for predicting the clarity of encrypted traffic video according to claim 5, wherein the features in the data block, the average value of the features and the data block of the current TCP flow file are combined into a known data packet with clarity The model receives at least one feature set sample for training, and uses the video code number in the video playback log of the known definition to verify the model's predicted definition result. If the accuracy of the model's prediction result is higher than the preset value , the model is successfully trained.
  7. 如权利要求1所述的对加密流量视频进行清晰度预测的方法,其特征在于,所述模型进行预测未知清晰度的加密流量视频时,在移动通信网络中,使用Android平台训练出来的模型预测,在传统固网环境中,使用 PC平台训练出来的模型预测。The method for predicting the definition of encrypted traffic video according to claim 1, wherein when the model predicts the encrypted traffic video of unknown definition, in the mobile communication network, a model trained on the Android platform is used to predict , in the traditional fixed network environment, using the model prediction trained on the PC platform.
  8. 如权利要求6所述的对加密流量视频进行清晰度预测的方法,其特征在于,对所述样本集合使用随机过采样方法进行清晰度类别平衡处理。The method for sharpness prediction of encrypted traffic video according to claim 6, characterized in that a random oversampling method is used to perform sharpness class balance processing on the sample set.
  9. 如权利要求1所述的对加密流量视频进行清晰度预测的方法,其特征在于,从网络流量中筛选出来带有HTTPS加密流量网络视频的目标流量时,通过比较域名中的SNI字段和字符串,若与预设SNI字段和预设字符串完全匹配则说明流量是HLS传输模式的视频流量,且能够成为需要预测清晰度加密流量视频文件。The method for predicting the definition of encrypted traffic video according to claim 1, wherein, when screening out the target traffic with HTTPS encrypted traffic network video from network traffic, by comparing the SNI field in the domain name and the character string , if it completely matches the preset SNI field and the preset string, it means that the traffic is video traffic in the HLS transmission mode, and can become a video file that requires a predicted definition to encrypt traffic.
  10. 一种对加密流量视频进行清晰度预测的装置,其特征在于,装置包括:A device for predicting sharpness of encrypted traffic video, characterized in that the device comprises:
    至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被程序设置为执行权利要求1-9任一所述的对加密流量视频进行清晰度预测方法。at least one processor; and, a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being programmed to perform claims 1- 9. Any of the described methods for predicting the sharpness of encrypted traffic video.
PCT/CN2021/130890 2020-12-04 2021-11-16 Method and device for predicting definition of video having encrypted traffic WO2022116811A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011397431.8 2020-12-04
CN202011397431.8A CN112203136B (en) 2020-12-04 2020-12-04 Method and device for predicting definition of encrypted flow video

Publications (1)

Publication Number Publication Date
WO2022116811A1 true WO2022116811A1 (en) 2022-06-09

Family

ID=74033677

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130890 WO2022116811A1 (en) 2020-12-04 2021-11-16 Method and device for predicting definition of video having encrypted traffic

Country Status (2)

Country Link
CN (1) CN112203136B (en)
WO (1) WO2022116811A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278349A (en) * 2022-07-21 2022-11-01 北京邮电大学 Method for processing dragging watching video under wireless communication environment
CN117240735A (en) * 2023-11-09 2023-12-15 湖南戎腾网络科技有限公司 Method, system, equipment and storage medium for filtering audio and video streams

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112203136B (en) * 2020-12-04 2021-03-30 武汉绿色网络信息服务有限责任公司 Method and device for predicting definition of encrypted flow video
US20220417303A1 (en) * 2021-06-28 2022-12-29 Tencent America LLC Techniques for monitoring encrypted streaming traffic using underlying transport metrics

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107071399A (en) * 2017-04-26 2017-08-18 华为技术有限公司 The method for evaluating quality and device of a kind of encrypted video stream
CN107833214A (en) * 2017-11-03 2018-03-23 北京奇虎科技有限公司 Video definition detection method, device, computing device and computer-readable storage medium
CN107888579A (en) * 2017-11-06 2018-04-06 浙江大学 A kind of mobile video user experience quality index modeling method of non-interfering type
CN108696403A (en) * 2018-03-23 2018-10-23 中国科学技术大学 A kind of encrypted video QoE evaluating methods based on the study of network flow latent structure
CN109905696A (en) * 2019-01-09 2019-06-18 浙江大学 A kind of recognition methods of the Video service Quality of experience based on encryption data on flows
CN110620766A (en) * 2019-09-05 2019-12-27 东南大学 Method for extracting TLS data block in encrypted network flow
CN112203136A (en) * 2020-12-04 2021-01-08 武汉绿色网络信息服务有限责任公司 Method and device for predicting definition of encrypted flow video

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10733457B1 (en) * 2019-03-11 2020-08-04 Wipro Limited Method and system for predicting in real-time one or more potential threats in video surveillance
CN110197234B (en) * 2019-06-13 2020-05-19 四川大学 Encrypted flow classification method based on dual-channel convolutional neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107071399A (en) * 2017-04-26 2017-08-18 华为技术有限公司 The method for evaluating quality and device of a kind of encrypted video stream
CN107833214A (en) * 2017-11-03 2018-03-23 北京奇虎科技有限公司 Video definition detection method, device, computing device and computer-readable storage medium
CN107888579A (en) * 2017-11-06 2018-04-06 浙江大学 A kind of mobile video user experience quality index modeling method of non-interfering type
CN108696403A (en) * 2018-03-23 2018-10-23 中国科学技术大学 A kind of encrypted video QoE evaluating methods based on the study of network flow latent structure
CN109905696A (en) * 2019-01-09 2019-06-18 浙江大学 A kind of recognition methods of the Video service Quality of experience based on encryption data on flows
CN110620766A (en) * 2019-09-05 2019-12-27 东南大学 Method for extracting TLS data block in encrypted network flow
CN112203136A (en) * 2020-12-04 2021-01-08 武汉绿色网络信息服务有限责任公司 Method and device for predicting definition of encrypted flow video

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115278349A (en) * 2022-07-21 2022-11-01 北京邮电大学 Method for processing dragging watching video under wireless communication environment
CN115278349B (en) * 2022-07-21 2023-05-23 北京邮电大学 Method for processing drag watching video under wireless communication environment
CN117240735A (en) * 2023-11-09 2023-12-15 湖南戎腾网络科技有限公司 Method, system, equipment and storage medium for filtering audio and video streams
CN117240735B (en) * 2023-11-09 2024-01-19 湖南戎腾网络科技有限公司 Method, system, equipment and storage medium for filtering audio and video streams

Also Published As

Publication number Publication date
CN112203136B (en) 2021-03-30
CN112203136A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
WO2022116811A1 (en) Method and device for predicting definition of video having encrypted traffic
Dimopoulos et al. Measuring video QoE from encrypted traffic
Ameigeiras et al. Analysis and modelling of YouTube traffic
Mangla et al. emimic: Estimating http-based video qoe metrics from encrypted network traffic
Krishnamoorthi et al. BUFFEST: Predicting buffer conditions and real-time requirements of HTTP (S) adaptive streaming clients
US10869067B2 (en) Method for detecting a live adaptive bit rate stream
RU2487484C2 (en) Stream media server, client terminal, method and system for downloading stream media
US20170093648A1 (en) System and method for assessing streaming video quality of experience in the presence of end-to-end encryption
WO2020052110A1 (en) Service quality monitoring method, apparatus, and system
US9781474B2 (en) Content playback information estimation apparatus and method and program
CN112601072B (en) Video service quality assessment method and device
US20200374333A1 (en) Methods and systems for codec detection in video streams
CN106656629B (en) Method for predicting streaming media playing quality
CN109936769B (en) Video jamming detection method, video jamming detection system, mobile terminal and storage device
US10193814B2 (en) Method and apparatus for categorizing a download of a resource
US11743195B2 (en) System and method for monitoring and managing video stream content
TWI455529B (en) A method, a system, a server, a device, a computer program and a computer program product for transmitting data in a computer network
WO2015009828A1 (en) Method and system for detecting live over the top streams
Dubin et al. Video quality representation classification of encrypted http adaptive video streaming
Wu et al. Monitoring video resolution of adaptive encrypted video traffic based on HTTP/2 features
AT&T
AT&T
CN113438503A (en) Video file restoration method and device, computer equipment and storage medium
US11196652B2 (en) Transport layer monitoring and performance assessment for OTT services
CN114679606B (en) Video flow identification method, system, electronic equipment and storage medium based on Burst characteristics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899845

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE