CN117729192A

CN117729192A - Video data acquisition method, system and electronic equipment

Info

Publication number: CN117729192A
Application number: CN202311714993.4A
Authority: CN
Inventors: 刘辉; 周汉川; 冯建业
Original assignee: Beijing Ruian Technology Co Ltd
Current assignee: Beijing Ruian Technology Co Ltd
Priority date: 2023-12-13
Filing date: 2023-12-13
Publication date: 2024-03-19

Abstract

The invention provides a video data acquisition method, a system and electronic equipment, and relates to the field of data communication.

Description

Video data acquisition method, system and electronic equipment

Technical Field

The present invention relates to the field of data communications, and in particular, to a method, a system, and an electronic device for acquiring video data.

Background

The hypertext transfer protocol (Hypertext Transfer Protocol, HTTP) is currently the most widely used transfer protocol in the internet and can be used for file transfer procedures between clients and servers. When using hypertext transfer protocol for large file transfers such as video transfers, this is typically accomplished through multiple sessions. In the HTTP protocol, each request needs to establish a connection, and each request is encapsulated in an HTTP message, so that multiple HTTP sessions need to be established during the transmission of a video file.

In the actual process of processing the session, usually, a specific website or software is analyzed manually, and the video data is extracted and processed by sorting the characteristics of session identification, session range identification and the like. The method has the advantages of good extraction effect, high file integrity, long analysis time, limited scene use and more manpower consumption.

Disclosure of Invention

Accordingly, the present invention is directed to a video data acquisition method, system and electronic device, which can fully analyze and sort video stream data in hypertext protocol data, and generate recommended data of a multi-session transmission scene to realize a final video data acquisition process, thereby reducing manual participation, improving automation degree and reducing analysis time, and solving the problems in the prior art.

In a first aspect, an embodiment of the present invention provides a video data acquisition method, including:

after the hypertext transfer protocol data is acquired, filtering the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule to obtain filtering data corresponding to the hypertext transfer protocol data; wherein the hypertext transfer protocol data comprises video stream data;

Acquiring header field data and a target address of the hypertext transfer protocol data in the filtering data, grouping the hypertext transfer protocol data based on the header field data and the target address to obtain grouping data, and determining recommended data corresponding to the filtering data according to the grouping data;

after updating the filtering rule according to the characteristic data contained in the recommended data, acquiring video stream data contained in the hypertext transfer protocol data by utilizing the updated filtering rule;

and caching the video stream data based on the characteristic data corresponding to the filtering rule to obtain video data corresponding to the video data.

In one embodiment, after obtaining the hypertext transfer protocol data, filtering the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule to obtain filtered data corresponding to the hypertext transfer protocol data, including:

acquiring hypertext transfer protocol data, and analyzing to obtain header field data of the hypertext transfer protocol data;

acquiring request data, response data and video stream data in the hypertext transfer protocol data based on the header field data, and filtering the request data and the response data;

Acquiring a preset matching feature library and a filtering rule corresponding to the matching feature library;

and based on the matching feature library, matching and filtering the video stream data by utilizing a filtering rule to obtain filtering data corresponding to the hypertext transfer protocol data.

In one embodiment, the step of obtaining header field data and a destination address of hypertext transfer protocol data in the filter data, grouping the hypertext transfer protocol data based on the header field data and the destination address to obtain grouping data, and determining recommended data corresponding to the filter data according to the grouping data includes:

acquiring header field type data, header field length data, header field range data and a uniform resource locator corresponding to the hypertext transfer protocol data in the filter data, and determining the header field data according to the header field type data, the header field length data and the header field range data;

acquiring a target address corresponding to the hypertext transfer protocol data in the filtered data, determining key data corresponding to the hypertext transfer protocol data according to the target address, and analyzing value data corresponding to the hypertext transfer protocol data by utilizing the key data;

grouping the hypertext transfer protocol data according to the header field data to obtain grouping data;

And determining recommended data according to the key data, the header field data and the uniform resource locator corresponding to the key data in the grouping data when the key data meets the preset clustering condition.

In one embodiment, the step of determining recommended data according to the key data, the header field data, and the uniform resource locator corresponding to the key data in the packet data when the key data satisfies the preset clustering condition includes:

key data corresponding to each group in the group data is obtained, and quintuple information corresponding to the hypertext transfer protocol data is obtained; wherein the quintuple information comprises IP address information, source port information, destination IP address information, destination port information and transport layer protocol information corresponding to the hypertext transfer protocol data;

if the key data are consistent, clustering the grouping data according to preset clustering conditions, and judging whether five-tuple information contained in the clusters are consistent and whether the value data are consistent;

if so, determining recommended data according to the key data, the header field data and the uniform resource locator.

In one embodiment, the step of obtaining video stream data included in hypertext transfer protocol data using the updated filtering rule after updating the filtering rule according to the feature data included in the recommendation data includes:

Updating a matching feature library corresponding to the filtering rule by utilizing feature data contained in the recommendation data;

and acquiring the updated filtering rule based on the matching feature library, and acquiring video stream data contained in the hypertext transfer protocol data according to the filtering rule.

In one embodiment, the step of updating the matching feature library corresponding to the filtering rule by using feature data included in the recommendation data includes:

acquiring feature data contained in the recommended data, and determining type marks contained in the feature data and corresponding value data thereof;

determining a start value, an end value and length data corresponding to the value data based on the type mark;

and acquiring a matching feature library corresponding to the filtering rule, and updating the matching feature library by using a start value, an end value and length data corresponding to the value data.

In one embodiment, the step of caching video stream data based on feature data corresponding to a filtering rule to obtain video data corresponding to the video data includes:

analyzing and obtaining video stream data contained in the characteristic data based on the filtering rule;

judging whether the video stream data is complete or not by utilizing header field range data corresponding to the hypertext transfer protocol data in the filtered data;

And if the video stream data is complete data transmission, the video stream data is subjected to sequencing, de-duplication and splicing and then is cached to generate the video data.

In one embodiment, the steps of sorting, de-duplicating and splicing the video stream data, and then buffering the video stream data to generate video data include:

after sequencing, de-duplication and splicing of the video stream data, quintuple information and a session identifier corresponding to the hypertext transfer protocol data are obtained;

generating query conditions corresponding to the video stream data according to the quintuple information, the session identifier and the matching feature library;

and caching the video stream data by using the query condition to generate the video stream data.

In a second aspect, embodiments of the present invention provide a video data acquisition system, the system comprising:

the data filtering module is used for filtering the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule after the hypertext transfer protocol data is acquired, so as to obtain filtering data corresponding to the hypertext transfer protocol data; wherein the hypertext transfer protocol data comprises video stream data;

the characteristic analysis module is used for acquiring header field data and a target address of the hypertext transfer protocol data in the filtered data, grouping the hypertext transfer protocol data based on the header field data and the target address to obtain grouping data, and determining recommended data corresponding to the filtered data according to the grouping data;

The data arrangement module is used for acquiring video stream data contained in the hypertext transfer protocol data by utilizing the updated filtering rule after updating the filtering rule according to the characteristic data contained in the recommended data;

and the data processing module is used for caching the video stream data based on the characteristic data corresponding to the filtering rule and obtaining the video data corresponding to the video data.

In a third aspect, embodiments of the present invention also provide an electronic device, including a processor and a memory, the memory storing computer-executable instructions executable by the processor, the processor executing the computer-executable instructions to implement the steps of the video data acquisition method provided in the first aspect.

In a fourth aspect, embodiments of the present invention also provide a storage medium storing computer-executable instructions that, when invoked and executed by a processor, cause the processor to implement the steps of the video data acquisition method provided in the first aspect.

In the process of analyzing hypertext transfer data containing video stream data, after hypertext transfer protocol data is acquired, filtering request data and response data in the hypertext transfer protocol data according to preset filtering rules to obtain filtering data corresponding to the hypertext transfer protocol data; wherein the hypertext transfer protocol data comprises video stream data; then, header field data and a target address of the hypertext transfer protocol data in the filtered data are obtained, the hypertext transfer protocol data are grouped based on the header field data and the target address to obtain grouping data, and recommendation data corresponding to the filtered data are determined according to the grouping data; then, after updating the filtering rule according to the characteristic data contained in the recommended data, acquiring video stream data contained in the hypertext transfer protocol data by utilizing the updated filtering rule; and finally, caching the video stream data based on the characteristic data corresponding to the filtering rule, and obtaining the video data corresponding to the video data. The scheme can fully analyze and arrange the video stream data in the hypertext protocol data, and the final video data acquisition process is realized by generating the recommended data of the multi-session transmission scene, so that the manual participation degree is reduced, the automation degree is improved, the analysis time is shortened, and the problems in the prior art are solved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a video data acquisition method according to an embodiment of the present invention;

fig. 2 is a flowchart of step S101 in a video data acquisition method according to an embodiment of the present invention;

fig. 3 is a flowchart of step S102 in a video data acquisition method according to an embodiment of the present invention;

Fig. 4 is a flowchart of step S304 in a video data acquisition method according to an embodiment of the present invention;

fig. 5 is a flowchart of step S103 in a video data acquisition method according to an embodiment of the present invention;

fig. 6 is a flowchart of step S501 in a video data acquisition method according to an embodiment of the present invention;

fig. 7 is a flowchart of step S104 in a video data acquisition method according to an embodiment of the present invention;

fig. 8 is a flowchart of step S703 in a video data acquisition method according to an embodiment of the present invention;

FIG. 9 is a flowchart of another video data acquisition method according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of a video data acquisition system according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Icon:

1010-a data filtering module; 1020-a feature analysis module; 1030-a data sort module; 1040-a data processing module;

a 101-processor; 102-memory; 103-bus; 104-communication interface.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described in conjunction with the embodiments, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Specifically, in the process of carrying out multi-session transmission on HTTP video data, the method is mainly realized through the following steps:

1. the client breaks the large attachment into a plurality of small data blocks and creates a unique identifier for each data block.

2. The client creates an HTTP request for each data block and encapsulates the request in an HTTP message.

3. The client sends each HTTP message to the server and adds a session identifier to the request header to mark that the request belongs to the same session.

4. After receiving the HTTP message, the server connects each request according to the session identifier and splices the content of each request into a complete large attachment.

5. The server processes the large accessory, such as storing or transmitting.

It should be noted that when multiple HTTP sessions are established, reliability of each session connection needs to be guaranteed, and integrity of each request often needs to be guaranteed. Therefore, attention is also required to be paid to the lost and repeated transmission of the communication message of the data block, so that the situation of data errors is avoided.

In the actual process of processing the session, usually, a specific website or software is analyzed manually, and the video data is extracted and processed by sorting the features such as host, uri, session identifier, session range identifier and the like. The method has the advantages of good extraction effect, high file integrity, long analysis time, limited scene use and more manpower consumption. Based on the above, the embodiment of the invention provides a video data acquisition method, a system and electronic equipment, which can fully analyze and sort video stream data in hypertext protocol data, and can realize a final video data acquisition process by generating recommended data of a multi-session transmission scene, thereby reducing the manual participation degree, improving the automation degree and reducing the analysis time.

For the sake of understanding the present embodiment, first, a method for acquiring video data disclosed in the present embodiment is described in detail, where the method is shown in fig. 1, and includes:

Step S101, after obtaining the hypertext transfer protocol data, filtering the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule to obtain filtering data corresponding to the hypertext transfer protocol data; the hypertext transfer protocol data includes video stream data.

In popular terms, the step is a data filtering step, and a preset filtering rule is used to filter a large amount of hypertext transfer protocol data, and the data is used in the subsequent data characteristic collecting and processing process. The filtering process comprises an analysis process of the hypertext transfer protocol, and after the request data and the response data of the hypertext transfer protocol are obtained through analysis, the request data and the response data are filtered by utilizing a filtering rule, and corresponding filtering data are obtained.

Step S102, header field data and a target address of the hypertext transfer protocol data in the filtered data are obtained, the hypertext transfer protocol data are grouped based on the header field data and the target address to obtain grouping data, and recommendation data corresponding to the filtered data are determined according to the grouping data.

Roughly speaking, the step is a data characteristic collection step, and data recommended data is formed by characteristic analysis of the hypertext transfer protocol data contained in the filtered data. In the process, the hypertext transfer protocol data is grouped based on the header field data and the target address, and recommendation data corresponding to the filtering data is determined according to the grouping.

Step S103, after updating the filtering rule according to the characteristic data contained in the recommended data, the video stream data contained in the hypertext transfer protocol data is obtained by utilizing the updated filtering rule.

The method comprises the steps of recommending data arrangement, wherein after the filtering rules are updated by utilizing the characteristic data in the recommended data, the updated filtering rules are utilized to analyze and comb the session related characteristics, so that the characteristic data of different scenes are arranged and recorded, and finally video stream data contained in the hypertext transfer protocol data is obtained.

Step S104, caching the video stream data based on the characteristic data corresponding to the filtering rule, and obtaining the video data corresponding to the video data.

The step is a video data real-time processing step, and final video data is obtained by caching video stream data and performing package splicing processing. And actually caching the http session data containing the video accessory and carrying out package splicing processing to finally obtain the video data.

In one embodiment, after obtaining the hypertext transfer protocol data, filtering the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule to obtain filtered data corresponding to the hypertext transfer protocol data, as shown in fig. 2, the step S101 includes:

Step S201, obtaining hypertext transfer protocol data, analyzing and obtaining header field data of the hypertext transfer protocol data;

step S202, request data, response data and video stream data in the hypertext transfer protocol data are acquired based on the header field data, and the request data and the response data are filtered;

step S203, a preset matching feature library is obtained, and a filtering rule corresponding to the matching feature library is obtained;

step S204, based on the matching feature library, the filtering rules are utilized to carry out matching filtering on the video stream data, and then the filtering data corresponding to the hypertext transfer protocol data is obtained.

In the data filtering process, header field data corresponding to the HTTP data is analyzed first, then request data and response data are determined according to the header field data, and video stream data are determined at the same time. And then, acquiring a corresponding filtering rule by using a preset matching feature library, and carrying out matching filtering on video stream data based on the matching feature library. Specifically, after accessing massive HTTP data, first analyzing the header field data of the HTTP, filtering out video stream data according to the corresponding Content-Type Content, and outputting the video stream data to a specific data feature collection component for feature collection. In the process, HTTP request and response packets are filtered, and video stream data is not downloaded, so that the efficiency is improved. And matching the data by using the matched feature library, filtering the data, and accessing the filtered video stream data into a specific data real-time processing component for subsequent data processing.

In one embodiment, the step S102 of obtaining header field data and a destination address of hypertext transfer protocol data in the filtered data, grouping the hypertext transfer protocol data based on the header field data and the destination address to obtain grouping data, and determining recommended data corresponding to the filtered data according to the grouping data, as shown in fig. 3, includes:

step S301, head domain type data, head domain length data, head domain range data and uniform resource locators corresponding to the hypertext transfer protocol data in the filtered data are obtained, and the head domain data are determined according to the head domain type data, the head domain length data and the head domain range data;

step S302, a target address corresponding to the hypertext transfer protocol data in the filtered data is obtained, key data corresponding to the hypertext transfer protocol data is determined according to the target address, and value data corresponding to the hypertext transfer protocol data is analyzed by utilizing the key data;

step S303, grouping the hypertext transfer protocol data according to the header field data to obtain grouping data;

step S304, according to the key data, the header field data and the uniform resource locator corresponding to the key data in the grouping data meeting the preset clustering condition, the recommendation data is determined.

In the data feature collection step, first, header field Type data Content-Type, header field Length data Content-Length, header field Range data Content-Range and uniform resource locator URL corresponding to hypertext transfer protocol data in the filter data are acquired, after the Content-Type and Content-Length, content-Range are determined as the header field data, the HTTP data are grouped according to a target address host, and then key-vlue key value pair extraction processing is performed on the uniform resource locator URL, so that recommended data are formed. In one embodiment, the step S304 of determining recommended data according to the key data, header field data and uniform resource locator corresponding to the key data in the packet data when the key data satisfies the preset clustering condition, as shown in fig. 4, includes:

step S401, key data corresponding to each group in the group data is obtained, and quintuple information corresponding to the hypertext transfer protocol data is obtained; wherein the quintuple information comprises IP address information, source port information, destination IP address information, destination port information and transport layer protocol information corresponding to the hypertext transfer protocol data;

step S402, if the key data are consistent, clustering the grouping data according to preset clustering conditions, and judging whether five-tuple information contained in the clusters are consistent and whether value data are consistent;

Step S403, if yes, determining recommended data according to the key data, the header data and the uniform resource locator.

Specifically, the accumulated HTTP parsed data is first grouped by using host, the same key data appear in each group at the same time, and the five-tuple information and the value are used to determine, if multiple sessions appear in one cluster at the same value for multiple times and the five-tuple information is consistent, it is indicated that HTTP session transmission exists, and at this time, the header field data host, the uniform resource locator uri and the key data key are used as recommended data.

In one embodiment, after updating the filtering rule according to the feature data included in the recommendation data, step S103 of acquiring video stream data included in the hypertext transfer protocol data by using the updated filtering rule, as shown in fig. 5, includes:

step S501, updating a matching feature library corresponding to the filtering rule by utilizing feature data contained in the recommended data;

step S502, the updated filtering rule is obtained based on the matching feature library, and video stream data contained in the hypertext transfer protocol data is obtained according to the filtering rule.

And in the recommended data sorting step, the recommended data is researched and judged, so that whether the recommended data belongs to a scene of large accessory data multi-session transmission or not is rapidly determined. Specifically, the matching feature library corresponding to the filtering rule is updated by using feature data contained in the recommendation data, so that the matching feature library is sorted, and the matching feature library can include matching features (host, uri, key and the like), session identification features and session range features.

In one embodiment, the step S501 of updating the matching feature library corresponding to the filtering rule by using the feature data included in the recommendation data, as shown in fig. 6, includes:

step S601, obtaining feature data contained in the recommended data, and determining type marks contained in the feature data and corresponding value data thereof;

step S602, determining a start value, an end value and length data corresponding to the value data based on the type mark;

step S603, a matching feature library corresponding to the filtering rule is obtained, and the matching feature library is updated by using a start value, an end value and length data corresponding to the value data.

The session features are classified and combed, different scene features are arranged and recorded, and the extracted value data value is converted into a general starting value start value, a general ending value end value and general length data. In the actual implementation process, only the matched feature library in the session range needs to be updated regularly, and if a new scene exists, an extraction code is added, so that the universal sorting, de-duplication and splicing processing codes can be realized.

In one embodiment, the step S104 of caching the video stream data based on the feature data corresponding to the filtering rule to obtain the video data corresponding to the video data, as shown in fig. 7, includes:

Step S701, analyzing and obtaining video stream data contained in the characteristic data based on the filtering rule;

step S702, judging whether the video stream data is complete or not by utilizing header field range data corresponding to the hypertext transfer protocol data in the filtered data;

in step S703, if the video stream data is complete data transmission, the video stream data is sequenced, de-duplicated and spliced and then buffered to generate video data.

The video data real-time processing step comprises three processes of data analysis, data splicing and data caching. Specifically, the data parsing process parses HTTP data, extracts video stream data contained in feature data, and determines whether the current session is complete according to content-length of the current header field data. In the process of data splicing, each time new data is analyzed, five-tuple information, matching characteristics and session identification values are used for inquiring, all session range values are obtained for calculation, if the accessory is judged to be complete, all accessory message data are taken out, sorting, duplicate removal and splicing are carried out, and finally output is carried out; if not, caching the data and continuing waiting.

In one embodiment, the step S703 of generating video data by sorting, de-duplicating and splicing video stream data and then buffering the video stream data includes, as shown in fig. 8:

Step S801, after ordering, de-duplication and splicing video stream data, quintuple information and session identifier corresponding to hypertext transfer protocol data are obtained;

step S802, generating query conditions corresponding to video stream data according to quintuple information, a session identifier and a matching feature library;

step S803, the video stream data is buffered by using the query condition, and the video stream data is generated.

The data caching process can cache the analyzed data according to five-tuple information, matching characteristics and session identification values as query conditions, wherein the query conditions are accessory messages and session range values; the upper limit of the buffer overtime time and the buffer message size is configured, the data with the buffer overtime is discarded, and an error log is generated for the front end investigation.

As shown in the flowchart of another video data acquisition method in fig. 9, the method corresponds to four steps, and is specifically as follows:

and a flow filtering step: accessing a massive HTTP flow, analyzing header field information of the HTTP, filtering out video flow according to Content-Type Content, outputting the video flow to a data characteristic collection component, filtering HTTP request and response packets, and not downloading entity data flow, thereby improving processing efficiency; and matching by using the well-arranged matching feature library, filtering the data, and accessing the filtered specific video flow to a data real-time processing assembly.

The method comprises the steps of collecting data characteristics, optimizing the analysis process of filtered HTTP data, adding Content extraction of an attachment-related header field (Content-Type, content-Length, content-Range), grouping HTTP data by a program according to host (or destination IP), performing key-vlue extraction processing on uri, and caching the data, wherein the main extraction modes are as follows:

"/1ea3473255bc078c/fe40862b5323a5fd1ea3473255bc078c-4ad5ec65a154da62cee38ae0a7d5ec9d-720p.mp4? "; wherein in the uri format of the http protocol the last '/' and '? ' is defined as the file name part, "fe40862b5323a5fd1ea3473255bc078c-4ad5ec65a154da62cee ae0a7d5ec9d-720p.mp4" in the sample, which is one of the values to be extracted.

"auth_key= 1693641864-373804353-0-031c8da432d0f4 & clientcachekey=e04 a1a8ede0625ef3ae & tt=zhizhenqingliujh264_720 p & start=1541 & end= 1050116"; the data in this format is subjected to key-value processing, and the extraction results are shown in the following table:

key	Value
		auth_key	1693641864-373804353-0-031c8da432bd0f4
clientCacheKey	e04a1a8ede0625ef3ae
		tt	ZHIZHENQINGLIU_H264_720P
start	1541
		end	1050116

accumulating http-resolved data, grouping the data by using host, clustering the data with the same key appearing in each group at the same time, judging by using quintuple information and value, if multiple sessions appear in one cluster for multiple times and the same value and the quintuple information is consistent, indicating that http-split session transmission exists, making the host, uri and key into recommended data, and preserving sample data.

The recommended data is manually and regularly researched and judged, whether the recommended data belongs to a scene of large accessory data multi-session transmission or not can be quickly determined, and a data feature library is arranged, wherein the data feature library comprises matching features (host, uri, key and the like), session identification features and session range features.

The session features are classified and combed, different scene features are arranged and recorded, the extracted value is converted into a universal start value, an end value and a total length, only the session range feature library file is needed to be updated regularly, if a new scene exists, the extraction codes are added, so that universal sorting, de-duplication and splicing processing codes can be realized, the following table lists several common session range features and judgment bases, and the data needs to be analyzed in actual processing:

a data real-time processing step, which comprises three stages; data analysis: and analyzing the http data, extracting and extracting the current session attachment, and judging whether the current session is complete or not according to the content-length of the current header field. And (3) data splicing: after analyzing a piece of new data, inquiring by using five-tuple information, matching characteristics and session identification values, acquiring all session range values for calculation, if the accessory is judged to be complete, taking out all accessory message data, sorting, de-duplication and splicing, and finally outputting; if not, caching the data and continuing waiting. Data caching: caching the accessory message and the session range value according to the analyzed data and five-tuple information, matching characteristics and session identification value as query conditions; the upper limit of the buffer overtime time and the buffer message size is configured, the data with the buffer overtime is discarded, and an error log is generated for the front end investigation.

When the mass flow is overlarge, a part of flow of the data characteristic collection assembly can be filtered first, after a part of matching characteristic library is generated within a specified time threshold, the part of flow is filtered to the data processing assembly, after the processing pressure of the data characteristic assembly is reduced, new flow is received, the whole process is continuously processed, and new characteristics are continuously processed to generate the data. The website host can be independently configured, and the specific website video data can be filtered, characterized and sorted, so that the workload of manpower is reduced.

The video data acquisition method in the embodiment of the invention can fully analyze and sort the video stream data in the hypertext protocol data, and the final video data acquisition process is realized by generating the recommended data of the multi-session transmission scene, so that the manual participation degree is reduced, the automation degree is improved, and the analysis time is reduced.

For the video data acquisition method provided in the foregoing embodiment, an embodiment of the present invention provides a video data acquisition system, as shown in fig. 10, including:

the data filtering module 1010 is configured to perform filtering processing on the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule after the hypertext transfer protocol data is acquired, so as to obtain filtering data corresponding to the hypertext transfer protocol data; wherein the hypertext transfer protocol data comprises video stream data;

The feature analysis module 1020 is configured to obtain header field data and a target address of the hypertext transfer protocol data in the filtered data, group the hypertext transfer protocol data based on the header field data and the target address to obtain group data, and determine recommended data corresponding to the filtered data according to the group data;

the data sorting module 1030 is configured to obtain video stream data included in the hypertext transfer protocol data by using the updated filtering rule after updating the filtering rule according to the feature data included in the recommended data;

the data processing module 1040 is configured to buffer video stream data based on feature data corresponding to the filtering rule, and obtain video data corresponding to the video data.

The video data acquisition system mentioned in the above embodiment can fully analyze and sort the video stream data in the hypertext protocol data, and can realize the final video data acquisition process by generating the recommended data of the multi-session transmission scene, thereby reducing the manual participation degree, improving the automation degree and reducing the analysis time.

The implementation principle and the generated technical effects of the video data acquisition system provided by the embodiment of the present invention are the same as those of the embodiment of the video data acquisition method, and for the sake of brief description, reference may be made to corresponding contents in the embodiment of the video data acquisition method where the embodiment of the apparatus is not mentioned.

The embodiment also provides an electronic device, the structural schematic diagram of which is shown in fig. 11, the device includes a processor 101 and a memory 102; the memory 102 is used for storing one or more computer instructions, and the one or more computer instructions are executed by the processor to implement the steps of the video data acquisition method described above.

The electronic device shown in fig. 11 further comprises a bus 103 and a communication interface 104, the processor 101, the communication interface 104 and the memory 102 being connected by the bus 103.

The memory 102 may include a high-speed random access memory (RAM, random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. Bus 103 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 11, but not only one bus or type of bus.

The communication interface 104 is configured to connect with at least one user terminal and other network units through a network interface, and send the encapsulated IPv4 message or the IPv4 message to the user terminal through the network interface.

The processor 101 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 101 or instructions in the form of software. The processor 101 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processor, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks of the disclosure in the embodiments of the disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present disclosure may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 102, and the processor 101 reads information in the memory 102, and in combination with its hardware, performs the steps of the method of the previous embodiment.

The embodiment of the present invention also provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the video data acquisition method in the foregoing embodiment.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, indirect coupling or communication connection of devices or units, electrical, mechanical, or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of video data acquisition, the method comprising:

after updating the filtering rule according to the characteristic data contained in the recommended data, acquiring the video stream data contained in the hypertext transfer protocol data by utilizing the updated filtering rule;

2. The method for obtaining video data according to claim 1, wherein the step of obtaining the filtered data corresponding to the hypertext transfer protocol data by filtering the request data and the response data in the hypertext transfer protocol data according to a preset filtering rule after obtaining the hypertext transfer protocol data comprises the steps of:

Acquiring the hypertext transfer protocol data, and analyzing to obtain header field data of the hypertext transfer protocol data;

acquiring a preset matching feature library, and acquiring the filtering rule corresponding to the matching feature library;

and based on the matching feature library, matching and filtering the video stream data by utilizing the filtering rule to obtain the filtering data corresponding to the hypertext transfer protocol data.

3. The video data acquisition method according to claim 1, wherein the step of acquiring header field data and a destination address of the hypertext transfer protocol data in the filter data, grouping the hypertext transfer protocol data based on the header field data and the destination address to obtain grouping data, and determining recommended data corresponding to the filter data based on the grouping data comprises:

acquiring header field type data, header field length data, header field range data and a uniform resource locator corresponding to the hypertext transfer protocol data in the filtering data, and determining the header field data according to the header field type data, the header field length data and the header field range data;

Acquiring a target address corresponding to the hypertext transfer protocol data in the filtering data, determining key data corresponding to the hypertext transfer protocol data according to the target address, and analyzing value data corresponding to the hypertext transfer protocol data by utilizing the key data;

and determining the recommended data according to the key data, the header field data and the uniform resource locator corresponding to the key data when the key data in the grouping data meets the preset clustering condition.

4. The video data acquisition method according to claim 3, wherein the step of determining the recommended data according to the key data, the header field data, and the uniform resource locator corresponding to the key data satisfying a preset clustering condition in the packet data comprises:

acquiring the key data corresponding to each group in the group data, and acquiring quintuple information corresponding to the hypertext transfer protocol data; wherein the quintuple information comprises IP address information, source port information, destination IP address information, destination port information and transport layer protocol information corresponding to the hypertext transfer protocol data;

If the key data are consistent, clustering the grouping data according to preset clustering conditions, and judging whether the five-tuple information contained in the clusters is consistent and whether the value data are consistent;

and if so, determining the recommended data according to the key data, the header field data and the uniform resource locator.

5. The video data acquisition method according to claim 2, wherein the step of acquiring the video stream data included in the hypertext transfer protocol data using the updated filter rule after updating the filter rule based on the feature data included in the recommendation data, comprises:

updating the matching feature library corresponding to the filtering rule by utilizing the feature data contained in the recommended data;

and acquiring the updated filtering rule based on the matching feature library, and acquiring the video stream data contained in the hypertext transfer protocol data according to the filtering rule.

6. The video data acquisition method according to claim 5, wherein the step of updating the matching feature library corresponding to the filtering rule by using the feature data included in the recommendation data includes:

Acquiring the characteristic data contained in the recommended data, and determining a type mark contained in the characteristic data and corresponding value data thereof;

and acquiring the matching feature library corresponding to the filtering rule, and updating the matching feature library by using the starting value, the ending value and the length data corresponding to the value data.

7. The method for obtaining video data according to claim 2, wherein the step of buffering the video stream data based on the feature data corresponding to the filtering rule to obtain the video data corresponding to the video data comprises:

analyzing and obtaining the video stream data contained in the characteristic data based on the filtering rule;

judging whether the video stream data is complete or not by utilizing header field range data corresponding to the hypertext transfer protocol data in the filtering data;

and if the video stream data is complete data transmission, the video stream data is subjected to sequencing, de-duplication and splicing and then is cached, so that the video data is generated.

8. The method of claim 7, wherein the step of buffering the video stream data after sorting, de-duplication and splicing to generate the video data comprises:

After the video stream data are sequenced, de-duplicated and spliced, quintuple information and a session identifier corresponding to the hypertext transfer protocol data are obtained;

and caching the video stream data by utilizing the query condition to generate the video stream data.

9. A video data acquisition system, the system comprising:

the characteristic analysis module is used for acquiring header field data and a target address of the hypertext transfer protocol data in the filtering data, grouping the hypertext transfer protocol data based on the header field data and the target address to obtain grouping data, and determining recommended data corresponding to the filtering data according to the grouping data;

The data arrangement module is used for acquiring the video stream data contained in the hypertext transfer protocol data by utilizing the updated filtering rule after updating the filtering rule according to the characteristic data contained in the recommended data;

10. An electronic device comprising a processor and a memory, the memory storing computer executable instructions executable by the processor, the processor executing the computer executable instructions to implement the steps of the video data acquisition method of any one of claims 1 to 8.