CN112449212B - Audio and video code stream processing method and device and control equipment - Google Patents

Audio and video code stream processing method and device and control equipment Download PDF

Info

Publication number
CN112449212B
CN112449212B CN201910804736.7A CN201910804736A CN112449212B CN 112449212 B CN112449212 B CN 112449212B CN 201910804736 A CN201910804736 A CN 201910804736A CN 112449212 B CN112449212 B CN 112449212B
Authority
CN
China
Prior art keywords
audio
code stream
information
video
video code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910804736.7A
Other languages
Chinese (zh)
Other versions
CN112449212A (en
Inventor
尹星晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision System Technology Co Ltd
Original Assignee
Hangzhou Hikvision System Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision System Technology Co Ltd filed Critical Hangzhou Hikvision System Technology Co Ltd
Priority to CN201910804736.7A priority Critical patent/CN112449212B/en
Publication of CN112449212A publication Critical patent/CN112449212A/en
Application granted granted Critical
Publication of CN112449212B publication Critical patent/CN112449212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the invention provides a processing method, a device and control equipment of an audio and video code stream, wherein the method is applied to the control equipment in a video monitoring system, and the video monitoring system comprises the control equipment and audio and video acquisition equipment; the method comprises the following steps: in the process that the control equipment extracts the stream from the audio and video acquisition equipment, determining the original packaging format of the original audio and video code stream obtained by extracting the stream; judging whether the original packaging format is matched with the target packaging format; wherein, the target packaging format is: the packaging formats that the control device can support; if not, converting the original packaging format of the original audio and video code stream into a target packaging format; and when the conversion of the packaging format is successful, obtaining a target audio and video code stream to be unpacked. Compared with the prior art, the method and the device for controlling the equipment have the advantage that the universality of the control equipment can be improved by applying the scheme provided by the embodiment of the invention.

Description

Audio and video code stream processing method and device and control equipment
Technical Field
The invention relates to the technical field of video monitoring, in particular to a method and a device for processing an audio and video code stream and control equipment.
Background
The video monitoring system is a component of a safety precaution system and is a comprehensive system with strong precaution capacity. The video monitoring system is visual, convenient and rich in information content, and can be widely applied to many occasions, such as public transportation hubs like stations and airports, and key departments of enterprises like storehouses and research and development laboratories.
The video monitoring system comprises a control device and an audio and video acquisition device installed on a monitored site. The control device is used for acquiring the audio and video code stream generated by the audio and video acquisition device and playing audio and video contents corresponding to the acquired audio and video code stream, so that a user can check the condition of the monitored site. In order to play the audio/video content corresponding to the audio/video code stream, the control device needs to decapsulate the corresponding audio/video code stream first.
In the related art, after the control device acquires the audio/video code stream, the acquired audio/video code stream is directly used as a code stream to be decapsulated, and then the code stream to be decapsulated is decapsulated. However, since the encapsulation formats that can be supported by the control device are fixed and limited, the control device can only decapsulate the audio/video code stream in the limited kinds of encapsulation formats, which results in poor versatility of the control device.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for processing an audio and video code stream and control equipment, so as to improve the universality of the control equipment. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a method for processing an audio/video code stream, where the method is applied to a control device in a video monitoring system, and the video monitoring system includes the control device and an audio/video acquisition device; the method comprises the following steps:
determining an original packaging format of an original audio and video code stream obtained by stream taking in the process of the stream taking of the control equipment from the audio and video acquisition equipment;
judging whether the original packaging format is matched with the target packaging format; wherein the target packaging format is: a packaging format that the control device is capable of supporting;
if not, converting the original packaging format of the original audio and video code stream into a target packaging format;
and when the conversion of the packaging format is successful, obtaining a target audio and video code stream to be unpacked.
Optionally, in a specific implementation manner, the step of determining an original packaging format of the original audio/video code stream includes:
detecting the packaging attribute of the original audio and video code stream to obtain the target packaging attribute of the original audio and video code stream;
respectively determining the matching degree of the target packaging attribute and the packaging attributes of various types of packaging formats;
and determining the packaging format corresponding to the determined highest matching degree as the original packaging format of the original audio and video code stream.
Or, the step of determining the original packaging format of the original audio/video code stream includes:
acquiring packaging information fed back by the audio and video acquisition equipment;
and determining the packaging format included in the packaging information as the original packaging format of the original audio and video code stream.
Optionally, in a specific implementation manner, before the step of determining whether the original packaging format is matched with the target packaging format, the method further includes:
acquiring first type perception information corresponding to the original audio and video code stream;
wherein the first type of perceptual information comprises at least one of the following information: a transport protocol type, a transport control protocol type, and a code stream type.
After the step of obtaining the first type of perceptual information corresponding to the original audio/video code stream, the method further includes:
acquiring second type perception information and/or third type perception information corresponding to the original audio and video code stream;
wherein the second type of perceptual information comprises: extracting the obtained information from the audio and video description information fed back by the audio and video acquisition equipment, wherein the audio and video description information is as follows: in the process that the control equipment fetches the stream from the audio and video acquisition equipment, the audio and video acquisition equipment feeds back information; the third type of perceptual information includes frame attribute information.
Optionally, in a specific implementation manner, the method further includes:
decapsulating and decoding the target audio and video code stream to be decapsulated to obtain decoded audio and video data;
analyzing the decoded audio and video data to obtain fourth type perception information;
wherein the fourth type of perceptual information comprises: audio attribute information and/or video attribute information.
Optionally, in a specific implementation manner, the method further includes:
generating a global report about the original audio and video code stream based on the determined various kinds of perception information;
when the global report includes at least two types of perception information, the method further includes:
judging whether target information representing the same attribute exists in different types of perception information included in the global report;
if yes, judging whether the attribute values of the target information are the same in the different kinds of perception information;
if not, the target information is marked as an outlier in the global report.
In a second aspect, an embodiment of the present invention provides a processing apparatus for an audio/video code stream, which is applied to a control device in a video monitoring system, where the video monitoring system includes the control device and an audio/video acquisition device; the device comprises:
the format determining module is used for determining the original packaging format of the original audio and video code stream obtained by stream taking in the process that the control equipment takes the stream from the audio and video acquisition equipment;
the format judging module is used for judging whether the original packaging format is matched with the target packaging format; if not, triggering the format conversion module; wherein the target packaging format is: a packaging format that the control device is capable of supporting;
the format conversion module is used for converting the original packaging format of the original audio and video code stream into a target packaging format;
and the code stream obtaining module is used for obtaining a target audio and video code stream to be unpacked when the conversion of the packaging format is successful.
Optionally, in a specific implementation manner, the format determining module is specifically configured to:
detecting the packaging attribute of the original audio and video code stream to obtain the target packaging attribute of the original audio and video code stream; respectively determining the matching degree of the target packaging attribute and the packaging attribute of each type of packaging format; determining the packaging format corresponding to the determined highest matching degree as the original packaging format of the original audio and video code stream;
or, the format determination module is specifically configured to:
acquiring packaging information fed back by the audio and video acquisition equipment; and determining the packaging format included in the packaging information as the original packaging format of the original audio and video code stream.
Optionally, in a specific implementation manner, the apparatus further includes:
the first information acquisition module is used for acquiring first type perception information corresponding to the original audio and video code stream before judging whether the original packaging format is matched with the target packaging format;
wherein the first type of perceptual information comprises at least one of the following information: a transmission protocol type, a transmission control protocol type and a code stream type;
the second information acquisition module is used for acquiring second type perception information and/or third type perception information corresponding to the original audio and video code stream after the step of acquiring the first type perception information corresponding to the original audio and video code stream;
wherein the second type of perceptual information comprises: extracting the obtained information from the audio and video description information fed back by the audio and video acquisition equipment, wherein the audio and video description information is as follows: in the process that the control equipment fetches the stream from the audio and video acquisition equipment, the information fed back by the audio and video acquisition equipment; the third type of perceptual information includes frame attribute information.
Optionally, in a specific implementation manner, the apparatus further includes:
the third information acquisition module is used for decapsulating and decoding the target audio/video code stream to be decapsulated to obtain decoded audio/video data; analyzing the decoded audio and video data to obtain fourth type perception information;
wherein the fourth type of perceptual information includes: audio attribute information and/or video attribute information.
Optionally, in a specific implementation manner, the apparatus further includes:
the report generation module is used for generating a global report related to the original audio and video code stream based on the determined various kinds of perception information;
when the global report includes at least two types of perception information, the apparatus further includes:
the information judgment module is used for judging whether target information representing the same attribute exists in different types of perception information contained in the global report; if yes, triggering an attribute value judgment module;
the attribute value judging module is used for judging whether the attribute values of the target information are the same in the different types of perception information; if the difference is not the same, triggering an information marking module;
the information marking module is used for marking the target information as an abnormal item in the global report.
In a third aspect, an embodiment of the present invention provides a control device, including a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface complete communication between the memory and the processor through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of any audio and video code stream processing method provided by the first aspect when executing the program stored in the memory.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of any one of the processing methods for an audio/video code stream provided in the first aspect.
As can be seen from the above, with the adoption of the scheme provided by the embodiment of the present invention, when the control device obtains the original audio/video code stream generated by the audio/video acquisition device, the control device can determine the original packaging format of the original audio/video code stream, and when the original packaging format is judged to be not matched with the target packaging format, the control device converts the original packaging format of the original audio/video code stream into the target packaging format. Therefore, when the conversion of the packaging format is successful, the target audio/video code stream to be unpacked can be obtained. Therefore, the original audio and video code stream with the original packaging format not matched with the packaging format which can be supported by the control equipment can be converted into the target audio and video code stream which can be unpacked by the control equipment through the packaging format conversion, and the universality of the control equipment is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for processing an audio/video code stream according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of another method for processing an audio/video code stream according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of another method for processing an audio/video code stream according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a device for processing an audio/video code stream according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a control device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the related art, after the control device acquires the audio/video code stream, the acquired audio/video code stream is directly used as the code stream to be decapsulated, and then the code stream to be decapsulated is decapsulated. However, since the encapsulation format that can be supported by the control device is fixed and limited, the control device can only decapsulate the audio/video code stream in the limited kinds of encapsulation formats, which results in poor versatility of the control device. In order to solve the above technical problem, an embodiment of the present invention provides a method for processing an audio/video code stream.
First, a method for processing an audio/video code stream provided in an embodiment of the present invention is described below.
Fig. 1 is a schematic flow diagram of a processing method of an audio/video code stream according to an embodiment of the present invention. The method is applied to control equipment in a video monitoring system, and the video monitoring system comprises the control equipment and audio and video acquisition equipment.
The control device may be any type of electronic device in a Video monitoring system, for example, an NVR (Network Video Recorder), a smart phone, a desktop computer, a tablet computer, a server, and the like. The control device can execute corresponding software codes through the processor to realize the method for processing the audio and video code stream provided by the embodiment of the invention.
Specifically, the processing method of the audio/video code stream can be applied to a processing device of the audio/video code stream in the control equipment, and the processing device can be a new device arranged in the control equipment to realize the processing of the audio/video code stream and can communicate with other original devices in the control equipment; or a new plug-in unit is installed in some original device in the control equipment to realize the processing of the audio and video code stream. This is all reasonable.
As shown in fig. 1, a method for processing an audio/video code stream according to an embodiment of the present invention may include the following steps:
s101: in the process that the control equipment extracts the stream from the audio and video acquisition equipment, determining the original packaging format of the original audio and video code stream obtained by extracting the stream;
and the audio and video acquisition equipment in the video monitoring system is used for generating audio and video code stream information about a monitored site. Specifically, the audio/video acquisition device can acquire and monitor video images of a monitored site to obtain original video data and original audio data of the monitored site. And then, video coding and audio coding are respectively carried out on the obtained original video data and the original audio data through a preset coding format, and a video coding result and an audio coding result obtained through coding are packaged, so that audio and video code stream information about a monitored site is obtained.
Because the audio and video acquisition equipment and the control equipment in the video monitoring system are in communication connection, the control equipment can acquire audio and video code streams from the audio and video acquisition equipment. The process of acquiring the audio/video code stream from the audio/video code stream by the control device may be referred to as a stream taking process.
Optionally, the control device may implement streaming from the audio/video capture device based on a type of a transmission protocol with the audio/video capture device. Specifically, when the control device fetches a stream from the audio/video acquisition device, a transmission protocol type between the control device and the audio/video acquisition device is determined, and stream fetching information required during stream fetching is obtained through analysis according to the transmission protocol type, so that a stream fetching layer corresponding to the transmission protocol type is established.
After the stream taking layer corresponding to the transmission protocol type is created, the control device can take the stream from the audio and video acquisition device by using the stream taking information obtained by analysis through the stream taking layer to obtain the audio and video code stream generated by the audio and video acquisition device. The obtained audio/video code stream may be referred to as an original audio/video code stream. Further, in the process that the control device fetches the stream from the audio/video acquisition device, the control device may determine the packaging format of the original audio/video code stream in the process of obtaining the original audio/video code stream. The determined packaging format of the original audio and video code stream can be called as an original packaging format.
The Streaming information corresponding to different types of transmission protocols is different, for example, the Streaming information corresponding to rtsp (Real Time Streaming Protocol), hls (HTTP Live Streaming, HTTP-based Streaming media network transmission Protocol), HTTP (HyperText Transfer Protocol), mms (multimedia Management System, streaming media transmission Protocol), rtmp (Real Time Messaging Protocol) includes: an IP (Internet Protocol Address) of the audio/video acquisition device and a port number used for acquiring an audio/video code stream; the streaming information corresponding to onvif (Open Network Video Interface Forum) includes: the IP, the user name and the password of the audio and video acquisition equipment, the channel number for transmitting the audio and video code stream, the port number for acquiring the audio and video code stream, the information of the main code stream and the sub code stream and the like are obtained. Based on the stream fetching information corresponding to different transmission protocol types, the stream fetching layers corresponding to different transmission protocol types are also different. The stream fetching information and the stream fetching layer corresponding to several stream fetching protocols are shown in the following table.
Fetching information Streaming protocol Created fetch layer
rtsp://10.17.39.17:1935/vod/sample.mp4 rtsp rtsp protocol stream fetching layer
hls://10.17.39.17:1935/vod/mp4:sample.mp4 hls hls protocol stream fetching layer
http://10.17.39.17:1935/vod/mp4:sample.mp4/playlist.m3u8 http http protocol stream fetching layer
mms://10.17.34.107:1935/vod/mp4:sample.mp4 mms mms protocol stream fetching layer
rtmp://10.17.34.107:1935/vod/mp4:sample.mp4 rtmp rtmp protocol stream fetching layer
onvif://10.17.33.13:8000:0:MAIN:admin:Abc12345:TCP onvif onvif protocol stream fetching layer
In addition, in step S101, the control device may determine the original packaging format of the original audio/video code stream obtained by fetching the stream in multiple ways, which is not limited in this embodiment of the present invention. For convenience of description, a specific manner in which the control apparatus performs the above-described step S101 will be described by way of example.
S103: judging whether the original packaging format is matched with the target packaging format; if not, executing S104;
wherein, the target packaging format is: the packaging formats that the control device is capable of supporting;
s104: converting the original packaging format of the original audio and video code stream into a target packaging format;
s105: and when the conversion of the packaging format is successful, obtaining a target audio and video code stream to be unpacked.
Thus, for the obtained original audio and video code stream, after determining the original packaging format of the original audio and video code stream, the control device can determine whether the original packaging format is matched with the packaging format that can be supported by the control device, that is, whether the original packaging format is matched with the target packaging format.
Furthermore, when the determination result is yes, it can be described that the control device can support the original encapsulation format of the original audio/video code stream, that is, the control device can decapsulate the original audio/video code stream. Therefore, the control equipment does not need to convert the original packaging format of the original audio and video code stream.
Correspondingly, when the judgment result is negative, it can be stated that the control device cannot support the original encapsulation format of the original audio/video code stream, that is, the control device cannot decapsulate the original audio/video code stream. Therefore, in order to ensure that the control device can play the audio/video content corresponding to the original audio/video code stream, the control device needs to convert the original packaging format of the original audio/video code stream into the target format.
Furthermore, after the packaging format of the original audio/video code stream is successfully converted into the target format, the control device can decapsulate the target audio/video code stream after the packaging format is successfully converted, because the target format is the packaging format that the control device can support. Based on the method, when the conversion of the packaging format is successful, the control equipment can obtain the target audio and video code stream to be unpacked. Therefore, the control equipment can decapsulate the obtained target audio/video code stream to be decapsulated.
Specifically, the process of converting the original packaging format of the original audio/video code stream into the target format by the control device may be: detecting the original audio and video code stream to obtain an original packaging format of the original audio and video code stream, and then disassembling the obtained original packaging format according to the characteristics of the obtained original packaging format, so as to disassemble to obtain original code stream data of the original audio and video code stream, namely obtaining a bare code stream corresponding to the original audio and video code stream. And then, encapsulating the bare code stream corresponding to the obtained original audio and video code stream according to the encapsulation requirement of the target format, thereby obtaining the original audio and video code stream of which the encapsulation format is converted into the target format, namely obtaining the target audio and video code stream to be decapsulated.
As can be seen from the above, with the adoption of the scheme provided by the embodiment of the present invention, when the control device obtains the original audio/video code stream generated by the audio/video acquisition device, the control device can determine the original packaging format of the original audio/video code stream, and when the original packaging format is judged to be not matched with the target packaging format, the control device converts the original packaging format of the original audio/video code stream into the target packaging format. Therefore, when the conversion of the packaging format is successful, the target audio/video code stream to be unpacked can be obtained. Therefore, the original audio and video code stream with the original packaging format not matched with the packaging format which can be supported by the control equipment can be converted into a target audio and video code stream which can be unpacked by the control equipment through packaging format conversion, and therefore the universality of the control equipment is improved.
Next, a manner of determining the original packaging format of the original audio/video code stream obtained by streaming in step S101 performed by the control device is described as an example.
Optionally, in a specific implementation manner, the manner in which the control device executes the original encapsulation format of the original audio/video code stream obtained by stream fetching in step S101 may include the following steps A1 to A3:
step A1: detecting the packaging attribute of the original audio and video code stream to obtain the target packaging attribute of the original audio and video code stream;
step A2: respectively determining the matching degree of the target packaging attribute and the packaging attribute of each type of packaging format;
step A3: and determining the packaging format corresponding to the determined highest matching degree as the original packaging format of the original audio and video code stream.
In this specific implementation manner, the control device may perform encapsulation attribute detection on the original audio/video code stream, so as to obtain a target encapsulation attribute of the original audio/video code stream.
Thus, since the control device can obtain the package attributes of various package formats, the control device can determine the matching degree of the target package attribute and the package attribute of each type of package format, and obtain the maximum value of the determined matching degrees. Furthermore, the control device can determine the packaging format corresponding to the determined maximum value of the matching degree as the original packaging format of the original audio and video code stream.
That is to say, in the present specific implementation manner, the control device determines, according to the characteristics of each type of encapsulation format, the possibility that the encapsulation attribute of the original audio/video code stream belongs to each type of encapsulation format, so as to determine the encapsulation format with the highest possibility as the original encapsulation format of the original audio/video code stream. The possibility that the packaging attribute of the original audio/video code stream belongs to each type of packaging format is characterized by the matching degree of the determined target packaging attribute and the packaging attribute of each type of packaging format. Obviously, the greater the matching degree between the determined target package attribute and the package attribute of a certain type of package format is, the greater the possibility that the original package format of the original audio/video code stream is the type of package format is.
Specifically, for the packaging Format in which the feature code exists at the beginning of the file, such as avi (Audio Video Interleaved Format), rm/rmvb (real media/real media Variable Bitrate bit, streaming Video file Format/Video file Format with Variable bit rate), flv (FlashVideo), mkv (matcha Video, matcha multimedia container), asf (Advanced Streaming Format), in the above step A1, the feature code can be used as the packaging attribute of the type Format, so that in the above step A1, it is determined whether the feature code exists at the beginning of the original Audio/Video code stream by detection, and when the feature code exists, the specific content of the feature code; for streaming media formats such as TS (Transport Stream), the 0x47 sync byte and pid can be used as the encapsulation attribute of the type format, so that in the step A1, the 0x47 sync byte and pid of the original audio/video code Stream are determined by probing.
When the control device executes the step A2, the matching degree between the target package attribute and the package attribute of each type of package format may be determined in various ways.
For example, for the package attribute of each type of package format, the control device may determine the number of the package attributes of the type of package format and the same attributes in the target package attribute, and further calculate a ratio of the number of the same attributes to the number of each attribute in the target package attribute, so as to determine the calculated ratio as a matching degree between the target package attribute and the package attribute of the type of package format.
For another example, the control device may set a weight for each attribute in the target package attributes in advance, so that, for the package attributes of each type of package format, the control device may calculate a weight sum of the package attributes of the type of package format and the same attribute in the target package attributes, and determine the calculated weight sum as a matching degree between the target package attributes and the package attributes of the type of package format.
Optionally, in another specific implementation manner, the manner in which the control device determines the original packaging format of the original audio/video code stream obtained by fetching the stream in step S101 may include the following steps B1 to B2:
step B1: acquiring encapsulation information fed back by audio and video acquisition equipment;
and step B2: and determining the packaging format included in the packaging information as the original packaging format of the original audio and video code stream.
In this specific implementation manner, when the control device obtains an original audio/video code stream by using the stream-taking information obtained by analysis through the created stream-taking layer and by using the stream-taking information obtained by analysis, the audio/video acquisition device may feed back relevant information of the original audio/video code stream to the control device. Such as codec format, encapsulation information, etc.
Therefore, the control equipment can acquire the encapsulation information fed back by the audio and video acquisition equipment in the process of acquiring the stream from the audio and video acquisition equipment to obtain the original audio and video code stream. Further, the packaging format included in the packaging information can be determined as the original packaging format of the original audio/video code stream.
When the transmission protocol between the control device and the audio/video acquisition device is different, the related information of the original audio/video code stream fed back by the audio/video device is also different in the process that the control device takes the stream from the audio/video device. For example, when the transmission Protocol is rtsp, the relevant information of the original audio/video code stream fed back by the audio/video device is SDP (Session Description Protocol) Description information, which describes the initialization parameter information of the original audio/video code stream; when the transmission protocol is rtmp and onvif, the related information of the original audio and video code stream fed back by the audio and video equipment is frame header information, and the frame header information includes the encoding and decoding format, the packaging information and the like of the original audio and video code stream.
In addition, in order to be able to globally recognize the obtained original audio/video code stream and/or the target audio/video code stream after the format conversion is successful, so as to perform global analysis, the control device expects to be able to obtain the relevant sensing information of the original audio/video code stream and/or the target audio/video code stream.
Based on this, optionally, in a specific implementation manner, as shown in fig. 2, before the step S103, the method for processing the audio/video code stream may further include the step S102:
s102: acquiring first-class sensing information corresponding to an original audio and video code stream;
wherein the first type of perceptual information comprises at least one of the following information: a transmission protocol type, a transmission control protocol type and a code stream type;
specifically, in practical application, when establishing a communication connection between the control device and the audio/video acquisition device, a user may specify in advance a transmission protocol type, a transmission control protocol type, and a code stream type corresponding to an original audio/video code stream. Therefore, when the control equipment realizes the stream taking from the audio and video acquisition equipment based on the transmission protocol type between the control equipment and the audio and video acquisition equipment, the control equipment can obtain the transmission protocol type, the transmission control protocol type and the code stream type corresponding to the original audio and video code stream. In this way, the control device can determine at least one of the transmission protocol type, the transmission control protocol type and the code stream type as the first type of perception information corresponding to the original audio/video code stream. Therefore, the control equipment can acquire the first type of perception information corresponding to the original audio and video code stream.
Therefore, in this specific implementation manner, the control device may obtain the first type of sensing information corresponding to the original audio/video code stream, that is, obtain at least one type of information among the transmission protocol type, the transmission control protocol type, and the code stream type of the original audio/video code stream.
Further, in the process that the control device fetches the stream from the audio/video acquisition device, the audio/video acquisition device may feed back information related to the audio/video code stream to the control device, that is, feed back the audio/video description information of the audio/video code stream to the control device. Therefore, after the control device obtains the audio and video description information, the control device can also extract information such as a transmission protocol type, a coding and decoding format, a receiving and transmitting mode, time zone information, a frame rate and the like corresponding to the original audio and video code stream from the obtained audio and video description information. The information extracted by the control device from the audio and video description information fed back by the audio and video acquisition device can be determined as second type perception information corresponding to the original audio and video code stream. And, the second type of perceptual information may further include: the control device obtains the information extracted from the audio and video description information fed back by the audio and video acquisition device by the other control devices.
Furthermore, after the original audio and video code stream is obtained and the original packaging format of the original audio and video is determined, the control device can perform frame header analysis on the obtained original audio and video code stream according to the obtained original packaging format, so as to obtain the frame attribute information of the original audio and video code stream. Wherein, the frame attribute information may include: time stamp information, absolute time information, packet type, frame rate, frame number, whether to encrypt, encryption type during encryption and other information of the original audio and video code stream. The control device can determine the frame attribute information of the original audio and video code stream as third sensing information corresponding to the original audio and video code stream by performing pillow analysis on the original audio and video code stream. And, the third type of perceptual information may further include: the information extracted from the frame attribute information by the other control device, which is acquired by the control device from the other control device, includes, for example, an encapsulation version, a stream type, whether a key frame exists, and encapsulation global time information corresponding to the original audio/video code stream.
For the convenience of understanding the frame attribute information, taking H264 as PS encapsulation as an example, the following is specifically described:
for video data: each IDR (Instantaneous Decoding Refresh) NALU (Network Abstraction Layer Units) may contain an NALU such as SPS (Sequence Parameter Set), PPS (Picture Parameter Set), etc. before it, so the NALUs of SPS, PPS, IDR are encapsulated into one PS (Program Stream) packet, including a PS header, and then PS system header, PS system map, and PES header + h264raw are added. Thus, the outer-to-inner order of an IDR NALU PS packet is: PSheader | PS system header | PS system Map | PES header | h264raw data. For the PS packets of other non-key frames, the PS header and the PES header can be directly added, so that the PS packets of one non-key frame have the following sequence from outside to inside: PS header | PES header | h264raw data.
Further, the audio data may also be packetized into PS packets, i.e. when there is audio data, a PES header is added to the data, and the added audio data is put into the video PES. Therefore, the video and audio code stream after encapsulation can be obtained, and the frame header attributes and the sequence of the video code stream are as follows: PS packet = PS header | PES (video) | PES (audio).
Based on this, optionally, in a specific implementation manner, after the control device executes the step S102 and acquires the first type of sensing information corresponding to the original audio/video code stream, the method for processing the audio/video code stream may further include the following step C1:
step C1: acquiring second type perception information or third type perception information corresponding to an original audio and video code stream;
wherein the second type of perceptual information comprises: extracting the obtained information from the audio and video description information fed back by the audio and video acquisition equipment, wherein the audio and video description information is as follows: in the process that the control equipment fetches the stream from the audio and video acquisition equipment, the audio and video acquisition equipment feeds back information; the third type of perceptual information comprises frame attribute information.
Specifically, in this specific implementation manner, the type of the perception information acquired by the control device may be any one of the following situations, where the situation 1 is: acquiring first type perception information and second type perception information; case 2 is: acquiring first type perception information and third type perception information; case 3 is: and acquiring the first type of perception information, the second type of perception information and the third type of perception information. This is all reasonable.
In addition, the execution sequence of the step S102 and the step S103 may be to execute the step S102 first and then execute the step S103; step S102 may be executed first, and then step S103 may be executed; step S102 and step S103 may also be performed simultaneously. This is all reasonable. When the processing method of the audio/video code stream includes the step C1, it is only necessary to ensure that the step C1 is executed in the step S102.
Optionally, in another specific implementation manner, as shown in fig. 3, on the basis of the specific implementation manner shown in fig. 2, the processing method of the audio/video code stream may further include step S106:
s106: decapsulating and decoding a target audio/video code stream to be decapsulated to obtain decoded audio/video data; analyzing the decoded audio and video data to obtain fourth type perception information;
wherein the fourth type of perceptual information comprises: audio attribute information and/or video attribute information.
In the specific implementation mode, after the target audio/video code stream to be decapsulated is obtained, the control device may decapsulate and decode the target audio/video code stream to be decapsulated, thereby obtaining decoded audio/video data; furthermore, the control device can analyze the decoded audio/video data to obtain a fourth type of perception information, namely audio attribute information and/or video attribute information. And the obtained fourth type perception information aims at a target audio and video code stream obtained after the encapsulation conversion is successful.
Wherein the audio attribute information may include: the video attribute information may include information such as sampling frequency, sampling bit number, channel number, bit rate, baud rate, and the like: image size, code stream, bit rate, timestamp, etc.
In addition, in this specific implementation, the control device may acquire only the audio attribute information or the video attribute information as the fourth type of perceptual information, and may also acquire the audio attribute information and the video attribute information as the fourth type of perceptual information. This is all reasonable.
Thus, in this specific implementation, the perception information obtained by the control device may include at least the first type of perception information and the fourth type of perception information.
Specifically, the method comprises the following steps: the type of the perception information acquired by the control device may be any one of the following situations, wherein the situation 1 is: acquiring first type perception information and fourth type perception information; case 2 is: acquiring first type perception information, second type perception information and fourth type perception information; case 3 is: acquiring first type perception information, third type perception information and fourth type perception information; case 4 is: and acquiring the first type of perception information, the second type of perception information, the third type of perception information and the fourth type of perception information. This is all reasonable.
It should be noted that, in the processing method of the audio/video code stream provided in the embodiment of the present invention, the processing method may be applied to a processing device of the audio/video code stream in the control device.
Therefore, optionally, after the target audio/video code stream to be decapsulated is obtained, the processing device may send the target audio/video code stream to another device in the control device, decapsulate and decode the target audio/video code stream by using the other device, and the other device may feed back the audio/video data obtained after decoding to the processing device.
In addition, optionally, after the target audio/video code stream to be decapsulated is obtained, the processing device may also directly decapsulate and decode the target audio/video code stream to obtain decoded audio/video data.
Therefore, in this specific implementation manner, after the target audio/video code stream is decapsulated and decoded, the decoded audio/video data obtained by the processing device may be the audio/video data fed back by other devices after the target audio/video code stream is decapsulated and decoded; or the processing device directly decapsulates and decodes the target audio/video code stream to obtain decoded audio/video data. This is all reasonable.
Compared with the case that the conversion of the encapsulation format is successful in the step S105, when the control device obtains the target audio/video code stream to be decapsulated, and determines that the original encapsulation format of the original audio/video code stream is not matched with the target encapsulation format, and converts the original encapsulation format of the original audio/video code stream into the target encapsulation format, a case that the conversion of the encapsulation format fails due to reasons such as an error in a conversion path, an unusual encapsulation format of the original encapsulation format, and an inability of the monitoring device to perform the encapsulation format conversion may occur. In this case, the monitoring device also expects to obtain the relevant perceptual information of the original audio/video code stream.
Based on this, optionally, in a specific implementation manner, when the conversion of the encapsulation format is unsuccessful, in order to know the relevant information of the original audio/video code stream, the method for processing the audio/video code stream provided in the embodiment of the present invention may further include step D1:
step D1: detecting the original audio and video code stream to obtain fifth type perception information corresponding to the original audio and video code stream;
wherein the fifth type of perceptual information comprises: audio attribute information and/or video attribute information.
Because the original packaging format of the original audio and video code stream is not matched with the target packaging format, the control device cannot decapsulate and decode the original audio and video code stream. Therefore, the monitoring equipment can detect the original audio and video code stream to obtain the fifth type perception information corresponding to the original audio and video code stream.
The fifth type of sensing information may include audio attribute information and/or video attribute information of an original audio/video code stream. The audio attribute information may include: the video attribute information may include information such as sampling frequency, sampling bit number, channel number, bit rate, baud rate, and the like: image size, code stream, bit rate, timestamp, etc.
In addition, when the original audio and video code stream is detected, the packaging attribute of the original audio and video code stream, such as the original packaging format, can also be obtained. Further, the obtained encapsulation attribute of the original audio/video code stream can also be used as the fifth type of perception information corresponding to the original audio/video code stream.
In this way, in this specific implementation manner, before executing step S103, the monitoring device may execute step S102 to obtain the first type of perceptual information corresponding to the original audio/video code stream. And further, when the format conversion is unsuccessful, executing the step D1 to obtain fifth type perception information. Therefore, the fifth type perception information corresponding to the original audio and video code stream can be comprehensively and accurately obtained by detecting the original audio and video code stream.
Based on the method, the control device can at least obtain the first type of perception information and the fifth type of perception information corresponding to the original audio and video code stream aiming at the original audio and video code stream which has the original packaging format and has unsuccessful format conversion.
Specifically, the method comprises the following steps: the type of the perception information acquired by the control device may be any one of the following situations, wherein the situation 1 is: acquiring first type perception information and fifth type perception information; case 2 is: acquiring first type perception information, second type perception information and fifth type perception information; case 3 is: acquiring first type perception information, third type perception information and fifth type perception information; case 4 is: and acquiring the first type of perception information, the second type of perception information, the third type of perception information and the fifth type of perception information. This is all reasonable.
In addition, compared with the original audio and video code stream of which the original packaging format is not matched with the target packaging format, for the original audio and video code stream of which the original packaging format is matched with the target packaging format, at least the first type of perception information of the original audio and video code stream of which the original packaging format is matched with the target packaging format can be obtained. In addition, at least one type of perception information in the second type of perception information and/or the third type of perception information of the original audio and video code stream, which is matched with the target packaging format, can be obtained. Furthermore, after the original audio and video code stream with the original packaging format matched with the target packaging format is unpackaged and decoded, the audio and video data obtained after decoding is analyzed, and sixth type perception information of the original audio and video code stream with the original packaging format matched with the target packaging format is obtained, wherein the sixth type perception information comprises audio attribute information and video attribute information.
That is to say, for an original audio/video code stream with an original packaging format matching a target packaging format, in the embodiment of the present invention, at least first type sensing information and sixth type sensing information of the original audio/video code stream with the original packaging format matching the target packaging format may also be obtained.
It can be understood that, in each of the above specific implementation manners, for each obtained original audio/video code stream, the control device in the video monitoring system may obtain at least one type of perception information corresponding to the original audio/video code stream. Based on this, the monitoring device in the video monitoring system can acquire more information about the original audio/video code stream. Furthermore, the monitoring device in the video monitoring system can provide at least one type of perception information corresponding to the acquired original audio/video code stream for the user to view through various forms such as tables, reports and the like, so that the user can improve the understanding of the original audio/video code stream.
After obtaining various kinds of perception information corresponding to the original audio and video code stream and/or the target audio and video code stream, technical personnel can conveniently check the obtained various kinds of perception information, and the overall analysis can be better carried out on the original audio and video code stream and/or the target audio and video code stream. Optionally, in a specific implementation manner, the processing method of the audio/video code stream may further include the following step E1:
step E1: and generating a global report about the original audio and video code stream based on the determined various kinds of perception information.
In this specific implementation manner, the control device may generate a global report about the original audio/video code stream based on the determined various types of perception information. Wherein, the acquisition source of each type of perception information is different.
When the above-mentioned fourth type of perceptual information exists in the global report, the fourth type of perceptual information is audio attribute information and/or video attribute information about a target audio/video code stream. The target audio and video code stream is obtained by successfully converting the packaging format of the original audio and video code stream, so the fourth sensing information can also be used as indirect sensing information about the original audio and video code stream.
Further, in the description of the above specific implementation manners, it can be found that: information characterizing the same attribute may be present in different kinds of perceptual information. For example, the frame rate may be included in both the second type of perceptual information and the third type of perceptual information. Ideally, the attribute values of information representing the same attribute present in different kinds of perceptual information should be the same.
Based on this, in a specific implementation manner, when the global report includes at least two types of perceptual information, the processing method of the audio/video code stream may further include the following steps F1 to F3:
step F1: judging whether target information representing the same attribute exists in different types of perception information included in the global report; if yes, executing step F2;
step F2: judging whether the attribute values of the target information are the same in different types of perception information; if not, executing the step F3;
step F3: the target information is marked as an outlier in the global report.
In this specific implementation manner, when the global report includes the at least two types of sensing information, the control device may determine whether target information representing the same attribute exists in different types of sensing information included in the global report; if yes, judging whether the attribute values of the target information are the same in different types of perception information; and when the judgment results are different, the target information about the original audio/video code stream is abnormal. In this way, the control device can mark the target information as an abnormal item.
For example, the global report includes first type of sensing information, second type of sensing information, third type of sensing information, and fourth type of sensing information, and the frame rate is included in both the second type of sensing information and the third type of sensing information. In this way, the control device can determine whether the specific values of the frame rates included in the second type of perceptual information and the third type of perceptual information are the same. If not, the control device may mark the frame rate as an outlier.
The embodiment of the invention also provides a processing device of the audio and video code stream.
Fig. 4 is a schematic structural diagram of a device for processing an audio/video code stream according to an embodiment of the present invention. The device is applied to control equipment in a video monitoring system, and the video monitoring system comprises the control equipment and audio and video acquisition equipment. As shown in fig. 4, the apparatus may include the following modules:
the format determining module 410 is configured to determine an original packaging format of an original audio/video code stream obtained by stream fetching in a process that the control device fetches a stream from the audio/video acquisition device;
a format determining module 420, configured to determine whether the original packaging format matches the target packaging format; if not, triggering the format conversion module; wherein, the target packaging format is: the packaging formats that the control device is capable of supporting;
the format conversion module 430 is configured to convert an original packaging format of the original audio/video code stream into a target packaging format;
and a code stream obtaining module 440, configured to obtain a target audio/video code stream to be decapsulated when the conversion of the encapsulation format is successful.
As can be seen from the above, with the adoption of the scheme provided by the embodiment of the present invention, when the control device obtains the original audio/video code stream generated by the audio/video acquisition device, the control device can determine the original packaging format of the original audio/video code stream, and when the original packaging format is judged to be not matched with the target packaging format, the control device converts the original packaging format of the original audio/video code stream into the target packaging format. Therefore, when the conversion of the packaging format is successful, the target audio/video code stream to be unpacked can be obtained. Therefore, the original audio and video code stream with the original packaging format not matched with the packaging format which can be supported by the control equipment can be converted into the target audio and video code stream which can be unpacked by the control equipment through the packaging format conversion, and the universality of the control equipment is improved.
Optionally, in a specific implementation manner, the format determining module 410 may be specifically configured to:
detecting the packaging attribute of the original audio and video code stream to obtain the target packaging attribute of the original audio and video code stream; respectively determining the matching degree of the target packaging attribute and the packaging attribute of each type of packaging format; and determining the packaging format corresponding to the determined highest matching degree as the original packaging format of the original audio and video code stream.
Alternatively, the format determining module 410 may be specifically configured to:
acquiring encapsulation information fed back by audio and video acquisition equipment; and determining the packaging format included in the packaging information as the original packaging format of the original audio and video code stream.
Optionally, in a specific implementation manner, the processing device for the audio/video code stream may further include:
the first information acquisition module is used for acquiring first-class sensing information corresponding to an original audio and video code stream before judging whether the original packaging format is matched with the target packaging format;
wherein the first type of perceptual information comprises at least one of the following information: a transport protocol type, a transport control protocol type, and a code stream type.
The second information acquisition module is used for acquiring second type perception information and/or third type perception information corresponding to the original audio and video code stream after the step of acquiring the first type perception information corresponding to the original audio and video code stream;
wherein the second type of perceptual information includes: extracting the obtained information from the audio and video description information fed back by the audio and video acquisition equipment, wherein the audio and video description information is as follows: in the process that the control equipment fetches the stream from the audio and video acquisition equipment, the audio and video acquisition equipment feeds back information; the third type of perceptual information comprises frame attribute information.
Optionally, in a specific implementation manner, the processing device for the audio/video code stream may further include:
the third information acquisition module is used for decapsulating and decoding a target audio/video code stream to be decapsulated to obtain decoded audio/video data, and analyzing the decoded audio/video data to obtain fourth-class sensing information;
wherein the fourth type of perceptual information comprises: audio attribute information and/or video attribute information.
Optionally, in a specific implementation manner, the processing device for the audio/video code stream may further include:
and the report generation module is used for generating a global report related to the original audio and video code stream based on the determined various kinds of perception information.
When the global report includes at least two types of perceptual information, the processing device for the audio/video code stream may further include:
the information judgment module is used for judging whether target information representing the same attribute exists in different types of perception information included in the global report; if yes, triggering an attribute value judgment module;
the attribute value judging module is used for judging whether the attribute values of the target information are the same in different types of perception information; if the difference is not the same, triggering an information marking module;
and the information marking module is used for marking the target information as an abnormal item in the global report.
Corresponding to the method for processing an audio/video code stream provided in the foregoing embodiment of the present invention, an embodiment of the present invention further provides a control device, as shown in fig. 5, including a processor 501, a communication interface 502, a memory 503 and a communication bus 504, where the processor 501, the communication interface 502 and the memory 503 complete mutual communication through the communication bus 504,
a memory 503 for storing a computer program;
the processor 501 is configured to implement the processing method of the audio/video code stream provided by the above embodiment of the present invention when executing the program stored in the memory 503.
Specifically, the processing method of the audio and video code stream is applied to a control device in a video monitoring system, wherein the video monitoring system comprises the control device and an audio and video acquisition device; the method comprises the following steps:
in the process that the control equipment extracts the stream from the audio and video acquisition equipment, determining the original packaging format of the original audio and video code stream obtained by extracting the stream;
judging whether the original packaging format is matched with the target packaging format; wherein, the target packaging format is: the packaging formats that the control device can support;
if not, converting the original packaging format of the original audio and video code stream into a target packaging format;
and when the conversion of the packaging format is successful, obtaining a target audio and video code stream to be unpacked.
It should be noted that other implementation manners of the processing method for an audio/video code stream, which is implemented by the processor 501 executing the program stored in the memory 503, are the same as the embodiment of the processing method for an audio/video code stream provided in the foregoing method embodiment, and are not described here again.
As can be seen from the above, with the adoption of the scheme provided by the embodiment of the present invention, when the control device obtains the original audio/video code stream generated by the audio/video acquisition device, the original packaging format of the original audio/video code stream can be determined, and when the original packaging format is judged to be not matched with the target packaging format, the original packaging format of the original audio/video code stream is converted into the target packaging format. Therefore, when the conversion of the packaging format is successful, the target audio/video code stream to be unpacked can be obtained. Therefore, the original audio and video code stream with the original packaging format not matched with the packaging format which can be supported by the control equipment can be converted into a target audio and video code stream which can be unpacked by the control equipment through packaging format conversion, and therefore the universality of the control equipment is improved.
The communication bus mentioned above for the control device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the control device and other devices.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
Corresponding to the method for processing an audio/video code stream provided in the embodiment of the present invention, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method for processing an audio/video code stream provided in the embodiment of the present invention is implemented.
Specifically, the processing method of the audio and video code stream is applied to a control device in a video monitoring system, wherein the video monitoring system comprises the control device and an audio and video acquisition device; the method comprises the following steps:
in the process that the control equipment extracts the stream from the audio and video acquisition equipment, determining the original packaging format of the original audio and video code stream obtained by extracting the stream;
judging whether the original packaging format is matched with the target packaging format; wherein, the target packaging format is: the packaging formats that the control device can support;
if not, converting the original packaging format of the original audio and video code stream into a target packaging format;
and when the conversion of the packaging format is successful, obtaining a target audio/video code stream to be unpacked.
It should be noted that other implementation manners of the method for processing an audio/video code stream, which are implemented when the computer program is executed by the processor, are the same as the embodiments of the method for processing an audio/video code stream provided in the foregoing method embodiment, and are not described herein again.
As can be seen from the above, with the adoption of the scheme provided by the embodiment of the present invention, when the computer program is executed by the processor, when the original audio/video code stream generated by the audio/video acquisition device is acquired, the original packaging format of the original audio/video code stream can be determined, and when the original packaging format is judged to be not matched with the target packaging format, the original packaging format of the original audio/video code stream is converted into the target packaging format. Therefore, when the conversion of the packaging format is successful, the target audio/video code stream to be unpacked can be obtained. Therefore, the original audio and video code stream with the original packaging format not matched with the packaging format which can be supported by the control equipment can be converted into a target audio and video code stream which can be unpacked by the control equipment through packaging format conversion, and therefore the universality of the control equipment is improved.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. The method for processing the audio and video code stream is characterized by being applied to control equipment in a video monitoring system, wherein the video monitoring system comprises the control equipment and audio and video acquisition equipment; the method comprises the following steps:
determining an original packaging format of an original audio and video code stream obtained by stream taking in the process of the control equipment taking the stream from the audio and video acquisition equipment;
judging whether the original packaging format is matched with the target packaging format; wherein the target packaging format is: a packaging format that the control device is capable of supporting;
if not, converting the original packaging format of the original audio and video code stream into a target packaging format;
when the conversion of the packaging format is successful, obtaining a target audio/video code stream to be unpacked;
acquiring at least two types of perception information in the process of processing the original audio and video code stream and/or the target audio and video code stream; the processing comprises at least one of fetching the original audio and video code stream, performing frame header analysis after the original audio and video code stream is fetched, performing packaging format conversion on the original audio and video code stream, decapsulating the target audio and video code stream, and decoding the decapsulated target audio and video code stream, wherein the sensing information comprises at least one of audio attribute information of the original audio and video code stream, video attribute information of the original audio and video code stream, audio attribute information of the decoded target audio and video code stream, video attribute information of the decoded target audio and video code stream, frame header analysis information of the original audio and video code stream, and transmission attribute information of the original audio and video code stream;
generating a global report about the original audio and video code stream based on the acquired perception information;
judging whether target information representing the same attribute exists in different types of perception information included in the global report;
if yes, judging whether the attribute values of the target information are the same in the different kinds of perception information;
if not, the target information is marked as an abnormal item in the global report.
2. The method of claim 1,
the step of determining the original packaging format of the original audio and video code stream comprises the following steps:
detecting the packaging attribute of the original audio and video code stream to obtain the target packaging attribute of the original audio and video code stream;
respectively determining the matching degree of the target packaging attribute and the packaging attribute of each type of packaging format;
determining the packaging format corresponding to the determined highest matching degree as the original packaging format of the original audio and video code stream;
alternatively, the first and second electrodes may be,
the step of determining the original packaging format of the original audio and video code stream comprises the following steps:
acquiring packaging information fed back by the audio and video acquisition equipment;
and determining the packaging format included in the packaging information as the original packaging format of the original audio and video code stream.
3. The method according to claim 1, wherein said step of obtaining at least two types of perceptual information during the processing of said original audio/video codestream and/or said target audio/video codestream comprises:
before the step of judging whether the original packaging format is matched with the target packaging format, acquiring first type perception information corresponding to the original audio and video code stream;
wherein the first type of perceptual information comprises at least one of the following pieces of information: a transmission protocol type, a transmission control protocol type and a code stream type;
after the step of obtaining the first type of perception information corresponding to the original audio and video code stream, obtaining second type of perception information and/or third type of perception information corresponding to the original audio and video code stream;
wherein the second type of perceptual information comprises: extracting the obtained information from the audio and video description information fed back by the audio and video acquisition equipment, wherein the audio and video description information is as follows: in the process that the control equipment fetches the stream from the audio and video acquisition equipment, the information fed back by the audio and video acquisition equipment; the third type of perceptual information comprises frame attribute information.
4. The method according to claim 1, wherein the step of obtaining at least two types of perceptual information during the process of processing the original audio/video code stream and/or the target audio/video code stream comprises:
decapsulating and decoding the target audio/video code stream to be decapsulated to obtain decoded audio/video data;
analyzing the decoded audio and video data to obtain fourth type perception information;
wherein the fourth type of perceptual information comprises: audio attribute information and/or video attribute information.
5. The processing device of the audio and video code stream is characterized by being applied to control equipment in a video monitoring system, wherein the video monitoring system comprises the control equipment and audio and video acquisition equipment; the device comprises:
the format determining module is used for determining the original packaging format of the original audio and video code stream obtained by stream taking in the process that the control equipment takes the stream from the audio and video acquisition equipment;
the format judging module is used for judging whether the original packaging format is matched with the target packaging format; if not, triggering the format conversion module; wherein the target packaging format is: a packaging format that the control device is capable of supporting;
the format conversion module is used for converting the original packaging format of the original audio and video code stream into a target packaging format;
the code stream obtaining module is used for obtaining a target audio and video code stream to be unpacked when the conversion of the packaging format is successful;
the information acquisition module is used for acquiring at least two types of perception information in the process of processing the original audio and video code stream and/or the target audio and video code stream; the processing comprises at least one of fetching the original audio and video code stream, performing frame header analysis after the original audio and video code stream is fetched, performing packaging format conversion on the original audio and video code stream, decapsulating the target audio and video code stream, and decoding the decapsulated target audio and video code stream, wherein the sensing information comprises at least one of audio attribute information of the original audio and video code stream, video attribute information of the original audio and video code stream, audio attribute information of the decoded target audio and video code stream, video attribute information of the decoded target audio and video code stream, frame header analysis information of the original audio and video code stream, and transmission attribute information of the original audio and video code stream;
the report generating module is used for generating a global report about the original audio and video code stream based on the determined various kinds of perception information;
the information judgment module is used for judging whether target information representing the same attribute exists in different types of perception information contained in the global report; if yes, triggering an attribute value judgment module;
the attribute value judging module is used for judging whether the attribute values of the target information are the same in the different types of perception information; if the difference is not the same, triggering an information marking module;
the information marking module is used for marking the target information as an abnormal item in the global report.
6. The apparatus of claim 5, wherein the format determination module is specifically configured to:
detecting the packaging attribute of the original audio and video code stream to obtain the target packaging attribute of the original audio and video code stream; respectively determining the matching degree of the target packaging attribute and the packaging attributes of various types of packaging formats; determining the packaging format corresponding to the determined highest matching degree as the original packaging format of the original audio and video code stream;
or, the format determination module is specifically configured to:
acquiring packaging information fed back by the audio and video acquisition equipment; and determining the packaging format included in the packaging information as the original packaging format of the original audio and video code stream.
7. The apparatus of claim 5, wherein the information obtaining module comprises:
the first information acquisition module is used for acquiring first-class sensing information corresponding to the original audio and video code stream before judging whether the original packaging format is matched with the target packaging format;
wherein the first type of perceptual information comprises at least one of the following information: a transmission protocol type, a transmission control protocol type and a code stream type;
the second information acquisition module is used for acquiring second type perception information and/or third type perception information corresponding to the original audio and video code stream after the step of acquiring the first type perception information corresponding to the original audio and video code stream;
wherein the second type of perceptual information includes: extracting the obtained information from the audio and video description information fed back by the audio and video acquisition equipment, wherein the audio and video description information is as follows: in the process that the control equipment fetches the stream from the audio and video acquisition equipment, the audio and video acquisition equipment feeds back information; the third type of perceptual information comprises frame attribute information.
8. The apparatus of claim 5, wherein the information obtaining module comprises:
the third information acquisition module is used for decapsulating and decoding the target audio/video code stream to be decapsulated to obtain decoded audio/video data; analyzing the decoded audio and video data to obtain fourth type perception information;
wherein the fourth type of perceptual information includes: audio attribute information and/or video attribute information.
9. The control equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 4 when executing a program stored in the memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 4.
CN201910804736.7A 2019-08-28 2019-08-28 Audio and video code stream processing method and device and control equipment Active CN112449212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910804736.7A CN112449212B (en) 2019-08-28 2019-08-28 Audio and video code stream processing method and device and control equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910804736.7A CN112449212B (en) 2019-08-28 2019-08-28 Audio and video code stream processing method and device and control equipment

Publications (2)

Publication Number Publication Date
CN112449212A CN112449212A (en) 2021-03-05
CN112449212B true CN112449212B (en) 2022-11-04

Family

ID=74741889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910804736.7A Active CN112449212B (en) 2019-08-28 2019-08-28 Audio and video code stream processing method and device and control equipment

Country Status (1)

Country Link
CN (1) CN112449212B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011198244A (en) * 2010-03-23 2011-10-06 Hiromitsu Hama Object recognition system, monitoring system using the same, and watching system
CN102256162A (en) * 2011-07-22 2011-11-23 网宿科技股份有限公司 Method and system for optimizing media-on-demand based on real-time file format conversion
CN103002353A (en) * 2011-09-16 2013-03-27 杭州海康威视数字技术股份有限公司 Method and device for packaging multimedia documents
CN103957469A (en) * 2014-05-21 2014-07-30 百视通网络电视技术发展有限责任公司 Internet video on demand method and system based on real-time packaging switching
CN103973653A (en) * 2013-02-01 2014-08-06 上海迪爱斯通信设备有限公司 Intelligent sensing analyzer
CN108156481A (en) * 2016-12-02 2018-06-12 深圳市优朋普乐传媒发展有限公司 A kind of detection method and device of live source

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011198244A (en) * 2010-03-23 2011-10-06 Hiromitsu Hama Object recognition system, monitoring system using the same, and watching system
CN102256162A (en) * 2011-07-22 2011-11-23 网宿科技股份有限公司 Method and system for optimizing media-on-demand based on real-time file format conversion
CN103002353A (en) * 2011-09-16 2013-03-27 杭州海康威视数字技术股份有限公司 Method and device for packaging multimedia documents
CN103973653A (en) * 2013-02-01 2014-08-06 上海迪爱斯通信设备有限公司 Intelligent sensing analyzer
CN103957469A (en) * 2014-05-21 2014-07-30 百视通网络电视技术发展有限责任公司 Internet video on demand method and system based on real-time packaging switching
CN108156481A (en) * 2016-12-02 2018-06-12 深圳市优朋普乐传媒发展有限公司 A kind of detection method and device of live source

Also Published As

Publication number Publication date
CN112449212A (en) 2021-03-05

Similar Documents

Publication Publication Date Title
US11252062B2 (en) Monitoring streaming media content
US11432041B2 (en) Methods and apparatus to measure exposure to streaming media
US11770805B2 (en) Reception apparatus, reception method, transmission apparatus, and transmission method
US11831949B2 (en) Methods and apparatus to monitor streaming media content
CN102263959A (en) Direct broadcast transfer method and system
CN108174284B (en) Android system-based video decoding method
CN108810475B (en) Android video monitoring device based on Onvif standard and Sip protocol
CN112449212B (en) Audio and video code stream processing method and device and control equipment
Bailey Live Video Streaming from Android-Enabled Devices to Web Browsers
CN114205674B (en) Video data processing method, device, electronic equipment and storage medium
US11606528B2 (en) Advanced television systems committee (ATSC) 3.0 latency-free display of content attribute
AU2015252031B2 (en) Monitoring streaming media content
WO2023117465A1 (en) Rendering media streams

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant