CN115278305A - Video processing method, video processing system, and storage medium - Google Patents

Video processing method, video processing system, and storage medium Download PDF

Info

Publication number
CN115278305A
CN115278305A CN202210514646.6A CN202210514646A CN115278305A CN 115278305 A CN115278305 A CN 115278305A CN 202210514646 A CN202210514646 A CN 202210514646A CN 115278305 A CN115278305 A CN 115278305A
Authority
CN
China
Prior art keywords
video stream
end device
video
decoding
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210514646.6A
Other languages
Chinese (zh)
Other versions
CN115278305B (en
Inventor
卢成翔
吴惠敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202210514646.6A priority Critical patent/CN115278305B/en
Publication of CN115278305A publication Critical patent/CN115278305A/en
Application granted granted Critical
Publication of CN115278305B publication Critical patent/CN115278305B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440263Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the spatial resolution, e.g. for displaying on a connected PDA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application relates to a video processing method, a video processing system and a storage medium, wherein a control instruction is sent to a front-end device, wherein the control instruction is used for instructing the front-end device to adjust the inter-frame dependency relationship of a reference frame in a video stream to obtain a coded first video stream; the method comprises the steps of decoding a first video stream, wherein the first video stream supports the back-end device to decode with at least two sequential video frame sequences, and solves the problem that the back-end device cannot support the newly added channel under the condition that the accessed video stream meets the preview and/or playback effect in the related art, so that the back-end device can support the newly added channel.

Description

Video processing method, video processing system, and storage medium
Technical Field
The present application relates to the field of video processing, and in particular, to a video processing method, a video processing system, and a storage medium.
Background
After the front-end equipment collects and generates the video stream, the video stream is pulled to the back-end equipment for analysis. Besides being used for analysis by a back-end device, the video stream also needs to be used for preview or playback by a client device, so that, in general, the video stream frame rate is an effect of preferentially ensuring the preview or playback. Some analysis algorithms have high requirements on the frame rate, some analysis algorithms have low requirements on the frame rate, and in order to save the performance overhead of the analysis algorithms, in general, the video stream is decoded to obtain YUV, and then the YUV is transmitted to an analysis algorithm module by adopting a frame extraction or scaling mode.
Although the related art saves the performance overhead of the analysis algorithm, the channels of the whole backend device become much smaller. Under a certain allocation proportion, the decoding capability and the analysis capability of the back-end device are relatively matched, i.e. constrained by analysis specifications, for example, when the back-end device accesses a 1080P video stream, the back-end device can support 16 channels to perform analysis (including decoding and analysis, and a frame extraction and scaling method is applied) at a frame rate of 30 frames/second, when a 4K video stream is accessed, the back-end device only supports the 4 channels to perform analysis at a frame rate of 30 frames/second as a whole, and if a new channel is added to access the video stream, the back-end device cannot support.
In the related art, no effective solution is provided at present for the problem that the backend device cannot support the newly added channel under the condition that the accessed video stream satisfies the preview and/or playback effect.
Disclosure of Invention
In the present embodiment, a video processing method, a video processing system, and a storage medium are provided to solve a problem in the related art that a back-end device cannot support a newly added channel in a case where an accessed video stream satisfies a preview and/or playback effect.
In a first aspect, in this embodiment, a video processing method is provided, which is applied to a backend device, and includes:
sending a control instruction to front-end equipment, wherein the control instruction is used for instructing the front-end equipment to adjust the interframe dependency relationship of the reference frame in the video stream to obtain a coded first video stream;
decoding the first video stream, wherein the first video stream supports a backend device to decode in at least two sequential sequences of video frames.
In some embodiments, before sending the control instruction to the front-end device, the method further includes:
determining a current decoding capability and a desired decoding capability required to be provided when decoding a video stream of the front-end device;
and comparing the current decoding capability with the expected decoding capability, and generating the control instruction according to the comparison result.
In some of these embodiments, the first video stream includes at least one group of pictures, each group of pictures including a plurality of non-key frames, comparing the current decoding capability with the desired decoding capability, and generating the control instruction according to the comparison includes:
and generating a first control instruction when the current decoding capability is smaller than the expected decoding capability, wherein the first control instruction is used for instructing the front-end equipment to release the inter-frame dependency relationship between partial non-key frames in each picture group of the first video stream.
In some of these embodiments, the method further comprises:
and generating a second control instruction when the current decoding capability is not less than the expected decoding capability, wherein the second control instruction is used for instructing the front-end equipment to add the inter-frame dependency relationship between partial non-key frames in each picture group of the first video stream.
In some embodiments, in the case that the current decoding capability is not less than the desired decoding capability, generating a second control instruction comprises:
determining the maximum decoding capacity, and judging whether the maximum decoding capacity is larger than the expected decoding capacity;
and generating the second control instruction when the maximum decoding capacity is judged to be larger than the expected decoding capacity.
In some of these embodiments, the method further comprises:
determining a current analysis capability and determining a desired analysis capability required to be provided when analyzing the first video stream;
determining whether the current analysis capability is less than the desired analysis capability;
and refusing to access the front-end equipment under the condition that the current analysis capability is judged to be smaller than the expected analysis capability.
In some of these embodiments, decoding the first video stream comprises:
and extracting part of reference frames in the first video stream to decode, and/or decoding all the reference frames in the first video stream.
In a second aspect, in this embodiment, a video processing method is provided, applied to a front-end device, and includes:
receiving a control instruction, and adjusting the inter-frame dependency relationship of a reference frame in a video stream according to the control instruction to obtain a first video stream;
and sending the first video stream to a back-end device, wherein the first video stream supports the back-end device to decode with at least two sequential video frame sequences.
In some embodiments, the first video stream includes at least one group of pictures, each group of pictures including a plurality of non-key frames, and adjusting inter-frame dependencies of reference frames in the video stream according to the control instructions includes:
removing inter-frame dependency relations among partial non-key frames in each picture group of the first video stream according to a first control instruction; or,
and adding inter-frame dependency relationship among partial non-key frames in each picture group of the first video stream according to a second control instruction.
In a third aspect, in this embodiment, a video processing system is provided, which includes a front-end device and a back-end device, where the front-end device is connected to the back-end device, the front-end device is configured to be able to capture and encode a video stream to generate a video stream, and send the video stream to the back-end device, and the back-end device is configured to execute the video processing method according to the first aspect.
In some embodiments, the front-end device is further configured to perform the video processing method according to the second aspect.
In some of these embodiments, the video processing system further comprises: a third party device configured to be able to decode all reference frames in the first video stream.
In a fourth aspect, in the present embodiment, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements the video processing method of the first or second aspect.
Compared with the related art, the video processing method, the video processing system and the storage medium provided in this embodiment send a control instruction to the front-end device, where the control instruction is used to instruct the front-end device to adjust inter-frame dependency of a reference frame in a video stream, so as to obtain an encoded first video stream; the method comprises the steps of decoding a first video stream, wherein the first video stream supports the back-end device to decode with at least two sequential video frame sequences, and solves the problem that the back-end device cannot support the newly added channel under the condition that the accessed video stream meets the preview and/or playback effect in the related art, so that the back-end device can support the newly added channel.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic diagram of a video frame sequence for full frame rate decoding according to an embodiment of the present application;
FIG. 2 is a first flowchart of a video processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a video frame sequence obtained after adjustment of encoding parameters according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a principle of adjusting inter-frame dependency of a reference frame according to an embodiment of the present application;
FIG. 5 is a schematic flow chart illustrating analysis of a video stream by a backend device according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a method for dynamically adjusting inter-frame dependencies of reference frames according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating a video processing method according to an embodiment of the present application;
FIG. 8 is a block diagram of a video processing system according to an embodiment of the present application;
fig. 9 is a schematic diagram of a video processing system according to a preferred embodiment of the present application.
Detailed Description
For a clearer understanding of the objects, technical solutions and advantages of the present application, reference is made to the following description and accompanying drawings.
Unless defined otherwise, technical or scientific terms used herein shall have the same general meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The use of the terms "a" and "an" and "the" and similar referents in the context of this application do not denote a limitation of quantity, either in the singular or the plural. The terms "comprises," "comprising," "has," "having," and any variations thereof, as referred to in this application, are intended to cover non-exclusive inclusions; for example, a process, method, and system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or modules, but may include other steps or modules (elements) not listed or inherent to such process, method, article, or apparatus. Reference in this application to "connected," "coupled," and the like is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference to "a plurality" in this application means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. In general, the character "/" indicates a relationship in which the objects associated before and after are an "or". The terms "first," "second," "third," and the like in this application are used for distinguishing between similar items and not necessarily for describing a particular sequential or chronological order.
Before describing the method provided in this embodiment, it is necessary to describe the inventive concept of this application. The video stream frame rate preferentially ensures the effect of preview or playback, which means that the frame rate and resolution of the video stream generated after the front-end equipment is encoded do not change, so that the third-party equipment can decode the video stream at the full frame rate. FIG. 1 is a schematic diagram of a video frame sequence provided for full rate decoding in an embodiment of the present application, where, as shown in FIG. 1, an I frame is a complete image; a P-frame may be understood as a patch over a complete image; i frame 1 can be directly decoded into one frame YUV (full image); for a P frame 1, a first frame YUV + P frame 1 (patch) is required to be decoded together to generate a second frame YUV; for the P frame 2, a second frame YUV + P frame 2 (patch) is required to be decoded together to generate a third frame YUV; the whole decoding order must be performed according to the inter-frame dependency order of the reference frames, otherwise the subsequent YUV images have problems.
The back-end equipment has different requirements on the frame rate of the video stream in the decoding stage and the analysis stage, and supposing that under the condition of not changing the frame rate of the video stream, the initial frame rate of the video stream is 30 frames/second, and when the video stream generated by the front-end equipment is simultaneously used for analysis by the back-end equipment and preview and/or playback by a third-party equipment, the back-end equipment only needs 8 frames/second to achieve a better analysis effect.
In order to solve the problem that the backend device cannot support the newly added channel under the condition that the accessed video stream meets the preview and/or playback effect, the following assumptions are proposed: on the premise of not influencing subsequent analysis, if the frame rate of the video stream is reduced during decoding by the back-end equipment to adapt to the decoding capability of the back-end equipment, a newly added channel can be supported.
On one hand, due to the existence of the third-party equipment, the video stream generated by the front-end equipment needs to meet the requirement of full-frame-rate decoding, namely the frame rate of the video stream cannot be changed; on the other hand, when analyzing the video stream, the back-end device must decode all P frames and then perform frame extraction on YUV instead of frame extraction and decoding. That is, there is a conflict between meeting the preview and/or playback requirements and meeting the analysis requirements between the backend device and the third party device; in the back-end device, there is a contradiction between the fully adaptive decoding capability and the fully adaptive analysis capability.
In view of the above-mentioned technical problems and contradictions, a video processing method is provided in this embodiment, and is applied to a backend device, and fig. 2 is a flowchart of the video processing method of this embodiment, and as shown in fig. 2, the flowchart includes the following steps:
step S201, sending a control instruction to the front-end device, wherein the control instruction is used for instructing the front-end device to adjust the inter-frame dependency relationship of the reference frame in the video stream, so as to obtain a coded first video stream;
step S202, decoding a first video stream, where the first video stream supports the backend device to decode with at least two sequential video frame sequences.
In this embodiment, at an encoding stage of a front-end device, the front-end device is controlled by a back-end device, and an inter-frame dependency relationship of reference frames in a video stream is adjusted to obtain an encoded first video stream, where the first video stream can enable the back-end device to decode according to one of sequential video frame sequences without changing a frame rate of the video stream (maintaining a full frame rate), and can also decode according to an original sequential video frame sequence, that is, the full frame rate.
It is not necessary to provide at least two sequential video frame sequences, namely a first video frame sequence and a second video frame sequence, wherein the first video frame sequence is arranged in a sequence different from the full frame rate, and the second video frame sequence is arranged in the full frame rate sequence. With the above arrangement, when the front-end device transmits the video stream to the back-end device, the back-end device decodes the video stream according to the first video frame sequence under the condition that the frame rate is not changed.
The front-end equipment is controlled to encode through the rear-end equipment, so that the inter-frame dependency of reference frames in the video stream is dynamically adjustable, namely the frame rate of the first video frame sequence is dynamically adjustable, and in order to ensure a certain number of channels, the frame rate of the first video frame sequence can be reduced.
Illustratively, when the total decoding capability is insufficient after the analysis channel accessed by the back-end device is increased, the front-end device may be controlled to adjust the encoding parameters to obtain an encoded first video stream, the GOP frame reference relationship of which is shown in fig. 3, and the first video stream decodes and displays all frames when the third-party device previews and/or plays back, and the frame rate is still 30 frames/second. And when the back-end equipment analyzes, only the frame of the black part is decoded, and the frame rate is reduced to 1/3, so that the decoding performance requirement is reduced.
In the above steps S201 to S202, the back-end device controls the front-end device to adjust the inter-frame dependency relationship of the reference frames in the video stream, so as to obtain the video stream that supports decoding with at least two sequential video frame sequences, and satisfy that the back-end device or the third-party device performs full frame rate decoding on the video stream, so as to satisfy that the back-end device performs lower frame rate decoding on the video stream while implementing preview and/or playback effects, and support a new channel to ensure a certain number of channels. Through the steps, the problem that the back-end equipment cannot support the newly added channel under the condition that the accessed video stream meets the previewing and/or playback effect is solved, so that the back-end equipment can support the newly added channel.
In one embodiment, before sending the control instructions to the front-end device, the back-end device will also determine the current decoding capability and determine the desired decoding capability that needs to be provided when decoding the video stream of the front-end device; and comparing the current decoding capability with the expected decoding capability, and generating a control instruction according to the comparison result.
The current decoding capability of the back-end device refers to the decoding capability of the back-end device in a channel for analyzing the video stream at the current stage, and the expected decoding capability refers to the decoding capability required to be provided according to the video stream specification of the currently accessed front-end device. The current decoding capability is determined according to the analysis specification of the current channel in the back-end equipment, and the analysis specification carries resolution information and frame rate information; the desired decoding capability is determined according to resolution information and frame rate information carried by the video stream. Specifically, decoding capability = resolution wide × resolution high × frame rate.
In the embodiment, the current decoding capability and the expected decoding capability are compared in real time to reasonably control the front-end device to encode the video stream, so that the generated first video stream adapts to the current decoding capability of the back-end device.
In some embodiments of the present application, the video stream includes at least one Group of Pictures (GOP), each Group of Pictures includes a plurality of non-key frames, and when the back-end device compares the current decoding capability with the expected decoding capability, the front-end device is controlled to adjust the inter-frame dependency of the reference frames in the video stream according to different situations, in a specific manner as follows:
in one embodiment, the back-end device compares the current decoding capability with the expected decoding capability, and generates a first control instruction when the current decoding capability is smaller than the expected decoding capability, wherein the first control instruction is used for instructing the front-end device to release the inter-frame dependency relationship between partial non-key frames in each picture group of the first video stream. The arrangement is such that the desired decoding capability does not exceed what the backend device can actually provide.
In one embodiment, the back-end device compares the current decoding capability with the expected decoding capability, and generates a second control instruction when the current decoding capability is not less than the expected decoding capability, wherein the second control instruction is used for instructing the front-end device to add inter-frame dependency relationship between partial non-key frames in each picture group of the first video stream. The arrangement enables the expected decoding capability to approach the decoding capability which can be actually provided by the back-end equipment, and the decoding capability of the back-end equipment is fully exerted.
Fig. 4 is a schematic diagram illustrating a principle of adjusting inter-frame dependency of reference frames in this embodiment, and as shown in fig. 4, taking removing inter-frame dependency between partial non-key frames as an example, two video frame sequences are set, so that a backend device and a third-party device can decode in the following two sequences, where an I frame is a key frame and a P frame is a non-key frame:
when the video stream is provided for analysis by the backend device, a first video frame sequence is adopted: i frame 1, P frame 2, P frame 4, P frame 6, … …, I frame 2;
when the video stream is provided to a third-party device for preview and/or playback, a second sequence of video frames is used: i frame 1, P frame 2, P frame 3, … …, I frame 2.
In this embodiment, under different conditions, by controlling the adjustment direction of the inter-frame dependency relationship of the reference frame, the back-end device can be supported to decode at a lower frame rate or a higher frame rate, so that the decoding analysis performance of the channel in the back-end device is maximized.
How to further control the reference frame inter-frame dependencies will be described below.
In one embodiment, the back-end device compares the current decoding capability with the expected decoding capability, determines the maximum decoding capability under the condition that the current decoding capability is not less than the expected decoding capability, and judges whether the maximum decoding capability is greater than the expected decoding capability; and generating a second control instruction when the maximum decoding capacity is judged to be larger than the expected decoding capacity.
The current decoding capability is not less than the expected decoding capability, which represents that the decoding capability of the back-end equipment has residue, so that the front-end equipment is controlled to encode based on the maximum decoding capability of the back-end equipment, and the interframe dependency relationship among partial non-key frames is added in the first video stream to improve the frame rate of the first video frame sequence, so that the expected decoding capability approaches to the maximum decoding capability, and the decoding capability of the back-end equipment is fully exerted.
In addition, in one embodiment, the back-end device compares the current decoding capability with the expected decoding capability, determines the maximum decoding capability when the current decoding capability is not less than the expected decoding capability, and determines whether the maximum decoding capability is greater than the expected decoding capability; and under the condition that the maximum decoding capability is judged to be not greater than the expected decoding capability, determining that the frame rate of the first video frame sequence is already in the optimal state and cannot be adjusted any more, and keeping the frame rate unchanged.
In one embodiment, the back-end device further scales the decoded video stream to a preset resolution for analysis after decoding the video stream according to the first video frame sequence.
In one embodiment, the back-end device determines a current analysis capability and determines a desired analysis capability that needs to be provided when analyzing the first video stream; judging whether the current analysis capability is smaller than the expected analysis capability; and under the condition that the current analysis capability is judged to be smaller than the expected analysis capability, the front-end equipment is refused to be accessed.
In this embodiment, the backend device has a conflict between decoding and analysis, and even if the first video frame sequence is obtained by adjusting the inter-frame dependency of the reference frame in the video stream of the front-end device, the decoding capability actually provided by the backend device meets the expected decoding capability, and if there is no additional analysis capability, the front-end device is denied to be accessed to the backend device.
In one embodiment, when the back-end device decodes the first video stream, only part of the reference frames in the first video stream may be extracted for decoding; or, the back-end device decodes all reference frames in the first video stream; or, a part of channels of the back-end device extract a part of reference frames in the first video stream to decode, and the other part of channels decode all the reference frames in the first video stream; or, the back-end device extracts only a part of reference frames in the first video stream for decoding, and then decodes all reference frames in the first video stream.
Fig. 5 is a schematic flowchart of a backend device analyzing a video stream according to an embodiment of the present application, and as shown in fig. 5, the flowchart includes the following steps:
step S501, receiving a coded first video stream;
step S502, decoding the first video stream to obtain a second video stream containing multi-frame YVU;
step S503, when the self analysis capability is insufficient, frame extraction processing is carried out on the second video stream, and partial YUV is discarded;
step S504, scaling YUV in the second video stream to a preset resolution ratio to obtain a third video stream;
step S505, sending the third video stream to an algorithm module for analysis;
step S506, returning to the bounding box coordinates of the detection target;
and step S507, carrying out cutout coding based on the bounding box coordinates of the detection target.
The video processing method of the present application will be described below with reference to specific application scenarios.
In the related art, the number of analysis channels and the analysis specification (usually, a certain number of analysis channels can be reached under a certain resolution and frame rate) are set according to the hardware performance of the backend device. The resolution of videos acquired by front-end equipment is larger and larger, various resolution combinations may exist on the site, and the analysis specification of the back-end equipment cannot be accurately given because the analysis is premised on decoding into YUV. To solve this problem, in some embodiments, a schematic flow chart of dynamically adjusting inter-frame dependency of a reference frame is provided, as shown in fig. 6, where the flow chart includes the following steps:
step S601, calculating the current decoding capability of an analysis channel in the back-end equipment at regular time;
step S602, judging whether the current decoding capability is smaller than the expected decoding capability; if yes, go to step S603; if not, go to step S604;
step S603, determining whether the maximum decoding capability is greater than the desired decoding capability; if yes, go to step S606; if not, executing step S605;
step S604, instructing the front-end device to adjust the inter-frame dependency of the reference frame, and reducing the frame rate of the first video frame sequence so that the decoding capability required to be provided does not exceed the maximum decoding capability;
step S605, maintaining the inter-frame dependency relationship of the original reference frame and keeping the inter-frame dependency relationship unchanged;
step S606, instruct the front-end device to adjust the inter-frame dependency of the reference frame, and increase the frame rate of the first video frame sequence, so that the decoding capability required to be provided approaches the maximum decoding capability.
By dynamically adjusting the inter-frame dependency of the reference frame in the front-end device, an analysis specification independent of video resolution can be given, on the premise of not influencing the resolution and frame rate of video preview and/or playback, a certain number of analysis channels can be ensured to smoothly decode a video stream, and the number of the analysis channels can reach the index set in the product delivery specification under different resolution scenes.
The following embodiment will describe a method of adjusting decoding overhead by adjusting inter-frame dependency of reference frames.
In some of these embodiments, the decoding capability and analysis capability of the backend device in the initial state are as follows:
decoding capability: 16 1080P,30 frames/second, or 4 4K 30 frames/second; analysis capability: 1080P 80 frames/second.
The back-end equipment is accessed into 4K 30 frames/second video streams, the decoding capacity is 0, the video streams are reduced to 1080P for analysis after decoding, the video streams are 120 frames/second and exceed the analysis capacity, and frame extraction analysis is carried out on each path according to 20 frames/second.
The back-end equipment adds 4 paths of 4K 30 frames/second video again, changes into 8 paths of 4K 30 frames/second, and exceeds the decoding capability, the back-end equipment sends a control request to the front-end equipment, so that 8 sets of front-end equipment adjust the inter-frame dependency of the reference frames, and the front-end equipment supports the frame rate of 15 frames/second for decoding, at the moment, the decoding performance overhead is 8 paths of 4K 15 frames/second, and after the video is zoomed into 1080P, the intelligent analysis frame extraction strategy can be adjusted to 8 paths of 1080p 10 frames/second.
When the back-end equipment adds a 4K video on the basis of the former one, the minimum requirements of the algorithm on the video stream are 1080P,10 frames/second, 9 paths of video streams at least need the analysis capability of 90 frames/second, and even if the decoding capability is met, no extra analysis capability exists, so that the addition of a 9 th path of analysis channel is refused, and a failure prompt message is returned.
The method is only one algorithm and a scene with resolution, video streams with various resolutions and frame rates exist in an actual scene, various algorithms have different requirements on the minimum resolution and the frame rate of the video streams, decoding overhead is adjusted by adjusting inter-frame dependency of reference frames, and decoding analysis performance of channels in back-end equipment can be maximized by combining a specific adjustment strategy.
In an embodiment of the present application, another video processing method is provided, which is applied to a front-end device, and fig. 7 is a flowchart of the video processing method of the present embodiment, as shown in fig. 7, where the flowchart includes the following steps:
step S701, receiving a control instruction, and adjusting inter-frame dependency of a reference frame in a video stream according to the control instruction to obtain a first video stream;
step S702, sending a first video stream to the backend device, where the first video stream supports the backend device to decode with at least two sequential video frame sequences.
In this embodiment, at the encoding stage of the front-end device, the front-end device is controlled to adjust the inter-frame dependency relationship of the reference frames in the video stream, so as to obtain an encoded first video stream, where the first video stream can enable the back-end device to decode according to one of the sequential video frame sequences without changing the frame rate of the video stream (maintaining the full frame rate), and can also decode according to the original sequential video frame sequence, that is, the full frame rate decoding.
The method comprises the steps of adjusting the interframe dependency relationship of reference frames in a video stream by controlling a front-end device to obtain the video stream supporting decoding with at least two sequential video frame sequences, and meeting the requirement that a rear-end device or a third-party device performs full-frame-rate decoding on the video stream to achieve a preview and/or playback effect, simultaneously meeting the requirement that the rear-end device performs lower-frame-rate decoding on the video stream, and supporting a newly-added channel to ensure a certain number of channels. Through the steps, the problem that the back-end equipment cannot support the newly added channel under the condition that the accessed video stream meets the previewing and/or playback effect is solved, so that the back-end equipment can support the newly added channel.
In one embodiment, the first video stream comprises at least one picture group, each picture group comprises a plurality of non-key frames, and the front-end equipment releases the inter-frame dependency relationship between partial non-key frames in each picture group of the first video stream according to the first control instruction; or adding inter-frame dependency relationship among partial non-key frames in each picture group of the first video stream according to the second control instruction.
The video processing method executed in the front-end device is adapted to the back-end device, and therefore, for a specific example of the video processing method executed by the front-end device, reference may be made to the examples described in the above embodiments and optional implementations of the back-end device, and details are not described here again.
In an embodiment of the present application, there is further provided a video processing system, and fig. 8 is a schematic structural diagram of the video processing system in this embodiment, as shown in fig. 8, the system includes: the video processing system comprises a front-end device and a back-end device, wherein the front-end device is connected with the back-end device, the front-end device is configured to be capable of collecting and encoding to generate a video stream and sending the video stream to the back-end device, and the back-end device is configured to execute the video processing method of any corresponding embodiment.
In one embodiment, the front-end device is further configured to perform the video processing method of any of the above-described respective embodiments.
In one embodiment, the video processing system further comprises: a third party device configured to be able to decode all reference frames in the first video stream.
In one embodiment, the backend device further comprises: the storage module can store undecoded video streams, decoded video streams and analysis results obtained after the video streams are analyzed.
In one embodiment, the front-end device may be a camera device such as an IPC (internet protocol camera) or a dome camera, the back-end device is a terminal device or a platform with video analysis capability, the third-party device is a client/Web platform with video preview and/or playback function, and the client/Web platform may directly log in to the front-end device to preview and/or playback the video.
In addition, the third-party device may be connected to a backend device, for example, the client/Web platform is connected to a terminal device or a platform, and the client/Web platform may directly log in to the terminal device or the platform to access the analysis result.
Fig. 9 is a schematic structural diagram of a video processing system in a preferred embodiment of the present application, and as shown in fig. 9, the video processing system includes: the system comprises an IPC (Internet protocol Camera), an analysis platform and a client/Web platform, wherein the analysis platform comprises a storage module, the IPC is connected with the analysis platform, and the IPC and the analysis platform are respectively connected with the client/Web platform.
IPC is responsible for collecting pictures for coding; the analysis platform is responsible for pulling the video stream from the IPC for analysis, the storage module is responsible for storing the video stream and an analysis result, and in addition, the analysis platform also forwards the video stream to the client/Web platform; the client/Web platform can directly log in IPC for video preview and/or playback, or directly log in the analysis platform to access the analysis result.
It should be noted that, for specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and optional implementations, and details are not described again in this embodiment.
In addition, in combination with the video processing method provided in the foregoing embodiment, a storage medium may also be provided to implement in this embodiment. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the video processing methods in the above embodiments.
It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to be limiting. All other embodiments, which can be derived by a person skilled in the art from the examples provided herein without any inventive step, shall fall within the scope of protection of the present application.
It is obvious that the drawings are only examples or embodiments of the present application, and it is obvious to those skilled in the art that the present application can be applied to other similar cases according to the drawings without creative efforts. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
The term "embodiment" is used herein to mean that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly or implicitly understood by one of ordinary skill in the art that the embodiments described in this application may be combined with other embodiments without conflict.
The above-mentioned embodiments only express several implementation modes of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the patent protection. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (13)

1. A video processing method is applied to a back-end device and is characterized by comprising the following steps:
sending a control instruction to front-end equipment, wherein the control instruction is used for instructing the front-end equipment to adjust the inter-frame dependency relationship of a reference frame in a video stream to obtain a coded first video stream;
decoding the first video stream, wherein the first video stream supports decoding by a back-end device in at least two sequential sequences of video frames.
2. The video processing method of claim 1, wherein before sending the control command to the head-end device, the method further comprises:
determining a current decoding capability and a desired decoding capability required to be provided when decoding a video stream of the front-end device;
and comparing the current decoding capability with the expected decoding capability, and generating the control instruction according to the comparison result.
3. The video processing method according to claim 2, wherein the first video stream comprises at least one group of pictures, each group of pictures comprising a plurality of non-key frames, wherein comparing the current decoding capability with the desired decoding capability, and wherein generating the control command according to the comparison comprises:
and generating a first control instruction when the current decoding capability is smaller than the expected decoding capability, wherein the first control instruction is used for instructing the front-end equipment to release the inter-frame dependency relationship among partial non-key frames in each picture group of the first video stream.
4. The video processing method of claim 2, wherein the method further comprises:
and generating a second control instruction when the current decoding capability is not less than the expected decoding capability, wherein the second control instruction is used for instructing the front-end equipment to add the inter-frame dependency relationship between partial non-key frames in each picture group of the first video stream.
5. The video processing method of claim 4, wherein in the case that the current decoding capability is not less than the desired decoding capability, generating a second control instruction comprises:
determining the maximum decoding capacity, and judging whether the maximum decoding capacity is larger than the expected decoding capacity;
and generating the second control instruction when the maximum decoding capacity is judged to be larger than the expected decoding capacity.
6. The video processing method of claim 1, wherein the method further comprises:
determining a current analysis capability and a desired analysis capability required to be provided when analyzing the first video stream;
determining whether the current analysis capability is less than the desired analysis capability;
and refusing to access the front-end equipment under the condition that the current analysis capability is judged to be smaller than the expected analysis capability.
7. The video processing method of claim 1, wherein decoding the first video stream comprises:
and extracting part of reference frames in the first video stream to decode, and/or decoding all the reference frames in the first video stream.
8. A video processing method is applied to a front-end device, and is characterized by comprising the following steps:
receiving a control instruction, and adjusting the inter-frame dependency relationship of reference frames in a video stream according to the control instruction to obtain a first video stream;
and sending the first video stream to a back-end device, wherein the first video stream supports the back-end device to decode with at least two sequential video frame sequences.
9. The video processing method of claim 8, wherein the first video stream comprises at least one group of pictures, each group of pictures comprising a plurality of non-key frames, and wherein adjusting inter-frame dependencies of reference frames in the video stream according to the control instruction comprises:
removing inter-frame dependency relations among partial non-key frames in each picture group of the first video stream according to a first control instruction; or,
and adding inter-frame dependency relationship among partial non-key frames in each picture group of the first video stream according to a second control instruction.
10. A video processing system comprising a front-end device and a back-end device, the front-end device being connected to the back-end device, the front-end device being configured to be able to capture and encode a video stream and to transmit the video stream to the back-end device, the back-end device being configured to perform the video processing method of any one of claims 1 to 7.
11. The video processing system of claim 10, wherein the front-end device is further configured to perform the video processing method of claim 8 or 9.
12. The video processing system of claim 10, wherein the video processing system further comprises: a third party device configured to be able to decode all reference frames in the first video stream.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the video processing method of any one of claims 1 to 9.
CN202210514646.6A 2022-05-12 2022-05-12 Video processing method, video processing system, and storage medium Active CN115278305B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210514646.6A CN115278305B (en) 2022-05-12 2022-05-12 Video processing method, video processing system, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210514646.6A CN115278305B (en) 2022-05-12 2022-05-12 Video processing method, video processing system, and storage medium

Publications (2)

Publication Number Publication Date
CN115278305A true CN115278305A (en) 2022-11-01
CN115278305B CN115278305B (en) 2024-05-07

Family

ID=83758687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210514646.6A Active CN115278305B (en) 2022-05-12 2022-05-12 Video processing method, video processing system, and storage medium

Country Status (1)

Country Link
CN (1) CN115278305B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263942A (en) * 2010-05-31 2011-11-30 苏州闻道网络科技有限公司 Scalable video transcoding device and method
US20120169883A1 (en) * 2010-12-31 2012-07-05 Avermedia Information, Inc. Multi-stream video system, video monitoring device and multi-stream video transmission method
US20150010286A1 (en) * 2003-04-25 2015-01-08 Gopro, Inc. Encoding and decoding selectively retrievable representations of video content
CN105023241A (en) * 2015-07-29 2015-11-04 华南理工大学 Fast image interpolation method for mobile terminal
CN108063973A (en) * 2017-12-14 2018-05-22 浙江大华技术股份有限公司 A kind of method for decoding video stream and equipment
US20190228804A1 (en) * 2018-01-05 2019-07-25 Shanghai Xiaoyi Technology Co., Ltd. Device, method, storage medium, and terminal for controlling video stream data playing
CN111988561A (en) * 2020-07-13 2020-11-24 浙江大华技术股份有限公司 Adaptive adjustment method and device for video analysis, computer equipment and medium
WO2021007702A1 (en) * 2019-07-12 2021-01-21 Huawei Technologies Co., Ltd. Video encoding method, video decoding method, video encoding device, and video decoding device
CN113473147A (en) * 2021-05-17 2021-10-01 浙江大华技术股份有限公司 Post-processing method and device of video code stream and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150010286A1 (en) * 2003-04-25 2015-01-08 Gopro, Inc. Encoding and decoding selectively retrievable representations of video content
CN102263942A (en) * 2010-05-31 2011-11-30 苏州闻道网络科技有限公司 Scalable video transcoding device and method
US20120169883A1 (en) * 2010-12-31 2012-07-05 Avermedia Information, Inc. Multi-stream video system, video monitoring device and multi-stream video transmission method
CN105023241A (en) * 2015-07-29 2015-11-04 华南理工大学 Fast image interpolation method for mobile terminal
CN108063973A (en) * 2017-12-14 2018-05-22 浙江大华技术股份有限公司 A kind of method for decoding video stream and equipment
US20190228804A1 (en) * 2018-01-05 2019-07-25 Shanghai Xiaoyi Technology Co., Ltd. Device, method, storage medium, and terminal for controlling video stream data playing
WO2021007702A1 (en) * 2019-07-12 2021-01-21 Huawei Technologies Co., Ltd. Video encoding method, video decoding method, video encoding device, and video decoding device
CN111988561A (en) * 2020-07-13 2020-11-24 浙江大华技术股份有限公司 Adaptive adjustment method and device for video analysis, computer equipment and medium
CN113473147A (en) * 2021-05-17 2021-10-01 浙江大华技术股份有限公司 Post-processing method and device of video code stream and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
武其达;何小海;林宏伟;陶青川;吴笛;: "结合帧率变换与HEVC标准的新型视频压缩编码算法", 自动化学报, no. 09, 12 December 2017 (2017-12-12) *

Also Published As

Publication number Publication date
CN115278305B (en) 2024-05-07

Similar Documents

Publication Publication Date Title
US8351513B2 (en) Intelligent video signal encoding utilizing regions of interest information
WO2021244341A1 (en) Picture coding method and apparatus, electronic device and computer readable storage medium
EP1177691B1 (en) Method and apparatus for generating compact transcoding hints metadata
US20080144949A1 (en) Portable Terminal
MX2007000810A (en) Method and apparatus for encoder assisted-frame rate up conversion (ea-fruc) for video compression.
JP2004502356A (en) Multicast transmission system including bandwidth scaler
WO2006082690A1 (en) Image encoding method and image encoding device
KR20160007564A (en) Tuning video compression for high frame rate and variable frame rate capture
TWI280060B (en) Video encoder and method for detecting and encoding noise
JP4764136B2 (en) Moving picture coding apparatus and fade scene detection apparatus
US7957604B2 (en) Moving image coding apparatus, moving image decoding apparatus, control method therefor, and computer-readable storage medium
US11778210B2 (en) Load balancing method for video decoding in a system providing hardware and software decoding resources
JP2004015501A (en) Apparatus and method for encoding moving picture
JPH11234644A (en) Multi-point conference system
Makar et al. Real-time video streaming with interactive region-of-interest
CN115278305A (en) Video processing method, video processing system, and storage medium
WO2023061129A1 (en) Video encoding method and apparatus, device, and storage medium
CN102484717A (en) Video encoding device, video encoding method and video encoding program
Kobayashi et al. A Low-Latency 4K HEVC Multi-Channel Encoding System with Content-Aware Bitrate Control for Live Streaming
EP2884742B1 (en) Process for increasing the resolution and the visual quality of video streams exchanged between users of a video conference service
JP2005341093A (en) Contents adaptating apparatus, contents adaptation system, and contents adaptation method
JP2009081622A (en) Moving image compression encoder
JP3202270B2 (en) Video encoding device
CN115225911B (en) Code rate self-adaption method and device, computer equipment and storage medium
Wang et al. Attention information based spatial adaptation framework for browsing videos via mobile devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant