CN116260990B - AI asynchronous detection and real-time rendering method and system for multipath video streams - Google Patents

AI asynchronous detection and real-time rendering method and system for multipath video streams Download PDF

Info

Publication number
CN116260990B
CN116260990B CN202310549343.2A CN202310549343A CN116260990B CN 116260990 B CN116260990 B CN 116260990B CN 202310549343 A CN202310549343 A CN 202310549343A CN 116260990 B CN116260990 B CN 116260990B
Authority
CN
China
Prior art keywords
detection
frame
video
module
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310549343.2A
Other languages
Chinese (zh)
Other versions
CN116260990A (en
Inventor
宋艳枝
金晨曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Gauss Intelligent Technology Co ltd
Original Assignee
Hefei Gauss Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Gauss Intelligent Technology Co ltd filed Critical Hefei Gauss Intelligent Technology Co ltd
Priority to CN202310549343.2A priority Critical patent/CN116260990B/en
Publication of CN116260990A publication Critical patent/CN116260990A/en
Application granted granted Critical
Publication of CN116260990B publication Critical patent/CN116260990B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/439Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using cascaded computational arrangements for performing a single operation, e.g. filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of multipath video stream processing, which solves the technical problems of low rendering performance, remarkable resource occupation and delay in serial synchronous processing in the prior art, in particular to an AI asynchronous detection and real-time rendering method and system of multipath video streams, and the method comprises the following steps: s1, a plurality of cameras are subjected to frame splitting through a video decoding module to obtain each frame of image, and the latest frame of image is recorded as an Nth frame of image; s2, the AI detection engine module determines a corresponding detection algorithm model according to the self performance evaluation frame extraction interval, and the Nth frame image obtained from the S1 is sent into the detection algorithm model for relevant detection and identification at fixed time periods. The invention adopts the parallel asynchronous rendering scheme to greatly reduce the rendering delay of the real-time video stream, processes video data with the same order of magnitude, loads the same algorithm model, has lower resource occupancy rate and greatly improves the resource utilization efficiency.

Description

AI asynchronous detection and real-time rendering method and system for multipath video streams
Technical Field
The invention relates to the technical field of multipath video stream processing, in particular to an AI asynchronous detection and real-time rendering method and system for multipath video streams.
Background
The existing method for performing AI detection and rendering on most real-time video streams generally obtains video streams from a camera through OpenCV/FFmpeg for video decoding and frame extraction, converts the obtained video frames into images (such as RGB) in a corresponding format, sends a frame of frame images into an AI model for detection, classification and identification of related targets, draws detection results output by the AI model on each frame of images through OpenCV, and finally performs video encoding on each frame of images through FFmpeg to output video streams in an H264/H265 format.
However, the above AI detection and rendering have certain drawbacks, mainly:
1. each step of the scheme is serial synchronization, the output FPS depends on the reasoning speed of the AI model, and the rendering performance is low;
2. one AI model can only serve one path of video stream, the multi-path video rendering capability is lacked, if multi-path video is required to be rendered, a plurality of AI models are required to be loaded, and the resource occupation is obviously increased;
3. the real-time video frames acquired from the camera consume a lot of time in the steps of image transcoding, AI detection, video coding and the like, and generate great delay, so that the real-time images seen by the rendering terminal and the images at the current moment in the real scene have great deviation in time.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an AI asynchronous detection and real-time rendering method and system for multipath video streams, which solve the technical problems of low rendering performance, remarkable resource occupation and delay in serial synchronous processing in the prior art.
In order to solve the technical problems, the invention provides the following technical scheme: an AI asynchronous detection and real-time rendering method of multipath video stream comprises the following steps:
s1, a plurality of cameras are subjected to frame splitting through a video decoding module to obtain each frame of image, and the latest frame of image is recorded as an Nth frame of image;
s2, the AI detection engine module determines a corresponding detection algorithm model according to the self performance evaluation frame extraction interval, sends the Nth frame image obtained from the S1 into the detection algorithm model for relevant detection and identification every fixed time period, and pushes the detection identification result into a corresponding result queue to obtain the detection result of the Nth frame image;
s3, an inter-frame prediction module predicts the position and the state of an N-frame image detection frame according to the historical position and the displacement speed of the detection frame by adopting a Kalman filtering algorithm according to the N-frame image and the N-1-frame image which has output the detection result before, so as to obtain N-frame image data;
s4, the image drawing and rendering module draws a final video frame according to the information of the Nth frame image data and the detection result of the Nth frame image;
and S5, encoding and streaming the final video frame through a video encoding module and a real-time streaming module, wherein the user terminal can see the real-time picture after AI rendering.
Further, in step S2, the AI detection engine module determines a corresponding detection algorithm model according to the self performance evaluation frame interval, and the specific process includes the following steps:
s21, an AI detection engine module automatically reads the output frame rate of a plurality of paths of video streams and marks the output frame rate as r, and the plurality of paths of video streams as N;
s22, the AI detection engine module judges whether the processing frame rate FPS of the current detection algorithm model aiming at the current input stream is enough to cope with N paths of video streams or not according to the loaded detection algorithm model, namely judges whether the value of the processing frame rate FPS of the current detection algorithm model is larger than or equal to N r or not;
if yes, judging whether the value of the processing frame rate FPS of the current detection algorithm model is more than or equal to 2 x N x r;
if the value of the processing frame rate FPS is more than or equal to 2 x N x r, unloading half of the current detection algorithm model, stopping the related threads, and reducing the resource occupancy rate of the AI detection engine module;
if the value of the processing frame rate FPS is less than 2×n×r, step S23 is entered;
if not, adopting the current detection algorithm model and ending;
s23, judging whether the residual resources of the AI detection engine module can load a new detection algorithm model;
if yes, starting a new thread, and loading a new detection algorithm model to jointly process N paths of video streams;
if not, the output frame rate r of each path of video is reduced, and the frame extraction interval of the current detection algorithm model aiming at each path of video stream is adjusted until the input of N paths of video streams can be dealt with.
Further, in step S3, the position and state of the nth frame image detection frame are predicted, and the specific process includes the following steps:
s31, initializing a state vector X and a state covariance matrix P;
s32, defining a state transition matrix F, an observation matrix H, an observation noise covariance matrix R and a process noise covariance matrix Q;
s33, predicting the position of the target detection frame
S34, prediction state covariance matrix
S35, the state update obtains a corrected state vector X and a state covariance matrix P.
Further, in step S31,
the state vector X represents the state of the target, including position and speed information, and an initial value of the state vector needs to be set according to the detection result of the first frame at the beginning;
the expression of the state vector X is:
the state covariance matrix P represents uncertainty of state estimation, and may be initially defined as a unit matrix;
the expression of the state covariance matrix P is:
in the above-mentioned method, the step of,、/>representing the coordinates of the upper left corner of the detection frame,/-)>、/>Representing the coordinates of the lower right corner of the detection frame,/-)>、/>、/>Indicating the velocities at the upper left and lower right corner coordinates, respectively.
Further, in step S35, the state update obtains the corrected state vector X and the state covariance matrix P, and the specific process includes the following steps:
s351, obtaining a new observed value Z from the target detection algorithm and the upper left corner coordinate of the target detection frame in the new frameAnd lower right corner coordinates->
S352, according to the predicted state vectorAnd an observation matrix H, calculating an observation residual Y, wherein the observation residual Y represents the difference between a predicted value and an actual observed value, and a predicted state vector +.>I.e. position +.in step S33>
S353, according to the predicted state covariance matrixThe Kalman gain K is calculated from the observation matrix H and the observation noise covariance matrix R, and is a weight matrix for balancingUncertainty of the predicted value and the observed value;
s354, correcting the predicted state vector using the Kalman gain K and the observation residual YObtaining an updated state vector X;
s355, correcting the predicted state covariance matrix by using the Kalman gain K and the observation matrix HObtaining an updated state covariance matrix P;
s36, repeating the steps S33-S35.
The technical scheme also provides a system for realizing the AI asynchronous detection and real-time rendering method, which comprises the following steps:
the video decoding module decodes the RTSP/RTMP stream by using a video processing tool FFmpeg to obtain frames of images;
the AI detection engine module is used for automatically dynamically evaluating and dynamically loading different detection algorithm models according to the load of the current AI detection engine module according to the number of accessed video streams, the frame rate of an original video stream and the category of a configuration detection algorithm model;
the inter-frame prediction module is used for predicting the position and the state of an N-th frame image according to the detection result of the N-1-th frame image based on a Kalman filtering algorithm to obtain a prediction result, and correcting the prediction result according to the detection result of the N-th frame image to generate N-th frame image data;
the image drawing and rendering module adopts a ffmpeg filter module to render coordinate information detected by the detection algorithm model onto a video frame in a rectangular frame or mask area mode;
the video coding module is used for coding the final video frame;
and the real-time streaming module is used for streaming the final video frame.
Further, the detection algorithm models are packaged into a dynamic library, and the AI detection engine module utilizes dlopen/dlcalose to dynamically load/unload related detection algorithm models when in operation, wherein the detection algorithm models comprise a target detection model and an instance segmentation model.
Further, the object detection model is used for detecting a certain object in real time, the object detection model comprises a YOLO series and a fast RCNN, the instance segmentation model is used for segmenting the boundary of a certain object, and the instance segmentation model comprises a Yolact and a Mask RCNN.
Further, the AI detection engine module includes:
image coding and decoding: for converting the video frames of H264/H265 into RGB images;
resource monitor: the system is used for monitoring the utilization rate of the system disk, the memory, the CPU and the GPU resources;
algorithm scheduler: the detection algorithm model is used for dynamically loading or unloading related detection algorithm models according to system resource occupation, system performance and corresponding scenes;
reasoning engine: the detection algorithm model is used for running and outputting;
the algorithm module: is used for preprocessing, AI detection and post-processing of images.
By means of the technical scheme, the invention provides the AI asynchronous detection and real-time rendering method and system for the multipath video streams, which at least have the following beneficial effects:
1. the invention adopts the parallel asynchronous rendering scheme to greatly reduce the rendering delay of the real-time video stream, processes video data with the same order of magnitude, loads the same algorithm model, has lower resource occupancy rate and greatly improves the resource utilization efficiency.
2. The invention disassembles the traditional serial processing flow into a plurality of modules such as image encoding and decoding, AI detection engine, inter-frame prediction, image drawing and rendering, and the like, fully utilizes the computing resources and the parallel processing capability of each module, greatly reduces the video delay and reduces the system resource occupation on the premise of ensuring the output FPS is unchanged.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a flow chart of the AI asynchronous detection and real-time rendering method of the present invention;
FIG. 2 is a network structure diagram of a conventional object detection model of the present invention;
FIG. 3 is a network block diagram of a common example segmentation model of the present invention;
FIG. 4 is a schematic diagram of a Kalman filtering algorithm of the present invention;
FIG. 5 is a diagram showing an example of predicting the position and state of an N-th frame image detection frame by using a Kalman filtering algorithm according to the present invention;
FIG. 6 is a block diagram of an AI detection engine module of the invention;
fig. 7 is a block diagram of the AI asynchronous detection and real-time rendering system of the present invention.
In the figure: 10. a video decoding module; 20. an AI detection engine module; 30. an inter prediction module; 40. an image drawing and rendering module; 50. a video encoding module; 60. and the real-time plug flow module.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. Therefore, the implementation process of how to apply the technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in implementing an embodiment method may be performed by a program to instruct related hardware and, thus, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Referring to fig. 1-7, a specific implementation manner of the present embodiment is shown, in which a parallel asynchronous rendering scheme is adopted in the present embodiment to greatly reduce the rendering delay of a real-time video stream, and process video data with the same order of magnitude, load the same algorithm model, so that the resource occupancy rate of the present embodiment is lower, and the resource utilization efficiency is greatly improved.
Referring to fig. 1, the present embodiment provides an AI asynchronous detection and real-time rendering method for multi-path video streams, which includes the following steps:
s1, the multiple cameras are subjected to frame disassembly through a video decoding module to obtain each frame image, the latest frame image is recorded as an N frame image, and it is noted that videos shot by the multiple cameras are multiple video streams in the embodiment, and the video decoding module is used for carrying out frame disassembly on the multiple video streams to obtain each frame image.
S2, the AI detection engine module determines a corresponding detection algorithm model according to the self performance evaluation frame extraction interval, the N-th frame image obtained from the S1 is sent into the detection algorithm model for relevant detection and identification at fixed time intervals, and the detection identification result is pushed into a corresponding result queue to obtain the N-th frame image detection result.
In step S2, the number and types of the detection algorithm models are not limited, but are merely used as various related models for realizing target detection and identification in the present embodiment, and all the detection algorithm models are encapsulated into a dynamic library and integrated in the AI detection engine module, meanwhile, the AI detection engine module can load different detection algorithm models according to the number of accessed video paths, the frame rate and the load (CPU utilization, memory utilization, GPU utilization, etc.) of the original video flow, meanwhile, the detection algorithm models comprise a target detection model and an instance segmentation model, the target detection model is used for detecting a certain target in real time, the target detection model comprises YOLO series and Faster RCNN, the instance segmentation model is used for segmenting the boundary of a certain target object, and the instance segmentation model comprises Yolact and Mask RCNN.
In order to fully disclose the object detection model in the prior art, please refer to fig. 2 and fig. 3, which are respectively a network structure diagram of a common object detection model and a network structure diagram of an example segmentation model.
In this embodiment, in order to clearly and completely describe the implementation method of step S2, how to implement the AI detection engine module to determine the corresponding detection algorithm model according to the self performance evaluation frame extraction interval is described, and the specific process includes the following steps:
s21, an AI detection engine module automatically reads the output frame rate of a plurality of paths of video streams and marks the output frame rate as r, and the plurality of paths of video streams as N;
s22, the AI detection engine module judges whether the processing frame rate FPS of the current detection algorithm model aiming at the current input stream is enough to cope with N paths of video streams or not according to the loaded detection algorithm model, namely judges whether the value of the processing frame rate FPS of the current detection algorithm model is larger than or equal to N r or not;
if yes, judging whether the value of the processing frame rate FPS of the current detection algorithm model is more than or equal to 2 x N x r;
if the value of the processing frame rate FPS is more than or equal to 2 x N x r, unloading half of the current detection algorithm model, stopping the related threads, and reducing the resource occupancy rate of the AI detection engine module;
if the value of the processing frame rate FPS is less than 2×n×r, step S23 is entered;
if not, adopting the current detection algorithm model and ending;
s23, judging whether the residual resources of the AI detection engine module can load a new detection algorithm model;
if yes, starting a new thread, and loading a new detection algorithm model to jointly process N paths of video streams;
if not, the output frame rate r of each path of video is reduced, and the frame extraction interval of the current detection algorithm model aiming at each path of video stream is adjusted until the input of N paths of video streams can be dealt with.
Specifically, if the processing frame rate FPS of the current detection algorithm model is insufficient to support the N-path video stream, it is determined whether the resources left by the AI detection engine module can load the new detection algorithm model, and the resources left by the AI detection engine module are the CPU utilization, the memory, and the video memory. If the new detection algorithm model can be loaded, a new thread is started, and the new detection algorithm model is loaded to jointly process the N paths of video streams. If the AI detection engine module is in the performance bottleneck and can not load a new detection algorithm model, the output frame rate r of each path of video stream is reduced, and the frame extraction interval of the current detection algorithm model for each path of video stream is adjusted until the input of N paths of video streams can be dealt with.
S3, an inter-frame prediction module predicts the position and the state of an N-th frame image detection frame according to the historical position and the displacement speed of the detection frame by adopting a Kalman filtering algorithm and combining historical detection data according to the N-th frame image and the N-1-th frame image which has output a detection result before, so as to obtain N-th frame image data;
referring to fig. 4, a schematic diagram of a kalman filter algorithm is shown, and for the purpose of describing step S3, the kalman filter algorithm is a mathematical algorithm for a dynamic system for estimating state variables. The kalman filtering algorithm can be used for predicting the position and the speed of an object in the next frame image, and the basic idea is to consider the position and the speed of the object as state variables and predict the position and the state of the object in the next frame image by using the position of a detection frame in the previous frame image as prior information.
Referring to fig. 5, in order to predict the position and the state of the nth frame image detection frame by using the kalman filtering algorithm, in step S3, the specific process includes the following steps:
s31, initializing a state vector X and a state covariance matrix P;
the state vector X represents the state of the target, including position and speed information, and an initial value of the state vector needs to be set according to the detection result of the first frame at the beginning;
the expression of the state vector X is:
the state covariance matrix P represents uncertainty of state estimation, and may be initially defined as a unit matrix;
the expression of the state covariance matrix P is:
in the above-mentioned method, the step of,、/>representing the coordinates of the upper left corner of the detection frame,/-)>、/>Representing the coordinates of the lower right corner of the detection frame,/-)>、/>、/>Indicating the velocities at the upper left and lower right corner coordinates, respectively.
S32, defining a state transition matrix F, an observation matrix H, an observation noise covariance matrix R and a process noise covariance matrix Q;
state transition matrix F: describe the state variable at time intervalsHow the inner changes, in this embodiment, the position coordinates are at speed +.>And->And (3) a change.
Observation matrix H: the state vector is mapped to the observation space, and our observations only include position coordinates, so that the observation matrix H retains only the position information in the state vector.
Observing a noise covariance matrix R: uncertainty in observed noise is described. In this example we assume that the observation noise is independent at each coordinate, so that R is a diagonal matrix.
Process noise covariance matrix Q: uncertainty of process noise is described.
In the above formula, σ is the standard deviation,the G matrix at time T is represented, in this embodiment 0.01, G being a matrix similar to the state transition matrix F but containing only the velocity term, and the process noise covariance matrix Q can be obtained by calculating the G matrix, which is defined in this embodiment as follows:
in the above-mentioned method, the step of,representing a time interval.
S33, predictive object detectionMeasuring the position of a frame
In this embodiment, the state vector X includes position and velocity information expressed as:t denotes the moment, the state transition matrix F already takes into account the influence of speed on position, the position thus calculated +.>Including predicted position and velocity information. To obtain the predicted coordinates, only the position +.>The predicted upper left corner coordinate is +.>The lower right corner coordinate is->
S34, prediction state covariance matrix
In the above formula, F is a state transition matrix of the system, which describes the evolution rule of the system state from the last moment to the current moment; p represents a state covariance matrix; q is a covariance matrix representing noise, which describes the noise of the input control and the noise during state evolution;at the time of TState transition matrix of (a); by this formula we can get the predicted covariance matrix +.>For a subsequent state update step.
S35, the state update obtains a corrected state vector X and a state covariance matrix P.
In step S35, the state update obtains the corrected state vector X and the state covariance matrix P, and the specific process includes the following steps:
s351, obtaining a new observed value Z from the target detection algorithm and the upper left corner coordinate of the target detection frame in the new frameAnd lower right corner coordinates->
The new observation value Z is calculated by the following formula:
in the above formula, T represents matrix transposition.
S352, according to the predicted state vectorAnd an observation matrix H, calculating an observation residual Y, wherein the observation residual Y represents the difference between a predicted value and an actual observed value, and a predicted state vector +.>I.e. the position of the target detection frame predicted in step S33 +.>
In the above-mentioned method, the step of,representing the predicted position of the target detection frame.
S353, according to the predicted state covariance matrixCalculating a Kalman gain K by using the observation matrix H and the observation noise covariance matrix R, wherein the Kalman gain K is a weight matrix used for balancing the uncertainty of the predicted value and the observed value;
in the above-mentioned method, the step of,representing the observation matrix at time T.
S354, correcting the predicted state vector using the Kalman gain K and the observation residual YObtaining an updated state vector X;
the correction formula is:
s355, correcting the predicted state covariance matrix by using the Kalman gain K and the observation matrix HObtaining an updated state covariance matrix P;
the correction formula is:
in the above formula, I represents an identity matrix, that is, a square matrix in which all elements on a diagonal line are 1 and the remaining elements are 0. In this embodiment, I represents an identity matrix of 8×8. In the state covariance matrix update formula, (I-k×h) represents a linear combination of the kalman gain K and the observation matrix H, and is multiplied by the predicted state covariance matrix P', to obtain an updated state covariance matrix P. The meaning of the formula is to correct the predicted state covariance matrix P' with the error estimate (kalman gain K) calculated in the kalman filter algorithm to obtain a more accurate state estimate.
S36, repeating the steps S33-S35.
Because the AI detection engine module needs to consume a certain time for detection and identification, the output result of the AI detection engine module is lagged, and the inter-frame prediction module is required to predict the position of the detection frame of the N-th frame image according to the N-th frame image and the N-1-th frame image which has output the detection result before, and the historical detection data is combined, so as to obtain the N-th frame image data according to the historical position and displacement speed of the detection frame.
And S4, the image drawing and rendering module draws a final video frame by utilizing the information of the Nth frame image data and the detection result of the Nth frame image.
The image drawing and rendering module in the embodiment abandons the OpenCV image drawing scheme adopted in most schemes, and directly uses the ffmpeg filter module to draw the detection result on the video frame. The advantage of this is that the time consumption caused by the conversion of the ffmpeg frame and the OpenCV image format is avoided, the time of encoding and decoding is saved, and the efficiency is improved.
More specifically, AVFilterGraph using FFmpeg creates a filter that is used to render coordinate information detected by the detection algorithm model onto a video frame in the form of a rectangular box or mask. For example, when a rectangular frame needs to be drawn, an avfilter_init_subject function is used to set the upper left corner coordinates (x and y), width and height (width and height) and color (color) of the rectangular frame, and the AI detection engine module can dynamically adjust parameters of the rectangular frame according to the detection result of each frame, so that the purposes of dynamic drawing and rendering are achieved.
And S5, encoding and streaming the final video frame through a video encoding module and a real-time streaming module, wherein the user terminal can see the real-time picture after AI rendering.
According to the embodiment, aiming at a scene of multi-path video merging AI detection rendering, an asynchronous parallel optimization scheme is innovatively provided, a traditional serial processing flow is disassembled into a plurality of modules such as image encoding and decoding, AI detection engines, inter-frame prediction, image drawing and rendering, and the like, calculation resources and parallel processing capacity of each module are fully utilized, video delay is greatly reduced on the premise that output FPS is unchanged, and system resource occupation is reduced.
The system for the AI asynchronous detection and real-time rendering method according to the present embodiment corresponds to the AI asynchronous detection and real-time rendering method according to the foregoing embodiment, and since the AI asynchronous detection and real-time rendering system according to the present embodiment corresponds to the AI asynchronous detection and real-time rendering method according to the foregoing embodiment, implementation of the foregoing AI asynchronous detection and real-time rendering method is also applicable to the AI asynchronous detection and real-time rendering system according to the present embodiment, and will not be described in detail in the present embodiment.
Referring to fig. 7, a block diagram of an AI asynchronous detection and real-time rendering system according to the present embodiment is shown, where the AI asynchronous detection and real-time rendering system is composed of a video decoding module 10, an AI detection engine module 20, an inter-frame prediction module 30, an image drawing and rendering module 40, a video encoding module 50 and a real-time push module 60, and specifically, the communication connection between the modules on the premise of realizing their respective functions is shown in fig. 7.
The video decoding module 10 decodes the RTSP/RTMP stream using the video processing tool FFmpeg to obtain frames of images.
The AI detection engine module 20 automatically and dynamically loads different detection algorithm models according to dynamic evaluation of the load CPU utilization rate, memory utilization rate, GPU utilization rate and the like of the current AI detection engine module 20 according to the number of accessed video streams, the frame rate of an original video stream and the category of the configuration detection algorithm model.
All detection algorithm models are packaged into a dynamic library, the AI detection engine module 20 can utilize dlopen/dlcalose to dynamically load/unload related detection algorithm models at running, the detection algorithm models comprise a target detection model and an instance segmentation model, the target detection model is used for detecting a certain target in real time, the target detection model comprises a YOLO series and a Faster RCNN, the instance segmentation model is used for segmenting the boundary of a certain target object, and the instance segmentation model comprises a Yolat and a Mask RCNN.
Referring to fig. 6, in order to illustrate a block diagram of the AI detection engine module, the AI detection engine module may be regarded as an algorithm scheduler, responsible for loading/unloading the algorithm, responsible for delivering video frames into the algorithm for relevant AI detection and output, and delivered into the inter-frame prediction module for video frame drawing.
The AI detection engine module 20 includes:
image coding and decoding: for converting the video frames of H264/H265 into RGB images;
resource monitor: the system is used for monitoring the utilization rate of the system disk, the memory, the CPU and the GPU resources;
algorithm scheduler: the detection algorithm model is used for dynamically loading or unloading related detection algorithm models according to system resource occupation, system performance and corresponding scenes;
reasoning engine: the detection algorithm model is used for running and outputting;
the algorithm module: is used for preprocessing, AI detection and post-processing of images.
In the AI detection engine module 20 of the present embodiment, no specific detection algorithm model or structure is concerned, and the AI detection engine module 20 already supports a commonly used target detection model, such as fast RCNN/Yolo, and also supports a commonly used instance segmentation algorithm, such as MASK RCNN, yolo, etc., which input 2D/3D images and output coordinates based on pixel points.
The inter-frame prediction module 30 predicts the position and state of the nth frame image based on the detection result of the nth frame image based on the kalman filter algorithm to obtain a prediction result, and corrects the prediction result according to the detection result of the nth frame image to generate the nth frame image data.
The image drawing and rendering module 40 renders the coordinate information detected by the detection algorithm model to the video frame in a rectangular frame or mask area manner by using a ffmpeg filter module.
The image rendering and rendering module 40 creates a filter using AVFilterGraph of FFmpeg, which is used to render the coordinate information detected by the detection algorithm model onto the video frame in the form of a rectangular frame or mask area. For example, when a rectangular frame needs to be drawn, the avfilter_init_subject function is used to set the upper left corner coordinates x and y, width and height and color of the rectangular frame, and the AI detection engine module can dynamically adjust parameters of the rectangular frame according to the detection result of each frame, so as to achieve the purposes of dynamic drawing and rendering.
The video encoding module 50 is used for encoding the final video frame;
the real-time push module 60 is used for pushing the final video frame.
In this embodiment, the image drawing and rendering module 40 and the AI detection engine module 20 are separated into different threads to operate independently, so that the serial synchronous processing flow is changed into an asynchronous parallel processing mode, and the rendering performance is greatly improved.
And the AI detection engine module 20 can be connected into multiple paths of different video streams, and dynamically load/unload related detection algorithm models according to the load state, so that FPS (source driver interface) of each path of video push stream is met as much as possible, and the resource utilization efficiency is improved.
The AI detection engine module 20 directly draws the detection result into the video stream in a filter mode, so that the time consumed by steps of image transcoding, AI detection, video coding and the like is saved, and the time delay of the real-time video stream is greatly reduced.
And the result detected by the AI detection engine module 20 is used as a prediction target, a correlation filter is used for predicting a future detection result according to historical detection data, and the predicted target is corrected by using the result of AI detection of the next frame, so that the delay caused by the need of waiting for the output of the AI detection result in video rendering is avoided, the inter-frame prediction is realized, and the real-time performance of video rendering is improved.
It should be noted that, in the system provided in the foregoing embodiment, when implementing the functions thereof, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be implemented by different functional modules, that is, the internal structure of the device is divided into different functional modules, so as to implement all or part of the functions described above. In addition, the system and method embodiments provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the system and method embodiments are detailed in the method embodiments, which are not repeated herein.
The following is performance comparison data for the same 8-way RTSP video stream (FPS: 25) run on an RTX 3080ti graphics card using the yolvs 5s model:
according to the comparison data, the parallel asynchronous rendering scheme provided by the embodiment greatly reduces the rendering delay of the real-time video stream, processes video data with the same order of magnitude, loads the same algorithm model, has lower resource occupancy rate and greatly improves the resource utilization efficiency.
The foregoing embodiments have been presented in a detail description of the invention, and are presented herein with a particular application to the understanding of the principles and embodiments of the invention, the foregoing embodiments being merely intended to facilitate an understanding of the method of the invention and its core concepts; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (8)

1. An AI asynchronous detection and real-time rendering method of multipath video stream is characterized by comprising the following steps:
s1, a plurality of cameras are subjected to frame splitting through a video decoding module to obtain each frame of image, and the latest frame of image is recorded as an Nth frame of image;
s2, the AI detection engine module determines a corresponding detection algorithm model according to the self performance evaluation frame extraction interval, sends the Nth frame image obtained from the S1 into the detection algorithm model for relevant detection and identification every fixed time period, and pushes the detection identification result into a corresponding result queue to obtain the detection result of the Nth frame image;
in step S2, the AI detection engine module determines a corresponding detection algorithm model according to the self performance evaluation frame interval, and the specific process includes the following steps:
s21, an AI detection engine module automatically reads the output frame rate of a plurality of paths of video streams and marks the output frame rate as r, and the plurality of paths of video streams as N;
s22, the AI detection engine module judges whether the processing frame rate FPS of the current detection algorithm model aiming at the current input stream is enough to cope with N paths of video streams or not according to the loaded detection algorithm model, namely judges whether the value of the processing frame rate FPS of the current detection algorithm model is larger than or equal to N r or not;
if yes, judging whether the value of the processing frame rate FPS of the current detection algorithm model is more than or equal to 2 x N x r;
if the value of the processing frame rate FPS is more than or equal to 2 x N x r, unloading half of the current detection algorithm model, stopping the related threads, and reducing the resource occupancy rate of the AI detection engine module;
if the value of the processing frame rate FPS is less than 2×n×r, step S23 is entered;
if not, adopting the current detection algorithm model and ending;
s23, judging whether the residual resources of the AI detection engine module can load a new detection algorithm model;
if yes, starting a new thread, and loading a new detection algorithm model to jointly process N paths of video streams;
if not, reducing the output frame rate r of each path of video, and adjusting the frame extraction interval of the current detection algorithm model aiming at each path of video stream until the input of N paths of video streams can be dealt with;
s3, an inter-frame prediction module predicts the position and the state of an N-frame image detection frame according to the historical position and the displacement speed of the detection frame by adopting a Kalman filtering algorithm according to the N-frame image and the N-1-frame image which has output the detection result before, so as to obtain N-frame image data;
s4, the image drawing and rendering module draws a final video frame according to the information of the Nth frame image data and the detection result of the Nth frame image;
and S5, encoding and streaming the final video frame through a video encoding module and a real-time streaming module, wherein the user terminal can see the real-time picture after AI rendering.
2. The AI asynchronous detection and real-time rendering method of claim 1, wherein: in step S3, the position and state of the nth frame image detection frame are predicted, and the specific process includes the following steps:
s31, initializing a state vector X and a state covariance matrix P;
s32, defining a state transition matrix F, an observation matrix H, an observation noise covariance matrix R and a process noise covariance matrix Q;
s33, predicting the position of the target detection frame
S34, prediction state covariance matrix
S35, the state update obtains a corrected state vector X and a state covariance matrix P.
3. The AI asynchronous detection and real-time rendering method of claim 2, wherein: in the step S31 of the process of the present invention,
the state vector X represents the state of the target, including position and speed information, and an initial value of the state vector needs to be set according to the detection result of the first frame at the beginning;
the expression of the state vector X is:
the state covariance matrix P represents uncertainty of state estimation, and may be initially defined as a unit matrix;
the expression of the state covariance matrix P is:
in the above-mentioned method, the step of,、/>representing the coordinates of the upper left corner of the detection frame,/-)>、/>Representing the coordinates of the lower right corner of the detection frame,/-)>、/>、/>Indicating the velocities at the upper left and lower right corner coordinates, respectively.
4. The AI asynchronous detection and real-time rendering method of claim 2, wherein: in step S35, the state update obtains the corrected state vector X and the state covariance matrix P, and the specific process includes the following steps:
s351, obtaining a new observed value Z from the target detection algorithm and the upper left corner coordinate of the target detection frame in the new frameAnd lower right corner coordinates->
S352, according to the predicted state vectorAnd an observation matrix H, calculating an observation residual Y, wherein the observation residual Y represents the difference between a predicted value and an actual observed value, and a predicted state vector +.>I.e. position +.in step S33>
S353, according to the predicted state covariance matrixCalculating a Kalman gain K by using the observation matrix H and the observation noise covariance matrix R, wherein the Kalman gain K is a weight matrix used for balancing the uncertainty of the predicted value and the observed value;
s354, correcting the predicted state vector using the Kalman gain K and the observation residual YObtaining an updated state vector X;
s355, correcting the predicted state covariance matrix by using the Kalman gain K and the observation matrix HObtaining an updated state covariance matrix P;
s36, repeating the steps S33-S35.
5. A system for implementing the AI asynchronous detection and real-time rendering method of any of the preceding claims 1-4, characterized in that the system comprises:
a video decoding module (10), wherein the video decoding module (10) decodes the RTSP/RTMP stream by using a video processing tool FFmpeg to obtain frames of images;
the AI detection engine module (20), the AI detection engine module (20) is used for dynamically evaluating and dynamically loading different detection algorithm models according to the load of the current AI detection engine module (20) according to the number of accessed video streams, the frame rate of an original video stream and the category of a configuration detection algorithm model;
the inter-frame prediction module (30), the inter-frame prediction module (30) predicts the position and the state of an N-th frame image according to the detection result of the N-1-th frame image based on a Kalman filtering algorithm to obtain a prediction result, and corrects the prediction result according to the detection result of the N-th frame image to generate N-th frame image data;
the image drawing and rendering module (40), the image drawing and rendering module (40) adopts a ffmpeg filter module to render the coordinate information detected by the detection algorithm model to a video frame in a rectangular frame or mask area mode;
a video encoding module (50), the video encoding module (50) being configured to encode a final video frame;
and the real-time streaming module (60) is used for streaming the final video frames.
6. The system according to claim 5, wherein: the AI detection engine module (20) utilizes dlopen/dlcalose to dynamically load/unload related detection algorithm models when in operation, and the detection algorithm models comprise a target detection model and an instance segmentation model.
7. The system according to claim 6, wherein: the target detection model is used for detecting a certain target in real time, the target detection model comprises a YOLO series and a fast RCNN, the instance segmentation model is used for segmenting the boundary of a certain target object, and the instance segmentation model comprises a yoact RCNN and a Mask RCNN.
8. The system according to claim 5, wherein: the AI detection engine module (20) includes:
image coding and decoding: for converting the video frames of H264/H265 into RGB images;
resource monitor: the system is used for monitoring the utilization rate of the system disk, the memory, the CPU and the GPU resources;
algorithm scheduler: the detection algorithm model is used for dynamically loading or unloading related detection algorithm models according to system resource occupation, system performance and corresponding scenes;
reasoning engine: the detection algorithm model is used for running and outputting;
the algorithm module: is used for preprocessing, AI detection and post-processing of images.
CN202310549343.2A 2023-05-16 2023-05-16 AI asynchronous detection and real-time rendering method and system for multipath video streams Active CN116260990B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310549343.2A CN116260990B (en) 2023-05-16 2023-05-16 AI asynchronous detection and real-time rendering method and system for multipath video streams

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310549343.2A CN116260990B (en) 2023-05-16 2023-05-16 AI asynchronous detection and real-time rendering method and system for multipath video streams

Publications (2)

Publication Number Publication Date
CN116260990A CN116260990A (en) 2023-06-13
CN116260990B true CN116260990B (en) 2023-07-28

Family

ID=86686575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310549343.2A Active CN116260990B (en) 2023-05-16 2023-05-16 AI asynchronous detection and real-time rendering method and system for multipath video streams

Country Status (1)

Country Link
CN (1) CN116260990B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116761018B (en) * 2023-08-18 2023-10-17 湖南马栏山视频先进技术研究院有限公司 Real-time rendering system based on cloud platform
CN117196999B (en) * 2023-11-06 2024-03-12 浙江芯劢微电子股份有限公司 Self-adaptive video stream image edge enhancement method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105933724A (en) * 2016-05-23 2016-09-07 福建星网视易信息系统有限公司 Video producing method, device and system
CN114882400A (en) * 2022-04-28 2022-08-09 华北水利水电大学 Aggregate detection and classification method based on AI intelligent machine vision technology

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100571836B1 (en) * 2004-03-05 2006-04-17 삼성전자주식회사 Method and apparatus for detecting video shot change of a motion picture
CN102821323B (en) * 2012-08-01 2014-12-17 成都理想境界科技有限公司 Video playing method, video playing system and mobile terminal based on augmented reality technique
JP6373056B2 (en) * 2014-05-14 2018-08-15 キヤノン株式会社 IMAGING DEVICE, IMAGING DEVICE CONTROL METHOD, AND IMAGE PROCESSING METHOD
CN105792002B (en) * 2014-12-18 2019-07-02 广州市动景计算机科技有限公司 Video Rendering method and device
US20210344991A1 (en) * 2016-10-13 2021-11-04 Skreens Entertainment Technologies, Inc. Systems, methods, apparatus for the integration of mobile applications and an interactive content layer on a display
US11402909B2 (en) * 2017-04-26 2022-08-02 Cognixion Brain computer interface for augmented reality
WO2019164518A1 (en) * 2018-02-25 2019-08-29 Nokia Solutions And Networks Oy Method and system for automated dynamic network slice deployment using artificial intelligence
CN111833861A (en) * 2019-04-19 2020-10-27 微软技术许可有限责任公司 Artificial intelligence based event evaluation report generation
CN110490901A (en) * 2019-07-15 2019-11-22 武汉大学 The pedestrian detection tracking of anti-attitudes vibration
CN110852283A (en) * 2019-11-14 2020-02-28 南京工程学院 Helmet wearing detection and tracking method based on improved YOLOv3
CN110929683B (en) * 2019-12-09 2021-01-22 北京赋乐科技有限公司 Video public opinion monitoring method and system based on artificial intelligence
US11508156B2 (en) * 2019-12-27 2022-11-22 Magna Electronics Inc. Vehicular vision system with enhanced range for pedestrian detection
CN111541911A (en) * 2020-04-21 2020-08-14 腾讯科技(深圳)有限公司 Video detection method and device, storage medium and electronic device
CN111508578A (en) * 2020-05-19 2020-08-07 中国电子科技集团公司第三十八研究所 Brain wave checking device and method based on artificial intelligence
CN111882656A (en) * 2020-06-19 2020-11-03 深圳宏芯宇电子股份有限公司 Graph processing method, equipment and storage medium based on artificial intelligence
WO2022022368A1 (en) * 2020-07-28 2022-02-03 宁波环视信息科技有限公司 Deep-learning-based apparatus and method for monitoring behavioral norms in jail
CN113221706B (en) * 2021-04-30 2024-03-22 西安聚全网络科技有限公司 AI analysis method and system for multi-process-based multi-path video stream
CN113536915A (en) * 2021-06-09 2021-10-22 苏州数智源信息技术有限公司 Multi-node target tracking method based on visible light camera
CN113538873A (en) * 2021-07-28 2021-10-22 东莞全芯物联科技有限公司 AI position of sitting corrects camera based on image recognition technology
CN115942105A (en) * 2021-08-09 2023-04-07 华为技术有限公司 Scheduling operation method and device of AI model in camera and camera
WO2023038898A1 (en) * 2021-09-07 2023-03-16 Vizio, Inc. Methods and systems for detecting content within media streams
CN114155284A (en) * 2021-12-15 2022-03-08 天翼物联科技有限公司 Pedestrian tracking method, device, equipment and medium based on multi-target pedestrian scene
CN115131697A (en) * 2022-05-06 2022-09-30 腾讯科技(深圳)有限公司 Video detection method, device, equipment and storage medium
CN114926781A (en) * 2022-05-27 2022-08-19 北京邮电大学 Multi-user time-space domain abnormal behavior positioning method and system supporting real-time monitoring scene
CN115205753A (en) * 2022-07-22 2022-10-18 上海交通大学 Lightweight video action understanding method and system based on computer vision
US20230071470A1 (en) * 2022-11-15 2023-03-09 Arvind Radhakrishnen Method and system for real-time health monitoring and activity detection of users
CN115984675B (en) * 2022-12-01 2023-10-13 扬州万方科技股份有限公司 System and method for realizing multipath video decoding and AI intelligent analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105933724A (en) * 2016-05-23 2016-09-07 福建星网视易信息系统有限公司 Video producing method, device and system
CN114882400A (en) * 2022-04-28 2022-08-09 华北水利水电大学 Aggregate detection and classification method based on AI intelligent machine vision technology

Also Published As

Publication number Publication date
CN116260990A (en) 2023-06-13

Similar Documents

Publication Publication Date Title
CN116260990B (en) AI asynchronous detection and real-time rendering method and system for multipath video streams
US20200322619A1 (en) Systems and Methods of Encoding Multiple Video Streams for Adaptive Bitrate Streaming
CN111147867B (en) Multifunctional video coding CU partition rapid decision-making method and storage medium
US9350990B2 (en) Systems and methods of encoding multiple video streams with adaptive quantization for adaptive bitrate streaming
US8154553B2 (en) Centralized streaming game server
US8264493B2 (en) Method and system for optimized streaming game server
WO2014190308A1 (en) Systems and methods of encoding multiple video streams with adaptive quantization for adaptive bitrate streaming
EP2364190A2 (en) Centralized streaming game server
US11330263B1 (en) Machine learning based coded size estimation in rate control of video encoding
JP2013532926A (en) Method and system for encoding video frames using multiple processors
US11470327B2 (en) Scene aware video content encoding
US10536696B2 (en) Image encoding device and image encoding method
CN107277519B (en) A kind of method and electronic equipment of the frame type judging video frame
US20100104010A1 (en) Real-time rate-control method for video encoder chip
US20220408097A1 (en) Adaptively encoding video frames using content and network analysis
KR20120010790A (en) Method for detecting scene change and apparatus therof
Lu et al. Dynamic offloading on a hybrid edge–cloud architecture for multiobject tracking
CN106658024A (en) Fast video coding method
JP2023546513A (en) Data encoding method, device, and computer program
CN110659571B (en) Streaming video face detection acceleration method based on frame buffer queue
US20220217378A1 (en) Computer Software Module, a Device and a Method for Accelerating Inference for Compressed Videos
CN112188212A (en) Method and device for intelligent transcoding of high-definition monitoring video
Huang et al. EdgeBooster: Edge-assisted real-time image segmentation for the mobile web in WoT
CN116797442A (en) Video processing method, device, computer equipment and storage medium
CN113660487B (en) Parameter determination method and device for distributing corresponding bit number for frame image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant