CN117640925A - Multi-path video picture consistency detection method, system and electronic equipment - Google Patents

Multi-path video picture consistency detection method, system and electronic equipment Download PDF

Info

Publication number
CN117640925A
CN117640925A CN202410107600.1A CN202410107600A CN117640925A CN 117640925 A CN117640925 A CN 117640925A CN 202410107600 A CN202410107600 A CN 202410107600A CN 117640925 A CN117640925 A CN 117640925A
Authority
CN
China
Prior art keywords
frame
video
frames
scene
video frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410107600.1A
Other languages
Chinese (zh)
Inventor
程亚辉
李东
刘鹏
周筱婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haikan Network Technology Shandong Co ltd
Original Assignee
Haikan Network Technology Shandong Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haikan Network Technology Shandong Co ltd filed Critical Haikan Network Technology Shandong Co ltd
Priority to CN202410107600.1A priority Critical patent/CN117640925A/en
Publication of CN117640925A publication Critical patent/CN117640925A/en
Pending legal-status Critical Current

Links

Abstract

The application discloses a method, a system and electronic equipment for detecting the consistency of multiple paths of video pictures, which relate to the technical field of IPTV and comprise the following steps: splitting a multipath video stream acquired in real time into a video frame sequence and storing the video frame sequence into a memory queue; determining scene transition frames in the video frame sequence, and searching alignment frames through the scene transition frames to realize the synchronization of multiple paths of video streams; and discarding all video frames before the aligned frame, and comparing the video frames after the aligned frame by frame to perform picture consistency detection. And realizing multi-channel signal synchronization through video frame alignment, comparing picture consistency frame by frame on the premise, storing real-time comparison data, screenshot and the like, and finally carrying out image display or pushing abnormal alarm on an analysis result. By combining the characteristics of the IPTV information source system, the abnormal monitoring of the signals is carried out on the segments through the picture consistency ratio, the defect of the conventional monitoring capability is overcome, and the running stability of the IPTV broadcasting system is improved.

Description

Multi-path video picture consistency detection method, system and electronic equipment
Technical Field
The application relates to the technical field of IPTV, in particular to a method, a system and electronic equipment for detecting the consistency of multiple paths of video pictures.
Background
IPTV (Internet Protocol Television ) is a technology for transmitting video and television programs over the internet. In the last decade, domestic IPTV is rapidly popularized under the promotion of policy support, user demand and technical development, and factors such as high definition, abundant and various content selections, cross-platform viewing experience and the like promote popularization and development of IPTV services, so that the IPTV has very important significance in providing good viewing experience for users and ensuring the safe broadcasting and monitoring signal abnormity.
At present, the abnormal monitoring of the IPTV system signals in the production environment is mainly aimed at the conventional abnormal monitoring of the signal pictures, such as cut-off, static frame, mosaic, black field and the like, and is limited by the range of the data set of monitoring software for certain abnormal states, such as frame skip, picture jitter and the like, and sometimes cannot be detected.
Aiming at the problems, by combining the link characteristics of the IPTV system, the comprehensive anomaly detection can be realized through comparison of the consistency of multiple paths of videos. In the traditional multi-path consistency comparison technology, signal synchronization is performed according to time offsets obtained by time stamps of a reference signal and a comparison signal, but the time offsets of multi-path video streams in an IPTV platform are affected by a plurality of factors, equipment in a system is divided into a plurality of brands, time stamp generation strategies are different, clock drift may exist in different equipment, in addition, the problems of transmission delay, packet loss or disorder and the like may exist in video streams transmitted through a network, the factors may all cause the time offsets of the multi-path video streams to change, alignment synchronization is not accurate enough according to the offsets, and further accurate detection of picture consistency cannot be realized.
Disclosure of Invention
In order to solve the technical problems, the application provides the following technical scheme:
in a first aspect, an embodiment of the present application provides a method for detecting consistency of multiple video frames, including:
splitting a multipath video stream acquired in real time into a video frame sequence and storing the video frame sequence into a memory queue;
determining scene transition frames in the video frame sequence, and searching alignment frames through the scene transition frames to realize the synchronization of multiple paths of video streams;
and discarding all video frames before the aligned frame after the aligned frame is determined, and comparing the video frames after the aligned frame by frame to perform picture consistency detection.
In one possible implementation manner, the splitting the multi-path video stream acquired in real time into a video frame sequence and storing the video frame sequence in a memory queue includes:
carrying out data preprocessing on the multipath video streams to obtain video stream TS fragments;
identifying and transcoding the TS fragments of the video stream based on the ffmpeg to obtain parameter information, wherein the parameter information comprises: video format, resolution, code rate, and frame rate;
the TS fragments of the video stream are split into video frame sequences after being identified and transcoded;
reading each frame of data in the video frame sequence, and performing scaling, denoising, segmentation and enhancement processing on each frame of data;
and writing the frame data which is finally processed into a memory queue, and caching each frame image of the video stream.
In one possible implementation manner, the determining a scene transition frame in the video frame sequence, searching an alignment frame through the scene transition frame, and implementing synchronization of multiple paths of video streams includes:
converting the RGB image of the video frame sequence stored in the memory queue into a gray scale image;
then, the gray level image of each frame is calculated as a gray level histogram;
respectively carrying out difference solving on gray histograms of two adjacent frames in each video frame sequence, and if the gray histograms are larger than a first preset threshold value, extracting scene conversion frames in the video frame sequences and adding the scene conversion frames into a scene conversion frame list;
and carrying out pairwise similarity comparison on scene conversion frames in different scene conversion frame lists, and extracting aligned frames if the similarity is larger than a second preset threshold value.
In one possible implementation manner, the step of performing pairwise similarity comparison on the scene transition frames in the different scene transition frame lists, and extracting the alignment frame if the similarity is greater than a second preset threshold value includes:
selecting a first scene transition frame in the first scene transition frame list as a reference frame;
comparing the scene conversion frames in the other scene conversion frame list with the reference frame, and calculating the peak signal-to-noise ratio PSNR of each scene conversion frame and the reference frame time, wherein the higher the PSNR value is, the smaller the picture difference between the scene conversion frame and the reference frame is;
and determining a second scene transition frame with the maximum PSNR value with the reference frame in the other scene transition frame list as an alignment frame with the reference frame.
In one possible implementation, the calculating the peak signal-to-noise ratio PSNR of each scene transition frame and the reference frame time includes:
wherein MAX is the maximum value of the pixel value of the image, MSE is the mean square error, and represents the measurement of the difference of the pixel values at each position of two images;
xi is the i-th pixel value of the original image, yi is the i-th pixel value of the image to be compared, and m is the total number of image pixels.
In one possible implementation, if two consecutive alignment frames are determined from the scene transition frame list and the two alignment frames are consecutive in index in the scene transition frame list, the latter alignment frame is extracted as the final alignment frame.
In one possible implementation manner, the determining the aligned frame and discarding all video frames before the aligned frame, and comparing the video frames after the aligned frame by frame, performing picture consistency detection includes:
after the alignment frames are determined, deleting all frames before the alignment frames in the video frame sequence to realize signal synchronization;
performing similarity comparison on all video frames after aligned frames in the video frame sequence on the basis of an SSIM structural similarity algorithm;
and if the comparison result is larger than the third preset threshold value, confirming that the pictures are consistent, or if the comparison result is smaller than the third preset threshold value, carrying out dislocation comparison, and if the comparison result after dislocation comparison is larger than the third preset threshold value, confirming that the pictures are consistent.
In one possible implementation manner, if the comparison result is smaller than a third preset threshold, performing misalignment comparison includes:
determining two video frames with the comparison result smaller than a third preset threshold value as a comparison reference video frame and a dislocation mobile video frame;
if the video frame sequence in which the reference video frame is located loses frames, the dislocation moving video frame is moved leftwards for n times and is compared with the reference video frame again; or if the video frame sequence in which the dislocation moving video frame is located loses frames, the dislocation moving video frame is moved rightwards for n times to be compared with the reference video frame again, wherein n is the lost frames, and the value of n is determined according to the network environment.
In a second aspect, an embodiment of the present application provides a multi-path video frame consistency detection system, including:
the video processing module is used for splitting the multipath video stream acquired in real time into a video frame sequence and storing the video frame sequence into the memory queue;
the video stream synchronization module is used for determining scene conversion frames in the video frame sequence, searching alignment frames through the scene conversion frames and realizing the synchronization of multiple paths of video streams;
and the consistency detection module is used for determining all video frames before the aligned frame are discarded after the aligned frame, and comparing the video frames after the aligned frame by frame to carry out picture consistency detection.
In a third aspect, an embodiment of the present application provides an electronic device, including:
a processor;
a memory;
and a computer program, wherein the computer program is stored in the memory, the computer program comprising instructions that, when executed by the processor, cause the electronic device to perform the multi-way video picture consistency detection method of any of the possible implementations of the first aspect.
In the embodiment of the application, the real-time video stream data is read and converted into the structured analysis and writing into the memory queue, the video frames are aligned to realize the synchronization of multiple paths of signals, the picture consistency comparison is carried out frame by frame on the premise, the real-time comparison data, the screenshot and the like are stored, and finally the analysis result is subjected to image display or abnormal pushing alarm. By combining the characteristics of the IPTV information source system, the abnormal monitoring of the signals is carried out on the segments through the picture consistency ratio, the defect of the conventional monitoring capability is overcome, and the running stability of the IPTV broadcasting system is improved.
Drawings
Fig. 1 is a typical link schematic diagram of an IPTV source system according to an embodiment of the present application;
fig. 2 is a flow chart of a method for detecting consistency of multiple video frames according to an embodiment of the present application;
fig. 3 is a schematic diagram of video frame alignment according to an embodiment of the present application;
fig. 4 is a schematic diagram of an a image frame provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a B-picture frame according to an embodiment of the present application
Fig. 6 is a schematic diagram of a A, B video scene transition frame according to an embodiment of the present disclosure;
fig. 7 is a schematic diagram of image consistency detection according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a first set of statistical analysis SSIM thresholds provided in an embodiment of the present application;
FIG. 9 is a schematic diagram of a second set of statistical analysis SSIM thresholds provided in an embodiment of the present application;
FIG. 10 is a schematic diagram of a third set of statistical analysis SSIM thresholds provided in an embodiment of the present application;
FIG. 11 is a schematic diagram of misalignment alignment provided in an embodiment of the present application;
fig. 12 is a schematic modularized view of a multi-channel video picture consistency detection system according to an embodiment of the present application;
fig. 13 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present invention is described below with reference to the drawings and the detailed description.
The multi-channel video picture consistency detection method provided by the embodiment of the application is mainly used for comparing multi-channel IPTV multicast streams and detecting picture abnormality alarms. Taking a typical link of an IPTV signal source system as an example, as shown in fig. 1, the main signal source A before transcoding, the standby signal source B before transcoding and the C after transcoding are compared in pairs, problems and locking fault links are found through consistency of judging pictures in pairs, if the main signal source A, B is inconsistent in comparison, one path of signal is abnormal, abnormal pictures are judged, transcoding is switched to a normal path of signal source, and output signals are ensured to be normal. When the signals of A, C or B, C are compared and found to be inconsistent, the problem of the transcoding link is illustrated, and the processing is examined. The following description will take an example of the comparison A, B signal.
Referring to fig. 2, the method for detecting the consistency of multiple video frames provided in this embodiment includes:
s101, splitting a multipath video stream acquired in real time into a video frame sequence and storing the video frame sequence into a memory queue.
In this embodiment, first, data preprocessing is performed on the multiple paths of video streams to obtain a video stream TS segment; identifying and transcoding the TS fragments of the video stream based on the ffmpeg to obtain parameter information, wherein the parameter information comprises: video format, resolution, code rate, and frame rate; the TS fragments of the video stream are split into video frame sequences after being identified and transcoded; reading each frame of data in the video frame sequence, and performing scaling, denoising, segmentation and enhancement processing on each frame of data; and writing the frame data which is finally processed into a memory queue, and caching each frame image of the video stream.
Specifically, for the characteristics of the IPTV system in this embodiment, the types of the collected signals include, but are not limited to, two main types: SDI signal and IP signal, IP stream signaling protocols include, but are not limited to, MPEG2, h.264, avs+, etc. And (3) obtaining video information by using ffprobe, wherein two paths of videos are both H.264 transmission protocols, the resolution is 1080P, the code rate is 8.0Mbps, and the frame rate is 25fps. Splitting the video stream into images through ffmpeg, storing the images into a frame sequence, and reading the frame sequence into a memory queue, wherein the specific parameters are as follows: the image scaling ratio is 0.5, and the proper scaling ratio can reduce the calculation times and improve the calculation efficiency; the number of color channels of the RGB frame image is 3; setting every 250 frames of images to be stored as a group of PNG images; creating a video tag for log recording, running a program loop asynchronously to divide a ffmpeg read UDP stream into an image frame sequence, adjusting the size of a frame image based on OpenCV, storing processed frame data and some metadata (URL, frame index and resolution) as PNG images, storing the images in a catalog created before every 250 frames pass, and recording the processing time of every 250 frames of images by circulating until the reading of the UDP stream is stopped.
S102, determining scene transition frames in the video frame sequence, and searching alignment frames through the scene transition frames to realize synchronization of multiple paths of video streams.
The multipath video stream is to be in the video picture asynchronism caused by network delay, packet loss, main and standby transcoding efficiency difference and other factors, and picture consistency detection is based on the frame-by-frame comparison among signals, and is carried out under the condition of completely synchronizing the video streams, so that the comparison is carried out on the premise of firstly carrying out video frame alignment.
Referring to fig. 3, the A, B image frame sequence RGB image read into the memory queue in step S1 is converted into a gray image, and then the gray image of each frame is calculated as a gray histogram. The gray level histogram of two adjacent frames in the A image frame sequence is calculated, the scene conversion frame in the A image sequence is extracted and added into the scene conversion frame list L1 more than a first threshold value alpha, the gray level histogram of two adjacent frames in the B image frame sequence is calculated, and the scene conversion frame in the B image sequence is extracted and added into the scene conversion frame list L2 more than the first threshold value alpha. And (3) carrying out similarity comparison on the elements L1 and L2 in pairs on the basis of a PSNR algorithm, extracting aligned frames when the similarity is larger than a second threshold value beta, and extracting the next aligned frame as a final aligned frame when indexes of two aligned frames in the found A image frame sequence are continuous in a scene transition frame list, wherein alignment accuracy can be improved by judging that the frames are aligned by the two continuous aligned frames.
In this embodiment, an image frame list is analyzed and processed, a color image is converted into a gray level image, then a histogram thereof is calculated, a mean absolute difference is used to calculate a histogram difference value of two frames before and after the image frame list, the larger the difference value is, the more obvious the difference between the frames is, when the difference value is larger than a first threshold value 2000, a current frame index is added into a scene conversion frame list, the index and the histogram difference value of the frame are printed, the scene conversion frame is found, the frame list in the queue is circularly executed, the scene conversion frame list is returned, and the video scene conversion frames of the image frame a in fig. 4 and the image frame B in fig. 5 are as shown in fig. 6.
In order to accurately extract the alignment frames, selecting a first scene transition frame in a first scene transition frame list as a reference frame; and comparing the scene conversion frames in the other scene conversion frame list with the reference frame, and calculating the peak signal-to-noise ratio PSNR of each scene conversion frame and the reference frame time, wherein the higher the PSNR value is, the smaller the picture difference between the scene conversion frame and the reference frame is. And determining a second scene transition frame with the maximum PSNR value with the reference frame in the other scene transition frame list as an alignment frame with the reference frame.
The PSNR peak signal-to-noise ratio is an index for measuring the quality of an image or video, is commonly used for evaluating the distortion degree between the image or video and an original reference image, and can also be used for judging the consistency of multiple pictures. In the case of multiple pictures, one of the multiple pictures can be used as a reference picture, the other multiple pictures are compared with the reference picture, and the PSNR value between each picture and the reference picture is calculated, wherein the higher the PSNR value is, the smaller the difference between the two pictures is, and the better the picture consistency is. In general, PSNR is excellent above 20dB and very excellent above 25dB, and if it is above 30dB, it can be said to be "undistorted".
Specifically, scene transition frames in the video stream A, B are compared one by one based on PSNR, a second threshold of PSNR similarity is set to 25dB, a pair of scene transition frames larger than the threshold is obtained by calculating PSNR values of two images, when 2 pairs of frames with PSNR values larger than the threshold, which are continuous in index in the scene transition frame list, are found, an aligned frame is considered to be found, the index of the next continuous frame is recorded, and the frame is considered to be the aligned frame. If no aligned frame is found in a group of frames, the group of frames is deleted from the frame queue and journaled. To avoid infinite loops, if the number of comparisons exceeds a certain limit, the record fails to find an aligned frame and jumps out of the loop.
In this embodiment, calculating the peak signal-to-noise ratio PSNR of each scene transition frame and the reference frame time includes:
wherein MAX is the maximum value of the pixel value of the image, MSE is the mean square error, and represents the measurement of the difference of the pixel values at each position of two images;
xi is the i-th pixel value of the original image, yi is the i-th pixel value of the image to be compared, and m is the total number of image pixels.
And S103, discarding all video frames before the aligned frame, and comparing the video frames after the aligned frame by frame to perform picture consistency detection.
Referring to fig. 7, after the video alignment frame is found, all frames before the alignment frame in the A, B video frame sequence are deleted, so as to realize signal synchronization. Based on the above, based on an SSIM structural similarity algorithm, carrying out similarity comparison on all frames of the A, B image frame sequence frame by frame, and considering that the pictures are consistent when the similarity is larger than a third threshold value gamma, otherwise, recognizing that the pictures are inconsistent. However, in a real environment, the comparison is often inaccurate, and is easily affected by video frame deletion caused by network fluctuation and packet loss, so that a dislocation judgment mechanism is added in picture consistency detection, picture consistency is still determined after dislocation comparison, and otherwise, the picture consistency is determined to be inconsistent.
The SSIM structural similarity index is an index widely applied to image quality evaluation and image processing and is used for measuring the similarity between two images, and the similarity in three aspects of brightness, contrast and structure is considered, specifically:
brightness refers to the degree of overall brightness of an image, represented by the average of the pixel values of the image, usingAnd->Representing the average value of the pixels of the two images x and y, respectively, the contrast being the degree of variation of the pixel values in the images, a higher contrast representing a significant color or brightness variation and a lower contrast representing a more uniform color or brightness, using +.>And->The structure refers to the high-frequency information such as edges and textures in the images, reflects the details of the images, uses +.>Representing the pixel covariance of the two images x and y.
The calculation formula of the SSIM is:
wherein,and->Is a constant introduced to avoid zero denominator. SSIM has a value of [ -1, 1]The larger the value is, the closer the similarity of the two images is, and a threshold value can be set for judging the picture consistency.
Because the comparison operand is large and the judgment accuracy is directly affected by algorithm parameters, the program selects several important parameters through multiple tests, statistical analysis and continuous parameter tuning: the original image scaling ratio is 0.5, the SSIM scaling ratio is 0.1 and the SSIM threshold is 0.6, wherein the original image scaling ratio is specifically an image scaling ratio executed when a video frame sequence is added into a memory queue in a video acquisition stage, the SSIM scaling ratio is specifically an image scaling ratio executed when two images are subjected to SSIM similarity algorithm comparison, and the SSIM threshold is 0.6 indicates that pictures are considered to be consistent when the SSIM similarity calculated value of the two images is larger than 0.6, otherwise, the pictures are considered to be inconsistent. Fig. 8-10 show statistical analysis experimental data, in fig. 8-10, the vertical axis is frame number, the horizontal axis is ssim value, min-max is the range of ssim obtained from the experiment, and for example, (0.9997,10000) shows 10000 frames with 0.9997 of their ssim similarity. Fig. 8 (a), fig. 9 (a) and fig. 10 (a) are respectively a relationship between a statistical A, B video frame ssim similarity and two video comparison frame numbers, and the closer ssim is to 1, the more similar the two video frames are expressed. Fig. 8 (b), 9 (b) and 10 (b) are enlarged views of details of fig. 8 (a), 9 (a) and 10 (a), respectively, with the abscissa unchanged and the ordinate smaller, showing details of the number of frames. As can be seen from fig. 8 to 10, the statistical image in fig. 9 is optimal in terms of operation efficiency and judgment accuracy, and the statistical image shows that the SSIM threshold is 0.8 or slightly larger, and 0.6 is preferable.
In this embodiment, when the dislocation comparison is performed, two video frames with the comparison result smaller than the third preset threshold are determined as the reference video frame and the dislocation moving video frame. If the video frame sequence in which the reference video frame is located loses frames, the dislocation moving video frame is moved leftwards for n times and is compared with the reference video frame again; or if the video frame sequence in which the dislocation moving video frame is located loses frames, the dislocation moving video frame is moved rightwards for n times to be compared with the reference video frame again, wherein n is the lost frames, and the value of n is determined according to the network environment.
In this embodiment, when the comparison is inconsistent, it is first determined whether the inconsistent frame indexes are continuous, if the number of continuous frames reaches 5 frames, for example, an\an+1\an+2 and bn\bn+1\bn+2, the B video frame queues are staggered by 1 or 2 frames, for example, an\an+1\an+2 and Bn-1\bn\bn+1, bn+1\bn+2\bn+3, respectively, if the alignment indicates that the misalignment is encountered, whether the a queue or the B queue is misaligned, and then the alignment is re-aligned, and then the picture consistency detection is continued, as shown in fig. 7.
In fig. 11, a video frame is a contrast reference video frame, and B video frame is a shift moving video frame. If the B video frame loss can be aligned with the A image frame sequence again by right shifting the B image frame sequence n times, the A video frame loss can be aligned with the A image frame sequence again by left shifting the B image frame sequence n times, and the A video frame loss can be aligned with the A image frame sequence again. And after the dislocation comparison is consistent, the A, B video is considered to be consistent, otherwise, the A, B video is considered to be inconsistent.
If the final video pictures are inconsistent in comparison, the warning is pushed to the webmaster, and the warning is pushed to the mobile phone end and the web page end in a mode of mail, short message or WeChat and the like.
Corresponding to the method for detecting the consistency of the multiple video frames provided by the embodiment, the application also provides an embodiment of a system for detecting the consistency of the multiple video frames.
Referring to fig. 12, the multi-channel video picture consistency detection system 20 includes:
the video processing module 201 is configured to split a multi-path video stream acquired in real time into a video frame sequence and store the video frame sequence in the memory queue;
the video stream synchronization module 202 is configured to determine a scene transition frame in the video frame sequence, and find an alignment frame through the scene transition frame, so as to realize synchronization of multiple paths of video streams;
and the consistency detection module 203 is configured to discard all video frames before the aligned frame, and compare the video frames after the aligned frame by frame to perform picture consistency detection.
Corresponding to the above embodiment, the present application further provides an electronic device, configured to implement detection of picture consistency of an IPTV playing system.
Referring to fig. 13, a schematic structural diagram of an electronic device according to an embodiment of the present application is provided.
As shown in fig. 13, the electronic device 300 may include: a processor 301, a memory 302 and a communication unit 303. The components may communicate via one or more buses, and it will be appreciated by those skilled in the art that the electronic device structure shown in the drawings is not limiting of the embodiments of the present application, and that it may be a bus-like structure, a star-like structure, or include more or fewer components than shown, or may be a combination of certain components or a different arrangement of components.
Wherein, the communication unit 303 is configured to establish a communication channel, so that the electronic device may communicate with other IPTV devices.
The processor 301, which is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device and/or processes data by running or executing software programs and/or modules stored in the memory 302, and invoking data stored in the memory. The processor may be comprised of integrated circuits (integrated circuit, ICs), such as a single packaged IC, or may be comprised of packaged ICs that connect multiple identical or different functions. For example, the processor 301 may include only a central processing unit (central processing unit, CPU). In the embodiment of the application, the CPU may be a single operation core or may include multiple operation cores.
Memory 302 for storing instructions for execution by processor 301, memory 302 may be implemented by any type of volatile or nonvolatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, or optical disk.
The execution of the instructions in memory 302, when executed by processor 301, enables electronic device 300 to perform some or all of the steps of the method embodiments described above.
Corresponding to the above embodiment, the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium may store a program, where when the program runs, the device where the computer readable storage medium is located may be controlled to execute some or all of the steps in the above method embodiment. In particular, the computer readable storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random access memory (random access memory, RAM), or the like.
Corresponding to the above embodiments, the present application also provides a computer program product comprising executable instructions which, when executed on a computer, cause the computer to perform some or all of the steps of the above method embodiments.
Those of ordinary skill in the art will appreciate that the various elements and algorithm steps described in the embodiments disclosed herein can be implemented as a combination of electronic hardware, computer software, and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In several embodiments provided herein, any of the functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relation of association objects, and indicates that there may be three kinds of relations, for example, a and/or B, and may indicate that a alone exists, a and B together, and B alone exists. Wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of the following" and the like means any combination of these items, including any combination of single or plural items. For example, at least one of a, b and c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
The foregoing is merely specific embodiments of the present application, and any person skilled in the art may easily conceive of changes or substitutions within the technical scope of the present application, which should be covered by the protection scope of the present application. The protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. The method for detecting the consistency of the multiple paths of video pictures is characterized by comprising the following steps of:
splitting a multipath video stream acquired in real time into a video frame sequence and storing the video frame sequence into a memory queue;
determining scene transition frames in the video frame sequence, and searching alignment frames through the scene transition frames to realize the synchronization of multiple paths of video streams;
and discarding all video frames before the aligned frame, and comparing the video frames after the aligned frame by frame to perform picture consistency detection.
2. The method for detecting the consistency of multiple video frames according to claim 1, wherein the splitting the multiple video streams collected in real time into video frame sequences and storing the video frame sequences in the memory queue comprises:
carrying out data preprocessing on the multipath video streams to obtain video stream TS fragments;
identifying and transcoding the TS fragments of the video stream based on the ffmpeg to obtain parameter information, wherein the parameter information comprises: video format, resolution, code rate, and frame rate;
the TS fragments of the video stream are split into video frame sequences after being identified and transcoded;
reading each frame of data in the video frame sequence, and performing scaling, denoising, segmentation and enhancement processing on each frame of data;
and writing the frame data which is finally processed into a memory queue, and caching each frame image of the video stream.
3. The method for detecting the consistency of multiple video pictures according to claim 1, wherein said determining the scene change frame in the video frame sequence and searching the alignment frame through the scene change frame, to realize the synchronization of multiple video streams, comprises:
converting the RGB image of the video frame sequence stored in the memory queue into a gray scale image;
then, the gray level image of each frame is calculated as a gray level histogram;
respectively carrying out difference solving on gray histograms of two adjacent frames in each video frame sequence, and if the gray histograms are larger than a first preset threshold value, extracting scene conversion frames in the video frame sequences and adding the scene conversion frames into a scene conversion frame list;
and carrying out pairwise similarity comparison on scene conversion frames in different scene conversion frame lists, and extracting aligned frames if the similarity is larger than a second preset threshold value.
4. The method for detecting the consistency of multiple video pictures according to claim 3, wherein the step of performing a pairwise similarity comparison on the scene transition frames in the different scene transition frame lists, and extracting the aligned frames if the similarity is greater than a second preset threshold value comprises:
selecting a first scene transition frame in the first scene transition frame list as a reference frame;
comparing the scene conversion frames in the other scene conversion frame list with the reference frame, and calculating the peak signal-to-noise ratio PSNR of each scene conversion frame and the reference frame time, wherein the higher the PSNR value is, the smaller the picture difference between the scene conversion frame and the reference frame is;
and determining a second scene transition frame with the maximum PSNR value with the reference frame in the other scene transition frame list as an alignment frame with the reference frame.
5. The method according to claim 4, wherein calculating a peak signal-to-noise ratio PSNR of each scene transition frame and the reference frame time comprises:
wherein MAX is the maximum value of the pixel value of the image, MSE is the mean square error, and represents the measurement of the difference of the pixel values at each position of two images;
xi is the i-th pixel value of the original image, yi is the i-th pixel value of the image to be compared, and m is the total number of image pixels.
6. The method according to any one of claims 3 to 5, wherein if two consecutive alignment frames are determined from the scene change frame list and the two alignment frames are consecutive in index in the scene change frame list, the next alignment frame is extracted as the final alignment frame.
7. The method for detecting picture uniformity of multiple video frames according to claim 1, wherein said discarding all video frames before said aligned frame and comparing video frames after said aligned frame by frame for picture uniformity detection comprises:
after the alignment frames are determined, deleting all frames before the alignment frames in the video frame sequence to realize signal synchronization;
performing similarity comparison on all video frames after aligned frames in the video frame sequence on the basis of an SSIM structural similarity algorithm;
and if the comparison result is larger than the third preset threshold value, confirming that the pictures are consistent, or if the comparison result is smaller than the third preset threshold value, carrying out dislocation comparison, and if the comparison result after dislocation comparison is larger than the third preset threshold value, confirming that the pictures are consistent.
8. The method for detecting the consistency of multiple video pictures according to claim 7, wherein if the comparison result is smaller than a third preset threshold value, performing the dislocation comparison comprises:
determining two video frames with the comparison result smaller than a third preset threshold value as a comparison reference video frame and a dislocation mobile video frame;
if the video frame sequence in which the reference video frame is located loses frames, the dislocation moving video frame is moved leftwards for n times and is compared with the reference video frame again; or if the video frame sequence in which the dislocation moving video frame is located loses frames, the dislocation moving video frame is moved rightwards for n times to be compared with the reference video frame again, wherein n is the lost frames, and the value of n is determined according to the network environment.
9. A multi-channel video picture consistency detection system, comprising:
the video processing module is used for splitting the multipath video stream acquired in real time into a video frame sequence and storing the video frame sequence into the memory queue;
the video stream synchronization module is used for determining scene conversion frames in the video frame sequence, searching alignment frames through the scene conversion frames and realizing the synchronization of multiple paths of video streams;
and the consistency detection module is used for discarding all video frames before the aligned frame, comparing the video frames after the aligned frame by frame, and carrying out picture consistency detection.
10. An electronic device, comprising:
a processor;
a memory;
and a computer program, wherein the computer program is stored in the memory, the computer program comprising instructions that, when executed by the processor, cause the electronic device to perform the multi-way video picture consistency detection method of any of claims 1 to 8.
CN202410107600.1A 2024-01-26 2024-01-26 Multi-path video picture consistency detection method, system and electronic equipment Pending CN117640925A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410107600.1A CN117640925A (en) 2024-01-26 2024-01-26 Multi-path video picture consistency detection method, system and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410107600.1A CN117640925A (en) 2024-01-26 2024-01-26 Multi-path video picture consistency detection method, system and electronic equipment

Publications (1)

Publication Number Publication Date
CN117640925A true CN117640925A (en) 2024-03-01

Family

ID=90025566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410107600.1A Pending CN117640925A (en) 2024-01-26 2024-01-26 Multi-path video picture consistency detection method, system and electronic equipment

Country Status (1)

Country Link
CN (1) CN117640925A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102014296A (en) * 2010-12-10 2011-04-13 北京中科大洋科技发展股份有限公司 Video consistency monitoring technology based on self-adaptive edge matching and local stream processing algorithm
CN104079924A (en) * 2014-03-05 2014-10-01 北京捷成世纪科技股份有限公司 Mistakenly-played video detection method and device
CN106713963A (en) * 2016-11-28 2017-05-24 天脉聚源(北京)科技有限公司 Method and apparatus for aligning play progress of video streams
CN109743591A (en) * 2019-01-04 2019-05-10 广州虎牙信息科技有限公司 The method of video frame alignment
CN112714309A (en) * 2020-12-22 2021-04-27 北京百度网讯科技有限公司 Video quality evaluation method, device, apparatus, medium, and program product
CN113316001A (en) * 2021-05-25 2021-08-27 上海哔哩哔哩科技有限公司 Video alignment method and device
US20210352341A1 (en) * 2020-05-06 2021-11-11 At&T Intellectual Property I, L.P. Scene cut-based time alignment of video streams
CN114640881A (en) * 2020-12-15 2022-06-17 武汉Tcl集团工业研究院有限公司 Video frame alignment method and device, terminal equipment and computer readable storage medium
CN115941939A (en) * 2022-11-03 2023-04-07 咪咕视讯科技有限公司 Video frame alignment method, device, equipment and storage medium
CN117037009A (en) * 2022-04-28 2023-11-10 腾讯科技(深圳)有限公司 Video identification method, device, computer equipment and storage medium
CN117152660A (en) * 2023-08-31 2023-12-01 维沃移动通信有限公司 Image display method and device
CN117156125A (en) * 2023-10-25 2023-12-01 帕科视讯科技(杭州)股份有限公司 IPTV live stream real-time monitoring method and server based on artificial intelligence
CN117201845A (en) * 2023-09-15 2023-12-08 海看网络科技(山东)股份有限公司 Live program head-cast and replay content consistency monitoring method based on frame comparison
CN117278776A (en) * 2023-04-23 2023-12-22 青岛尘元科技信息有限公司 Multichannel video content real-time comparison method and device, equipment and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102014296A (en) * 2010-12-10 2011-04-13 北京中科大洋科技发展股份有限公司 Video consistency monitoring technology based on self-adaptive edge matching and local stream processing algorithm
CN104079924A (en) * 2014-03-05 2014-10-01 北京捷成世纪科技股份有限公司 Mistakenly-played video detection method and device
CN106713963A (en) * 2016-11-28 2017-05-24 天脉聚源(北京)科技有限公司 Method and apparatus for aligning play progress of video streams
CN109743591A (en) * 2019-01-04 2019-05-10 广州虎牙信息科技有限公司 The method of video frame alignment
US20210352341A1 (en) * 2020-05-06 2021-11-11 At&T Intellectual Property I, L.P. Scene cut-based time alignment of video streams
CN114640881A (en) * 2020-12-15 2022-06-17 武汉Tcl集团工业研究院有限公司 Video frame alignment method and device, terminal equipment and computer readable storage medium
CN112714309A (en) * 2020-12-22 2021-04-27 北京百度网讯科技有限公司 Video quality evaluation method, device, apparatus, medium, and program product
CN113316001A (en) * 2021-05-25 2021-08-27 上海哔哩哔哩科技有限公司 Video alignment method and device
CN117037009A (en) * 2022-04-28 2023-11-10 腾讯科技(深圳)有限公司 Video identification method, device, computer equipment and storage medium
CN115941939A (en) * 2022-11-03 2023-04-07 咪咕视讯科技有限公司 Video frame alignment method, device, equipment and storage medium
CN117278776A (en) * 2023-04-23 2023-12-22 青岛尘元科技信息有限公司 Multichannel video content real-time comparison method and device, equipment and storage medium
CN117152660A (en) * 2023-08-31 2023-12-01 维沃移动通信有限公司 Image display method and device
CN117201845A (en) * 2023-09-15 2023-12-08 海看网络科技(山东)股份有限公司 Live program head-cast and replay content consistency monitoring method based on frame comparison
CN117156125A (en) * 2023-10-25 2023-12-01 帕科视讯科技(杭州)股份有限公司 IPTV live stream real-time monitoring method and server based on artificial intelligence

Similar Documents

Publication Publication Date Title
US11423942B2 (en) Reference and non-reference video quality evaluation
Wang et al. YouTube UGC dataset for video compression research
JP5955319B2 (en) Method and apparatus for temporal synchronization between a video bitstream and an output video sequence
US20090051814A1 (en) Information processing device and information processing method
Dou et al. Edge computing-enabled deep learning for real-time video optimization in IIoT
CN1938972A (en) Methods and apparatuses for measuring transmission quality of multimedia data
US20080084506A1 (en) Real time scene change detection in video sequences
CN110493638B (en) Video frame alignment method and device, electronic equipment and readable storage medium
US20210274231A1 (en) Real-time latency measurement of video streams
KR20100071803A (en) Method for restoring transport error included in image and apparatus thereof
CN117201845A (en) Live program head-cast and replay content consistency monitoring method based on frame comparison
CN117640925A (en) Multi-path video picture consistency detection method, system and electronic equipment
WO2010103112A1 (en) Method and apparatus for video quality measurement without reference
EP2736261A1 (en) Method For Assessing The Quality Of A Video Stream
Dosselmann et al. A prototype no-reference video quality system
Grbić et al. Real-time video freezing detection for 4K UHD videos
CN113014953A (en) Video tamper-proof detection method and video tamper-proof detection system
US20220232275A1 (en) Adaptive bitrate video testing from screen recording
Su et al. A source video identification algorithm based on motion vectors
Xu et al. What you see is what you get: measure ABR video streaming QoE via on-device screen recording
Babić et al. Real-time no-reference histogram-based freezing artifact detection algorithm for UHD videos
JP4573301B2 (en) Video signal frame synchronization method
Chen et al. No-reference video quality assessment on mobile devices
US8976186B2 (en) Image processing apparatus and method thereof
JP4013024B2 (en) Movie processing apparatus, movie processing method, and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination