CN112330650A - Retrieval video quality evaluation method - Google Patents
Retrieval video quality evaluation method Download PDFInfo
- Publication number
- CN112330650A CN112330650A CN202011263356.6A CN202011263356A CN112330650A CN 112330650 A CN112330650 A CN 112330650A CN 202011263356 A CN202011263356 A CN 202011263356A CN 112330650 A CN112330650 A CN 112330650A
- Authority
- CN
- China
- Prior art keywords
- video
- retrieval
- evaluation method
- calculating
- retrieved
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
- G06T2207/20064—Wavelet transform [DWT]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a retrieval video quality evaluation method, which comprises the following steps: calculating an environment index of the retrieval video; calculating target physical characteristics of the retrieval video; performing discrete wavelet transform on the retrieval video; and inputting the retrieved video, the environmental index, the target physical characteristic and the image after discrete wavelet transformation into different granularity feature layers of a neural network, and finally forming a model capable of classifying the video according to the video quality through mass data learning. The invention combines the traditional digital image processing technology and the deep learning method to analyze and classify the self characteristics of the video to be retrieved so as to improve the quality of the retrieved videos of different types and provide a parameter adjusting suggestion for a video retrieval comparison system.
Description
Technical Field
The invention relates to processing and analyzing of videos, in particular to a retrieval video quality evaluation method.
Background
With the popularization of urban video monitoring systems, the manner of criminal investigation and solution solving of the public security department is greatly changed, and investigation and solution solving (namely video investigation) by using field videos is greatly developed and applied. In video investigation applications, retrieval and comparison of suspected targets and behaviors thereof are important requirements.
Research on video retrieval technology has made great progress, many retrieval models are proposed one after another and are improved and verified continuously in practice, which greatly facilitates users to find satisfactory targets to some extent, but most retrieval systems still have serious robustness problems, for example, for some user queries, retrieval results are high in quality, and for other queries, retrieval results often contain many targets unrelated to queries; moreover, even those systems that are generally recognized to have good average search performance, their returned results may not be satisfactory for certain queries, and in short, there is often a large difference between the search results for different queries.
The reason is that the existing retrieval and comparison technologies only study the accuracy of retrieval and comparison, neglect to evaluate the environment of the input video and the physical characteristics of the target in the video, lack to analyze and classify the quality of the video to be retrieved, and lack to solve the problem of video quality and improve the quality of the video to be retrieved. If any unknown video is input into the retrieval comparison system, an undesirable or unpredictable retrieval result is inevitably obtained, and in many complex cases, the retrieval effect is undesirable.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a retrieval video quality evaluation method, and the method is combined with the traditional digital image processing technology and a deep learning method to analyze and classify the characteristics of the video to be retrieved so as to improve the quality of different types of retrieval videos and provide a parameter adjustment suggestion for a video retrieval comparison system.
The technical scheme adopted for realizing the aim of the invention is a retrieval video quality evaluation method, which comprises the following steps:
and calculating the environmental index of the retrieval video. Because the retrieval system can output unsatisfactory results due to low definition, abnormal brightness, low contrast and the like of a video picture, the invention firstly utilizes a digital image processing technology to calculate the environment quality indexes such as definition, abnormal brightness, contrast and the like of a video segment and comprehensively analyzes the three indexes to obtain the environment index score of the video.
And calculating the target physical characteristics of the retrieval video. The method comprises the steps that targets in a video frame are too small or the moving speed of the targets is too high, so that a retrieval system is difficult to detect the targets from the video, a general target detection algorithm based on a deep neural network is used for detecting the general targets in the video sequence frame, and the size of the targets in the frame is analyzed according to the size of a target frame; and then, acquiring physical characteristic quality indexes such as target motion speed and the like by using a multi-target tracking algorithm, and comprehensively analyzing the size and the motion speed of the target to obtain the physical characteristic index score of the video target.
And performing discrete wavelet transform on the retrieval video. The video that uses shooting equipment such as cell-phone, shoots and show in displays such as computer is the video of reprinting, and this kind of special video, environmental index such as its definition and physical characteristic index such as target size are as good as normal video is not different, but this kind of video can have noise such as mole line, and in addition the shake of reprinting the video, can make retrieval system rate of accuracy greatly reduced. The wavelet transform has the characteristic of multi-resolution analysis, can focus on any details of signals to perform multi-resolution time-frequency domain analysis, uses Discrete Wavelet Transform (DWT) to convert an input image, and the converted image can better highlight noises such as moire and the like so as to form more discriminative characteristics.
And inputting the retrieved video, the environmental index, the target physical characteristic and the image after discrete wavelet transformation into different granularity feature layers of a neural network, and finally forming a model capable of classifying the video according to the video quality through mass data learning.
In the above technical solution, the calculating of the environmental index of the retrieved video includes calculating a definition, calculating a brightness anomaly, and calculating a contrast, and the three indexes are calculated as follows:
the calculation of the sharpness includes:
wherein DR is definition, the larger DR is, the more clear the image is, x and y are horizontal and vertical coordinates, p(x,y)The pixel values at the coordinate point (x, y) are w, h are the width and height of the image block, respectively.
The luminance abnormality calculation includes:
in the formula, CAST represents a deviation value, less than 1 represents normal, and more than 1 represents abnormal brightness; when CAST is abnormal, DA is larger than 0 to indicate that the light is too bright, and DA is smaller than 0 to indicate that the light is too dark; p is a radical of(x,y)The pixel values at the coordinate point (x, y) are w and h are the width and height of the image block respectively, Mean is a brightness average value point, and Hist is a gray histogram of the image block.
The calculation of the contrast includes:
wherein, Contrast is definition, the larger Contrast represents the better Contrast of the image block, x and y are respectively horizontal and vertical coordinates, p(x,y)The pixel values at the coordinate point (x, y) are w, h are the width and height of the image block, respectively.
In the above technical solution, the performing discrete wavelet transform on the search video includes:
wherein x (t) is the result of the change, c0[k]、d0[k]The value of the coefficient of expansion is equal to,as a function of scale, #0,k(t) etc. are wavelet functions.
The method adopts a multi-input deep convolutional neural network, takes a video sequence original image, an environmental quality index, a physical characteristic index and a DWT change image as 4 items of input, inputs or fuses the 4 items of input into different granularity characteristic layers of the neural network, and finally forms a model capable of classifying videos according to the video quality through mass data learning.
Due to the sudden increase of the amount of the monitoring videos, the video scenes and the video types are more and more complex, the quality of the videos to be retrieved is classified and analyzed, the quality of the videos to be retrieved is improved, and the improvement of the video retrieval quality is greatly facilitated. The invention innovatively provides a multi-input convolutional neural network combined with the traditional digital image processing technology and a deep learning method, carries out all-round analysis and classification on a video to be retrieved, and solves the problem that a single network or method cannot accurately analyze the quality of a complex video. The invention classifies the searched videos, the videos of different types have different quality attributes, and after the videos of different quality attributes are evaluated according to the actual searching requirement, the videos of the types which accord with the conditions are selected as the input of the searching comparison system, thereby improving the searching efficiency.
Drawings
Fig. 1 is a flowchart of a retrieval video quality evaluation method according to the present invention.
FIG. 2 is a schematic diagram of the structure of the multi-input deep convolutional neural network of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments.
S1, calculating the environmental index of the search video
In order to be able to be integrated into a deep convolutional neural network, when calculating the index of the video environment, the original image is divided into 16 × 16 image blocks, each image block generates an index value, and the index is input into the model in the form of a feature map (1/16 size of the original image).
And calculating the definition by using the X, Y direction adjacent pixel difference value in the video image. The formula is as follows:
wherein DR is definition, the larger DR is, the more clear the image is, x and y are horizontal and vertical coordinates, p(x,y)Is the pixel at coordinate point (x, y)The values w, h are the width and height of the image block, respectively.
And (4) calculating brightness abnormity, namely calculating whether the brightness is abnormal or not by using a gray histogram of the video image. The formula is as follows:
in the formula, CAST represents a deviation value, less than 1 represents normal, and more than 1 represents abnormal brightness; when CAST is abnormal, DA greater than 0 means too bright, and DA less than 0 means too dark. p is a radical of(x,y)The pixel values at the coordinate point (x, y) are w, h are the width and height of the image block respectively, Mean is the brightness average value point, and Hist is the gray histogram of the image block.
And calculating the contrast by using the variance of adjacent pixels in the video image. The formula is as follows:
wherein, Contrast is definition, the larger Contrast represents the better Contrast of the image block, x and y are respectively horizontal and vertical coordinates, p(x,y)The pixel value at coordinate point (x, y) is w, h are width and height of the image block, respectively.
S2, calculating the target physical characteristics of the search video
The universal target detection and tracking algorithm uses an open source algorithm, detects and tracks the universal target in the video sequence, and obtains the target size and the target motion speed information.
S3, discrete wavelet transform is carried out on the retrieval video
Converting the original image by using a wavelet transformation algorithm, wherein the formula is as follows:
wherein x (t) is the result of the change, c0[k]、d0[k]、d1[k]The value of the coefficient of expansion is equal to,as a function of scale, #0,k(t) etc. are wavelet functions.
S4, multiple input convolution neural network for video quality classification
The core of the invention is as follows: the method adopts a multi-input deep convolutional neural network, takes a video sequence original image, an environmental quality index, a physical characteristic index and a discrete wavelet transform change image as four items of input, inputs or fuses the four items of input into different granularity characteristic layers of the neural network, and finally forms a model capable of classifying videos according to the video quality through mass data learning.
In the invention, the video quality classification result is used as the video quality evaluation result, and in the specific implementation, the output of the multi-input convolutional neural network (namely the video quality analysis result) is classified into five categories, specifically as follows:
1. videos such as video cannot be analyzed, video messy codes and black screens are taken as the type 1, and the videos cannot be searched and analyzed.
2. The reproduction video is taken as a type 2 video, the mole noise of the type of video is large, part of the video has jitter, and retrieval analysis can be carried out after optimization.
3. The condition of video illumination, contrast and the like is very poor, the video of the target in the video is hardly seen as the 3 rd class, and the retrieval and analysis of the class of video can hardly be carried out.
4. Videos with moderate conditions such as illumination, contrast and the like are taken as the 4 th class, the videos are fuzzy, and retrieval analysis can be performed after optimization.
5. And taking videos with good video environment conditions as a 5 th class, wherein the videos are clear and can be directly retrieved and analyzed.
The classification of the five categories corresponds to different video qualities, a model capable of classifying videos according to the video qualities is finally formed, and the classification of the videos is realized after the model is evaluated according to the five categories, namely, the evaluation and the classification of the video qualities are realized.
Claims (6)
1. A retrieval video quality evaluation method is characterized by comprising the following steps:
calculating an environment index of the retrieval video;
calculating target physical characteristics of the retrieval video;
performing discrete wavelet transform on the retrieval video;
and inputting the retrieved video, the environmental index, the target physical characteristic and the image after discrete wavelet transformation into different granularity feature layers of a neural network, and finally forming a model capable of classifying the video according to the video quality through mass data learning.
2. The retrieval video quality evaluation method according to claim 1, wherein: and calculating the environment indexes of the retrieval video, wherein the environment indexes comprise definition calculation, brightness abnormity calculation and contrast calculation.
3. The retrieved video quality evaluation method according to claim 2, wherein the calculating of the sharpness comprises:
wherein DR is definition, x and y are respectively abscissa and ordinate, and p(x,y)The pixel value at coordinate point (x, y) is w, h are width and height of the image block, respectively.
4. The retrieved video quality evaluation method according to claim 2, wherein the luminance anomaly calculation includes:
in the formula, CAST represents a deviation value, less than 1 represents normal, and more than 1 represents abnormal brightness; when CAST is abnormal, DA is larger than 0 to indicate that the light is too bright, and DA is smaller than 0 to indicate that the light is too dark; p is a radical of(x,y)The pixel values at the coordinate point (x, y) are w and h are the width and height of the image block respectively, Mean is a brightness average value point, and Hist is a gray histogram of the image block.
5. The retrieved video quality evaluation method according to claim 2, wherein the calculating of the contrast comprises:
wherein Contrast is definition, x and y are respectively abscissa and ordinate, and p(x,y)The pixel value at coordinate point (x, y) is w, h are width and height of the image block, respectively.
6. The retrieved video quality evaluation method according to claim 1 or 2, wherein the performing discrete wavelet transform on the retrieved video comprises:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011263356.6A CN112330650A (en) | 2020-11-12 | 2020-11-12 | Retrieval video quality evaluation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011263356.6A CN112330650A (en) | 2020-11-12 | 2020-11-12 | Retrieval video quality evaluation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112330650A true CN112330650A (en) | 2021-02-05 |
Family
ID=74319069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011263356.6A Pending CN112330650A (en) | 2020-11-12 | 2020-11-12 | Retrieval video quality evaluation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112330650A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102421008A (en) * | 2011-12-07 | 2012-04-18 | 浙江捷尚视觉科技有限公司 | Intelligent video quality detecting system |
CN107636690A (en) * | 2015-06-05 | 2018-01-26 | 索尼公司 | Full reference picture quality evaluation based on convolutional neural networks |
US20180032846A1 (en) * | 2016-08-01 | 2018-02-01 | Nvidia Corporation | Fusing multilayer and multimodal deep neural networks for video classification |
CN107679462A (en) * | 2017-09-13 | 2018-02-09 | 哈尔滨工业大学深圳研究生院 | A kind of depth multiple features fusion sorting technique based on small echo |
CN108055501A (en) * | 2017-11-22 | 2018-05-18 | 天津市亚安科技有限公司 | A kind of target detection and the video monitoring system and method for tracking |
CN109191437A (en) * | 2018-08-16 | 2019-01-11 | 南京理工大学 | Clarity evaluation method based on wavelet transformation |
US20200226740A1 (en) * | 2019-03-27 | 2020-07-16 | Sharif University Of Technology | Quality assessment of a video |
-
2020
- 2020-11-12 CN CN202011263356.6A patent/CN112330650A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102421008A (en) * | 2011-12-07 | 2012-04-18 | 浙江捷尚视觉科技有限公司 | Intelligent video quality detecting system |
CN107636690A (en) * | 2015-06-05 | 2018-01-26 | 索尼公司 | Full reference picture quality evaluation based on convolutional neural networks |
US20180032846A1 (en) * | 2016-08-01 | 2018-02-01 | Nvidia Corporation | Fusing multilayer and multimodal deep neural networks for video classification |
CN107679462A (en) * | 2017-09-13 | 2018-02-09 | 哈尔滨工业大学深圳研究生院 | A kind of depth multiple features fusion sorting technique based on small echo |
CN108055501A (en) * | 2017-11-22 | 2018-05-18 | 天津市亚安科技有限公司 | A kind of target detection and the video monitoring system and method for tracking |
CN109191437A (en) * | 2018-08-16 | 2019-01-11 | 南京理工大学 | Clarity evaluation method based on wavelet transformation |
US20200226740A1 (en) * | 2019-03-27 | 2020-07-16 | Sharif University Of Technology | Quality assessment of a video |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022099598A1 (en) | Video dynamic target detection method based on relative statistical features of image pixels | |
CN108764085B (en) | Crowd counting method based on generation of confrontation network | |
US20110087677A1 (en) | Apparatus for displaying result of analogous image retrieval and method for displaying result of analogous image retrieval | |
CN108564052A (en) | Multi-cam dynamic human face recognition system based on MTCNN and method | |
CN113011329A (en) | Pyramid network based on multi-scale features and dense crowd counting method | |
CN111723693A (en) | Crowd counting method based on small sample learning | |
CN113592911B (en) | Apparent enhanced depth target tracking method | |
CN112766218B (en) | Cross-domain pedestrian re-recognition method and device based on asymmetric combined teaching network | |
CN111738054A (en) | Behavior anomaly detection method based on space-time self-encoder network and space-time CNN | |
CN116402850A (en) | Multi-target tracking method for intelligent driving | |
CN108257148B (en) | Target suggestion window generation method of specific object and application of target suggestion window generation method in target tracking | |
Khan et al. | Foreground detection using motion histogram threshold algorithm in high-resolution large datasets | |
CN112330650A (en) | Retrieval video quality evaluation method | |
CN110830734B (en) | Abrupt change and gradual change lens switching identification method and system | |
Panchal et al. | Multiple forgery detection in digital video based on inconsistency in video quality assessment attributes | |
Xiang et al. | Quality-distinguishing and patch-comparing no-reference image quality assessment | |
Jöchl et al. | Deep Learning Image Age Approximation-What is More Relevant: Image Content or Age Information? | |
Yin et al. | Flue gas layer feature segmentation based on multi-channel pixel adaptive | |
Hou et al. | Improved Multi-sampling Kernelized Correlation Filter Target Tracking Algorithm | |
CN116912783B (en) | State monitoring method and system of nucleic acid detection platform | |
Chauhan et al. | Smart surveillance based on video summarization: a comprehensive review, issues, and challenges | |
Wang et al. | A Survey of Crowd Counting Algorithm Based on Domain Adaptation | |
CN116309350B (en) | Face detection method and system | |
Zhu et al. | Recaptured image detection through multi-resolution residual-based correlation coefficients | |
CN117315723B (en) | Digital management method and system for mold workshop based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |