CN112330650A - Retrieval video quality evaluation method - Google Patents

Retrieval video quality evaluation method Download PDF

Info

Publication number
CN112330650A
CN112330650A CN202011263356.6A CN202011263356A CN112330650A CN 112330650 A CN112330650 A CN 112330650A CN 202011263356 A CN202011263356 A CN 202011263356A CN 112330650 A CN112330650 A CN 112330650A
Authority
CN
China
Prior art keywords
video
retrieval
evaluation method
calculating
retrieved
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011263356.6A
Other languages
Chinese (zh)
Inventor
李庆春
严国建
李志强
王彬
曾璐
梁瑞凡
许璐
谢兰迟
晏于文
槐森
赵明磊
于晏平
潘培培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD
Original Assignee
WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD filed Critical WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD
Priority to CN202011263356.6A priority Critical patent/CN112330650A/en
Publication of CN112330650A publication Critical patent/CN112330650A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20064Wavelet transform [DWT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a retrieval video quality evaluation method, which comprises the following steps: calculating an environment index of the retrieval video; calculating target physical characteristics of the retrieval video; performing discrete wavelet transform on the retrieval video; and inputting the retrieved video, the environmental index, the target physical characteristic and the image after discrete wavelet transformation into different granularity feature layers of a neural network, and finally forming a model capable of classifying the video according to the video quality through mass data learning. The invention combines the traditional digital image processing technology and the deep learning method to analyze and classify the self characteristics of the video to be retrieved so as to improve the quality of the retrieved videos of different types and provide a parameter adjusting suggestion for a video retrieval comparison system.

Description

Retrieval video quality evaluation method
Technical Field
The invention relates to processing and analyzing of videos, in particular to a retrieval video quality evaluation method.
Background
With the popularization of urban video monitoring systems, the manner of criminal investigation and solution solving of the public security department is greatly changed, and investigation and solution solving (namely video investigation) by using field videos is greatly developed and applied. In video investigation applications, retrieval and comparison of suspected targets and behaviors thereof are important requirements.
Research on video retrieval technology has made great progress, many retrieval models are proposed one after another and are improved and verified continuously in practice, which greatly facilitates users to find satisfactory targets to some extent, but most retrieval systems still have serious robustness problems, for example, for some user queries, retrieval results are high in quality, and for other queries, retrieval results often contain many targets unrelated to queries; moreover, even those systems that are generally recognized to have good average search performance, their returned results may not be satisfactory for certain queries, and in short, there is often a large difference between the search results for different queries.
The reason is that the existing retrieval and comparison technologies only study the accuracy of retrieval and comparison, neglect to evaluate the environment of the input video and the physical characteristics of the target in the video, lack to analyze and classify the quality of the video to be retrieved, and lack to solve the problem of video quality and improve the quality of the video to be retrieved. If any unknown video is input into the retrieval comparison system, an undesirable or unpredictable retrieval result is inevitably obtained, and in many complex cases, the retrieval effect is undesirable.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a retrieval video quality evaluation method, and the method is combined with the traditional digital image processing technology and a deep learning method to analyze and classify the characteristics of the video to be retrieved so as to improve the quality of different types of retrieval videos and provide a parameter adjustment suggestion for a video retrieval comparison system.
The technical scheme adopted for realizing the aim of the invention is a retrieval video quality evaluation method, which comprises the following steps:
and calculating the environmental index of the retrieval video. Because the retrieval system can output unsatisfactory results due to low definition, abnormal brightness, low contrast and the like of a video picture, the invention firstly utilizes a digital image processing technology to calculate the environment quality indexes such as definition, abnormal brightness, contrast and the like of a video segment and comprehensively analyzes the three indexes to obtain the environment index score of the video.
And calculating the target physical characteristics of the retrieval video. The method comprises the steps that targets in a video frame are too small or the moving speed of the targets is too high, so that a retrieval system is difficult to detect the targets from the video, a general target detection algorithm based on a deep neural network is used for detecting the general targets in the video sequence frame, and the size of the targets in the frame is analyzed according to the size of a target frame; and then, acquiring physical characteristic quality indexes such as target motion speed and the like by using a multi-target tracking algorithm, and comprehensively analyzing the size and the motion speed of the target to obtain the physical characteristic index score of the video target.
And performing discrete wavelet transform on the retrieval video. The video that uses shooting equipment such as cell-phone, shoots and show in displays such as computer is the video of reprinting, and this kind of special video, environmental index such as its definition and physical characteristic index such as target size are as good as normal video is not different, but this kind of video can have noise such as mole line, and in addition the shake of reprinting the video, can make retrieval system rate of accuracy greatly reduced. The wavelet transform has the characteristic of multi-resolution analysis, can focus on any details of signals to perform multi-resolution time-frequency domain analysis, uses Discrete Wavelet Transform (DWT) to convert an input image, and the converted image can better highlight noises such as moire and the like so as to form more discriminative characteristics.
And inputting the retrieved video, the environmental index, the target physical characteristic and the image after discrete wavelet transformation into different granularity feature layers of a neural network, and finally forming a model capable of classifying the video according to the video quality through mass data learning.
In the above technical solution, the calculating of the environmental index of the retrieved video includes calculating a definition, calculating a brightness anomaly, and calculating a contrast, and the three indexes are calculated as follows:
the calculation of the sharpness includes:
Figure BDA0002775338820000031
wherein DR is definition, the larger DR is, the more clear the image is, x and y are horizontal and vertical coordinates, p(x,y)The pixel values at the coordinate point (x, y) are w, h are the width and height of the image block, respectively.
The luminance abnormality calculation includes:
Figure BDA0002775338820000032
Figure BDA0002775338820000033
in the formula, CAST represents a deviation value, less than 1 represents normal, and more than 1 represents abnormal brightness; when CAST is abnormal, DA is larger than 0 to indicate that the light is too bright, and DA is smaller than 0 to indicate that the light is too dark; p is a radical of(x,y)The pixel values at the coordinate point (x, y) are w and h are the width and height of the image block respectively, Mean is a brightness average value point, and Hist is a gray histogram of the image block.
The calculation of the contrast includes:
Figure BDA0002775338820000034
wherein, Contrast is definition, the larger Contrast represents the better Contrast of the image block, x and y are respectively horizontal and vertical coordinates, p(x,y)The pixel values at the coordinate point (x, y) are w, h are the width and height of the image block, respectively.
In the above technical solution, the performing discrete wavelet transform on the search video includes:
Figure BDA0002775338820000035
wherein x (t) is the result of the change, c0[k]、d0[k]The value of the coefficient of expansion is equal to,
Figure BDA0002775338820000036
as a function of scale, #0,k(t) etc. are wavelet functions.
The method adopts a multi-input deep convolutional neural network, takes a video sequence original image, an environmental quality index, a physical characteristic index and a DWT change image as 4 items of input, inputs or fuses the 4 items of input into different granularity characteristic layers of the neural network, and finally forms a model capable of classifying videos according to the video quality through mass data learning.
Due to the sudden increase of the amount of the monitoring videos, the video scenes and the video types are more and more complex, the quality of the videos to be retrieved is classified and analyzed, the quality of the videos to be retrieved is improved, and the improvement of the video retrieval quality is greatly facilitated. The invention innovatively provides a multi-input convolutional neural network combined with the traditional digital image processing technology and a deep learning method, carries out all-round analysis and classification on a video to be retrieved, and solves the problem that a single network or method cannot accurately analyze the quality of a complex video. The invention classifies the searched videos, the videos of different types have different quality attributes, and after the videos of different quality attributes are evaluated according to the actual searching requirement, the videos of the types which accord with the conditions are selected as the input of the searching comparison system, thereby improving the searching efficiency.
Drawings
Fig. 1 is a flowchart of a retrieval video quality evaluation method according to the present invention.
FIG. 2 is a schematic diagram of the structure of the multi-input deep convolutional neural network of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments.
S1, calculating the environmental index of the search video
In order to be able to be integrated into a deep convolutional neural network, when calculating the index of the video environment, the original image is divided into 16 × 16 image blocks, each image block generates an index value, and the index is input into the model in the form of a feature map (1/16 size of the original image).
And calculating the definition by using the X, Y direction adjacent pixel difference value in the video image. The formula is as follows:
Figure BDA0002775338820000041
wherein DR is definition, the larger DR is, the more clear the image is, x and y are horizontal and vertical coordinates, p(x,y)Is the pixel at coordinate point (x, y)The values w, h are the width and height of the image block, respectively.
And (4) calculating brightness abnormity, namely calculating whether the brightness is abnormal or not by using a gray histogram of the video image. The formula is as follows:
Figure BDA0002775338820000042
Figure BDA0002775338820000043
in the formula, CAST represents a deviation value, less than 1 represents normal, and more than 1 represents abnormal brightness; when CAST is abnormal, DA greater than 0 means too bright, and DA less than 0 means too dark. p is a radical of(x,y)The pixel values at the coordinate point (x, y) are w, h are the width and height of the image block respectively, Mean is the brightness average value point, and Hist is the gray histogram of the image block.
And calculating the contrast by using the variance of adjacent pixels in the video image. The formula is as follows:
Figure BDA0002775338820000051
wherein, Contrast is definition, the larger Contrast represents the better Contrast of the image block, x and y are respectively horizontal and vertical coordinates, p(x,y)The pixel value at coordinate point (x, y) is w, h are width and height of the image block, respectively.
S2, calculating the target physical characteristics of the search video
The universal target detection and tracking algorithm uses an open source algorithm, detects and tracks the universal target in the video sequence, and obtains the target size and the target motion speed information.
S3, discrete wavelet transform is carried out on the retrieval video
Converting the original image by using a wavelet transformation algorithm, wherein the formula is as follows:
Figure BDA0002775338820000052
wherein x (t) is the result of the change, c0[k]、d0[k]、d1[k]The value of the coefficient of expansion is equal to,
Figure BDA0002775338820000053
as a function of scale, #0,k(t) etc. are wavelet functions.
S4, multiple input convolution neural network for video quality classification
The core of the invention is as follows: the method adopts a multi-input deep convolutional neural network, takes a video sequence original image, an environmental quality index, a physical characteristic index and a discrete wavelet transform change image as four items of input, inputs or fuses the four items of input into different granularity characteristic layers of the neural network, and finally forms a model capable of classifying videos according to the video quality through mass data learning.
In the invention, the video quality classification result is used as the video quality evaluation result, and in the specific implementation, the output of the multi-input convolutional neural network (namely the video quality analysis result) is classified into five categories, specifically as follows:
1. videos such as video cannot be analyzed, video messy codes and black screens are taken as the type 1, and the videos cannot be searched and analyzed.
2. The reproduction video is taken as a type 2 video, the mole noise of the type of video is large, part of the video has jitter, and retrieval analysis can be carried out after optimization.
3. The condition of video illumination, contrast and the like is very poor, the video of the target in the video is hardly seen as the 3 rd class, and the retrieval and analysis of the class of video can hardly be carried out.
4. Videos with moderate conditions such as illumination, contrast and the like are taken as the 4 th class, the videos are fuzzy, and retrieval analysis can be performed after optimization.
5. And taking videos with good video environment conditions as a 5 th class, wherein the videos are clear and can be directly retrieved and analyzed.
The classification of the five categories corresponds to different video qualities, a model capable of classifying videos according to the video qualities is finally formed, and the classification of the videos is realized after the model is evaluated according to the five categories, namely, the evaluation and the classification of the video qualities are realized.

Claims (6)

1. A retrieval video quality evaluation method is characterized by comprising the following steps:
calculating an environment index of the retrieval video;
calculating target physical characteristics of the retrieval video;
performing discrete wavelet transform on the retrieval video;
and inputting the retrieved video, the environmental index, the target physical characteristic and the image after discrete wavelet transformation into different granularity feature layers of a neural network, and finally forming a model capable of classifying the video according to the video quality through mass data learning.
2. The retrieval video quality evaluation method according to claim 1, wherein: and calculating the environment indexes of the retrieval video, wherein the environment indexes comprise definition calculation, brightness abnormity calculation and contrast calculation.
3. The retrieved video quality evaluation method according to claim 2, wherein the calculating of the sharpness comprises:
Figure FDA0002775338810000011
wherein DR is definition, x and y are respectively abscissa and ordinate, and p(x,y)The pixel value at coordinate point (x, y) is w, h are width and height of the image block, respectively.
4. The retrieved video quality evaluation method according to claim 2, wherein the luminance anomaly calculation includes:
Figure FDA0002775338810000012
Figure FDA0002775338810000013
in the formula, CAST represents a deviation value, less than 1 represents normal, and more than 1 represents abnormal brightness; when CAST is abnormal, DA is larger than 0 to indicate that the light is too bright, and DA is smaller than 0 to indicate that the light is too dark; p is a radical of(x,y)The pixel values at the coordinate point (x, y) are w and h are the width and height of the image block respectively, Mean is a brightness average value point, and Hist is a gray histogram of the image block.
5. The retrieved video quality evaluation method according to claim 2, wherein the calculating of the contrast comprises:
Figure FDA0002775338810000014
wherein Contrast is definition, x and y are respectively abscissa and ordinate, and p(x,y)The pixel value at coordinate point (x, y) is w, h are width and height of the image block, respectively.
6. The retrieved video quality evaluation method according to claim 1 or 2, wherein the performing discrete wavelet transform on the retrieved video comprises:
Figure FDA0002775338810000021
wherein x (t) is the result of the change, c0[k]、d0[k]In order to be the coefficient of expansion,
Figure FDA0002775338810000022
as a function of scale, #0,k(t) etc. are wavelet functions.
CN202011263356.6A 2020-11-12 2020-11-12 Retrieval video quality evaluation method Pending CN112330650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011263356.6A CN112330650A (en) 2020-11-12 2020-11-12 Retrieval video quality evaluation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011263356.6A CN112330650A (en) 2020-11-12 2020-11-12 Retrieval video quality evaluation method

Publications (1)

Publication Number Publication Date
CN112330650A true CN112330650A (en) 2021-02-05

Family

ID=74319069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011263356.6A Pending CN112330650A (en) 2020-11-12 2020-11-12 Retrieval video quality evaluation method

Country Status (1)

Country Link
CN (1) CN112330650A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102421008A (en) * 2011-12-07 2012-04-18 浙江捷尚视觉科技有限公司 Intelligent video quality detecting system
CN107636690A (en) * 2015-06-05 2018-01-26 索尼公司 Full reference picture quality evaluation based on convolutional neural networks
US20180032846A1 (en) * 2016-08-01 2018-02-01 Nvidia Corporation Fusing multilayer and multimodal deep neural networks for video classification
CN107679462A (en) * 2017-09-13 2018-02-09 哈尔滨工业大学深圳研究生院 A kind of depth multiple features fusion sorting technique based on small echo
CN108055501A (en) * 2017-11-22 2018-05-18 天津市亚安科技有限公司 A kind of target detection and the video monitoring system and method for tracking
CN109191437A (en) * 2018-08-16 2019-01-11 南京理工大学 Clarity evaluation method based on wavelet transformation
US20200226740A1 (en) * 2019-03-27 2020-07-16 Sharif University Of Technology Quality assessment of a video

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102421008A (en) * 2011-12-07 2012-04-18 浙江捷尚视觉科技有限公司 Intelligent video quality detecting system
CN107636690A (en) * 2015-06-05 2018-01-26 索尼公司 Full reference picture quality evaluation based on convolutional neural networks
US20180032846A1 (en) * 2016-08-01 2018-02-01 Nvidia Corporation Fusing multilayer and multimodal deep neural networks for video classification
CN107679462A (en) * 2017-09-13 2018-02-09 哈尔滨工业大学深圳研究生院 A kind of depth multiple features fusion sorting technique based on small echo
CN108055501A (en) * 2017-11-22 2018-05-18 天津市亚安科技有限公司 A kind of target detection and the video monitoring system and method for tracking
CN109191437A (en) * 2018-08-16 2019-01-11 南京理工大学 Clarity evaluation method based on wavelet transformation
US20200226740A1 (en) * 2019-03-27 2020-07-16 Sharif University Of Technology Quality assessment of a video

Similar Documents

Publication Publication Date Title
WO2022099598A1 (en) Video dynamic target detection method based on relative statistical features of image pixels
CN108764085B (en) Crowd counting method based on generation of confrontation network
US20110087677A1 (en) Apparatus for displaying result of analogous image retrieval and method for displaying result of analogous image retrieval
CN108564052A (en) Multi-cam dynamic human face recognition system based on MTCNN and method
CN113011329A (en) Pyramid network based on multi-scale features and dense crowd counting method
CN111723693A (en) Crowd counting method based on small sample learning
CN113592911B (en) Apparent enhanced depth target tracking method
CN112766218B (en) Cross-domain pedestrian re-recognition method and device based on asymmetric combined teaching network
CN111738054A (en) Behavior anomaly detection method based on space-time self-encoder network and space-time CNN
CN116402850A (en) Multi-target tracking method for intelligent driving
CN108257148B (en) Target suggestion window generation method of specific object and application of target suggestion window generation method in target tracking
Khan et al. Foreground detection using motion histogram threshold algorithm in high-resolution large datasets
CN112330650A (en) Retrieval video quality evaluation method
CN110830734B (en) Abrupt change and gradual change lens switching identification method and system
Panchal et al. Multiple forgery detection in digital video based on inconsistency in video quality assessment attributes
Xiang et al. Quality-distinguishing and patch-comparing no-reference image quality assessment
Jöchl et al. Deep Learning Image Age Approximation-What is More Relevant: Image Content or Age Information?
Yin et al. Flue gas layer feature segmentation based on multi-channel pixel adaptive
Hou et al. Improved Multi-sampling Kernelized Correlation Filter Target Tracking Algorithm
CN116912783B (en) State monitoring method and system of nucleic acid detection platform
Chauhan et al. Smart surveillance based on video summarization: a comprehensive review, issues, and challenges
Wang et al. A Survey of Crowd Counting Algorithm Based on Domain Adaptation
CN116309350B (en) Face detection method and system
Zhu et al. Recaptured image detection through multi-resolution residual-based correlation coefficients
CN117315723B (en) Digital management method and system for mold workshop based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination