CN105959685B - A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis - Google Patents

A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis Download PDF

Info

Publication number
CN105959685B
CN105959685B CN201610378960.0A CN201610378960A CN105959685B CN 105959685 B CN105959685 B CN 105959685B CN 201610378960 A CN201610378960 A CN 201610378960A CN 105959685 B CN105959685 B CN 105959685B
Authority
CN
China
Prior art keywords
video
information
bit rate
cluster analysis
compression bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610378960.0A
Other languages
Chinese (zh)
Other versions
CN105959685A (en
Inventor
宋利
朱雨桐
解蓉
张文军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201610378960.0A priority Critical patent/CN105959685B/en
Publication of CN105959685A publication Critical patent/CN105959685A/en
Application granted granted Critical
Publication of CN105959685B publication Critical patent/CN105959685B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention discloses a kind of compression bit rate Forecasting Methodology based on video content and cluster analysis, and this method does sobel filtering to each frame of video first, obtains spatial complexity information;Then difference is done to the monochrome information of adjacent two frame, obtains time complexity information;Then to spatial information and temporal information, cluster analysis is done using k means methods;Then in each class, coefficient regression is done, obtains forecast model, and utilize the model prediction compression bit rate.Method proposed by the present invention first to carry out k means cluster analyses, then doing in each class regression forecasting, hence it is evident that the predictablity rate for improving model is used.The method of such a " first cluster and return again " is predicted, and can obtain more preferable effect.

Description

A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis
Technical field
The present invention relates to a kind of method in video quality evaluation and test field, is specifically that one kind is based on sdi video information and time Information, after doing cluster analysis to video source sequence, no-reference video quality is used in similar characteristic per one kind The compression bit rate Forecasting Methodology of evaluation model.
Background technology
Multimedia rapid development also provides multiple terminal selection, including the TV of giant-screen, small size for video-see Smart mobile phone, and tablet personal computer for falling between of size etc..Pursuit of the beholder to number of videos and quality is gradual Lifting, bigger memory space and the more requirement of high compression code check for equipment also increase therewith.Therefore, regard reaching certain During frequency quality, how to find compression bit rate as small as possible turns into the Research Points of this patent.Therefore, this patent proposes one kind Compression bit rate Forecasting Methodology based on video content and cluster analysis.
Video quality evaluation and test can be broadly divided into two big kind methods:It is subjective and objective.Objective quality is evaluated and tested and subjective method Compare, more flexibly, fast, be easy to put into practice.Objective quality is evaluated and tested is divided into full reference, partly with reference to and without with reference to evaluation and test side again Method.Wherein, no-reference video quality evaluating method is directly analyzed video, then makes assessment to video quality quality. Have no-reference video quality evaluating method of the major class based on video self-information parameter at present, because it need not be to video Source sequence is compressed processing, and the complexity of method is relatively low, is also easy to put into practice, therefore this method can apply to real-time system In, tool has significant practical applications.
Existing result of study shows that Subjective video quality is mainly influenceed by following factor:In coded system, video Appearance, compression bit rate, video frame rate and video resolution.Some proposed at present are regarded based on video parameter model without reference Frequency quality assessment method is also based substantially on the one or more in five elements of the above.As Motohiro Takagi et al. exist IEEE International Conference on Visual Communications and Image in 2014 Delivered on Processing, pp.33-36 (IEEE visual communications in 2014 and image procossing international conference, page 33 to 36) “Optimized spatial and temporal resolution based on subjective quality Estimation without encoding " (time domain and spatial resolution optimization based on the estimation of non-coding subjective quality) text Zhang Zhong, i.e., video quality is predicted using compression bit rate and video frame rate.
However, existing no-reference video quality evaluation is mostly that video motion information or coding information are extracted Afterwards,
Video quality is directly predicted, seldom analyzed for the classification of video content.It is existing few in number Method by being given a forecast after classifying to video, it is also mostly to observe by the naked eye video content to be classified, is such as divided into " new News class ", " cartoon class " etc..Still it is barely satisfactory in accuracy.
Therefore, the present invention proposes to do the side of compression bit rate prediction based on video content self-information and using cluster analysis Method, to improve the accuracy of model prediction and practicality.
The content of the invention
The present invention is on the basis of existing no reference video method for evaluating objective quality, there is provided one kind based on video content and The compression bit rate Forecasting Methodology of cluster analysis, classifies to video self-information, and forecasting accuracy is improved with this.
To achieve the above object, the technical solution adopted by the present invention is as follows:
S1:Sobel filtering is done to each frame of video, obtains spatial information SI;The monochrome information of adjacent two frame is made the difference Value, obtains temporal information TI;
S2:The spatial information SI and temporal information TI obtained to S1, does cluster analysis using k-means methods, obtains more Individual class;
S3:In S2 each class, coefficient regression is done, obtains compression bit rate forecast model, and utilize the model prediction Compression bit rate.By being returned in each class to it, forecasting accuracy is improved.
More preferably, the S1:For the n-th frame image of former video sequence, it is respectively processed with following two formula, from And obtain spatial information SI (Spatial Information) and temporal information TI (Temporal Information):
SI=maxtime{stdspace[Sobel(Fn)]}
TI=maxtime{stdspace[Fn(i,j)–Fn-1(i,j)]}
Wherein FnIt is the monochrome information of present frame, Sobel represents the Sobel operators in classical image procossing, stdspaceTable Show and standard deviation, max are asked to the result being calculated by Sobel in the frametimeRepresent to calculate all frames by standard deviation Obtained result takes maximum.
More preferably, the S2:The spatial information SI and temporal information TI results in S1 are taken, brings into K-means algorithms and does Cluster analysis, referred to using square (the Squared Euclidean distance) of Euclidean distance as the distance for calculating cluster Mark.Meanwhile using the silhouette values in K-means cluster analyses as cluster result analysis indexes, by analyzing the value, It is determined that final cluster number.Finally, the video with similar SI and TI information is gathered for one kind.
More preferably, the S3, after S2 completes cluster analysis, in the class that each is gathered, the space that will be calculated in S1 Information SI and temporal information TI is brought into following compression bit rate forecast model, the sequence of corresponding different video, is brought into different Subjective video quality evaluates and tests MOS score values, obtains the predicted value of compression bit rate, realizes to needed for video compress under extra fine quality requirement The prediction of code check:
vc=TISI (2)
α(vc)=c1+c2·log(vc) (3)
γ(vc)=c4+c5·log(vc) (5)
Wherein, c1To c6For model parameter.α, β, γ are intermediate parameters.MOS (Mean Opinion Score) represents to regard Frequency subjective testing score value, there is different values according to different method of testings, and this invention takes in ITU-RBT-500 files DSI Variant II methods, and employ the principle of 5 points of systems, i.e.,:1 point represents that quality is excessively poor;2 points represent quality compared with Difference;3 points represent that quality is general;4 points represent that quality is preferable;5 points represent that quality is very good.In addition, TI and SI represent the time respectively Information and spatial information.vcThat represent is video content (video content), is determined by TI and SI.BRpWhat is then represented is pre- The compression bit rate of survey.
Further, the model parameter c1, c2, c3, c4, c5, c6Determine by the following method:In practical application is ensured Encoder type, video resolution and frame per second it is consistent with subjective video quality ratings material in the case of, commented with subjective quality Valency result carries out least square regression calculating to the mathematical modeling of proposition, obtains the model parameter for application-specific.
The present invention considers influence of the video content to video quality, and utilization space information is with temporal information as in video Hold feature, and cluster analysis is done to video content features, the video with similar features is gathered for one kind.To based on video After the model of parameter carries out inverse transformation, you can with reference to video content and desired video quality, compressed code is done in each class Rate is predicted.The method can generally use before the coding, for required for when determining to reach the video quality of requirement substantially Compression bit rate.
Compared with prior art, the present invention has following beneficial effect:
Method proposed by the present invention first to carry out k-means cluster analyses, then doing in each class regression forecasting, hence it is evident that carry The predictablity rate for having risen model is used.The method of such a " first cluster and return again " is predicted, and can obtain more preferable effect.
Brief description of the drawings
By reading with reference to the following drawings, will become for features, objects and advantages of the invention and holistic approach It is clear to become apparent from:
Fig. 1 is the FB(flow block) of the compression bit rate Forecasting Methodology based on video content and cluster analysis.
Fig. 2 is that the spatial information of the video source sequence for Parameters in Regression Model is believed with the time in one embodiment of the invention Breath.
Fig. 3 is to use the prediction result after the inventive method.
Embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill to this area For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention Protection domain.
Specific embodiment is being described without reference objective video quality evaluation application below in conjunction with the inventive method, will this hair Bright proposition carries out cluster analysis using TI and SI, and carrying out regression forecasting in each class afterwards is applied to quality evaluation, specific stream Journey block diagram is as shown in Figure 1.The 4K ultra high-definition videos for being 30fps using the frame per second of HEVC compressed encodings are applied the invention to herein In sequence.It should be noted that the frame per second that the result (such as Pearson correlation coefficients PCC) is only applicable to HEVC codings is 30fps 4K videos, for the application under different scenes, in fact it could happen that Different Results.But overall method is general, this is not influenceed The essence of invention.
The extraction step of video time complexity is introduced first below, then introduces the extraction step of sdi video complexity Suddenly, k-means clustering methods, and cluster number analysis method, last place of matchmakers next will be discussed in detail on basis herein The no-reference video quality evaluation model of foundation.
1) space and the temporal information of video are calculated.
SI=maxtime{stdspace[Sobel(Fn)]}
TI=maxtime{stdspace[Fn(i,j)–Fn-1(i,j)]}
Wherein FnIt is the monochrome information of present frame, Sobel represents the Sobel operators in classical image procossing, stdspaceTable Show and standard deviation, max are asked to the result being calculated by Sobel in the frametimeRepresent to calculate all frames by standard deviation Obtained result takes maximum.
2) K-means cluster analyses are carried out to the SI and TI of video.
The present invention carries out cluster analysis using k-means methods, because k-means is unsupervised learning method, it is only necessary to Determine the class number gathered.Therefore and silhouette values are selected as the index for evaluating and testing cluster result under inhomogeneity number.The index takes It is worth scope [- 1,1], the usual value is bigger, illustrates that the video sequence is more remote from other classes, the polymerization effect in its affiliated class is got over It is good.
When analyzing silhouette result, present invention selection following four feature carries out interpretation of result:Minimum value Silhmin, maximum SilhmaX, average SilhmeanAnd standard deviation Silhdev.Analyzed below by taking table one as an example.Wherein, KcaRepresent cluster number.
The cluster analysis silhouette value results of the inhomogeneity number of table one
Classification Kca=2 Kca=3 Kca=4 Kca=5
Silhmin 0.3905 0.1383 0.5069 0.5069
Silhmax 0.9381 0.9793 0.9677 1
Silhmean 0.839 0.7643 0.7410 0.7717
Silhdev 0.1726 0.2305 0.1620 0.1911
Work as KcaWhen=2, although its average highest, and standard deviation come it is second small, by subsequently a kind of being carried out to every During regression forecasting, it is found that accuracy rate is low, effect is poor.Its basic reason, which also resides in, only gathers for 2 classes, and class number is very few, knot now Fruit and the difference very little not clustered.That is, gather for 2 class when, although being met the requirements in data, can without reality meaning.
Work as KcaWhen=3, its minimum value as little as 0.1383, it means that Clustering Effect is excessively poor, and only class is gathered As a result unobvious.Therefore, it is necessary to which more class numbers could meet to require.
Work as KcaWhen=5, its maximum is 1, and this explanation gather effect is extremely good from data.But from result See, an only video sequence in such, i.e. class number now is excessive, should reduce class number.
To sum up analyze, KcaValue has optimal gather effect when being 4.
After the class number for determining cluster analysis, you can carry out cluster analysis according to k-means algorithms.Finally, will have similar Spatial information SI and the video of temporal information TI features are gathered for one kind.
3) according to cluster analysis result, in each class, the video in such return, it is accurate so as to improve prediction True rate.
After carrying out cluster analysis, in each class, return to obtain model parameter c using least square method1To c6, then The prediction of code check is compressed using no-reference video quality evaluation model.
By taking 4K definition video datas storehouse disclosed in Shanghai Communications University's Image Communication and network engineering research institute as an example (http://medialab.sjtu.edu.cn/resources/resources.html), the database is with 10 reference videos Based on, it is compressed with 6 code check points respectively, and provide corresponding subjective DMOS values.Spearman coefficient (SROCC) It is used as weighing the index of forecasting accuracy with Pearson's coefficient (LCC).
After table two is by cluster analysis, per a kind of prediction result, and prediction result when not carrying out cluster analysis.Can To find out, after carrying out cluster analysis in advance, PCC highests, which improve 28.76%, RMSE highests, reduces 68.98%.By this hair It is bright, more preferable effect is obtained really.
The prediction result of table two
Classification PCC SCC RMSE MOS
Classification A 0.972 0.986 0.102 3.945
Classification B 0.953 0.951 0.087 3.818
Classification C 0.901 0.865 0.274 4.124
Classification D 0.961 0.969 0.177 4.041
All sequences when not clustering 0.672 0.753 1.174 4.002
Described above is only the preferred embodiment of the present invention, and protection scope of the present invention is not only limited to above-mentioned implementation Example, all technical schemes belonged under thinking of the present invention belong to the protection category of the present invention.It should be pointed out that for the art Technical staff for, some improvements and modifications without departing from the principles of the present invention, these improvements and modifications also all should It is considered as protection scope of the present invention.

Claims (4)

1. a kind of compression bit rate Forecasting Methodology based on video content and cluster analysis, it is characterised in that comprise the following steps:
S1:Sobel filtering is done to each frame of video, obtains spatial information SI;Difference is done to the monochrome information of adjacent two frame, obtained To temporal information TI;
S2:The spatial information SI and temporal information TI obtained to S1, does cluster analysis using k-means methods, obtains multiple classes;
S3:In S2 each class, coefficient regression is done, obtains compression bit rate forecast model, and compress using the model prediction Code check;
The S3:After S2 completes cluster analysis, in the class that each is gathered, by the spatial information SI calculated in S1 and time Information TI is brought into following compression bit rate forecast model, the sequence of corresponding different video, brings different Subjective video qualities into MOS score values are evaluated and tested, the predicted value of compression bit rate is obtained, realizes the prediction to code check needed for video compress under extra fine quality requirement:
vc=TISI (2)
α(vc)=c1+c2·log(vc) (3)
γ(vc)=c4+c5·log(vc) (5)
Wherein, c1To c6For model parameter, α, β, γ are intermediate parameters, and MOS represents video subjective testing score value, takes ITU-R DSI Variant II methods in BT-500 files, and employ the principle of 5 points of systems, i.e.,:1 point represents that quality is excessively poor, 2 Divide and represent second-rate, 3 points represent that quality are general, and 4 points represent that quality are preferable, and 5 points represent that quality are very good;TI and SI generations respectively Table temporal information and spatial information;vcWhat is represented is video content, is determined by SI and TI, BRpWhat is then represented is the compressed code of prediction Rate.
2. the compression bit rate Forecasting Methodology according to claim 1 based on video content and cluster analysis, it is characterised in that: The S1:For the n-th frame image of former video sequence, it is respectively processed with following two formula, so as to obtain spatial information SI and temporal information TI:
SI=maxtime{stdspace[Sobel(Fn)]}
TI=maxtime{stdspace[Fn(i,j)–Fn-1(i,j)]}
Wherein FnIt is the monochrome information of present frame, Sobel represents the Sobel operators in classical image procossing, stdspaceExpression pair The result being calculated by Sobel in the frame asks standard deviation, maxtimeAll frames are calculated by standard deviation for expression Result take maximum.
3. the compression bit rate Forecasting Methodology according to claim 1 based on video content and cluster analysis, it is characterised in that: The S2:The spatial information SI and temporal information TI results in S1 are taken, brings into K-means algorithms and does cluster analysis, using Europe Square range index clustered as calculating of formula distance, meanwhile, made using the silhouette values in K-means cluster analyses For cluster result analysis indexes, by analyzing the silhouette values, it is determined that final cluster number, finally, will have similar Spatial information SI and the video of temporal information TI features are gathered for one kind.
4. the compression bit rate Forecasting Methodology based on video content and cluster analysis according to claim any one of 1-3, its It is characterised by:The model parameter c1, c2, c3, c4, c5, c6Determine by the following method:Encoder in practical application is ensured In the case of type, video resolution and frame per second are consistent with subjective video quality ratings material, with subjective quality assessment result pair The mathematical modeling of proposition carries out least square regression calculating, obtains the model parameter for application-specific.
CN201610378960.0A 2016-05-31 2016-05-31 A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis Active CN105959685B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610378960.0A CN105959685B (en) 2016-05-31 2016-05-31 A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610378960.0A CN105959685B (en) 2016-05-31 2016-05-31 A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis

Publications (2)

Publication Number Publication Date
CN105959685A CN105959685A (en) 2016-09-21
CN105959685B true CN105959685B (en) 2018-01-19

Family

ID=56907484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610378960.0A Active CN105959685B (en) 2016-05-31 2016-05-31 A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis

Country Status (1)

Country Link
CN (1) CN105959685B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111447446B (en) * 2020-05-15 2022-08-23 西北民族大学 HEVC (high efficiency video coding) rate control method based on human eye visual region importance analysis
CN112861852A (en) * 2021-01-19 2021-05-28 北京金山云网络技术有限公司 Sample data screening method and device, electronic equipment and storage medium
CN113038142B (en) * 2021-03-25 2022-11-01 北京金山云网络技术有限公司 Video data screening method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101715097A (en) * 2008-09-29 2010-05-26 索尼株式会社 Image processing apparatus and coefficient learning apparatus
CN101742355A (en) * 2009-12-24 2010-06-16 厦门大学 Method for partial reference evaluation of wireless videos based on space-time domain feature extraction
CN102118803A (en) * 2011-04-14 2011-07-06 北京邮电大学 Video cross-layer scheduling method of mobile communication system on basis of QoE prediction
CN103780901A (en) * 2014-01-22 2014-05-07 上海交通大学 Video quality and compressed code rate evaluating method based on space information and time information of video

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101715097A (en) * 2008-09-29 2010-05-26 索尼株式会社 Image processing apparatus and coefficient learning apparatus
CN101742355A (en) * 2009-12-24 2010-06-16 厦门大学 Method for partial reference evaluation of wireless videos based on space-time domain feature extraction
CN102118803A (en) * 2011-04-14 2011-07-06 北京邮电大学 Video cross-layer scheduling method of mobile communication system on basis of QoE prediction
CN103780901A (en) * 2014-01-22 2014-05-07 上海交通大学 Video quality and compressed code rate evaluating method based on space information and time information of video

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Optimized spatial and temporal resolution based on subjective quality estimation without encoding;Motohiro Takagi, Hiroshi Fujii, Atsushi Shimizu;《Visual Communications and Image Processing Conference, 2014 IEEE》;20150302;第33-36页 *
基于神经网络的IPTV视频质量评估模型;李蕊;《中国优秀硕士学位论文全文数据库-信息科技辑》;20130315;I136-952 *

Also Published As

Publication number Publication date
CN105959685A (en) 2016-09-21

Similar Documents

Publication Publication Date Title
Wang et al. Subjective and objective quality assessment of compressed screen content images
CN102611910B (en) Objective evaluation method of no-reference video quality based on key frame image quality weight
Jin et al. CNN oriented fast QTBT partition algorithm for JVET intra coding
Ma et al. Reduced-reference image quality assessment in reorganized DCT domain
Temel et al. Perceptual image quality assessment through spectral analysis of error representations
CN107046639B (en) HEVC code stream quality prediction model based on content
Zuo et al. Screen content image quality assessment via convolutional neural network
Poyser et al. On the impact of lossy image and video compression on the performance of deep convolutional neural network architectures
CN108280480B (en) Latent image carrier security evaluation method based on residual error co-occurrence probability
CN105959685B (en) A kind of compression bit rate Forecasting Methodology based on video content and cluster analysis
CN105049851A (en) Channel no-reference image quality evaluation method based on color perception
Aqqa et al. Understanding How Video Quality Affects Object Detection Algorithms.
Serir et al. No-reference blur image quality measure based on multiplicative multiresolution decomposition
CN106375754B (en) View-based access control model stimulates the video quality evaluation without reference method of attenuation characteristic
CN109429051A (en) Based on multiple view feature learning without reference stereoscopic video quality method for objectively evaluating
CN101426148A (en) Video objective quality evaluation method
CN108513132A (en) A kind of method for evaluating video quality and device
Ma et al. No-reference image quality assessment based on multi-task generative adversarial network
Navas et al. Image fidelity metrics: future directions
Aldahdooh et al. Improving relevant subjective testing for validation: Comparing machine learning algorithms for finding similarities in VQA datasets using objective measures
Yang et al. Subjective quality evaluation of compressed digital compound images
Gao et al. Modeling image quality score distribution using alpha stable model
Wang et al. Spatio-temporal ssim index for video quality assessment
Chang et al. Image Quality Evaluation Based on Gradient, Visual Saliency, and Color Information
CN115052146A (en) Content self-adaptive down-sampling video coding optimization method based on classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant