CN104185023B

CN104185023B - Automatic detecting method and device for three-dimensional video format

Info

Publication number: CN104185023B
Application number: CN201410469369.7A
Authority: CN
Inventors: 王洪剑; 林江; 查毓水
Original assignee: SHANGHAI TONGTU SEMICONDUCTOR TECHNOLOGY Co Ltd
Current assignee: SHANGHAI TONGTU SEMICONDUCTOR TECHNOLOGY Co Ltd
Priority date: 2014-09-16
Filing date: 2014-09-16
Publication date: 2017-02-15
Anticipated expiration: 2034-09-16
Also published as: CN104185023A

Abstract

The invention discloses an automatic detecting method and device for a three-dimensional video format. The method comprises the following steps that a frame of an input video image is divided into image blocks, and dimension reduction is carried out on the image blocks so as to reduce feature vector dimensions; multi-feature extraction is carried out on the image blocks with the reduced dimensions, wherein the multiple features comprise the block gradient feature, the block histogram feature, the frame histogram feature, the projection feature and the middle line boundary feature; feature fusion is carried out on the extracted features by utilizing a multi-feature fusion method of a spatial domain of the 3D format features; fuzzy feature formats on the spatial domain and the time domain are discriminated according to the fused features, and an image format of a current frame video image is judged; the video format of a current frame is output, and a playing device is controlled to play a video according to the video image format automatically detected. According to the method and the device, the computation complexity of detecting the video format can be greatly reduced, and the accuracy of detecting the video format can be greatly improved.

Description

3 D video form automatic testing method and device

Technical field

The present invention relates to field of video displaying, more particularly to a kind of space domain characteristic form based on multi-feature fusion is sentenced Not and the 3 D video form automatic testing method of time domain format discriminance and device.

Background technology

With the development of video image technology, the more and more video film upload of two-dimentional (2D) and three-dimensional (3D) video Share to video website, and with (3D) film projection at the cinema of increasing three-dimensional and three-dimensional (3D) DTV The starting broadcasting of program, three-dimensional (3D) number of videos is in the gesture of magnanimity growth.The three-dimensional (3D) that can be transmitted by existing broadband at present Video format have left-right format (that is, the left-half of image is left-eye image, and right half part is eye image), top-down format (that is, the top half of image is left-eye image, and the latter half is eye image), the quantity of 3 D video increases it in magnanimity How gesture, take, 3 D video, the thing becoming extremely urgent home, but remain the challenge of many.Due to existing Video in remain substantial amounts of two dimension (2D) video, therefore need most one of technology major issue of solution it is simply that pin Automatic detection is carried out to the various film sources play on TV and goes out play video image format, control TV normal play piece Source, is the film source of which kind of form with the film source realizing the broadcasting of television set intelligently automatic identification, and this is accomplished by providing a kind of inspection automatically Surveying video format is two-dimentional (2D) or the technological means of three-dimensional (3D), if three-dimensional (3D) video format is in addition it is also necessary to detect It is which kind of 3 dimensional format concrete (which kind of form in left-right format, top-down format) in three-dimensional (3D) video format.

One complete three-dimensional (3D) television system accepts port, electricity as shown in figure 1, its main modular includes video signal Origin system, audio process, speaker, video processor, graphics controller, 3 d display, and be used for detecting that three-dimensional regards The module of frequency form is then the 3 dimensional format detection module in graphics controller.

The Chinese patent of Publication No. CN101980545A provides a kind of " side of automatic detection three-dimensional TV video frequency program form Method ", it passes through first to carry out image entropy threshold value differentiation, and the covariance similarity feature threshold value then carrying out image block pixel levies differentiation Method is differentiating video format.It has the disadvantage that：1st, there is higher computation complexity；2nd, video format detection is accurate Rate is not high.

Content of the invention

For overcoming the shortcomings of above-mentioned prior art presence, the purpose of the present invention is to provide a kind of 3 D video form automatic Detection method and device, its computation complexity that not only substantial amounts of minimizing video format detects, and also raising that can be very big regards The accuracy rate of frequency format detection.

For reaching above and other purpose, the present invention proposes a kind of 3 D video form automatic testing method, walks including following Suddenly：

Step one, carries out image block and carries out dimension reduction to reduce characteristic vector dimension to the frame video image inputting Number；

Step 2, the image block after dimension reduction is carried out multi-feature extraction, obtain first kind formatter eigenvalue, the Two class formatter eigenvalues, the 3rd class formatter eigenvalue, this multiple features comprises block gradient modular character, block histogram feature, frame Histogram feature, projection properties, intermediate line boundary characteristic；

Step 3, to the first kind formatter eigenvalue obtaining, Equations of The Second Kind formatter eigenvalue, the 3rd class form subcharacter Value carries out multiple features fusion using the multiple features fusion method in the spatial domain of 3D format character and obtains first kind 3D format character Value, Equations of The Second Kind 3D format character value and the 3rd class 3D format character value；

Step 4, carries out the format discriminance in spatial domain and the fuzzy characteristics lattice in time domain according to the feature after merging Formula differentiates, judges the picture format of current frame video image；

Step 5, the video format of output present frame, control the video image format that playing device goes out according to automatic detection Play out.

Further, in step 2, for video image f (x, y) of the unknown format of input, enter as follows Row statistics 3 class formatter eigenvalues：

The left-half image taking image f (x, y) is possible left-eye image, and the right half part image of image f (x, y) is Possible eye image, then be calculated first kind formatter eigenvalue according to 3D formatter eigenvalue calculation method, and symbol is remembered For

The top half image taking image f (x, y) is possible left-eye image, and the latter half image of image f (x, y) is Possible eye image, then be calculated Equations of The Second Kind formatter eigenvalue according to 3D formatter eigenvalue calculation method, and symbol is remembered For

The top field picture taking image f (x, y) is possible left-eye image, and the bottom field picture of image f (x, y) is the possible right side Eye pattern picture, then be calculated the 3rd class formatter eigenvalue according to 3D formatter eigenvalue calculation method, symbol is designated as

Further, this 3D formatter eigenvalue calculation method is as follows：

Calculate respectively the frame of pixels histogram vectors of left-eye image and eye image, block histogram vectors, gradient-norm vector, Horizontal projection vector, calculates corresponding eigenvalue, after the corresponding normalization of 4 feature vectors, eigenvalue symbol is designated as respectively to it d_hist、d_{blk_hist}、d_mag、d_prj；

Detected in video image either with or without boundary line by the gradient-norm and monochrome information detecting belt-like zone, its normalization Boundary line eigenvalue symbol afterwards is designated as d_bry；

Wherein, d_hist、d_{blk_hist}、d_mag、d_prj、d_bryReferred to as 3D formatter eigenvalue.

Further, in step 3, using formula is calculated as below, multiple features are carried out to this first kind formatter eigenvalue Merge and obtain first kind 3D format character value：

Wherein, w_hist、w_{blk_hist}、w_prj、w_bry、w_magIt is respectively first kind form subcharacter Weight proportion coefficient set in advance, weight proportion coefficient domain of definition for [0,1] and meets equality condition 1=w_hist+w_{blk_hist}+w_prj+w_bry+w_mag, d^lr∈ [0,1] represents first kind 3D format character value.

Further, in step 3, using formula is calculated as below, multiple features are carried out to this Equations of The Second Kind formatter eigenvalue Merge and obtain Equations of The Second Kind 3D format character value：

Wherein, w_hist、w_{blk_hist}、w_prj、w_bry、w_magFor weight proportion coefficient set in advance, with this first kind formatter Feature weight proportionality coefficient is identical, d^tb∈ [0,1] represents Equations of The Second Kind 3D format character value.

Further, in step 3, the 3rd class formatter eigenvalue is melted using formula is calculated as below carrying out multiple features Conjunction obtains the 3rd class 3D format character value：

Wherein, w '_hist、w′_{blk_hist}、w′_prj、w′_magFor weight proportion coefficient set in advance, the definition of weight proportion coefficient Domain for [0,1] and meets equality condition 1=w '_hist+w′_{blk_hist}+w′_prj+w′_mag；d^it∈ [0,1] represents that the 3rd class 3D form is special Value indicative.

Further, in step 4, the video format being carried out according to following expression in spatial domain differentiates：

Wherein, fmt represents the video format that image f (x, y) differentiates, d represents the format character value of image f (x, y), d= min(d^tb,d^lr,d^it), d^lrRepresent the first kind 3D format character value of image f (x, y), d^tbRepresent the Equations of The Second Kind of image f (x, y) 3D format character value, d^itRepresent the 3rd class 3D format character value of image f (x, y)；As fmt=0, image f (x, y) represents to be sentenced Not Wei a 2D image, as fmt=1, image f (x, y) represents and is determined as the 3D rendering of a left-right format, works as fmt=2 When, image f (x, y) represents the 3D rendering being determined as a top-down format, and as fmt=3, image f (x, y) expression is determined as one The 3D rendering of individual stagger scheme.

Further, in step 4, the video format being carried out according to following expression in time domain differentiates：

Wherein, fmt represents the video format that format discriminance in spatial domain for current frame image f (x, y) obtains, fmt_n-1 Represent the video format of previous frame image, using the form fmt in spatial domain as previous frame format fmt of initial first frame_n-1's Initialization form, fmt_nRepresent the video format that current frame image obtains after time-domain information fusion, d_lowFor 3D format character The Lower Threshold threshold value of value d, d_highLower Threshold threshold value for 3D format character value d.

For reaching above-mentioned purpose, the present invention also provides a kind of 3 D video form automatic detection device, at least includes：

Image block module, to input a frame video image carry out image block carry out dimension reduction with reduce feature to Amount dimension；

Multi-feature extraction module, will be straight to the image block feeding block gradient modular character extracting sub-module after dimension reduction, block Square figure feature extraction submodule, frame histogram feature extracting sub-module, projection properties extracting sub-module and middle line boundary are special Levy extracting sub-module and carry out multi-feature extraction respectively, obtain first kind formatter eigenvalue, Equations of The Second Kind formatter eigenvalue, the 3rd Class formatter eigenvalue；

Multiple features fusion module, the feature that each submodule in this multi-feature extraction module is extracted passes through 3D format character Spatial domain on multiple features fusion method carry out Feature Fusion obtain first kind 3D format character value, Equations of The Second Kind 3D format character Value and the 3rd class 3D format character value；

Form judge module, the feature after merging is sent into spatial domain form judge module and time domain form judge module Carry out the format discriminance in spatial domain and the fuzzy characteristics format discriminance in time domain, judge the image of current frame video image Form.

Further, the first kind formatter eigenvalue that this multiple features fusion module is extracted to each submodule is counted using following Calculation formula carries out multiple features fusion and obtains first kind 3D format character value：

Wherein, w_hist、w_{blk_hist}、w_prj、w_bry、w_magIt is respectively first kind formatter eigenvalue Weight proportion coefficient set in advance, weight proportion coefficient domain of definition for [0,1] and meets equality condition 1=w_hist+w_{blk_his}t+w_prj+w_bry+w_mag, d^lr∈ [0,1] represents first kind 3D format character value.

The Equations of The Second Kind formatter eigenvalue that each submodule is extracted is obtained using formula is calculated as below carrying out multiple features fusion Equations of The Second Kind 3D format character value

The 3rd class formatter eigenvalue that each submodule is extracted is obtained using formula is calculated as below carrying out multiple features fusion 3rd class 3D format character value：

Compared with prior art, a kind of 3 D video of present invention form automatic testing method and device pass through first to input One frame video image carries out image block and carries out dimension reduction to reduce characteristic vector dimension element, reduces computation complexity, secondly will It is sent into 5 feature extraction submodules and carries out multi-feature extraction respectively, then sends into Feature Fusion module to these features and carries out Feature after merging is sent into form judge module and is carried out the fuzzy characteristics lattice in spatial domain and time domain by Feature Fusion again Formula, differentiates the picture format of current frame video image f (x, y), finally exports the video format of present frame, controls playing device to press Play out film source according to the video image format that automatic detection goes out, the computation complexity of video format detection not only can be reduced, And the accuracy rate improving video format detection that can be very big.

Brief description

Fig. 1 is the system architecture diagram of a complete 3D television system；

Fig. 2 is a kind of flow chart of steps of present invention 3 D video form automatic testing method；

Fig. 3 is multiple features fusion calculation flow chart in present pre-ferred embodiments；

Fig. 4 is that the video format in spatial domain in present pre-ferred embodiments differentiates calculation flow chart；

Fig. 5 is that the video format in time domain in present pre-ferred embodiments differentiates calculation flow chart；

Fig. 6 is a kind of system architecture diagram of present invention 3 D video form automatic detection device.

Specific embodiment

Below by way of specific instantiation and embodiments of the present invention are described with reference to the drawings, those skilled in the art can Understand further advantage and effect of the present invention by content disclosed in the present specification easily.The present invention also can be by other different Instantiation implemented or applied, the every details in this specification also can be based on different viewpoints and application, without departing substantially from Carry out various modification and change under the spirit of the present invention.

Because three-dimensional (3D) video image is merged by right and left eyes image, if the three-dimensional of left-right format (3D) figure Picture, then left-eye image is the left-half of three-dimensional (3D) video image, and eye image is the right-hand part of three-dimensional (3D) video image Point；If the three-dimensional of top-down format (3D) image, then left-eye image is the top half of three-dimensional (3D) video image, right eye figure As the latter half for three-dimensional (3D) video image；If the three-dimensional of stagger scheme (3D) image, then left-eye image is three-dimensional (3D) the top field picture of video image, eye image is the bottom field picture of three-dimensional (3D) video image.

No matter three-dimensional (3D) video image is left-right format, top-down format or stagger scheme, single-frame imagess are all by a left side Eye pattern as and eye image composition, even if right and left eyes image has parallax, but left-eye image and eye image seem it is all ten As split-phase, this is all the symmetry feature that two-dimentional (2D) does not have.Symmetrical feature mainly includes：Rectangular histogram, gradient-norm vector, Projection vector etc..For the feature of above three-dimensional (3D) video image, the present invention proposes the simply effective automatic video frequency lattice of one kind Formula detection method, that is, the multiple features fusion technical method in spatial domain carry out extracting 3D format character and spatial domain and time domain On fuzzy characteristics format discriminance automatic video frequency format detection method.

Fig. 2 is a kind of flow chart of steps of present invention 3 D video form automatic testing method.As shown in Fig. 2 the present invention A kind of 3 D video form automatic testing method, comprises the steps：

Step 201, carries out image block and carries out dimension reduction to reduce characteristic vector dimension to the frame video image inputting Number, reduces computation complexity.

Because three-dimensional (3D) video image resolution is often very high, if directly with pixel carry out single-point extract feature to Amount, computation complexity is too high, and cost is too greatly it is therefore desirable to carry out dimension reduction to characteristic vector.Again because of 3D video image Right and left eyes image has parallax, so what left-eye image pixel and eye image pixel were offset with locus.The present invention By to left-eye image and eye image with pixel resolution size as B_h×B_wImage block carries out image block division, with block of pixels Carry out the negative effect that feature extraction can effectively reduce right and left eyes image parallax, the characteristic vector to right and left eyes image for unit Dimension reduction can be realized, increase the robustness of characteristic vector, simultaneously can the substantial amounts of meter reducing video format detection method Calculate complexity.

Step 202, the image block after dimension reduction is carried out multi-feature extraction, and multiple features here mainly comprise：Block Gradient modular character, block histogram feature, frame histogram feature, projection properties, intermediate line boundary characteristic.

Due to the left-eye image of three-dimensional (3D) video image and eye image have very strong similarity (3D video image Symmetry), the frame of pixels histogram vectors of calculating left-eye image and eye image, block rectangular histogram (resolution B respectively_h×B_wPixel Block) vector, gradient-norm vector, horizontal projection vector, then this 4 feature vectors of left-eye image and eye image should have phase Like property, it can be calculated with corresponding vector distance d ∈ [0,1] (vector distance abbreviation eigenvalue), if eigenvalue is equal to 0, Then left-eye image is almost identical with the characteristic vector of eye image, if distance is equal to 1, the spy of left-eye image and eye image Levy vector entirely different.After the corresponding normalization of 4 feature vectors, eigenvalue symbol is designated as d respectively_hist、d_{blk_hist}、d_mag、 d_prj.

Merged by right and left eyes image due to 3D video image, if the 3D rendering of left-right format, then generally exist The horizontal direction of middle column has an obvious boundary line or dark areas band, if the 3D rendering of top-down format, then generally There are an obvious boundary line or dark areas band in the vertical direction of center row.Then can be by the gradient of belt-like zone in detection It is designated as d either with or without boundary line, the boundary line eigenvalue symbol after its normalization in mould and monochrome information detection video image_bry∈ [0,1].For the 3D rendering of stagger scheme, due to there is no the boundary line of obvious right and left eyes image, so the 3D of stagger scheme There is no boundary line feature in format character.

Here by d_hist、d_{blk_hist}、d_mag、d_prj、d_bryReferred to as 3D formatter eigenvalue.

For video image f (x, y) of the unknown format of input, carry out as follows counting 3 class form subcharacters Value：

First kind formatter eigenvalue：The left-half image taking image f (x, y) is possible left-eye image, image f The right half part image of (x, y) is possible eye image, then can be calculated according to 3D formatter eigenvalue calculation method First kind formatter eigenvalue, symbol is designated as

Equations of The Second Kind formatter eigenvalue：The top half image taking image f (x, y) is possible left-eye image, image f The latter half image of (x, y) is possible eye image, then can be calculated according to 3D formatter eigenvalue calculation method Equations of The Second Kind formatter eigenvalue, symbol is designated as

3rd class formatter eigenvalue：The top field picture taking image f (x, y) is possible left-eye image, image f (x, y) Bottom field picture be possible eye image, then the 3rd class form can be calculated according to 3D formatter eigenvalue calculation method Subcharacter value, symbol is designated as

Step 203, carries out feature to the multiple features fusion method in the spatial domain of the characteristic use 3D format character extracted Merge.

Picture material is often complicated and changeable, and the 2D image being more especially difficult to differentiate between under special case also has symmetry Characteristic comparing is strong (for example：Pure white scene, black scene), parallax is weaker than larger 3D rendering symmetry feature, if only Only judge video image format by single features it is easy to produce video format erroneous judgement.The present invention proposes by 3D form Multiple features fusion in the spatial domain of feature carries out 3D format character method, will multiple 3D form subcharacters be weighted merging Method, can extract a highly stable 3D format character by the method, can be very big improve video format detection Accuracy rate.The calculation flow chart of multiple features fusion is shown in Fig. 3.

If image f (x, y) is the 3D rendering of left-right format it is clear that left-half image (left-eye image) and right half part image The distance of each characteristic vector of (eye image) is smaller, then have first kind formatter eigenvalue Also 0 or smaller can be all simultaneously equal to.If only having a few sub- eigenvalue ratios in the 5 of image f (x, y) sub- eigenvalues Less, but can not certainly differentiate that image f (x, y) is a 3D rendering, because there have many 2D images also to meet a small pin for the case to be special Value indicative is little, such as：Pure white 2D scene image, its subcharacter valueIt is equal to 0.Therefore judge image f (x, y) Form, needs to consider the real reflecting video format characteristic of feature ability of all subcharacter values simultaneously, otherwise only relies only on individual Not single subcharacter base differentiates that video format is easily caused form erroneous judgement.Based on this principle, the present invention proposes multiple features fusion Method extracts 3D format character, can improve expression accuracy and the stability of 3D format character further.

First, the multiple features fusion method computing formula of first kind formatter eigenvalue is as follows：

In its formula, w_hist、w_{blk_hist}、w_prj、w_bry、w_magIt is respectively first kind formatter eigenvalue Weight proportion coefficient set in advance, weight proportion coefficient domain of definition for [0,1] and meets equality condition 1=w_hist+w_{blk_hist}+w_prj+w_bry+w_mag；d^lr∈ [0,1] represents first kind 3D format character value.

Secondly, similar method, carries out multiple features fusion for Equations of The Second Kind formatter eigenvalue and can obtain Equations of The Second Kind 3D Format character value, computing formula is as follows：

Wherein, w_hist、w_{blk_hist}、w_prj、w_bry、w_magWeight proportion coefficient set in advance, with first kind form subcharacter Weight proportion coefficient is identical, d^tb∈ [0,1] represents Equations of The Second Kind 3D format character value.

Again, multiple features fusion is carried out for the 3rd class formatter eigenvalue and can obtain the 3rd class 3D format character value, Computing formula is as follows：

In its formula, w '_hist、w′_{blk_hist}、w′_prj、w′_magWeight proportion coefficient set in advance, weight proportion coefficient defines Domain for [0,1] and meets equality condition 1=w '_hist+w′_{blk_hist}+w′_prj+w′_mag；d^it∈ [0,1] represents that the 3rd class 3D form is special Value indicative.

Step 204, carries out the fuzzy characteristics lattice in the format discriminance of spatial domain and time domain according to the feature after merging Formula differentiates, judges that (judgement is 2D form, top-down format, left-right format, staggered for the picture format of current frame video image f (x, y) Form).

1st, the format discriminance in spatial domain

3D video image is only possible to be one of left-right format, top-down format, stagger scheme it is impossible to belong simultaneously to 2 Plant 3D form.If input video image be left-right format image, obviously have first kind 3D format character value be less than second, Three class 3D format character values；If the video image of input is top-down format image, obviously there is Equations of The Second Kind 3D format character value Less than first and third class 3D format character value；If the video image of input is stagger scheme image, obviously there are the 3rd class 3D lattice Formula eigenvalue is less than first and second class 3D format character value；So first and second and three minimal eigenvalue in class 3D format character value The 3D format character value of input picture can be expressed, then can obtain mathematics calculation expression：

D=min (d^tb,d^lr,d^it)

Wherein, d represents the 3D format character value of image f (x, y)；Due to d^tb∈ [0,1], d^lr∈ [0,1], d^it∈[0, 1], then understand format character value d ∈ [0,1].

Knowable to video image format eigenvalue d definition, d describes the left-eye image of image f (x, y) and eye image If, that is to say, that the value of format character value d is less, 3D format character is stronger for distance metric, image f (x, y) is probably more 3D figure Picture；Conversely, the value of format character value d is bigger, 3D format character is weaker, and that is, 2D video format feature is stronger, and image f (x, y) is more It is probably 2D image.

In order to preferably describe, the present invention introduces the video format that symbol fmt carries out quantized image f (x, y), if fmt Value be equal to 0 then it represents that image f (x, y) is 2D video format image；If the value of fmt is equal to 1 then it represents that image f (x, y) It is left-right format image；If the value of fmt is equal to 2 then it represents that image f (x, y) is top-down format；If the value of fmt is equal to 3, Then represent that image f (x, y) is stagger scheme.For differentiating that input picture is 2D image or 3D rendering, special to 3D form here Levy one threshold value d of setting_th∈ (0,1), such as threshold value can be set to d_th=0.5, if format character value d >=d_th, explanatory diagram picture The 2D video format feature of f (x, y) is partially strong, and the 3D format character of f (x, y) is on the weak side, then by image f (x, y) format discriminance be 2D Video format, the value of fmt is designated as 0；If format character value d ＜ d_th, explanatory diagram is on the weak side as the 2D video format feature of f (x, y), I.e. the 3D format character of image f (x, y) is partially strong, then by image f (x, y) format discriminance be 3D form.If format character value d ＜ d_thI d^lr=d, explanatory diagram is as the 3D format character of f (x, y) is partially strong and first kind 3D format character is better than second and third class 3D lattice Formula feature, that is, the left-right format feature of image f (x, y) is the strongest, we then by image f (x, y) format discriminance be left-right format, The value of fmt is designated as 1；If format character value d ＜ d_thI d^tb=d, explanatory diagram is as the 3D format character of f (x, y) is partially strong and the Two class 3D format character are better than first and third class 3D format character, and that is, the top-down format feature of image f (x, y) is the strongest, then by image F (x, y) format discriminance is top-down format, and the value of fmt is designated as 2；If format character value d ＜ d_thI d^it=d, explanatory diagram is as f The 3D format character of (x, y) is partially strong and the 3rd class 3D format character is better than first and second 3D format character, i.e. the friendship of image f (x, y) Wrong format character is the strongest, then be stagger scheme by image f (x, y) format discriminance, the value of fmt is designated as 3；

Video format fmt that is last then can obtaining in spatial domain differentiates that expression formula is as follows：

Wherein, fmt represents the video format that image f (x, y) differentiates, d represents the format character value of image f (x, y), d^lrTable Diagram is as the first kind 3D format character value of f (x, y), d^tbRepresent the Equations of The Second Kind 3D format character value of image f (x, y), d^itRepresent 3rd class 3D format character value of image f (x, y)；As fmt=0, image f (x, y) expression is determined as a 2D image, when During fmt=1, image f (x, y) represents the 3D rendering being determined as a left-right format, and as fmt=2, image f (x, y) represents to be sentenced Not Wei a top-down format 3D rendering, as fmt=3, image f (x, y) represents and is determined as the 3D rendering of a stagger scheme.

Format discriminance calculation flow chart in spatial domain is shown in Fig. 4.Specific as follows：Calculating first and second and three class 3D forms are special Minimal eigenvalue in value indicative is as 3D format character value d of input picture；Judge whether this 3D format character value d is less than 3D lattice Formula eigenvalue threshold d_th, if so, then judge that video format is three-dimensional, and continue subsequently to judge, if it is not, then judging that video format is Two dimension；Judge whether this 3D format character value d is equal to first kind 3D format character value d^lrIf being equal to, judge that video format is left Right form, otherwise judges whether this 3D format character value d is equal to Equations of The Second Kind 3D format character value d^tb；If being equal to, judge video Form is top-down format, is otherwise judged as stagger scheme.

2nd, the format discriminance in time domain

Generally for the weaker 3D rendering of the fewer dark scene of the stronger 2D image of symmetry and quantity of information or gradient, Format character value d calculated can fall in 3D format character value threshold value d_thNear, so when 3D format character value d falls in d_th Near, 3D the or 2D video format feature of image f (x, y) is inconspicuous, and being difficult to distinguish image f (x, y) is 2D or 3D rendering lattice Formula；If directly with d_thAs video format criterion, video format erroneous judgement can be easily caused.Therefore, the present invention proposes 3D form fuzzy characteristics value method of discrimination carries out differentiating video format, first, introduces the Lower Threshold threshold of 3D format character value d here Value d_low(or claiming the strong characteristic threshold value of 3D form) and Upper threshold threshold value d_high(or claiming the strong characteristic threshold value of 2D form), wherein d_lowWith d_highDomain of definition is 0 ＜ d_low＜ d_th＜ d_high＜ 1.Obviously, as 3D format character value d ＞ d_high, then the 2D video format of image Feature is strong, and the video format fmt (being worth for 0) that the format discriminance in spatial domain obtains is with a high credibility；When 3D format character Value d ＜ d_low, then the 3D format character of image is strong, and the video format fmt that the format discriminance in spatial domain obtains (is worth and is 1st, 2 or 3) with a high credibility；When format character value d_low≤d≤d_high, the 2D video format feature of image or 3D format character not strong Strong, video format fmt (being worth for 0,1, the 2 or 3) Reliability ratio that the format discriminance in spatial domain obtains is relatively low, therefore claims 3D Format character value interval [d_low,d_high] it is format discriminance fuzzy characteristics area, claim 3D format character value d ∈ [d_low,d_high] it is form Fuzzy characteristics value, claims 3D format character value interval [0, d_low)U(d_high, 1] and it is the strong characteristic area of format discriminance, claim 3D format character Value d ∈ [0, d_low)U(d_high, 1] and it is the strong eigenvalue of form.

Because video image is all often continuous multiple frames, the video format of former frames is with respect to the lattice in spatial domain Formula fuzzy characteristics differentiates that the video format obtaining is with a high credibility, and the form in order to reduce fuzzy characteristics area is judged by accident and detected standard to algorithm The impact of exactness, present invention introduces the time-domain information of video format, because video image is all often continuous multiple frames, therefore when Prior image frame form fmt_nOften with previous frame image form fmt_n-1There is certain dependency, as format character value d ∈ [d_low, d_high], because the fmt that single frames form judges is with a low credibility, former two field picture form fmt in time-domain information_n-1With respect to work as The fmt credibility that previous frame calculates is higher, and the video format of therefore current frame image judges into the video lattice with previous frame image By form fuzzy characteristics value, present frame differentiates that the credibility of the video format obtaining is higher in spatial domain to formula phase on year-on-year basis.Work as lattice Formula eigenvalue d ∈ [0, d_low)U(d_high, 1], because the video format feature of image is stronger, by the format discriminance in spatial domain The video format fmt (being worth for 1,2 or 3) obtaining is reliable, passes through the differentiation of form strong eigenvalue and obtain therefore in spatial domain Video format as present frame video format.

The form fmt of current frame image f (x, y) in time domain_nMathematic(al) representation as follows：

Wherein, fmt represents the video format that format discriminance in spatial domain for current frame image f (x, y) obtains, fmt_n-1 Represent the video format of previous frame image, using the form fmt in spatial domain as previous frame format fmt of initial first frame_n-1's Initialization form, fmt_nRepresent the video format that current frame image obtains after time-domain information fusion.

Format discriminance calculation flow chart in time domain is shown in Fig. 5.Specific as follows：Judge whether 3D format character value d belongs to It is in format discriminance fuzzy characteristics area [d_low,d_high]；If being in this fuzzy characteristics area, video format is judged as former frame Video format, if being not at this fuzzy characteristics area, the form strong eigenvalue video that obtain of differentiation will be passed through on spatial domain Form is as the video format of present frame.

Step 205, the video format of output present frame, control the video image format that playing device goes out according to automatic detection Play out film source.

Fig. 6 is a kind of system architecture diagram of present invention 3 D video form automatic detection device.As shown in fig. 6, the present invention A kind of 3 D video form automatic detection device, at least includes：Image block module 61, multi-feature extraction module 62, multiple features Fusion Module 63 and form judge module 64.

Image block module 61, carries out image block to the frame video image inputting and carries out dimension reduction to reduce feature Vector dimension, reduces computation complexity.In present pre-ferred embodiments, by being divided with pixel to left-eye image and eye image Resolution size is B_h×B_wImage block carries out image block division, carries out feature extraction and can effectively reduce a left side in units of block of pixels The negative effect of eye image parallax, can realize dimension reduction to the characteristic vector of right and left eyes image, increase characteristic vector Robustness, simultaneously can the substantial amounts of computation complexity reducing video format detection method.

Multi-feature extraction module 62, by after dimension reduction image block send into block gradient modular character extracting sub-module 621, Block histogram feature extracting sub-module 622, frame histogram feature extracting sub-module 623, projection properties extracting sub-module 624 and Intermediate line Boundary characteristic extraction submodule 625 carries out multi-feature extraction respectively, and feature here mainly comprises：Block gradient model Levy, block histogram feature, frame histogram feature, projection properties, intermediate line boundary characteristic.

Specifically, because the left-eye image of three-dimensional (3D) video image and eye image have very strong similarity (3D The symmetry of video image), calculate left-eye image respectively and the frame of pixels histogram vectors of eye image, block rectangular histogram (are differentiated Rate B_h×B_wBlock of pixels) vector, gradient-norm vector, horizontal projection vector, then this 4 kinds of features of left-eye image and eye image to Amount should have similarity, and it can be calculated with corresponding vector distance d ∈ [0,1] (vector distance abbreviation eigenvalue), if Eigenvalue be equal to 0, then left-eye image is almost identical with the characteristic vector of eye image, if distance is equal to 1, left-eye image and The characteristic vector of eye image is entirely different.After the corresponding normalization of 4 feature vectors, eigenvalue symbol is designated as d respectively_hist、 d_{blk_hist}、d_mag、d_prj.

Multiple features fusion module 63, it is special that the feature that each submodule in multi-feature extraction module 62 is extracted passes through 3D form Multiple features fusion method in the spatial domain levied carries out Feature Fusion.

Picture material is often complicated and changeable, and the 2D image being more especially difficult to differentiate between under special case also has symmetry Characteristic comparing is strong (for example：Pure white scene, black scene), parallax is weaker than larger 3D rendering symmetry feature, if only Only judge video image format by single features it is easy to produce video format erroneous judgement.The present invention proposes by 3D form Multiple features fusion method in the spatial domain of feature carries out 3D format character fusion, will multiple 3D form subcharacters be weighted Fusion method, can extract a highly stable 3D format character by the method, and raising video format that can be very big is examined The accuracy rate surveyed.

If image f (x, y) is the 3D rendering of left-right format it is clear that left-half image (left-eye image) and right half part image are (right Eye pattern picture) each characteristic vector distance smaller, then have first kind formatter eigenvalue? 0 or smaller will be simultaneously equal to.If only having a few sub- eigenvalues in the 5 of image f (x, y) sub- eigenvalues to compare Little, but can not certainly differentiate that image f (x, y) is a 3D rendering, because there being many 2D images also to meet indivedual subcharacters Value is little, such as：Pure white 2D scene image, its subcharacter valueIt is equal to 0.Therefore judge image f (x, y) Form, needs to consider the real reflecting video format characteristic of feature ability of all subcharacter values simultaneously, otherwise only relies only on individual Not single subcharacter base differentiates that video format is easily caused form erroneous judgement.Based on this principle, the present invention proposes multiple features fusion Method extracts 3D format character, can improve expression accuracy and the stability of 3D format character further.

Form judge module 64, the feature after merging is sent into spatial domain form judge module 641 and time domain form is sentenced Disconnected module 642 carries out the format discriminance in spatial domain and the fuzzy characteristics format discriminance in time domain, judges present frame video The picture format (judgement is 2D form, top-down format, left-right format, stagger scheme) of image f (x, y).

1st, spatial domain form judge module 641

D=min (d^tb,d^lr,d^it)

2nd, time domain form judge module 641

Therefore, in time domain current frame image f (x, y) form fmt_nMathematic(al) representation as follows：

In sum, a kind of 3 D video of present invention form automatic testing method and device are by first regarding to a frame of input Frequency image carries out image block and carries out dimension reduction to reduce characteristic vector dimension element, reduces computation complexity, next is sent to 5 Individual feature extraction submodule carries out multi-feature extraction respectively, then carries out feature to these features feeding Feature Fusion module and melts Close, be the fuzzy characteristics form that the feature feeding form judge module after merging carries out in spatial domain and time domain again, sentence The picture format of other current frame video image f (x, y), finally exports the video format of present frame, controls playing device according to certainly The dynamic video image format detecting plays out film source, not only can reduce the computation complexity of video format detection, and The accuracy rate of raising video format detection that can be very big.

Above-described embodiment only principle of the illustrative present invention and its effect, not for the restriction present invention.Any Skilled person all can be modified to above-described embodiment and changed without prejudice under the spirit and the scope of the present invention.Therefore, The scope of the present invention, should be as listed by claims.

Claims

1. a kind of 3 D video form automatic testing method, comprises the steps：

Step one, carries out image block to the frame video image inputting and carries out dimension reduction to reduce feature vector dimension；

Step 2, the image block after dimension reduction is carried out multi-feature extraction, obtains first kind formatter eigenvalue, Equations of The Second Kind Formatter eigenvalue, the 3rd class formatter eigenvalue, this multiple features comprises block gradient modular character, block histogram feature, frame Nogata Figure feature, projection properties and intermediate line boundary characteristic；

Step 3, divides to the first kind formatter eigenvalue obtaining, Equations of The Second Kind formatter eigenvalue, the 3rd class formatter eigenvalue Do not carry out multiple features fusion using the multiple features fusion method in the spatial domain of 3D format character and obtain first kind 3D format character Value, Equations of The Second Kind 3D format character value and the 3rd class 3D format character value；

Step 4, carries out the format discriminance in spatial domain according to the feature after merging and the fuzzy characteristics form in time domain is sentenced Not, judge the picture format of current frame video image；

Step 5, the video format of output present frame, control playing device to carry out according to the video image format that automatic detection goes out Play；

Wherein, in step 3, is obtained using formula is calculated as below carrying out multiple features fusion to this first kind formatter eigenvalue One class 3D format character value,

Wherein,Represent in first kind formatter eigenvalue according to frame rectangular histogram respectively The subcharacter value that feature, block histogram feature, projection properties, intermediate line boundary characteristic, block gradient modular character extract, w_hist、 w_{blk_hist}、w_prj、w_bry、w_magIt is respectively this first kind formatter eigenvalue Set in advance Fixed weight proportion coefficient, weight proportion coefficient domain of definition for [0,1] and meets equality condition 1=w_hist+w_{blk_hist}+w_prj+ w_bry+w_mag, d^lr∈ [0,1] represents first kind 3D format character value,

Equations of The Second Kind 3D format character is obtained using formula is calculated as below carrying out multiple features fusion to this Equations of The Second Kind formatter eigenvalue Value,

Wherein,Represent straight according to frame in this Equations of The Second Kind formatter eigenvalue respectively The subcharacter value that square figure feature, block histogram feature, projection properties, intermediate line boundary characteristic, block gradient modular character extract, w_hist、w_{blk_hist}、w_prj、w_bry、w_magIt is respectively this Equations of The Second Kind formatter eigenvalue Weight proportion coefficient set in advance, d identical with this first kind formatter feature weight proportionality coefficient^tb∈ [0,1] represents second Class 3D format character value,

3rd class 3D format character is obtained using formula is calculated as below carrying out multiple features fusion to the 3rd class formatter eigenvalue Value：

Wherein,Represent special according to frame rectangular histogram in the 3rd class formatter eigenvalue respectively Levy, block histogram feature, block gradient modular character, projection properties extract subcharacter value, w '_hist、w′_{blk_hist}、w′_prj、w′_magPoint Wei not the 3rd class formatter eigenvalueWeight proportion coefficient set in advance, weight ratio Example coefficient domain of definition for [0,1] and meets equality condition 1=w '_hist+w′_{blk_hist}+w′_prj+w′_mag；d^it∈ [0,1] represents the 3rd Class 3D format character value.

2. a kind of 3 D video form automatic testing method is it is characterised in that in step 2, right as claimed in claim 1 In video image f (x, y) of the unknown format of input, carry out as follows counting 3 class formatter eigenvalues：

The left-half image taking image f (x, y) is possible left-eye image, and the right half part image of image f (x, y) is possible Eye image, then this first kind formatter eigenvalue is calculated according to 3D formatter eigenvalue calculation method, symbol is designated as；

The top half image taking image f (x, y) is possible left-eye image, and the latter half image of image f (x, y) is possible Eye image, then this Equations of The Second Kind formatter eigenvalue is calculated according to 3D formatter eigenvalue calculation method, symbol is designated as；

The top field picture taking image f (x, y) is possible left-eye image, and the bottom field picture of image f (x, y) is possible right eye figure Picture, then be calculated the 3rd class formatter eigenvalue according to 3D formatter eigenvalue calculation method, symbol is designated as

3. as claimed in claim 2 a kind of 3 D video form automatic testing method it is characterised in that this 3D form subcharacter Value calculating method is as follows：

The frame of pixels histogram vectors of calculating left-eye image and eye image, block histogram vectors, gradient-norm vector, level respectively Projection vector, calculates corresponding eigenvalue, after the corresponding normalization of 4 feature vectors, eigenvalue symbol is designated as respectively to it d_hist、d_{blk_hist}、d_mag、d_prj；

Detected in video image either with or without boundary line by the gradient-norm and monochrome information detecting belt-like zone, after its normalization Boundary line eigenvalue symbol is designated as d_bry；

4. as claimed in claim 3 a kind of 3 D video form automatic testing method it is characterised in that in step 4, root The video format carrying out according to following expression in spatial domain differentiates：

Wherein, fmt represents the video format that image f (x, y) differentiates, d represents the format character value of image f (x, y), d=min (d^tb,d^lr,d^it), d^lrRepresent the first kind 3D format character value of image f (x, y), dt^bRepresent the Equations of The Second Kind 3D of image f (x, y) Format character value, d^itRepresent the 3rd class 3D format character value of image f (x, y)；As fmt=0, image f (x, y) represents differentiation For a 2D image, as fmt=1, image f (x, y) represents the 3D rendering being determined as a left-right format, as fmt=2, Image f (x, y) represents the 3D rendering being determined as a top-down format, and as fmt=3, image f (x, y) expression is determined as one The 3D rendering of stagger scheme.

5. as claimed in claim 4 a kind of 3 D video form automatic testing method it is characterised in that in step 4, root The video format carrying out according to following expression in time domain differentiates：

Wherein, fmt represents the video format that format discriminance in spatial domain for current frame image f (x, y) obtains, fmt_n-1Represent The video format of previous frame image, using the form fmt in spatial domain as previous frame format fmt of initial first frame_n-1Initial Change form, fmt_nRepresent the video format that current frame image obtains after time-domain information fusion, d_lowFor 3D format character value d Lower Threshold threshold value, d_highLower Threshold threshold value for 3D format character value d.

6. a kind of 3 D video form automatic detection device, at least includes：

Image block module, carries out image block and carries out dimension reduction to reduce characteristic vector dimension to the frame video image inputting Number；

Multi-feature extraction module, the image block after dimension reduction is sent into block gradient modular character extracting sub-module, block rectangular histogram Feature extraction submodule, frame histogram feature extracting sub-module, projection properties extracting sub-module and intermediate line boundary characteristic carry Take submodule to carry out multi-feature extraction respectively, obtain first kind formatter eigenvalue, Equations of The Second Kind formatter eigenvalue, the 3rd class lattice Formula eigenvalue；

Multiple features fusion module, the eigenvalue that each submodule in this multi-feature extraction module is extracted passes through 3D format character Multiple features fusion method in spatial domain carries out Feature Fusion and obtains first kind 3D format character value, Equations of The Second Kind 3D format character value And the 3rd class 3D format character value；

Form judge module, the feature after merging is sent into spatial domain form judge module and time domain form judge module is carried out Fuzzy characteristics format discriminance on format discriminance in spatial domain and time domain, judges the image pane of current frame video image Formula,

The first kind formatter eigenvalue that this multiple features fusion module is extracted to each submodule using be calculated as below formula carry out many Feature Fusion obtains this first kind 3D format character value

Wherein,Represent in first kind formatter eigenvalue according to frame rectangular histogram respectively The subcharacter value that feature, block histogram feature, projection properties, intermediate line boundary characteristic, block gradient modular character extract, w_hist、 w_{blk_hist}、w_prj、w^b _ry、w_magIt is respectively this first kind formatter eigenvalue Set in advance Fixed weight proportion coefficient, weight proportion coefficient domain of definition for [0,1] and meets equality condition 1=w_hist+w_{blk_hist}+w_prj+ w_bry+w_mag, d^lr∈ [0,1] represents first kind 3D format character value, this Equations of The Second Kind formatter eigenvalue that each submodule is extracted Obtain this Equations of The Second Kind 3D format character value using formula being calculated as below carrying out multiple features fusion

Wherein,Represent straight according to frame in this Equations of The Second Kind formatter eigenvalue respectively The subcharacter value that square figure feature, block histogram feature, projection properties, intermediate line boundary characteristic, block gradient modular character extract, w_hist、w_{blk_hist}、w_prj、w_bry、w_magIt is respectively this Equations of The Second Kind formatter eigenvalue weight proportion coefficient set in advance, with this First kind formatter feature weight proportionality coefficient is identical, d^tb∈ [0,1] represents Equations of The Second Kind 3D format character value,

The 3rd class form subcharacter that each submodule is extracted using be calculated as below formula carry out multiple features fusion obtain this Three class 3D format character values