WO2011071473A1 - Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image - Google Patents

Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image Download PDF

Info

Publication number
WO2011071473A1
WO2011071473A1 PCT/US2009/006469 US2009006469W WO2011071473A1 WO 2011071473 A1 WO2011071473 A1 WO 2011071473A1 US 2009006469 W US2009006469 W US 2009006469W WO 2011071473 A1 WO2011071473 A1 WO 2011071473A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
format
formats
sub
images
Prior art date
Application number
PCT/US2009/006469
Other languages
English (en)
Inventor
Tao Zhang
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Priority to US13/514,681 priority Critical patent/US20120242792A1/en
Priority to EP09799203A priority patent/EP2510702A1/fr
Priority to PCT/US2009/006469 priority patent/WO2011071473A1/fr
Priority to TW099142867A priority patent/TWI428008B/zh
Publication of WO2011071473A1 publication Critical patent/WO2011071473A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2213/00Details of stereoscopic systems
    • H04N2213/007Aspects relating to detection of stereoscopic image format, e.g. for adaptation to the display format

Definitions

  • This invention is related to a U.S. Patent Application Attorney Docket No. PU090183 entitled "Method For Distinguishing A 3D Image From A 2D Image And For Identifying The Presence Of A 3D Image Format By Feature Correspondence Determination", filed concurrently herewith and commonly assigned to the same assignee hereof, which is incorporated by reference in its entirety.
  • This invention relates to a method for identifying a three-dimensional (3D) image and, more particularly, for identifying a format associated with the 3D image wherein the identification is performed using an image difference determination.
  • Three-dimensional (3D) images exist today in many different digital formats.
  • the number of different formats together with the apparent lack of standardization for formatting such 3D images leads to many problems and further complexities in recognizing the presence of such 3D images and then in determining how the 3D image is formatted in order to process and display the image properly.
  • 3D contents include a pair of images or views initially generated as separate stereo images (or views).
  • stereo images and “stereo views” and the terms “images” and “views” may each be used interchangeably without loss of meaning and without any intended limitation.
  • Each of these images may be encoded.
  • the contents of the two stereo images are combined into a single image frame. So each frame will represent the entire 3D image instead of using two separate stereo images, each in their own frame or file.
  • Various formats for such a 3D image frame are depicted simplistically along the top row of Figure 1.
  • 3D image frame formats include a side-by-side format, a checkerboard pattern format, an interlaced format, a top-bottom format, and a color based format such as an anaglyph. All but the color based format are shown in simplified form in Figure 1.
  • one of the stereo images or stereo views of a 3D image is depicted in light shading, while the second image or view associated with that 3D image is depicted in dark shading.
  • the ability to support multiple frame formats for 3D images will be important for the success of 3D products in the marketplace.
  • One problem that arises by generating 3D image files in these single frame formats is that the resulting single image frame without further analysis may appear similar to an image frame used for a non-stereo image or a two- dimensional (2D) image. Moreover, a stream of such 3D image frames may initially appear indiscernible from a stream of 2D image frames.
  • image viewers, video players, set-top boxes, and the like which are used for receiving, processing, and displaying the contents of the image frame stream.
  • the present inventive method by identifying the presence of a three-dimensional (3D) image format for a received image through the use of image difference determination.
  • the received image is sampled using a candidate 3D format to generate two sub-images from the received image.
  • the candidate 3D format is a non-blended 3D format
  • these sub-images are compared to determine whether these sub-images are similar with respect to structure. If the sub-images are not similar, a new 3D format is selected and the method is repeated. If the sub-images are found to be similar or if the candidate 3D format is a blended 3D format, an image difference is computed between the two sub-images to form an edge map.
  • Thicknesses are computed for the edges in the edge map. The thickness and uniformity distribution of the edges are then used to determine whether the format is 2D or 3D and, if 3D, which of the 3D formats was used for the received image. When the format of the received image is determined, that format can be used to process and display the received image.
  • FIG. 1 depicts a plurality of exemplary 3D image formats
  • FIG. 2 depicts a flow chart of a method for use in identifying the existence of a particular blended 3D image format, when present in an image under test, in accordance with an embodiment of the present invention
  • FIG. 3 depicts a flow chart of a method for use in identifying the existence of a particular non-blended 3D image format, when present in an image frame under test, in accordance with an embodiment of the present invention.
  • FIG. 4 depicts a high level block diagram of an embodiment of a processing unit suitable for executing the inventive methods and processes of the various embodiments of the present invention.
  • the present invention advantageously provides a method for identifying a three-dimensional (3D) image and, more particularly, for identifying a format associated with the 3D image wherein the identification is performed using an image difference determination.
  • 3D three-dimensional
  • the present invention may be described primarily within the context of a video decoder and display environment, the specific embodiments of the present invention should not be treated as limiting the scope of the invention. It will be appreciated by those skilled in the art and informed by the teachings of the present invention that the concepts of the present invention can be advantageously applied in substantially any video-based environment such as, but not limited to, television, transcoding, video players, image viewers, set-top-box, or any software-based and/or hardware-based implementations to identify 3D formats.
  • processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and can implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random access memory
  • FIG. 4 depicts a high level block diagram of an embodiment of a processing unit 400 suitable for executing the inventive methods and processes of the various embodiments of the present invention. More specifically, the processing unit 400 of FIG. 4 illustratively comprises a processor 410 as well as a memory 420 for storing control programs, algorithms, stored media and the like.
  • the processor 410 cooperates with conventional support circuitry 430 such as power supplies, clock circuits, cache memory and the like as well as circuits that assist in executing the software routines stored in the memory 420. As such, it is contemplated that some of the process steps discussed herein as software processes may be implemented within hardware, for example, as circuitry that cooperates with the processor 410 to perform various steps.
  • the processing unit 410 also contains input-output circuitry 440 that forms an interface between various functional elements communicating with the processing unit 410 such as displays and the like.
  • processing unit 400 of FIG. 4 is depicted as a general purpose computer that is programmed to perform various control functions in accordance with the present invention, the invention can be implemented in hardware, for example, as an application specified integrated circuit (ASIC). As such, the process steps described herein are intended to be broadly interpreted as being equivalently performed by software, hardware, or a combination thereof.
  • ASIC application specified integrated circuit
  • a method has been developed to determine whether an image is in a 3D format or even whether the image is 3D at all based on the use of image difference information generated from the image. Moreover, the method is capable of identifying which one of the plurality of 3D formats is exhibited by the image, when it has been determined that the image is a 3D image rather than a 2D image. It is understood that a single image in 3D format contains information from two similar, but different, images or views. These two images actually differ significantly because the images are taken from different points of reference and different viewing angles. In contrast, a single 2D image contains information from only a single reference point and viewing angle, therefore, from only a single view. It has been determined herein that these differences can be exploited to show whether the image is or is not in a 3D format. Moreover, it is then possible to determine which particular 3D format has been applied to the image.
  • Figure 1 depicts a variety of different 3D formats across the top row.
  • the formats shown include an interlaced format, a top-bottom (also known as over- under) format, a side-by-side format, and a checkerboard pattern format.
  • the interlaced format shown is for horizontal interlacing. It will be appreciated that the orthogonal format to horizontal interlacing, namely, vertical interlacing could be achieved by interlacing alternating columns from each image or view instead of the alternating rows.
  • the formats shown in this figure represent an exemplary listing rather than an exhaustive listing of all known 3D formats.
  • one of the stereo images or stereo views (Si) of a 3D image is depicted in light shading, while the second image or view (S 2 ) associated with that 3D image is depicted in dark shading.
  • the 3D formats in Figure 1 can be classified into two groups according to the degree of blending at a pixel level between the left and right views, Si and S 2 .
  • One group includes the blended 3D formats while the other group includes the non-blended 3D formats.
  • blended 3D formats each pixel tends to be surrounded by pixels from both the left and right views.
  • Examples of blended 3D formats are the interlaced (horizontal or vertical) and checkerboard pattern formats.
  • non-blended 3D formats each pixel tends to be surrounded by pixels from the same view with the exception of pixels at view boundaries as can be seen for the pixels at the Si/S 2 boundary in the side-by-side and over-under formats.
  • a received image is sampled to generate two separate images, Si and S 2 , as shown in Figure 1 . Additionally, it is assumed that the sampling techniques that are being used include techniques for the blended 3D formats. Finally, it is assumed that the received single image is in fact a 2D image. If the received image is sampled to generate two separate images, Si and S 2 , these two images will be almost identical in both content and depth. Any slight differences between the images Si and S 2 are caused by a very small uniform displacement due to the sampling. A simple image subtraction between these two images will produce an edge map that shows a so-called "edge" to indicate where an image difference occurs.
  • the edges in this edge map are thin with substantially uniform thickness.
  • extraction may be performed by placing the odd numbered image rows of pixels from the received image into Si while placing the even numbered image rows of pixels from the received image into S 2 . Since the received image was assumed to be 2D, such a blended 3D extraction technique would invariably create two almost identical images Si and S 2 since their corresponding rows would be displaced by only one pixel from each other. Since the images are substantially identical, it follows that a subtraction of the images will produce either no difference at all or a slight difference that will show up as sparse edges or thin edges. The thickness of edges in such an example would be expected to be at most several pixels wide.
  • a received image is in a blended 3D format, denoted as F
  • the resulting image, E can be quite different depending on which sampling method is used to extract Si and S 2 .
  • the image, E, from the image difference step will be an edge map having edges whose thicknesses are not uniform and which exhibit large differences. This is so because the image extraction results in the correct Si and S 2 being generated wherein Si and S 2 are different views, by depth and point of reference, for example, for the same image.
  • the image, E, resulting from the difference of Si and S 2 can be expected to be an edge map exhibiting uniform edge thickness in a manner quite similar to that shown for a 2D image.
  • a received image is sampled to generate two separate images, Si and S 2 , as shown in Figure 1.
  • the sampling techniques that are being used include techniques for the non- blended 3D formats.
  • the received single image is in fact a 2D image. If the received image is sampled to generate two separate images, Si and S 2 , these two images will be different in structure because the images are taken from disparate parts of the 2D image. So when the received image is a 2D image and Si and S 2 are extracted using a non-blended 3D format technique, it is expected that a similarity of the images will reject the tested sampling method and its corresponding 3D non-blended format.
  • the resulting image E may again be quite different depending on which non-blended 3D sampling method is used. If we use the sampling methods designed for the format F, image E will be an edge map exhibiting non-uniform edge thickness. On the other hand, if the sampling method is one that was not designed for format F, image E is not an edge map at all since the two images sampled Si and S 2 are totally different. Thus, by using the methodology described immediately above, it is possible to determine whether a received image is a single 2D image or a single 3D image and, if the latter, whether it corresponds to a particular non- blended 3D format.
  • the resulting edge E map exhibits a non-uniform edge thickness for an image in a 3D format F, whether blended or non-blended, only if the image is sampled using a sampling method designed for and corresponding to the format F. Otherwise, it is understood that the resulting image difference, E, may be an edge map with uniform edge thickness (as in blended 3D formats) or not an edge map at all (as in non-blended 3D formats).
  • blended and non-blended 3D formats it is possible to combine the methods discussed for blended and non-blended 3D formats to determine whether the single 2D image or a single 3D image and, if the latter, whether it corresponds to a particular combined blended and non-blended 3D format.
  • Similarity testing is performed when for the non-blended 3D format based method. Similarity testing could be performed for the blended 3D format based technique since most sampling techniques on images formatted using a blended 3D format will extract two similar images. However, there is a possibility that, under certain conditions, a blended 3D formatted image could be processed in such a way that the two extracted views Si and S 2 are determined to be dissimilar. Thus, the image would be improperly rejected from further processing. So it would be preferable to perform similarity testing only for views from a non-blended 3D format sampling technique in order to avoid the problem of an improper rejection.
  • G ⁇ (G 1 ,M 1 ) I (G 2) M 2 ),...,(G NF , M NF ) ⁇ ,
  • G is a candidate 3D format
  • M is the sampling method corresponding to candidate 3D format Gj
  • NF is the total number of 3D formats supported in the group of candidate formats.
  • the method for identifying a 3D image and its corresponding format where the format is selected from the group of candidate blended 3D formats is shown in Figure 2.
  • the method begins in step 200 during which an input is received as a single image input O.
  • the single image input O is expected to be in either a 3D format or in a 2D format.
  • the method then proceeds to step 201 .
  • step 201 it is assumed that the input image O is formatted according to a candidate 3D format G> from the group of candidate formats G.
  • Two images Si and S 2 are then generated from the input image O according to its predefined corresponding sampling method Mj. It should be understood that the input image or the resulting images Si and S 2 can also be subjected to a transformation such as from color to grayscale or the like. The method then proceeds to step 202.
  • step 202 the image difference E of Si and S 2 is computed.
  • edge map E Si - S 2 .
  • the image difference computation is performed on a pixel-wise basis so that pixels from corresponding locations in the two images Si and S 2 are subtracted from each other.
  • the difference computation should be performed within the same channel for each image Si and S 2 . In this case, a channel can be selected from the group of RGB channels or the group of YUV channels or even among different grayscale levels.
  • the method then proceeds to optional step 203 or, if the optional step is not performed, to step 204.
  • Image subtraction as shown in the formulas above is considered to be a simple method to compute edge maps between two very similar images. It is also contemplated that this step can be realized by computing two individual edge maps which are then subtracted to form the difference E D of edge maps.
  • One of the individual edge maps is computed for Si and is denoted as E S i
  • the other one of the individual edge maps is computed for S 2 and is denoted as E S2 .
  • edge maps show obvious interlaced patterns (e.g., vertical or horizontal) which are relatively easy to filter out in the optional steps such as step 203 or step 304 in methods described herein.
  • step 203 it is possible to prune the edge map E by removing any edges having thickness smaller than a certain threshold ⁇ .
  • This pruned edge map is denoted as E 2 .
  • the threshold is selected to remove any edges or artifacts whose thickness, either vertically or horizontally, is less than ⁇ .
  • Morphological filtering including the operations of erosion and then dilation, can be applied to image to eliminate noise and make regions of narrow edges more homogeneous.
  • Morphological filtering is a well-known process for image enhancement that tends to simplify the image and thereby facilitate the search for objects of interest. It generally involves modifying the spatial form or structure of objects within the image.
  • dilation and erosion are two fundamental morphological filtering operations. Dilation allows objects to expand, thus potentially filling in small holes and connecting disjoint objects. Erosion is the complementary operation to dilation in that erosion shrinks objects by etching away (eroding) their boundaries. These operations can be customized for each application by the proper selection of a structuring element, which determines exactly how the objects will be dilated or eroded.
  • sei the morphological operation is performed in a horizontal direction
  • se 2 the morphological operation is performed in a vertical direction. After performing an open operation using sei and se 2 sequentially, all edges with horizontal or vertical thickness less than or equal to ⁇ of 3 will be removed from E. Operations such as erosion with sei and se 2 followed by dilation with sei and se 2 could also be employed for such an edge removal.
  • step 203 the method then proceeds to step 204.
  • step 204 the thickness of each edge is computed over the edge map E or over the optional edge map E 2 from step 203 in the horizontal direction and/or in the vertical direction.
  • the method then proceeds to step 205. It should be understood that it is possible to perform this computation over either the horizontal direction alone or over the vertical direction alone or over a combination of both directions. In an example using the latter case, it is possible to compute a thickness in the vertical direction for lines that are substantially horizontal, that is, the line has an inclination between +45 degrees and -45 degrees. Similar variations of the techniques described above are contemplated herein.
  • the distribution of the statistics of thickness in the edge map E or E 2 is analyzed.
  • the distribution of statistics can include the horizontal statistics or the vertical statistics or a combination of both horizontal and vertical statistics. If the average thickness of an edge is small in comparison to a threshold thickness and if the distribution of the thickness is uniform so that there are no large changes in thickness along an edge, the process proceeds to step 206. Otherwise, the process flow is diverted to step 207 since it is determined by the analysis in step 205 that the input image is in a blended 3D format and the format is the currently tested 3D blended format G,. At step 207, the process stops.
  • One exemplary technique suitable for use herein employs a heuristic threshold, a, where a is measured in pixels.
  • a is used while the optional step 203 is performed; otherwise, the threshold a would usually be a larger value since the optional step would not be performed.
  • max ab (thickness)
  • the maximum value of the absolute value of the thickness expressed as max (abs (thickness)) is compared to the threshold a. If this thickness value is less than or equal to the threshold, then the thickness is determined to be uniform and small, that is the "YES" branch from decision step 205. Otherwise, the thickness is determined to be neither uniform nor small, that is the "NO" branch from decision step 205.
  • the term "small” in this context should be understood to mean values that are less than the defined mean and standard variance values.
  • the method for identifying a 3D image and its corresponding format where the format is selected from the group of candidate non-blended 3D formats is shown in Figure 3.
  • the method begins in step 300 during which an input is received as a single image input O.
  • the single image input O is expected to be in either a 3D format or in a 2D format.
  • the method then proceeds to step 301.
  • step 301 it is assumed that the input image O is formatted according to a candidate 3D format Gi from the group of candidate formats G.
  • Two images Si and S 2 are then generated from the input image O according to its predefined corresponding sampling method Mj. It should be understood that the input image or the resulting images Si and S 2 can also be subjected to a transformation such as from color to grayscale or the like as mentioned above with respect to the method in Figure 2. The method then proceeds to step 302.
  • step 302 the method performs image processing operations on images Si and S 2 to determine if Si and S 2 are different images, that is, not similar images.
  • the concept of being "different images” is understood to mean that Si and S 2 are from different parts of a single image and that Si and S 2 are totally different in structure. If, in step 302, it is determined that Si and S 2 are different in structure, the control is transferred to step 307. Otherwise, the control of the method is transferred to step 303.
  • Two exemplary methods for determining whether the structures are similar or different are described below.
  • feature points in Si and S 2 are compared. If most detected features such as point features in S t are missing from S 2 upon comparison, a determination can be made that the two images are different in structure. Conversely, if most detected features such as point features in Si are found in S 2 upon comparison, a determination can be made that the two images are similar in structure.
  • Another technique uses image differences. If Si and S 2 are similar in structure, their image difference or vice versa, will be minimal and sparse and substantially blank. On the other hand, if Si and S 2 are not similar in structure, that is, if they are different, the differences in image E are huge and the resulting image E is dense.
  • the sparseness or density of non-blank pixels can be used to make the similarity determination.
  • a ratio of the total number of non-blank pixels to the total number of pixels can be used to show substantial similarity and substantial difference with respect to structure.
  • intensity changes between left and right views i.e., Si and S 2
  • histogram similarity can be used to characterize the structure similarity for step 302.
  • Histogram similarity does not always correspond to or identify structure similarity without complete accuracy, it does typically identify image pairs are not similar. Histogram similarity can be measured by a Bhattacharyya measure denoted by B. This measure is also referenced as the Bhattacharyya distance.
  • Bhattacharyya measure or Bhattacharyya distance is well known in the field of statistics.
  • the original paper defining this measure was written by A. Bhattacharyya and is entitled "On a Measure of Divergence Between Two Statistical Populations Defined by their Probability Distributions", published in 1943 in Bull. Calcutta Math. Soc, Vol. 35, pp. 99-1 10.
  • a histogram is computed for an image. For a gray scale image with intensity between 0 - 255, the intensity range 0 - 255 is divided into N bins. When a pixel in the image is shown to have a value v, that pixel is identified as belonging to the bin v/N. The quantity in the bin is then incremented by 1 . This is repeated for all the pixels in the image to create the actual image histogram. The histogram actually represents the intensity distribution of the image. Two histograms p and q are generated from the two images or views Si and S 2 . Histogram similarity is then simply a determination of how close or similar these two histograms appear. If the two images are similar, the histogram will be similar. It should be appreciated that similarity in histogram does not always mean structure similarity.
  • the similarity check in step 302 using the Bhattacharyya measure can be realized as a threshold comparison as follows: if B is less than the threshold, the images are similar in structure; otherwise, the images are not similar in structure.
  • the threshold has been set to 0.04. This threshold value is defined through experimental practice by trial and error. Other techniques may be useful for determining this threshold. At this time, the threshold value shown above has provided excellent result for substantially all images tested to date.
  • step 303 the image difference E of Si and S 2 is computed.
  • edge map E Si - S 2 .
  • the image difference computation is performed on a pixel-wise basis so that pixels from corresponding locations in the two images Si and S 2 are subtracted from each other.
  • the difference computation should be performed within the same channel for each image Si and S 2 .
  • a channel can be selected from the group of RGB channels or the group of YUV channels or even among different grayscale levels.
  • the method then proceeds either to step 304 when the optional step is performed or to step 305, if the optional step is not performed.
  • step 304 it is possible to prune the edge map, E, by removing any edges having thickness smaller than a certain threshold ⁇ .
  • This pruned edge map is denoted as E 2 .
  • the threshold is selected to remove any edges or artifacts whose thickness, either vertically or horizontally, is less than ⁇ .
  • step 305 the thickness of each edge is computed over the edge map, E, or over the optional edge map E 2 from step 304 in the horizontal direction and/or in the vertical direction.
  • the techniques employed in this step can be similar to those used in step 204 as described above. The method then proceeds to step 306.
  • decision step 306 the distribution of the statistics of thickness in the edge map, E or E 2 , is analyzed in a manner similar to that shown and described in Figure 2, step 205. If the average thickness of an edge is small in comparison to a threshold thickness and if the distribution of the thickness is uniform so that there are no large changes in thickness along an edge, the process proceeds to decision step 307. Otherwise, the process flow is diverted to step 308 since it is determined by the analysis in step 306 that the input image is in a non-blended 3D format and the format is the currently tested 3D non-blended format G,. At step 308, the process stops.
  • the method for identifying a 3D image and its corresponding format where the format is selected from the group of candidate formats that are representative of mixed blended and non-blended 3D formats is identical to the process shown in Figure 3 and described above with respect to non-blended 3D formats.
  • the corresponding sampling method extracts one line (i.e., horizontal row of pixels) for image Si and then the next line for S 2 , iteratively.
  • the order of the lines from the original single image is maintained in creating the two images Si and S 2 .
  • the lines are grouped in pairs so that two consecutive lines are extracted for Si and then the next two consecutive lines are extracted for image S 2 .
  • Other alternative realizations are contemplated for this sampling technique.
  • the corresponding sampling method extracts one line (i.e., vertical column of pixels) for image Si and then the next line for S 2) iteratively. The order of the lines from the original single image is maintained in creating the two images Si and S 2 .
  • Alternative realizations are contemplated for this sampling technique in a manner similar to the alternatives mentioned for the horizontal interlaced technique.
  • the corresponding sampling technique extracts the odd pixels from the odd rows together with the even pixels from the even rows for image Si while it also extracts the even pixels from the odd rows together with the odd pixels from the even rows for image S 2 .
  • this technique could be realized to extract alternating groups of pixels instead of individual pixels.
  • Sampling for the non-blended 3D formats is simpler in that the sampler merely separates Si and S 2 at their interface in the single image.
  • Si can be taken from the left side (half) of the single image while S 2 is taken from the right side (half) of the single image for the side-by-side format.
  • a similar approach can be taken for sampling the top-bottom format.
  • Sampling is performed in such a manner that the resulting image or view Si contains only pixels from one view and image S 2 contains pixels from the other view. It is contemplated also that sampling is performed on the same channel such as the Y channel in a YUV file or the G channel in an RGB file.
  • the method for identifying the 3D formats employs image difference and is therefore an intensity based method. This makes the method relatively sensitive to intensity changes. Unless other algorithms are utilized to effectively compensate intensity changes on different channels, the same channels should be used in sampling. It is expected that sampling from different channels generally will result in substandard results.
  • the image difference based methods described herein have been shown as individual methods in Figures 2 and 3. It should be understood that these two methods can be performed individually or sequentially. That is, the blended 3D format method may be performed on the single image and/or the non-blended 3D format method may be performed on the single image. Also, the blended and non-blended 3D format methods may be performed together so that one set of formats is tested before the other set of formats. In this embodiment, it has been found preferable to test blended formats before the non-blended formats. It should also been contemplated that yet another embodiment of the methods allows for batch processing rather than iterative processing so that the statistics for all the 3D formats are computed at the same time. In this latter embodiment, the method decisions (e.g., 3D vs. 2D and particular 3D format) can be determined on all the statistics computed.
  • the method decisions e.g., 3D vs. 2D and particular 3D format
  • the method disclosed employs a technique relying on feature correspondence.
  • This technique is fundamentally different from the techniques described herein that rely on image difference.
  • Feature correspondence based methods detect features and establish a one-by-one correspondence between detected features.
  • image difference based methods do not rely on features for proper operation.
  • edge maps could also be computed using absolute value differences.
  • or E
  • or E D
  • the image difference is computed preferably pixel-wise and on the same channel. It is further contemplated that the image difference could be computed for one or more or even all channels in the image.
  • one image difference could be computed for the Y channel, while another could be computed for a U channel, while yet another could be computed for the V channel, all within a single iteration of the method for a particular 3D format. These image differences would then be recomputed as the candidate 3D format is changed. While YUV channels have been discussed above, this technique could be applied similarly to RGB channels and even to grayscale levels (channels).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

L'invention porte sur un procédé qui identifie la présence d'un format d'image tridimensionnelle (3D) dans une image reçue par utilisation d'une détermination de différence d'image. L'image reçue est échantillonnée à l'aide d'un format 3D candidat pour générer deux sous-images à partir de l'image reçue. Lorsque le format 3D candidat est un format 3D non mélangé, ces sous-images sont comparées afin de déterminer si ces sous-images sont similaires ou non en termes de structure. Si les sous-images ne sont pas similaires, un nouveau format 3D est sélectionné et le procédé est répété. Si les sous-images sont similaires, une différence d'image est calculée entre les deux sous-images afin de former une carte de contours. Des épaisseurs sont calculées pour les contours dans la carte de contours. L'épaisseur et l'uniformité de distribution des contours sont ensuite utilisées pour déterminer si le format est 2D ou 3D et, s'il est 3D, lequel des formats 3D a été utilisé pour l'image reçue.
PCT/US2009/006469 2009-12-09 2009-12-09 Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image WO2011071473A1 (fr)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US13/514,681 US20120242792A1 (en) 2009-12-09 2009-12-09 Method and apparatus for distinguishing a 3d image from a 2d image and for identifying the presence of a 3d image format by image difference determination
EP09799203A EP2510702A1 (fr) 2009-12-09 2009-12-09 Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image
PCT/US2009/006469 WO2011071473A1 (fr) 2009-12-09 2009-12-09 Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image
TW099142867A TWI428008B (zh) 2009-12-09 2010-12-08 用於辨別三維影像和二維影像以及藉由影像差異測定來識別三維影像格式的呈現之方法與裝置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2009/006469 WO2011071473A1 (fr) 2009-12-09 2009-12-09 Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image

Publications (1)

Publication Number Publication Date
WO2011071473A1 true WO2011071473A1 (fr) 2011-06-16

Family

ID=42545450

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/006469 WO2011071473A1 (fr) 2009-12-09 2009-12-09 Procédé et appareil pour distinguer une image 3d d'une image 2d et pour identifier la présence d'un format d'image 3d par détermination de différence d'image

Country Status (4)

Country Link
US (1) US20120242792A1 (fr)
EP (1) EP2510702A1 (fr)
TW (1) TWI428008B (fr)
WO (1) WO2011071473A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105376552A (zh) * 2015-10-30 2016-03-02 杭州立体世界科技有限公司 基于立体影视播放装置的立体视频的自动识别方法及系统

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008106185A (ja) * 2006-10-27 2008-05-08 Shin Etsu Chem Co Ltd 熱伝導性シリコーン組成物の接着方法、熱伝導性シリコーン組成物接着用プライマー及び熱伝導性シリコーン組成物の接着複合体の製造方法
JP2012034138A (ja) * 2010-07-29 2012-02-16 Toshiba Corp 信号処理装置及び信号処理方法
EP2426931A1 (fr) * 2010-09-06 2012-03-07 Advanced Digital Broadcast S.A. Procédé et système pour déterminer un type de cadre vidéo
JP5817639B2 (ja) * 2012-05-15 2015-11-18 ソニー株式会社 映像フォーマット判別装置及び映像フォーマット判別方法、並びに映像表示装置
TWI508523B (zh) * 2012-06-28 2015-11-11 Chunghwa Picture Tubes Ltd 三維影像處理方法
CN103338477B (zh) * 2013-05-27 2016-09-28 中国科学院信息工程研究所 一种基于道路的入侵目标检测方法及系统
US9613575B2 (en) * 2014-03-17 2017-04-04 Shenzhen China Star Optoelectronics Technology Co., Ltd Liquid crystal display device and method for driving the liquid crystal display device
CN106028019B (zh) * 2016-05-31 2017-12-29 上海易维视科技股份有限公司 视频三维格式快速检测方法
CN106067966B (zh) * 2016-05-31 2018-08-28 上海易维视科技股份有限公司 视频三维格式自动检测方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03295393A (ja) * 1990-04-13 1991-12-26 Hitachi Ltd 立体画像自動判別装置
EP1024672A1 (fr) * 1997-03-07 2000-08-02 Sanyo Electric Co., Ltd. Recepteur de telediffusion numerique et afficheur
EP1501316A1 (fr) * 2002-04-25 2005-01-26 Sharp Corporation Procede de generation d'informations multimedia, et dispositif de reproduction d'informations multimedia
DE102005041249A1 (de) * 2005-08-29 2007-03-01 X3D Technologies Gmbh Verfahren zur Erzeugung räumlich darstellbarer Bilder
KR20090025934A (ko) * 2007-09-07 2009-03-11 삼성전자주식회사 3차원 영상을 판별하기 위한 장치 및 방법

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8619121B2 (en) * 2005-11-17 2013-12-31 Nokia Corporation Method and devices for generating, transferring and processing three-dimensional image data
US8126260B2 (en) * 2007-05-29 2012-02-28 Cognex Corporation System and method for locating a three-dimensional object using machine vision

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03295393A (ja) * 1990-04-13 1991-12-26 Hitachi Ltd 立体画像自動判別装置
EP1024672A1 (fr) * 1997-03-07 2000-08-02 Sanyo Electric Co., Ltd. Recepteur de telediffusion numerique et afficheur
EP1501316A1 (fr) * 2002-04-25 2005-01-26 Sharp Corporation Procede de generation d'informations multimedia, et dispositif de reproduction d'informations multimedia
DE102005041249A1 (de) * 2005-08-29 2007-03-01 X3D Technologies Gmbh Verfahren zur Erzeugung räumlich darstellbarer Bilder
KR20090025934A (ko) * 2007-09-07 2009-03-11 삼성전자주식회사 3차원 영상을 판별하기 위한 장치 및 방법

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Bull. Calcutta Math. Soc.", vol. 35, 1943, article "On a Measure of Divergence Between Two Statistical Populations Defined by their Probability Distributions", pages: 99 - 110

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105376552A (zh) * 2015-10-30 2016-03-02 杭州立体世界科技有限公司 基于立体影视播放装置的立体视频的自动识别方法及系统

Also Published As

Publication number Publication date
TWI428008B (zh) 2014-02-21
TW201143359A (en) 2011-12-01
US20120242792A1 (en) 2012-09-27
EP2510702A1 (fr) 2012-10-17

Similar Documents

Publication Publication Date Title
US20120242792A1 (en) Method and apparatus for distinguishing a 3d image from a 2d image and for identifying the presence of a 3d image format by image difference determination
US8773430B2 (en) Method for distinguishing a 3D image from a 2D image and for identifying the presence of a 3D image format by feature correspondence determination
US6819796B2 (en) Method of and apparatus for segmenting a pixellated image
US9070042B2 (en) Image processing apparatus, image processing method, and program thereof
EP3311361B1 (fr) Procédé et appareil pour déterminer une carte de profondeur pour une image
US20140270460A1 (en) Paper identifying method and related device
CN104144334B (zh) 用于立体视频内容的字幕检测
EP3340075B1 (fr) Résumé vidéo à l'aide de fusion et d'extraction d'avant-plan signé
KR20110014067A (ko) 스테레오 컨텐트의 변환 방법 및 시스템
KR101423835B1 (ko) 의료영상에서의 간 영역 검출방법
US20180192021A1 (en) Stereo logo insertion
CN105791795B (zh) 立体图像处理方法、装置以及立体视频显示设备
US10096116B2 (en) Method and apparatus for segmentation of 3D image data
WO2017113735A1 (fr) Procédé et système de distinction de format vidéo
Kim et al. Measurement of critical temporal inconsistency for quality assessment of synthesized video
Jorissen et al. Multi-view wide baseline depth estimation robust to sparse input sampling
KR101528683B1 (ko) 과도시차 객체 검출방법
WO2012031995A1 (fr) Procédé et système permettant de déterminer un type de trame vidéo
Zhang 3d image format identification by image difference
WO2014025295A1 (fr) Détection de format d'image bidimensionnelle (2d)/tridimensionnelle (3d)
CN106921856A (zh) 立体图像的处理方法、检测分割方法及相关装置和设备
RU2431939C1 (ru) Способ выявления двумерного экранного меню на стерео видеопоследовательности
EP2658266A1 (fr) Rendu de vues virtuelles sensible au texte
Ekin Robust, Hardware-Oriented Overlaid Graphics Detection for TV Applications
Wang The research of subtitles regional location algorithm based on video caption frames

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09799203

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2009799203

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2009799203

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 13514681

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE