WO2016210059A1 - Determining native resolutions of video sequences - Google Patents
Determining native resolutions of video sequences Download PDFInfo
- Publication number
- WO2016210059A1 WO2016210059A1 PCT/US2016/038907 US2016038907W WO2016210059A1 WO 2016210059 A1 WO2016210059 A1 WO 2016210059A1 US 2016038907 W US2016038907 W US 2016038907W WO 2016210059 A1 WO2016210059 A1 WO 2016210059A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- scene
- resolution
- low frequency
- distribution function
- log
- Prior art date
Links
- 238000001228 spectrum Methods 0.000 claims abstract description 202
- 238000000034 method Methods 0.000 claims description 63
- 210000003127 knee Anatomy 0.000 claims description 48
- 238000005315 distribution function Methods 0.000 claims description 40
- 230000003595 spectral effect Effects 0.000 claims description 15
- 238000005070 sampling Methods 0.000 abstract description 44
- 230000000007 visual effect Effects 0.000 description 24
- 238000010586 diagram Methods 0.000 description 12
- 238000012545 processing Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 7
- 238000004590 computer program Methods 0.000 description 6
- 230000001186 cumulative effect Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 241000287531 Psittacidae Species 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000003909 pattern recognition Methods 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000001629 suppression Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 230000001010 compromised effect Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000003775 Density Functional Theory Methods 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000000275 quality assurance Methods 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234363—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
Definitions
- Embodiments of the present invention relate generally to computer science and, more specifically, to techniques for determining native resolutions of video sequences.
- Video sequences may be presented in any number of different resolutions.
- the chosen resolution represents tradeoffs between resources required to generate and operate on the video sequence (e.g., camera resolution, processing time, bandwidth, storage, etc.) and visual quality.
- resources required to generate and operate on the video sequence e.g., camera resolution, processing time, bandwidth, storage, etc.
- visual quality e.g., if the resolution of a video sequence is 1080p, then each frame includes 2,073,600 pixels arranged into 1080 rows and 1920 columns.
- the resolution of a video sequence is 2160p, then each frame includes 8,294,400 pixels arranged into 2160 rows and 4096 columns. Since the 2160p video sequence includes four times more data than the 1080p video sequence, the visual quality of the 2160p video sequence displayed at the full resolution of 2160p is typically higher than the visual quality of the 1080p video sequence.
- storing the video sequence requires more memory, and transferring the video sequence requires more bandwidth.
- generating and displaying the video sequence at a particular resolution
- a video sequence may undergo one or more down-sampling operations that reduce the amount of data included in the frames within the sequence.
- up- sampling operations may be applied to the video sequence for, among other things, compatibility with other video sequences and/or playback equipment.
- a video sequence may be up-sampled as part of splicing the video sequence with another video sequence that has been stored at a higher resolution to create a movie.
- an endpoint consumer device such as a laptop
- the movie may be viewed at the final, higher resolution.
- subsequent up-sampling operations produce only an approximate reconstruction of the original video
- the video sequence "A” could be down-sampled and then stored at a resolution of 1080p. Subsequently, to include the video sequence "A” in a 2160p movie, the video sequence "A” would need to be up-sampled to a resolution of 2160p.
- the down-sampling operations would have eliminated selected information in the video sequence "A," the subsequent up-sampling operations would produce only an approximate reconstruction of the original video sequence "A.”
- the video sequence "A” included in the 2160p movie could be labeled as having a resolution of 2160p, the actual visual quality of the video sequence "A” included in the 2160p movie would be commensurate with an "effective resolution" of 1080p.
- the lowest resolution at which a video sequence has been stored determines the highest effective resolution with which the video sequence may be rendered and displayed. Consequently, this "native" resolution is more indicative of the visual quality of the video sequence than the "display" resolution at which the video sequence is delivered.
- various operations performed on a video sequence are optimized based on the resolution of the video sequence. For example, efficiently and accurately encoding source data is essential for real-time delivery of video
- encoders are usually configured to make tradeoffs between resources consumed during the encoding/decoding process and visual quality based on the resolution of the video sequence. If an encoder is designed to optimize tradeoffs for a resolution that is higher than the "native" resolution of a video sequence included in a movie having a higher resolution, then the tradeoffs that the encoder may implement for the higher resolution can dramatically increase resource burdens, such as storage and bandwidth usage, when encoding the video sequence without noticeably increasing the visual quality of the video sequence.
- One embodiment of the present invention sets forth a computer-implemented method for generating spectra for characterizing re-sampling operations that have been performed on a scene within a video sequence.
- the method includes performing a transform operation on a color component associated with a first frame included in the scene to generate a first frame spectrum; normalizing a plurality of magnitudes associated with the first frame spectrum to generate a normalized first frame spectrum; and performing at least one operation on the normalized first frame spectrum to generate a first log-magnitude frame spectrum.
- One advantage of the disclosed techniques for generating spectra is that native resolution engines may leverage these techniques to determine the lowest resolution at which a video sequence has been stored. Because this "native" resolution correlates better to the visual quality of the video sequence than the
- Figure 1 is a conceptual illustration of a system configured to implement one or more aspects of the present invention
- Figure 2 is a more detailed illustration of the native resolution analyzer of
- Figure 1 configured to process a video sequence, according to various embodiments of the present invention
- Figure 3 is an example of gray-scale scene spectra that the native resolution analyzer of Figure 2 is configured to generate based on a given scene within a video sequence, according to various embodiments of the present invention
- Figure 4 is an example of gray-scale scene spectra that the native resolution analyzer of Figure 2 is configured to generate based on multiple scenes within a video sequence, according to various other embodiments of the present invention
- Figure 5 is a flow diagram of method steps for deriving scene spectra from a video sequence, according to various embodiments of the present invention.
- Figure 6 are examples of a horizontal knee point and a vertical knee point associated with a scene spectrum that may be computed by the resolution compute engine of Figure 2, according to various embodiments of the present invention.
- Figure 7 is an example of a native resolution associated with a scene spectrum that may be computed by the resolution compute engine of Figure 2, according to various embodiments of the present invention.
- Figure 8 is a flow diagram of method steps for computing the native resolution of a scene within a video sequence, according to various embodiments of the present invention.
- FIG. 1 is a conceptual illustration of a system 100 configured to implement one or more aspects of the present invention.
- the system 100 includes a cloud 130 (e.g., encapsulated shared resources, software, data, etc.) connected to a variety of consumer devices capable of displaying video sequences.
- a cloud 130 e.g., encapsulated shared resources, software, data, etc.
- consumer devices include, without limitation, a desktop computer 102, a laptop 1 10, a
- a video sequence refers to any item that includes video content. Video sequences may be manipulated (e.g., stored, encoded, compressed, transmitted, etc.) using any mechanisms known in the art. For example, one video sequence may be included in a movie that is stored as a compressed audio-video file, transmitted via the internet to a consumer device, and then decompressed for display purposes.
- the cloud 130 may include any number of compute instances 140 configured with any number (including zero) of central processing units (CPUs) 142, graphics processing units (GPUs) 144, memory 146, etc. In operation, the CPU 142 is the master processor of the compute instance 140, controlling and coordinating
- the CPU 142 issues commands that control the operation of the GPU 144.
- the GPU 144 incorporates circuitry optimized for graphics and video processing, including, for example, video output circuitry. In various embodiments, GPU 144 may be integrated with one or more of other elements of the compute instance 140.
- the memory 146 stores content, such as software applications and video sequences, for use by the CPU 142 and the GPU 144 of the compute instance 140.
- the cloud 130 receives input user information from an input device (e.g., the laptop 1 10), one or more of the computer instances 140 operate on the user information, and the cloud 130 transmits processed information to the user.
- the cloud 130 processes video streams and delivers video services associated with the video streams to the consumer devices over a network, such as the Internet, via a video distribution subsystem 160.
- the video distribution subsystem 160 includes any number of applications that operate on the video streams.
- the video distribution subsystem 160 may implement a user interface that enables users to select video sequences based on a variety of criteria. More specifically, for each video sequence, the user interface may provide information such as genre, actors, title, video length and resolution.
- the video distribution subsystem 160 may include applications, such as encoders, that are optimized for real-time delivery of video streams based on a variety of criteria, including the resolution of the video sequence.
- the resolution of a particular video sequence may dramatically impact, among other things, the visual quality of the video sequence and the efficiency with which applications operate on the video sequence.
- the ostensible resolution of a video sequence referred to herein as the "display" resolution, may not reflect the lowest resolution with which the video sequence has been processed and stored. For example, to comply with resource constraints, such as memory
- applications may implement down-sampling techniques that eliminate selected information included in the video sequence. Subsequently, other
- the conventional, display resolution is not necessarily a reliable indication of the visual quality as perceived when viewing a video sequence at full resolution.
- the lowest resolution with which a video sequence has been stored is typically indicative of the highest effective resolution with which the video sequence may be rendered and displayed and, therefore, the perceived visual quality.
- this "native" resolution reflects the amount of unique information included in the video sequence, fine-tuning resolution-sensitive applications based on the native resolution instead of the display resolution may improve the efficiency of such applications.
- the system 100 includes a native resolution analyzer 125 that extracts information from video sequences that relates to the characteristics of the video sequences, including whether any sampling operations may have been performed on the video sequences. Further, in some embodiments, the native resolution analyzer 125 deterministically computes the native resolution of video sequences. Among other things, the information obtained via the native resolution analyzer 125 may be used to correctly set visual quality expectations and optimize resolution-sensitive applications.
- the cloud 130 may be replaced with any type of cloud computing environment.
- the system 100 may include any distributed computer system instead of the cloud 130.
- the system 100 does not include the cloud 130 and, instead, the system 100 includes a single computing unit that implements any number of processing units (e.g., central processing units and/or graphical processing units in any combination).
- the system 100 does not include the video distribution subsystem 160.
- the system 100 includes a single desktop computer 102 that stores the native resolution analyzer 125 in a memory device and a processing unit that executes the native resolution analyzer 125.
- the desktop computer 102 in such embodiments may or may not be connected to any external systems, such as the cloud 130, and may or may not implement any other video processing applications.
- the native resolutions computed by the native resolution analyzer 125 may be used to "label" video streams to correctly set visual quality expectations and optimize resolution-sensitive
- Figure 2 is a more detailed illustration of the native resolution analyzer 125 of Figure 1 configured to process a video sequence 205, according to various
- the native resolution analyzer 125 includes, without limitation, a video preprocessor 210, a frame processor 240, a scene spectrum generator 260, a presentation unit 272, and a resolution compute engine 280.
- the video preprocessor 210 receives the video sequence 205 and performs one or more operations designed to extract meaningful, homogeneous regions from the video sequence 205.
- the video preprocessor 210 includes, without limitation, a black bar detector 212 and a scene change detector 124. The black bar detector 212 detects and subsequently removes any black horizontal bars and/or vertical bars that may be included in the video sequence 205.
- the black bar detector 212 may implement any technically feasible algorithm to detect and subsequently remove any detected black bars.
- the scene change detector 214 is configured to partition the video sequence 205 into one or more scenes 220.
- Each of the scenes 220 includes a sequence of one or more consecutive frames 230 that possess similar texture, luminance, and contrast characteristics. Because each of the scenes 220 may be generated and/or modified in a different fashion, the native resolution analyzer 125 determines the native resolution of each of the scenes 220 independently of the other scenes 220.
- a video sequence 205 had a display resolution of 2160p and included the two scenes 220(1 ) and 220(2). Further, suppose that the scene 220(1 ) was recorded using a 1080p camera and then up-sampled to 2160p, while the scene 220(2) was recorded using a 2160p camera. In such a scenario, the scene change detector 214 would independently process the scenes 220(1 ) and 220(2), compute a native resolution of 1080p for the scene 220(1 ), and compute a native resolution of 2160p for the scene 220(2).
- the scene change detector 214 may implement any technically feasible algorithm for detecting and extracting the scenes 220 from the video sequence 205.
- the video preprocessor 210 may execute the black bar detector 212 and the scene change detector 214 in any order. Further, in alternate embodiments, the video preprocessor 210 may implement any number of additional preprocessing techniques designed to extract meaningful, homogeneous sequences of frames from the video sequence 205.
- each of the frames 230 includes three color components: a Y component 232, a Cb component 234, and a Cr component 236. Since each of the color components exhibit inherently different spatial resolutions, the native resolution analyzer 125 is configured to process each of the Y component 232, the Cb component 234, and the Cr component 236 independently. More specifically, the native resolution analyzer 125 is configured to discard the Cb component 234 and the Cr component 236, and then determine the native resolution based on the Y component 232.
- the native resolution analyzer 125 may discard any number, include zero, of the three color components and determine the native resolution based on the remaining components. Further, in various embodiments, the techniques described herein may be applied to any color format, including, and without limitation, all Y/Cb/Cr formats (e.g., YUV420, YUV422, YUV444) as well as all RGB formats (e.g., RGB24).
- all Y/Cb/Cr formats e.g., YUV420, YUV422, YUV444
- RGB formats e.g., RGB24
- the frame processor 240 is configured to operate on each of the frames 230 separately. Upon receiving the Y component 232 associated with the frame 230, the frame processor generates a log-magnitude frame spectrum 250.
- the log-magnitude frame spectrum 250 is a two-dimensional spectrum derived from the Y component 232, expressing every frequency in decibels (dB). Note that the spectral component of the log-magnitude frame spectrum 250 corresponding to a horizontal digital frequency of 0 and a vertical digital frequency of 0 is referred to herein as the "DC component,” and the remaining spectral components are collectively referred to herein as the "AC components.”
- the frame processor 240 To generate the log-magnitude frame spectrum 250, the frame processor 240 first performs a Discrete Fourier Transform (DFT) on the Y component 232.
- the frame processor 240 may perform the DFT in any technically feasible fashion.
- the frame processor 240 may implement a fast version of the DFT, known as the Fast Fourier Transform (FFT). Because the resolution analysis is based on energies and, therefore, the phase information is irrelevant, the frame processor 240 retains the magnitudes of the DFT complex-value spectra and discards the phases.
- the frame processor 240 may perform the DFT to calculate complex coefficients for each frequency.
- the frame processor 240 may then convert each of the complex coefficients from Cartesian coordinates (real + j * imaginary) to polar coordinates (
- the physical size of surrounding objects e.g., trees, walls, mountains, etc.
- the spectra of the frames 230 exhibit a natural preference towards low frequencies. Such a preference is reflected in the normalized frame spectra. More specifically, in the normalized frame spectra, the magnitudes of the DFT spectra of low frequency components are oftentimes significantly larger than the magnitudes of the DFT spectra of high frequency components. To attenuate the magnitudes of the low frequency
- the frame processor 240 logarithmically scales the normalized magnitude spectrum, thereby generating the log-magnitude frame spectrum 250.
- the scene spectrum generator 260 For each of the scenes 220, after generating the log-magnitude frame spectra 250 for all the frames 230 included in the scene 220, the scene spectrum generator 260 performs averaging operations that produce a scene spectrum 270. In operation, if the video sequence 205 includes "N" scenes 220, then the scene spectrum generator 260 generates N scene spectra 270(1 )-270(N) corresponding to the Y components 232 associated with the N scenes 220. Similarly, in embodiments where the native resolution analyzer 125 retains and operates on the Y component 232, the Cb component 234, and the Cr component 236, the scene spectrum generator 260 generates 3xN scene spectra 270(1 )-270(3xN). In general, each of the scene spectra 270 associated with each of the color components comprises a single, real- valued, two-dimensional array that represents the frequency components of the corresponding scene 220.
- the presentation unit 272 upon receiving the scene spectrum 270, converts the scene spectrum 270 to a gray- level scene spectrum 275.
- the scene spectrum generator 260 maps the values included in the scene spectrum 270 to a range of gray-scale values that ranges from 0 to 255, where 0 is the lowest gray-scale value and 255 is the highest gray-scale value.
- the gray-level scene spectrum 275 represents magnitudes of less than 10 "8 as 0, magnitudes of 1 as 255, and so forth. Accordingly, white pixel values in the gray-level scene spectrum 275 indicate high spectral components and black pixel values indicate low or zero spectral components.
- the gray-level scene spectrum 275 may be used as a frequency "signature" for the corresponding scene 220. Notably, if no re-sampling operations have been performed on the frames 230 included in the scene 220, then the gray-level scene spectrum 275 exhibits a "healthy," naturally decaying gray-scale gradient with no abrupt changes. If, however, re-sampling (e.g., up-sampling and/or down-sampling) operations have been performed on the frames 230, then the gray-level scene spectrum 275 typically demonstrates certain patterns that indicate the type of re- sampling operations performed. Consequently, among other things, the scene spectra may be used to analyze the quality and characteristics of the scenes 220 and/or the video sequence 105 that includes the scenes 220.
- re-sampling e.g., up-sampling and/or down-sampling
- pattern recognition operations may be used to extract valuable information related to the characteristics of the natural scene shot and captured, the camera used to shoot and capture the natural scene, and the down-sampling operations (if any) implemented between capturing the natural scene and rendering the final video sequence 205. Further, pattern recognition operations may identify camera-inherent resolution limitation due to Bayer pattern sampling, special effects overlaying in lower resolutions, and color interpolation implemented to achieve YUV422 format compliance.
- any technically feasible approach or techniques for classifying the scene spectra 270 may be implemented, and all such implementations fall within the scope of the present invention.
- computer vision- based or other similar techniques may be implemented to recognize "square" objects through edge detection (or other similar approach) using any known algorithm.
- pattern recognition operations may be performed manually by visually inspecting any number of the scene spectra 270.
- the native resolution subsystem 125 also includes the resolution compute engine 280.
- the resolution compute engine 280 receives the scene spectrum 270 for the scene 220 and automatically and deterministically computes a native resolution 290 of the scene 220. Notably, the resolution compute engine 280 relies on a sharp fall in values included in the scene spectrum 270 that are characteristic of up-sampling operations to determine the native resolution 290.
- the resolution compute engine 280 Upon receiving the two-dimensional (2D) scene spectrum 270, the resolution compute engine 280 projects the scene spectrum 270 along rows and then folds the resulting one-dimensional (1 D) spectrum to generate a 1 D horizontal spectrum.
- the resolution compute engine 280 projects the scene spectrum along columns and then folds the resulting one-dimension (1 D) spectrum to generate a 1 D vertical spectrum.
- the 1 D horizontal spectrum and the 1 D vertical spectrum indicate relative amounts of energy
- the resolution compute engine 280 integrates the 1 D horizontal spectrum to produce a cumulative distribution function (CDF) of energies, also referred to herein as the horizontal distribution function and the horizontal CDF.
- CDF cumulative distribution function
- the resolution compute engine 280 integrates the 1 D vertical spectrum to produce cumulative distribution function (CDF) of energies, referred to herein as the vertical distribution function and the vertical CDF. For each of the CDFs, the resolution compute engine 280 performs a variety of curve fitting operations designed to produce a best-fitting two segment line.
- CDF cumulative distribution function
- the resolution compute engine 280 may implement any technically feasible techniques to generate the two segment line. For example, in some embodiments, to generate a best-fitting two segment line for a particular CDF, the resolution compute engine 280 selects a one segment line that extends from one end point of the CDF to the other end point of the CDF.
- the resolution compute engine 280 then computes the area between the CDF and the one segment line, referred to herein as the "one segment area-under-curve.” Subsequently, the resolution compute engine 280 creates a two segment line that extends from one end point of the CDF to the other end point of the CDF, where the two segments intersect at a "knee point.” The resolution compute engine 280 optimizes the location of the knee point to minimize the area between the CDF and the two segment line, referred to herein as the "two segment area-under-curve.” The resolution compute engine 280 then divides the one segment area-under-curve by the two segment area-under curve, thereby computing a quality fit factor for the two segment line.
- the resolution compute engine 280 computes a horizontal knee point and a horizontal quality fit factor.
- the resolution compute engine 280 computes a vertical knee point and a vertical quality fit factor.
- the resolution compute engine 280 determines that the scene spectrum 270 represents a naturally decaying spectrum and the corresponding scene 220 has not been subjected to any up-sampling operations. Consequently, the resolution compute engine 280 sets the native resolution equal to the display resolution and ceases operation. By contrast, if the quality fit factors are not less than the quality threshold, then the resolution compute engine 280 determines that the corresponding scene 220 may have been subjected to one or more up-sampling operations. Accordingly, the resolution compute engine 280 continues to analyze the scene spectrum 270 in conjunction with the CDFs to determine the native resolution.
- the quality threshold may be set in any technically feasible fashion that is consistent with the
- the resolution compute engine 280 Based on the knee points, the resolution compute engine 280 generates a "low frequency rectangle.” More specifically, the resolution compute engine280 identifies a low frequency rectangular region included in the scene spectrum 270 that is centered on the DC frequency. This low frequency rectangular region has a width equal to twice the spatial frequency of the horizontal knee point and a height equal to twice the spatial frequency of the vertical knee point. The resolution compute engine 280 also generates a bounding box that represents a larger rectangular region included in the scene spectrum 270. The bounding box is centered on the DC frequency with a width equal to the final horizontal resolution and a height the equal to the final vertical resolution. Note that the resolution compute engine 280 adjusts the size of the bounding box to reflect the removal of any black bars by the video preprocessor 230.
- the resolution compute engine 280 computes a low frequency energy density as the AC energy associated with the low frequency rectangle divided by the area of the low frequency rectangle.
- the AC energy associated with the low frequency rectangle is the sum of the magnitudes of the AC components included in the scene spectrum 170 that lie within the low frequency rectangle.
- the resolution compute engine 280 defines a high frequency region as the region that lies outside the low frequency rectangle but within the bounding box.
- the resolution compute engine 280 computes a high frequency energy density as the AC energy associated with the high frequency region divided by the area of the high frequency region.
- the resolution compute engine 280 computes the ratio of the low frequency energy density to the high frequency energy density, and uses this frequency to determine the native resolution associated with the scene spectrum 270.
- the higher the "energy density ratio" between the low frequency energy density and the high frequency energy density the more likely the frames 230 included in the scene 220 include low-frequency components. Since such low- frequency components are indicative of up-sampling operations, the resolution compute engine 280 compares the energy density ratio to a predetermined energy density threshold to determine whether up-sampling operations have been performed on the scene 220.
- the resolution compute engine 280 determines that the scene spectrum 270 represents a naturally decaying spectrum and up-sampling operations have not been performed on the corresponding scene 220. Consequently, the resolution compute engine 280 sets the native resolution equal to the display resolution and ceases operation. By contrast, if the energy density ratio is not less than the energy density threshold, then the resolution compute engine 280 determines that one or more up-sampling operations have been performed on the corresponding scene 220. Further, the resolution compute engine 280 determines that the native resolution of the scene 220 is equal to the dimensions of the low frequency rectangle.
- the Y component is isolated and used in certain implementations, the Cb component or the Cr component may be isolated and used in other implementations. The same holds equally true for implementations involving RGB formats.
- the different thresholds discussed herein are exemplary only and are not intended to limit the scope of the present invention.
- the threshold used to detect black lines can be selected through any type of testing or machine-learning technique or can be user-programmable and based on any number of factors, including and without limitation, familiarity with the
- the quality and energy density thresholds for categorizing a given scene as "up-sampled" or not are tunable parameters that may be refined over time or determined through statistical analysis of the video sequence being analyzed.
- Figure 3 is an example of the gray-scale scene spectra 275 that the native resolution analyzer 125 of Figure 2 is configured to generate based on a given scene 220 within the video sequence 205, according to various embodiments of the present invention.
- the native resolution matches the display resolution of the scene 220 depicted in Figure 3.
- the scene 220 includes images of a red parrot 332 and a multicolored parrot 334 along with background surrounding images.
- the native resolution analyzer 125 For the scene 220, the native resolution analyzer 125 generates three separate gray-scale scene spectra 275: the gray-scale scene spectra 275 of the Y component 232, the gray-scale scene spectra 275 of the Cb component 234, and the gray-scale scene spectra 275 of the Cr component 236.
- the gray-scale scene spectra 275 include multiple identical regions, referred to herein as tiles.
- the gray-scale scene spectra 275 depicted in Figure 3 demonstrate gradually decaying gray-scale gradients across each tile as the frequencies increase both horizontally and vertically from the DC spectral frequency at the center of each tile.
- Such a "healthy" distribution indicates that no up-sampling operations have been performed on the scene 220.
- the gray-scale scene spectra 275 lack the sharp suppression of high frequencies that is characteristic of up-sampling operations.
- Figure 4 is an example of the gray-scale scene spectra 275 that the native resolution analyzer 125 of Figure 2 is configured to generate based on multiple scenes 220 within the video sequence 205, according to various other embodiments of the present invention. Notably, the native resolutions do not match the display resolutions of the scenes 220 depicted in Figure 4.
- the scene 220 that includes the red parrot 332 and the multicolored parrot 332 has a native resolution that equals the display resolution of 1920 x 1080.
- the scene 220 is referred to as the scene 220(a).
- the scene 220(a) is down-sampled to 960 x 540 and then up-sampled to 1920 x 1080 using seven different combinations of sampling techniques to generate the seven scenes 220(b)-220(h):
- each of the scenes 220(b)-(h) has a native resolution of 960 x 540, but a display resolution of 1920 x 1080.
- the native resolution analyzer 125 generates the gray-scale scene spectra 275 for the Y component 232 associated with the scenes 220(a)-(h), thereby generating eight different gray-scale scene spectra 275(a)-(h).
- the gray-scale scene spectrum 275(a) corresponding to the original scene 220(a) demonstrates a healthy distribution of spectral components across each tile.
- each of the gray-scale scene spectra 275(b)-275(h) resembles the gray-scale scene spectrum 275 at low
- each of the gray-scale scene spectra 275(b)-275(h) demonstrate high frequency suppression with attenuated spectral replication horizontally and/or vertically.
- the pattern of high frequency suppression correlates to the type of the up-sampling operation performed on the processed version of the scene 220.
- the size of the healthy region surrounding the DC component that does not exhibit produced attenuation is indicative of the native resolution of the processed version of the scene 220.
- Figure 5 is a flow diagram of method steps for deriving scene spectra from a video sequence, according to various embodiments of the present invention.
- a method 500 begins at step 502, where the native resolution analyzer 125 receives the video sequence 205.
- the video preprocessor 210 identifies and subsequently removes any black bars included in the video sequence 205.
- the video preprocessor 210 then partitions the video sequence 205 into the scenes 220.
- the native resolution analyzer 125 selects the first scene 220 and the first frame 230 included in the first scene 220.
- the frame processor 240 performs a Discrete Fourier Transform (DFT) on each of the color components included in the selected frame 220. More specifically, the frame processor 240 performs three DFTs: a DFT on the Y
- step 508 because phase information is not relevant to the resolution analysis, the frame processor 240 discards the phases but retains the magnitudes of the DFT spectra.
- the frame processor 240 performs normalization and scaling operations on each of the three DFT spectra (corresponding to the three color components). More precisely, the frame processor 240 normalizes the magnitude of each of the DFT spectra such that the total AC energy is one. The frame processor 240 then logarithmically scales each of the normalized magnitude spectra, thereby generating the three log-magnitude frame spectra 250.
- the native resolution analyzer 125 determines whether the selected frame 230 is the last frame 230 included in the selected scene 220. If, at step 512, native resolution analyzer 125 determines that the selected frame 230 is not the last frame 230 included in the selected scene 220, then the native resolution analyzer 125 proceeds to step 514. At step 514, the frame processor 240 selects the next frame 230 included in the selected scene 220. The method 500 then returns to step 508, where the frame processor 240 generates the three log-magnitude spectra 250 for the selected frame 230.
- the native resolution analyzer 125 continues to cycle through steps 508-514, generating three log-magnitude frame spectra 250 for each of the frames 230 included in the selected scene 220, until the frame processor 240 generates the log-magnitude frame spectra 250 for the last frame 230 included in the selected scene 220.
- step 512 the native resolution analyzer 125 determines that the selected frame 230 is the last frame included in the selected scene 220, then the method 500 proceeds directly to step 516.
- step 516 the scene spectrum
- the scene spectrum generator 260 performs averaging operations that produce three scene spectra 270 associated with the selected scene 220.
- the scene spectrum generator 260 averages the log-magnitude frame spectra 250 of the Y components 232 for all the frames 230 included in the selected scene 220 to create the scene spectrum 270 of the Y component 232 associated with the selected scene 230.
- the scene spectrum generator 260 averages the log-magnitude frame spectra 250 of the Cb components 234 for all the frames 220 included in the selected scene 230 to create the scene spectrum 270 of the Cb component 234 associated with the selected scene 230.
- the scene spectrum generator 260 averages the log-magnitude frame spectra 250 of the Cr components 236 for all the frames 220 included in the selected scene 230 to create the scene spectrum 270 of the Cb component 236 associated with the selected scene 230.
- the native resolution analyzer 125 determines whether the selected scene 220 is the last scene 220 included in the video sequence 205. If, at step 518, the native resolution analyzer 125 determines that the selected scene 220 is not the last scene 220 included in the selected scene 220, then the frame processor 240 proceeds to step 520. At step 520, the frame processor 240 selects the next scene 220 included in the video sequence 205 and the first frame 230 included in the next scene 220. The method 500 then returns to step 508, where the frame processor 240 generate the three log-magnitude spectra 250 for the selected frame 230.
- the native resolution analyzer 125 continues to cycle through steps 508- 518, generating three scene spectra 270 for each of the scenes 220 included in the video sequence 205, until the frame processor 240 generates the scene spectra 270 for the last scene 220 included in the video sequence 205.
- the method 500 then terminates.
- Figure 6 are examples of a horizontal knee point 628 and a vertical knee point 638 associated with the scene spectrum 270 that may be computed by the resolution compute engine 280 of Figure 2, according to various embodiments of the present invention. As described in conjunction with Figure 2, as part of determining the native resolution of the scene 220, the resolution compute engine 280 determines the horizontal knee point 628 and the vertical knee point 638 based on the two- dimensional (2D) scene spectrum 270.
- 2D two- dimensional
- the scene spectrum 270 may be derived from any one of the Y component 232, the Cb component 234, and the Cr component 236 of the frames 230 included in the scene 220. Further, the resolution compute engine 280 may be configured to independently determine the native resolution 290 based on any number of the Y component 232, the Cb component 234, and the Cr component 236.
- the resolution compute engine 280 projects the scene spectrum 270 along rows to generate a horizontal spectrum 610 that indicates relative amounts of energy (logarithmically scaled) in the spatial frequency domain. Subsequently, the resolution compute engine 280 integrates the horizontal spectrum 610 to produce a horizontal cumulative distribution function (CDF) 622. The resolution compute engine 280 then performs a horizontal knee point fit 620. More specifically, as shown, the resolution compute engine 280 generates a one segment line 624 that approximates the horizontal CDF 622 as well as a two segment line 626 that approximates the horizontal CDF 622.
- CDF cumulative distribution function
- the resolution compute engine 280 sets the horizontal knee point 628 to the point at which the two segments included in the two segment line 626 meet. Notably, as shown, the spectral frequency of the horizontal knee point 628 is 640.
- the resolution compute engine 280 projects the scene spectrum 270 along columns to generates a vertical spectrum (not shown) that indicates relative amounts of energy (logarithmically scaled) in the spatial frequency domain. Subsequently, the resolution compute engine 280 integrates the vertical spectrum 610 to produce a vertical cumulative distribution function (CDF) 632. The resolution compute engine 280 then performs a vertical knee point fit 630. More specifically, as shown, the resolution compute engine 280 generates a one segment line 634 that approximates the vertical CDF 632 as well as a two segment line 636 that approximates the vertical CDF 632.
- CDF vertical cumulative distribution function
- the resolution compute engine 280 After determining the one segment line 634 and the two segment line 636 that approximate the vertical CDF 632, the resolution compute engine 280 sets the vertical knee point 638 to the point at which the two segments included in the two segment line 636 meet. Notably, as shown, the spectral frequency of the vertical knee point 638 is 370.
- Figure 7 is an example of the native resolution 290 associated with the scene spectrum 270 that may be computed by the resolution compute engine 280 of Figure 2, according to various embodiments of the present invention. For explanatory purposes, the context of Figure 7 is that the resolution compute engine 280 has computed the horizontal knee point 628 and the vertical knee point 638 as illustrated in Figure 6.
- the spectral frequency of the horizontal knee point 628 is 640 and the spectral frequency of the vertical knee point 638 is 370.
- the display resolution of the scene 220 associated with the scene spectrum 270 is 4096 x 2160.
- the resolution compute engine Based on the horizontal knee point 628 and the vertical knee point 638, the resolution compute engine performs resolution calculations 730. First.as described in conjunction with Figure 2, the resolution compute engine 280 uses the horizontal knee point 628 and the vertical knee point 638 to identify a low frequency rectangle 720.
- the low frequency rectangle 720 is centered at the DC frequency with a width equal to twice the horizontal knee point 628 and a height equal to twice the spatial frequency of the vertical knee point 638.
- the resolution compute engine 280 sets the dimensions of the low frequency rectangle 720 to 1280 x 720.
- the resolution compute engine 280 also generates a bounding box 710 that is centered at the DC frequency with a width equal to the final horizontal resolution (4096) and a height equal to the final vertical resolution (2160).
- the resolution compute engine 280 computes a low frequency energy density (LFD) as the sum of the magnitudes of the AC components included in the scene spectrum 170 that lie within the low frequency rectangle 720 divided by the area of the low frequency rectangle 720.
- the resolution compute engine 280 computes a high frequency energy density (HFD) based on the subset of the scene spectrum 270 that lies outside the low frequency rectangle but within the bounding box 710.
- the resolution compute engine 280 then divides the LFD by the HFD to generate an energy density ratio.
- the resolution compute engine 280 determines that the energy density ratio of the scene spectrum 270 is 5.4 [0079]
- the likelihood that up-sampling operations have been performed on the frames 130 included in the scene 220 correlates to the energy density ratio.
- the resolution compute engine 280 implements an energy threshold to determine whether up-sampling operations have been performed on the scene spectrum 270.
- the values of the energy threshold implemented in the resolution compute engine 280 is 3.
- a value of 3 for the predetermined energy threshold reflects experimental results that indicate that high frequencies that are more than three orders of magnitude smaller than low frequencies are indicative of up-sampling.
- the energy threshold may be determined in any technical fashion, based on any type of information, and may be any value.
- the energy threshold may be set based on user input.
- the resolution compute engine 270 determines that the native resolution 190 of the scene 270 is equal to the dimensions of the low frequency rectangle 720. Consequently, the resolution compute engine 270 determines that the native resolution 190 associated with the scene spectrum 270 is 1280 x 720.
- Figure 8 is a flow diagram of method steps for computing the native resolution of a scene within a video sequence, according to various embodiments of the present invention. Although the method steps are described with reference to the systems of Figures 1 -4 and 6-7, persons skilled in the art will understand that any system configured to implement the method steps, in any order, falls within the scope of the present invention.
- a method 800 begins at step 802, where the resolution compute engine 280 receives the two-dimensional (2D) scene spectrum 270 associated with the scene 220 within the video sequence 205.
- the scene spectrum 270 is generated any technically feasible fashion.
- the native resolution analyzer could implement the method steps of Figure 5 to generate the scene spectrum 270.
- the resolution compute engine 280 projects the scene spectrum 270 along rows and folds the resulting one-dimensional (1 D) spectrum to generate the 1 D horizontal spectrum 610.
- the resolution compute engine 280 projects the scene spectrum along columns and folds the resulting one-dimension (1 D) spectrum to generate a 1 D vertical spectrum.
- the horizontal spectrum 610 and the vertical spectrum indicate relative amounts of energy (logarithmically scaled) in the spatial frequency domain.
- the resolution compute engine 280 integrates the horizontal spectrum 610 and the vertical spectrum to produce, respectively, the horizontal CDF 622 and the vertical CDF 632.
- the resolution compute engine 280 generates a best-fit one segment line, and a best-fit two segment line.
- the resolution compute engine 280 may perform the curve-fitting operations to generate the one segment lines and the two segment lines in any technically feasible fashion.
- the intersection of the two segments of the two segment line that approximates the horizontal CDF 622 defines the horizontal knee point 628, while the intersection of the two segments of the two segment line that approximates the vertical CDF 632 defines the vertical knee point 638.
- the resolution compute engine 280 computes a quality fit factor.
- the resolution compute engine 280 may compute the quality fit factor in any technically feasible fashion. For example, in some embodiments, to compute the quality fit factor for a particular two segment line, the resolution compute engine 280 computes the area between the CDF and the best-fit one segment line, referred to herein as the "one segment area-under-curve.” The resolution compute engine 280 then computes the area between the CDF and the two segment line, referred to herein as the "two segment area-under-curve.” Finally, the resolution compute engine 280 divides the one segment area-under-curve by the two segment area-under-curve to compute the quality fit factor for the two segment line.
- the resolution compute engine 280 compares the quality fit factor for the two segment lines that approximate the horizontal CDF 622 and the vertical CDF 624 to a predetermined quality threshold.
- the predetermined quality threshold may be determined in any technically feasible fashion. If, at step 812, the resolution compute engine 280 determines that both of the quality fit factors exceed the quality threshold, then the method 800 proceeds to step 814.
- the resolution compute engine 280 generates the low frequency rectangle 720 based on the horizontal knee point 628 and the vertical knee point 628, and the bounding box 710 based on the display resolution of the scene 220 associated with the scene spectrum 270.
- the resolution compute engine 280 centers both the low frequency rectangle 720 and the bounding box 710 at the DC component of the scene spectrum 270.
- the resolution compute engine 280 computes a low frequency energy density (LFD) based on the area within the low frequency rectangle 720.
- the resolution compute engine 280 computes a high frequency energy density (HFD) based on the area that is outside the low frequency rectangle 720, but within the bounding box 710.
- the resolution compute engine divides the LFD by the HDF to compute an energy density ratio and then compares this energy density ratio to an energy density threshold. If, at step 820, the energy density ratio exceeds the energy density threshold, then the resolution compute engine 280 proceeds to step 822.
- the resolutions compute engine 280 sets the native resolution 190 to match the dimensions of the low frequency rectangle 720, and the method 800 terminates.
- step 820 the resolution compute engine 280 determines that the energy density ratio does not exceed the energy density threshold, then the method 800 proceeds directly to step 824.
- the resolution compute engine 280 determines that one or more of the quality fit factors do not exceed the quality threshold, then the method 800 proceeds directly to step 824.
- the resolution compute engine 280 determines that the scene spectrum 270 represents a naturally decaying spectrum and no up-sampling operations have been performed on the associated scene 220. Consequently, the resolution compute engine 280 sets the native resolution equal to the display resolution of the scene 220 and the method 800 terminates.
- the disclosed techniques may be used to determine the lowest resolutions with which scenes within a video sequence have been processed.
- a video preprocessor divides the video sequence into scenes, where the frames included in each scene exhibit relatively uniform texture, luminance, and contrast characteristics. For each of the frames, a frame processor performs Fast Fourier Transforms on each of the color components (Y, Cb, and Cr) and normalizes the magnitude of the resulting spectrum such that the total AC energy equals one. The frame processor then logarithmically scales the normalized frame spectrum to generate log-magnitude spectra for the Y, Cb, and Cr components of each frame.
- a scene spectrum generator then performs averaging operations that coalesce the log-magnitude spectra for the frames included in the scene into scene spectra. More specifically, for each scene, the scene spectrum generator generates a Y component scene spectrum, a Cb component scene spectrum, and a Cr component scene spectrum. Notably, if the scene spectrum has undergone re-sampling operations, then the scene spectrum demonstrates certain distinct and recognizable patterns. Consequently, persons skilled in the art may apply any technically feasible pattern recognition technique (including visual inspection) to the scene spectrum to detect whether the native resolution of the scene is less than the display resolution. [0093] Further, in some embodiments, a resolution compute engine automatically and deterministically compute the native resolution of a scene based on the scene spectrum.
- the resolution compute engine projects a scene spectrum along rows to create a one-dimensional horizontal spectrum and along columns to create a one-dimensional vertical spectrum.
- the scene spectrum generator then individually integrates the horizontal and vertical spectra to create cumulative distribution functions (CDFs) of energies. Subsequently, the scene spectrum generator performs best fit operations that approximate each of the cumulative distribution functions with a two segment line, where the spatial frequency at the intersection of the two segments defines a "knee point.” [0094] Based on a horizontal knee point associated with the horizontal CDF and a vertical knee point associated with the vertical CDF, the scene spectrum generator creates a low frequency rectangle.
- the scene spectrum generator determines the total AC energy in the low-frequency rectangle and a high-frequency region that lies outside the low frequency rectangle but within the bounding box defined by the sampling frequency associated with the display resolution. If the ratio of the low- frequency AC energy to the high-frequency AC energy exceeds a predetermined energy density threshold, then the scene spectrum generator determines that the native resolution is lower than the display resolution. More specifically, the scene spectrum generator determines that the horizontal native resolution is less than or equal to the value of horizontal knee point and the vertical native resolution is less than or equal to the vertical knee point.
- any visual quality degradation attributable to discrepancies between the native resolution and the display resolution may be detected.
- any number of quality assurance procedures may be
- the provider may clearly specify that the video sequence include scenes that have been processed at native resolutions than are lower than the display resolution.
- operations that are performed on the video sequences and optimized based on the resolution, such as encoding may be fined-tuned based on the native resolution instead of the display resolution. Oftentimes, such an adjustment may dramatically decrease resource burdens, such as storage and bandwidth usage, without noticeably increasing visual quality.
- aspects of the present embodiments may be embodied as a system, method or computer program product. Accordingly, aspects of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit," "module” or “system.” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon. [0098] Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium.
- a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read- only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
- a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- processors may be, without limitation, general purpose processors, special-purpose processors, application-specific processors, or field-programmable processors or gate arrays.
- each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
- the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Priority Applications (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DK16738923.8T DK3314898T3 (en) | 2015-06-24 | 2016-06-23 | DETERMINATION OF INITIAL RESOLUTIONS OF VIDEO SEQUENCES |
JP2017565971A JP6542398B2 (en) | 2015-06-24 | 2016-06-23 | Determining the native resolution of video sequences |
CN201680048105.1A CN107925779B (en) | 2015-06-24 | 2016-06-23 | Determining a native resolution of a video sequence |
AU2016284583A AU2016284583B2 (en) | 2015-06-24 | 2016-06-23 | Determining native resolutions of video sequences |
KR1020187001835A KR102016273B1 (en) | 2015-06-24 | 2016-06-23 | Determination of Inherent Resolutions of Video Sequences |
MX2017016933A MX2017016933A (en) | 2015-06-24 | 2016-06-23 | Determining native resolutions of video sequences. |
EP16738923.8A EP3314898B1 (en) | 2015-06-24 | 2016-06-23 | Determining native resolutions of video sequences |
CA2989430A CA2989430C (en) | 2015-06-24 | 2016-06-23 | Determining native resolutions of video sequences |
AU2019200481A AU2019200481A1 (en) | 2015-06-24 | 2019-01-24 | Determining native resolutions of video sequences |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562184183P | 2015-06-24 | 2015-06-24 | |
US62/184,183 | 2015-06-24 | ||
US14/879,045 US9824278B2 (en) | 2015-06-24 | 2015-10-08 | Determining native resolutions of video sequences |
US14/879,053 US9734409B2 (en) | 2015-06-24 | 2015-10-08 | Determining native resolutions of video sequences |
US14/879,045 | 2015-10-08 | ||
US14/879,053 | 2015-10-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016210059A1 true WO2016210059A1 (en) | 2016-12-29 |
Family
ID=56411893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2016/038907 WO2016210059A1 (en) | 2015-06-24 | 2016-06-23 | Determining native resolutions of video sequences |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2016210059A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3541080A1 (en) * | 2018-03-15 | 2019-09-18 | Vestel Elektronik Sanayi ve Ticaret A.S. | Method for determining a native resolution of a video content, method for creating a look-up table, device and non-volatile data carrier |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060274204A1 (en) * | 2005-06-01 | 2006-12-07 | Hitachi, Ltd. | Picture display system for adjusting image quality of a picture signal having higher number of scanning lines |
JP2009044417A (en) * | 2007-08-08 | 2009-02-26 | Sony Corp | Image discrimination device, image discrimination method, and program |
US20110019096A1 (en) * | 2009-07-21 | 2011-01-27 | Louie Lee | Method and system for detection and enhancement of video images |
-
2016
- 2016-06-23 WO PCT/US2016/038907 patent/WO2016210059A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060274204A1 (en) * | 2005-06-01 | 2006-12-07 | Hitachi, Ltd. | Picture display system for adjusting image quality of a picture signal having higher number of scanning lines |
JP2009044417A (en) * | 2007-08-08 | 2009-02-26 | Sony Corp | Image discrimination device, image discrimination method, and program |
US20110019096A1 (en) * | 2009-07-21 | 2011-01-27 | Louie Lee | Method and system for detection and enhancement of video images |
Non-Patent Citations (1)
Title |
---|
IOANNIS KATSAVOUNIDIS ET AL: "Native Resolution Detection of Video Sequences", SMPTE 2015 ANNUAL TECHNICAL CONFERENCE AND EXHIBITION, 29 October 2015 (2015-10-29), pages 1 - 20, XP055297552, ISBN: 978-1-61482-956-0, DOI: 10.5594/M001673 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3541080A1 (en) * | 2018-03-15 | 2019-09-18 | Vestel Elektronik Sanayi ve Ticaret A.S. | Method for determining a native resolution of a video content, method for creating a look-up table, device and non-volatile data carrier |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10140520B2 (en) | Determining native resolutions of video sequences | |
Ma et al. | Constant time weighted median filtering for stereo matching and beyond | |
CA3112265C (en) | Method and system for performing object detection using a convolutional neural network | |
US20180253894A1 (en) | Hybrid foreground-background technique for 3d model reconstruction of dynamic scenes | |
US20200186783A1 (en) | Synthesis of transformed image views | |
US9639943B1 (en) | Scanning of a handheld object for 3-dimensional reconstruction | |
US20150326878A1 (en) | Selective perceptual masking via scale separation in the spatial and temporal domains using intrinsic images for use in data compression | |
AU2021201476A1 (en) | Techniques for synthesizing film grain | |
US20160241884A1 (en) | Selective perceptual masking via scale separation in the spatial and temporal domains for use in data compression with motion compensation | |
WO2016210059A1 (en) | Determining native resolutions of video sequences | |
US20100079448A1 (en) | 3D Depth Generation by Block-based Texel Density Analysis | |
US8897378B2 (en) | Selective perceptual masking via scale separation in the spatial and temporal domains using intrinsic images for use in data compression | |
Gibson et al. | Hazy image modeling using color ellipsoids | |
Storozhilova et al. | 2.5 D extension of neighborhood filters for noise reduction in 3D medical CT images | |
CN110140150B (en) | Image processing method and device and terminal equipment | |
AU2022212809B2 (en) | Banding artifact detection in images and videos | |
Kuo et al. | Automatic high dynamic range hallucination in inverse tone mapping | |
WO2022164795A1 (en) | Banding artifact detection in images and videos | |
WO2014163893A1 (en) | Selective perceptual masking via scale separation in the spatial and temporal domains using intrinsic images for use in data compression | |
Dong et al. | Joint visual attention and rendering complexity based sample rate estimation in selective rendering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16738923 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2989430 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11201710501T Country of ref document: SG |
|
ENP | Entry into the national phase |
Ref document number: 2017565971 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2017/016933 Country of ref document: MX |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2016284583 Country of ref document: AU Date of ref document: 20160623 Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20187001835 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016738923 Country of ref document: EP |