WO2018116603A1 - 画像処理装置、画像処理方法及びプログラム - Google Patents
画像処理装置、画像処理方法及びプログラム Download PDFInfo
- Publication number
- WO2018116603A1 WO2018116603A1 PCT/JP2017/037572 JP2017037572W WO2018116603A1 WO 2018116603 A1 WO2018116603 A1 WO 2018116603A1 JP 2017037572 W JP2017037572 W JP 2017037572W WO 2018116603 A1 WO2018116603 A1 WO 2018116603A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- unit
- code amount
- transfer function
- encoding
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/64—Circuits for processing colour signals
- H04N9/68—Circuits for processing colour signals for controlling the amplitude of colour signals, e.g. automatic chroma control circuits
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
- H04N5/77—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/109—Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/115—Selection of the code volume for a coding unit prior to coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/162—User input
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/19—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/80—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback
- H04N9/804—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components
- H04N9/8042—Transformation of the television signal for recording, e.g. modulation, frequency changing; Inverse transformation for playback involving pulse code modulation of the colour picture signal components involving data reduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/79—Processing of colour television signals in connection with recording
- H04N9/87—Regeneration of colour television signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
Definitions
- the present disclosure relates to an image processing apparatus, an image processing method, and a program.
- H.264 developed jointly by ITU-T and ISO / IEC.
- H.264 / AVC Advanced Video Coding
- H. H.264 / AVC encodes an image signal into a bitstream with improved coding efficiency using various elemental techniques such as prediction, orthogonal transform, quantization, and entropy coding.
- H.C. H.264 / AVC is a successor to the standard encoding system H.264.
- H.265 / HEVC High Efficiency Video Coding
- the encoding efficiency is approximately doubled (see Non-Patent Document 2).
- HDR High Dynamic Range
- SDR Standard Dynamic Range
- HLG Hybrid Log-Gamma
- ST2084 Spin-Log3
- Non-Patent Document 3 for HLG.
- BT. Standardized by ITU-R. 2020 is BT., which has been used in many applications. Compared to the 709 color gamut, a color gamut that enables more vivid colors to be expressed is defined.
- ITU-T “H.264: Advanced video coding for generic audiovisual services”, ITU-T Recommendation H.264, November 2007 ITU-T, “H.265: High efficiency video coding”, ITU-T Recommendation H.265, October 2014 Association of Radio Industries and Businesses, “ESSENTIAL PARAMETER VALUES FOR THE EXTENDED IMAGE DYNAMIC RANGE TELEVISION (EIDRTV) SYSTEM FOR PROGRAMME PRODUCTION ARIB STANDARD”, ARIB STD-B67 Version 1.0, July 20 Search 24 days], Internet ⁇ URL: http://www.arib.or.jp/english/html/overview/doc/2-STD-B67v1_0.pdf>
- an encoding unit that encodes an image acquired based on a transfer function related to conversion between light and an image signal, and each of the images in the encoding unit based on the transfer function
- An image processing apparatus includes a control unit that controls a code amount allocated to the partial area.
- an image acquired based on a transfer function relating to conversion between light and an image signal is encoded, and the image is encoded during the encoding based on the transfer function. Controlling the amount of code allocated to each partial region.
- the processor of the image processing device based on the transfer function, an encoding unit that encodes an image acquired based on a transfer function related to conversion between light and an image signal,
- a program for causing the encoding unit to function as a control unit that controls a code amount allocated to each partial region of the image is provided.
- the encoding unit that encodes an image acquired based on a transfer function related to conversion between light and an image signal for enabling display at a luminance higher than 100 nit
- an image processing apparatus comprising: a control unit that controls a coding amount assigned to each partial region of the image in an encoding unit depending on at least one of a luminance component and a color difference component of the partial region.
- an image processing method includes controlling a code amount assigned to each partial area of the image depending on at least one of a luminance component and a color difference component of the partial area.
- the processor of the image processing apparatus encodes an image acquired based on a transfer function relating to conversion between light and an image signal to enable display with a luminance higher than 100 nits.
- a control unit that controls the amount of code assigned to each partial area of the image in the encoding unit depending on at least one of a luminance component and a color difference component of the partial area.
- an encoding unit that encodes an image acquired based on a transfer function related to conversion between light and an image signal, and the image in the encoding unit based on the transfer function
- a control unit that controls the prediction residual code amount or the mode code amount for mode selection when encoding the image.
- a mode for encoding an image acquired based on a transfer function related to conversion between light and an image signal, and encoding the image based on the transfer function is provided. Controlling a prediction residual code amount or mode code amount for selection is provided.
- the processor of the image processing device based on the transfer function, an encoding unit that encodes an image acquired based on a transfer function related to conversion between light and an image signal,
- a program is provided for causing the encoding unit to function as a control unit that controls a prediction residual code amount or a mode code amount for mode selection when the image is encoded.
- FIG. 5 is an explanatory diagram for explaining a color gamut defined by 2020.
- FIG. It is explanatory drawing which shows the 1st example of a structure of the image processing system which concerns on one Embodiment. It is explanatory drawing which shows the 2nd example of a structure of the image processing system which concerns on one Embodiment.
- 1 is a block diagram illustrating a first example of a schematic configuration of an image processing apparatus according to a first embodiment. It is a block diagram which shows the 2nd example of schematic structure of the image processing apparatus which concerns on 1st Embodiment. It is a block diagram which shows an example of the detailed structure of the control part and encoding part which concern on 1st Embodiment.
- FIG. 24 It is a figure which shows the example of a display of the operation screen in a concentrated operation panel. It is a figure which shows an example of the mode of the surgery to which the operating room system was applied. It is a block diagram which shows an example of a function structure of the camera head shown in FIG. 24, and CCU.
- FIG. 1A is an explanatory diagram for explaining the luminance dynamic range of the SDR video.
- the vertical axis in FIG. 1A represents luminance [nit].
- the maximum brightness in the natural world may reach 20000 nit, and the brightness of a general subject is, for example, about 12000 nit at the maximum.
- the upper limit of the dynamic range of the image sensor is lower than the maximum brightness in the natural world, and may be, for example, 4000 nits.
- An imaging apparatus such as a digital camera or a digital camcorder converts an electrical signal generated by photoelectrically converting incident light in an image sensor into, for example, a 10-bit digital image signal in a signal processing circuit subsequent to the image sensor.
- the digital image signal generated by the imaging device is encoded by a predetermined video encoding method (also referred to as a video codec) according to the purpose of an application such as transmission or recording, and converted into an encoded bit stream.
- a digital image signal acquired by decoding the encoded bit stream is provided to the display device, and the video is reproduced with a display luminance of 100 nits maximum.
- FIG. 1B is an explanatory diagram for explaining the luminance dynamic range of the HDR video.
- the imaging apparatus converts incident light to the image sensor into an analog electric signal, and further converts the analog electric signal into, for example, a 10-bit digital image signal.
- the HDR video signal format maintains the gradation of the high-luminance portion exceeding 100 nits during such conversion, and allows the video to be reproduced with a luminance up to the upper limit of several hundreds or thousands of nits.
- the digital image signal generated by the imaging device is also encoded by a predetermined video encoding method according to the purpose of the application, and converted into an encoded bit stream.
- a digital image signal obtained by decoding the encoded bitstream is provided to the display device, and the video is reproduced with a luminance dynamic range including display luminance higher than 100 nits.
- codec distortion Regardless of whether the video is an SDR video or an HDR video, when an image signal is encoded by a video encoding method including irreversible compression, image quality is deteriorated in an image reproduced based on the decoded image signal. Such image quality degradation is referred to as codec distortion in this specification.
- the degree of codec distortion can be evaluated by an index called PSNR (Peak Signal-to-Noise Ratio).
- PSNR Peak Signal-to-Noise Ratio
- H when coding efficiency is equal, H.
- the image quality of an image encoded / decoded by H.264 / AVC is higher than the image quality of an image encoded / decoded by MPEG-2.
- the image quality of an image encoded / decoded by H.265 / HEVC is H.265. It is higher than H.264 / AVC.
- evaluation of codec distortion is usually performed by comparing the original image input to the encoder and the decoded image output from the decoder. It is not well known how signal conversion performed during HDR image capture or display, or how dynamic range reduction or expansion affects codec distortion.
- the inventors have converted a large number of sample videos into image signals in the HDR signal format.
- an experiment was conducted to verify the image quality of an HDR video reproduced from a decoded image signal.
- the deterioration of the image quality is scattered throughout the image mainly in the form of block noise or mosquito noise, and has occurred remarkably in a part of the image.
- the degree of deterioration that occurs when the same 10-bit image signal is encoded by the same video encoding method is usually the same. Nevertheless, the reason why distortion that is not perceived (or difficult to perceive) in the SDR video is detected in the HDR video is thought to be because the codec distortion is expanded together when the dynamic range of the decoded image signal is expanded.
- FIG. 2A shows a state where codec distortion occurs in an image signal of an SDR video through encoding and decoding. Since the codec distortion is not enlarged when the SDR video is reproduced, the distortion is not perceived subjectively if the distortion is sufficiently small.
- FIG. 2B shows that the codec distortion still occurs in the image signal of the HDR video. When playing back an HDR video, the codec distortion is increased with the expansion of the dynamic range. As a result, the possibility of subjectively perceiving image quality degradation such as block noise or mosquito noise increases.
- Codec distortion can also be increased when format conversion from HDR to SDR is performed on an image signal expressed in the HDR signal format.
- FIG. 2C shows how the codec distortion is expanded through format conversion from HDR to SDR, that is, HDR-SDR conversion.
- the HDR-SDR conversion is generally an inverse function of a transfer function corresponding to a signal format for HDR (for example, obtained by decoding an encoded bit stream) into an original signal corresponding to the output of an image sensor. And a process of reconverting the restored original signal into an SDR image signal with a transfer function corresponding to the signal format for SDR.
- the codec distortion expanded in the former of these processes is not reduced in the reconversion to the signal format for SDR. Therefore, when the SDR video is reproduced based on the image signal after the HDR-SDR conversion, the enlarged codec distortion can be subjectively sensed.
- the distortion should occur uniformly.
- the distortion is significant in the characteristic partial region as exemplified below: -Bright areas (eg clouds in the sky) -Brightly colored areas (eg lamps that glow red or blue)
- the cause of the remarkable distortion in these partial areas is related to the signal transfer function of the signal format for HDR.
- FIG. 3 illustrates an example of a typical SDR signal format OETF and an HDR signal format OETF.
- the horizontal axis represents the luminance dynamic range of light before conversion, and 100% corresponds to the luminance of 100 nits.
- the vertical axis represents the code value of the converted image signal. In the case of 10 bits, the code value can take values from 0 to 1023.
- the code value is particularly
- the difference in the slope of the transfer function is significant in a relatively large portion. This is because the image information is compressed at a higher compression ratio in the HDR case than in the SDR in such a portion, that is, in the HDR case, the same change in the code value is larger in the HDR case than in the SDR case. It means to express the change of key. Even when the transfer functions of the red (R) component, the green (G) component, and the blue (B) component are analyzed in the RGB color system, the relationship between HDR and SDR similar to the graph shown in FIG. Differences in signal transfer characteristics were confirmed.
- FIG. 4 shows SDR BT.
- a graph 709 shows how much the S-Log3 for HDR compresses image information.
- the horizontal axis in FIG. 4 represents the code value of a 10-bit image signal.
- the vertical axis represents BT.
- the ratio of the compression ratio of S-Log3 to the compression ratio of 709 is represented.
- the compression ratio of S-Log3 is BT. It is about 4 times the compression ratio of 709, and the compression ratio of S-Log3 becomes relatively higher as the code value increases.
- the image information is more strongly compressed in the HDR case than in the SDR case in the portion where the code value is relatively large.
- an EOTF Electro-Optical Transfer Function
- OETF Electro-Optical Transfer Function
- FIG. 709 and BT. 5 is an explanatory diagram for explaining a color gamut defined by 2020.
- FIG. 5 Referring to FIG. 5, there is shown a color gamut graph in which a three-dimensional color space is mapped to a two-dimensional plane using predetermined constraint conditions. The cross mark in the graph indicates the position where white is mapped. The broken line in the graph indicates BT.
- BT. 709 shows the range of colors that can be represented according to 709.
- the solid line in the graph is BT.
- the range of colors that can be expressed according to 2020 is shown.
- the dotted lines in the graph indicate the range of colors that human vision can identify.
- BT. 2020 It is possible to express a variety of colors more than 709.
- BT. 709 can express about 75% of colors in the real world, whereas BT. 2020 is said to be able to express more than 99% of the colors.
- BT. 2020 may be used as the color gamut of the SDR video, or may be used as the color gamut of the HDR video.
- FIG. 6A is an explanatory diagram illustrating a first example of the configuration of the image processing system according to the present embodiment.
- the image processing system 10a illustrated in FIG. 6A includes an imaging device 11, a signal processing device 14, and a server device 15.
- the imaging device 11 may be, for example, a digital video camera or a digital still camera, or any type of device having a video shooting function (for example, a monitoring camera, a Web camera, or an information terminal).
- the imaging device 11 captures the state of the real world using an image sensor and generates a primitive image signal.
- the signal processing device 14 may be a BPU (Baseband Processing Unit), for example, and is connected to the imaging device 11.
- the signal processing device 14 performs AD conversion and digital signal processing on the primitive image signal generated by the imaging device 11, and generates an image signal in a predetermined signal format.
- Digital signal processing performed by the signal processing device 14 may include, for example, gamma correction and color conversion.
- the signal processing device 14 may be configured integrally with the imaging device 11.
- the characteristic of signal conversion from light incident on the imaging device 11 to an image signal generated by the signal processing device 14 is represented by OETF.
- the signal processing device 14 may generate an image signal with a transfer function (or signal format) selected from a plurality of candidates by a user via some user interface.
- the plurality of candidates include one signal format for SDR (eg, BT.709) and one signal format for HDR (eg, a combination of BT.2020 and HLG or S-Log3). But you can.
- the plurality of candidates may include a plurality of signal formats for HDR.
- the signal processing device 14 may be capable of generating an image signal only with a single HDR signal format.
- the signal processing device 14 multiplexes an auxiliary signal including an audio signal and metadata as necessary on the image signal generated as a result of signal conversion, and outputs the multiplexed signals to the server device 15.
- the server device 15 is an image processing device connected to the signal processing device 14 via a signal line conforming to a transmission protocol such as SDI (Serial Digital Interface) or HD-SDI.
- SDI Serial Digital Interface
- HD-SDI High Speed Digital Interface
- the server device 15 acquires the image signal transmitted from the signal processing device 14, encodes the image with a predetermined video encoding method, and generates an encoded bitstream 17a.
- the encoded bit stream 17a may be stored in a storage device inside or outside the server device 15, or may be transmitted to another device (for example, a display device) connected to the server device 15.
- FIG. 6B is an explanatory diagram showing a second example of the configuration of the image processing system according to the present embodiment.
- the image processing system 10b illustrated in FIG. 6B includes an imaging device 12, a storage device 13, and a terminal device 16.
- the imaging device 12 may be, for example, a digital video camera, a digital camcorder or a digital still camera, or any type of device having a video shooting function.
- the imaging device 12 captures a real-world situation using an image sensor and generates a primitive image signal.
- the imaging device 12 performs AD conversion and digital signal processing as described above in connection with the signal processing device 14, and generates an image signal in a predetermined signal format. Similar to the signal processing device 14, the imaging device 12 may generate an image signal with a transfer function selected from a plurality of candidates by a user via some user interface, or a transfer function for a single HDR It may be possible to generate an image signal only by
- the imaging device 12 encodes an image by a predetermined video encoding method based on an image signal generated as a result of signal conversion, and generates an encoded bit stream 17b.
- the encoded bit stream 17b may be stored as a video file, for example, or may be provided to the storage device 13 or the terminal device 16 via a network.
- the storage device 13 is a data storage that stores various video data.
- the storage device 13 may store a video file 17c generated by encoding an image using a predetermined video encoding method.
- the type of transfer function, the type of color gamut, and the video encoding method relating to conversion between light and image signal applied to the video content included in the video file are identified. Parameters can be included.
- the storage device 13 may store a RAW video file 18 that records an image signal before encoding (or before signal conversion) as RAW data.
- the storage device 13 provides a file that the user desires to reproduce or edit to the terminal device 16 via the network.
- the terminal device 16 is an image processing device having a function of reproducing or editing a video file generated by the imaging device 12 or stored by the storage device 13. For example, the terminal device 16 may decode a coded bitstream included in the video file 17b or 17c acquired from the imaging device 12 or the storage device 13 to generate a decoded image signal. Further, the terminal device 16 may perform dynamic range conversion (for example, HDR-SDR conversion or SDR-HDR conversion) on the decoded image generated as described above. Further, the terminal device 16 may encode the image signal included in the RAW video file 18 or the decoded image signal after dynamic range conversion by a predetermined video encoding method to generate the encoded bit stream 17d.
- dynamic range conversion for example, HDR-SDR conversion or SDR-HDR conversion
- an image processing device that is, an encoder
- the amount of code allocated to each partial region of the image is controlled based on the transfer function (for example, based on the type of transfer function or other attributes). By doing so, degradation of image quality when the signal format for HDR is used is reduced. From the next section, a specific and exemplary configuration of such an image processing apparatus will be described in detail.
- FIG. 7A is a block diagram illustrating a first example of a schematic configuration of the image processing apparatus according to the present embodiment.
- the image processing apparatus 100a illustrated in FIG. 7A is, for example, the server apparatus 15 in the example of FIG. 6A, or the imaging apparatus 12 or the terminal apparatus 16 in the example of FIG. 6B (or an image processing module mounted on any of these apparatuses).
- the image processing apparatus 100a includes a signal acquisition unit 101, an information acquisition unit 103, an encoding unit 110, and a control unit 140.
- the signal acquisition unit 101 acquires an input image signal generated based on a transfer function related to conversion between light and an image signal.
- the signal acquisition unit 101 may acquire an input image signal from an external device via a transmission interface, or input from an imaging module and a signal processing module (not shown) configured integrally with the image processing device 100a. An image signal may be acquired.
- the information acquisition unit 103 acquires input information related to a transfer function applied to the image encoded by the encoding unit 110.
- the information acquisition unit 103 may acquire input information via a user interface included in the image processing apparatus 100a.
- the user interface may be provided by a physical input device such as a touch panel, a button, or a switch provided in the housing of the image processing apparatus 100a. Instead, the user interface may be provided as a GUI (Graphical User Interface) on a terminal device that is remotely connected via the communication interface.
- the input information includes at least a transfer function type indicating the type of transfer function applied to an image to be encoded.
- the user interface may cause the user to select one of the two options “SDR” and “HDR” to be applied to the image. In this case, it is determined that one predefined transfer function for SDR or one predefined transfer function for HDR is applied to the image.
- the user interface may allow the user to select a transfer function to be applied to the image from a plurality of transfer function candidates (for example, BT.709, HLG, ST2084, and S-Log3).
- the information acquisition unit 103 may acquire input information from an auxiliary signal multiplexed with an input image signal.
- the auxiliary signal is received by the signal acquisition unit 101 during a period in which no image signal is transmitted on the signal line (for example, a blanking period). Then, the information acquisition unit 103 can acquire input information including a transfer function type indicating the type of transfer function applied to the image from the auxiliary signal separated in the signal acquisition unit 101.
- the information acquisition unit 103 may acquire input information required by accessing an external data source.
- the encoding unit 110 encodes an image represented by the image signal acquired by the signal acquisition unit 101, and generates an encoded bit stream.
- the encoding unit 110 is, for example, MPEG-2, H.264, or the like. H.264 / AVC or H.264
- the encoding process may be executed according to any video encoding method such as H.265 / HEVC.
- the encoding process executed by the encoding unit 110 typically includes various arithmetic processes such as prediction, orthogonal transform, quantization, and entropy encoding. Among them, quantization is used to achieve a required compression rate. This process includes lossy compression.
- the control unit 140 controls the amount of code allocated to each partial region of the image in the encoding unit 110 based on the transfer function indicated by the input information acquired by the information acquisition unit 103. More specifically, the control unit 140 uses the HDR for the first transfer function corresponding to HDR (transfer function for HDR) and the second transfer function corresponding to SDR (transfer function for SDR).
- the quantization control process for reducing the degradation of the image quality of the HDR video is validated.
- the quantization control process is a process for correcting a process parameter of a quantization process executed regardless of a transfer function or a signal format so as to adjust a code amount allocation when an HDR transfer function is applied. Can be included.
- the allocated code amount is controlled mainly based on the type of the transfer function, but the code amount is determined based on other attributes of the transfer function, such as the upper limit value of the dynamic range associated with the transfer function. It may be controlled.
- FIG. 7B is a block diagram illustrating a second example of a schematic configuration of the image processing apparatus according to the present embodiment.
- the image processing apparatus 100b illustrated in FIG. 7B also includes, for example, the server apparatus 15 in the example of FIG. 6A or the image processing apparatus 12 or the terminal apparatus 16 in the example of FIG. Module).
- the image processing apparatus 100b includes a signal processing unit 102, an information acquisition unit 104, a storage unit 107, an encoding unit 110, and a control unit 140.
- the signal processing unit 102 acquires a primitive image signal input from the imaging device via some transmission interface or a signal line inside the device, or acquires an image signal from a video file stored in the storage unit 107. . Then, the signal processing unit 102 performs digital signal processing that can include, for example, gamma correction and color conversion on the primitive image signal, and generates an image signal to be encoded in a predetermined signal format. The signal format applied to the image by the signal processing unit 102 and the corresponding transfer function are determined based on the input information acquired by the information acquisition unit 104. Then, the signal processing unit 102 outputs the generated image signal to the encoding unit 110.
- the information acquisition unit 104 acquires input information related to a transfer function applied to an image encoded by the encoding unit 110.
- the information acquisition unit 104 may acquire input information via a user interface (provided by a physical input device or provided as a GUI) of the image processing apparatus 100b.
- the input information includes at least a transfer function type indicating the type of transfer function applied to an image to be encoded.
- the user interface may cause the user to select one of the two options “SDR” and “HDR” to be applied to the image.
- the user interface may allow the user to select a transfer function to be applied to the image from a plurality of transfer function candidates.
- the storage unit 107 is a storage device for storing various video data.
- the storage unit 107 may store, for example, a video file that records a digital image signal before signal conversion.
- the user may store the video file acquired from the external storage medium in the storage unit 107 via an input / output interface (not shown) included in the image processing apparatus 100b.
- the storage unit 107 may store a video file including an encoded bit stream generated as a result of the encoding process executed by the encoding unit 110.
- the video file may be output to an external device upon request.
- the encoding unit 110 encodes an image represented by the image signal acquired by the signal processing unit 102 to generate an encoded bitstream. Based on the type of transfer function indicated by the input information acquired by the information acquisition unit 104, the control unit 140 controls the amount of code allocated to each partial region of the image in the encoding unit 110.
- the encoded bit stream generated by the encoding unit 110 may be transmitted to a device external to the image processing apparatus 100b, or may be stored as a video file by the storage unit 107.
- FIG. 8 is a block diagram illustrating an example of a detailed configuration of the encoding unit and the control unit according to the first embodiment.
- the encoding unit 110 includes a rearrangement buffer 111, a block setting unit 112, a subtraction unit 113, an orthogonal transform unit 114, a quantization unit 115, a lossless encoding unit 116, and an inverse quantum.
- the rearrangement buffer 111 rearranges the image data of a series of images expressed by the image signal acquired by the signal acquisition unit 101 or the signal processing unit 102 according to a GOP (Group of Pictures) structure.
- the rearrangement buffer 111 outputs the rearranged image data to the block setting unit 112, the intra prediction unit 130, and the inter prediction unit 135.
- the block setting unit 112 divides each image corresponding to a picture into a plurality of blocks.
- MPEG-2 and H.264 In H.264 / AVC a picture is divided into a plurality of macroblocks having a fixed size in a grid pattern, and an encoding process is executed using each macroblock as a processing unit.
- the quantization process can be executed using a smaller sub-block set for each macroblock as a processing unit.
- H. In H.265 / HEVC a picture is divided into a plurality of coding units (Coding Units) having a variable size, and coding processing is executed with each CU as a processing unit.
- the quantization process can be executed with a smaller transform unit (Transform Unit) set in each CU as a processing unit.
- the subtraction unit 113 calculates prediction residual data that is the difference between the image data input from the block setting unit 112 and the prediction image data, and outputs the prediction residual data to the orthogonal transformation unit 114.
- the orthogonal transform unit 114 transforms the prediction residual data input from the subtraction unit 113 from spatial domain image data to frequency domain transform coefficient data.
- the orthogonal transformation executed by the orthogonal transformation unit 114 may be, for example, discrete cosine transformation or discrete sine transformation. Then, orthogonal transform section 114 outputs transform coefficient data to quantization section 115.
- the quantization unit 115 quantizes the transform coefficient data input from the orthogonal transform unit 114 in a quantization step that is determined so that a required compression rate is achieved. For example, if the buffer or transmission path has a large free capacity relative to the size of the output encoded bit stream, the quantization step is set to a small value. Conversely, if the free capacity is small, the quantization step is set to a large value. obtain.
- the quantization step is generally determined for each subregion in the image. Different quantization steps may be used for each of the three color components. The smaller the quantization step used for a partial region, the finer the transform coefficients for that partial region are.
- the quantization unit 115 may apply different quantization steps to different frequency components of the transform coefficient using the quantization matrix. Then, the quantization unit 115 outputs the quantized transform coefficient data (hereinafter referred to as quantization data) to the lossless encoding unit 116 and the inverse quantization unit 121.
- the control unit 140 provides the quantization unit 115 with parameters for adjusting (scaling) the quantization step used for each partial region. .
- the quantization unit 115 scales the quantization step by dividing (or multiplying) by this parameter provided from the control unit 140, and quantizes the transform coefficient data in the quantized step after scaling.
- some video coding schemes have a logarithmic function relationship with the quantization step instead of directly coding the quantization step as a control value required for inverse quantization on the decoder side.
- a quantization parameter (QP) is encoded.
- the scaling of the quantization step may be achieved by adding (or subtracting) some offset to the quantization parameter instead of dividing (or multiplying) the quantization step by some coefficient.
- the lossless encoding unit 116 generates an encoded bitstream by encoding the quantized data input from the quantizing unit 115. Further, the lossless encoding unit 116 encodes various parameters referred to by the decoder, and inserts the encoded parameters into the encoded bitstream.
- the parameters encoded by the lossless encoding unit 116 may include information regarding transfer functions, information regarding color gamut, and information regarding the quantization parameters described above.
- the lossless encoding unit 116 outputs the generated encoded bit stream to an output destination according to the purpose of the application.
- the inverse quantization unit 121, the inverse orthogonal transform unit 122, and the addition unit 123 constitute a local decoder.
- the local decoder is responsible for reconstructing the original image from the encoded data.
- the inverse quantization unit 121 performs inverse quantization on the quantized data in the same quantization step as that used by the quantization unit 115, and restores transform coefficient data.
- a quantization step scaled using parameters provided from the controller 140 may be used for each partial region. Then, the inverse quantization unit 121 outputs the restored transform coefficient data to the inverse orthogonal transform unit 122.
- the inverse orthogonal transform unit 122 restores the prediction residual data by performing an inverse orthogonal transform process on the transform coefficient data input from the inverse quantization unit 121. Then, the inverse orthogonal transform unit 122 outputs the restored prediction residual data to the addition unit 123.
- the adding unit 123 generates decoded image data by adding the restored prediction residual data input from the inverse orthogonal transform unit 122 and the predicted image data generated by the intra prediction unit 130 or the inter prediction unit 135. To do. Then, the adding unit 123 outputs the generated decoded image data to the loop filter 124 and the frame memory 126.
- the loop filter 124 is an in-loop filter for the purpose of improving the image quality of the decoded image.
- the loop filter 124 may include, for example, a deblocking filter for reducing block distortion appearing in the decoded image.
- the loop filter 124 may include an adaptive offset filter for adding an edge offset or a band offset to the decoded image.
- the loop filter 124 outputs the decoded image data after filtering to the frame memory 126.
- the frame memory 126 stores the decoded image data before filtering input from the adder 123 and the decoded image data after application of the in-loop filter input from the loop filter 124.
- the switch 127 reads decoded image data before filtering used for intra prediction from the frame memory 126, and supplies the read decoded image data to the intra prediction unit 130 as reference image data. Further, the switch 127 reads out the decoded image data after filtering used for inter prediction from the frame memory 126 and supplies the read out decoded image data to the inter prediction unit 135 as reference image data.
- the mode selection unit 128 selects a prediction method for each block based on the cost comparison input from the intra prediction unit 130 and the inter prediction unit 135. For the block for which intra prediction is selected, the mode selection unit 128 outputs the prediction image data generated by the intra prediction unit 130 to the subtraction unit 113 and outputs information related to the intra prediction to the lossless encoding unit 116. For the block for which inter prediction is selected, the mode selection unit 128 outputs the prediction image data generated by the inter prediction unit 135 to the subtraction unit 113 and outputs information related to inter prediction to the lossless encoding unit 116. .
- the intra prediction unit 130 executes an intra prediction process based on the original image data and the decoded image data. For example, the intra prediction unit 130 evaluates a cost estimated to occur for each of a plurality of candidate modes included in the search range. Next, the intra prediction unit 130 selects the prediction mode that minimizes the cost as the best prediction mode. Further, the intra prediction unit 130 generates predicted image data according to the selected best prediction mode. Then, the intra prediction unit 130 outputs information related to intra prediction including prediction mode information indicating the best prediction mode, the corresponding cost, and prediction image data to the mode selection unit 128.
- the inter prediction unit 135 performs inter prediction processing (motion compensation) based on the original image data and the decoded image data. For example, the inter prediction unit 135 evaluates a cost estimated to occur for each of a plurality of candidate modes included in the search range. Next, the inter prediction unit 135 selects the prediction mode with the lowest cost as the best prediction mode. Further, the inter prediction unit 135 generates predicted image data according to the selected best prediction mode. Then, the inter prediction unit 135 outputs information related to inter prediction, corresponding costs, and predicted image data to the mode selection unit 128.
- inter prediction processing motion compensation
- control unit 140 includes a statistical calculation unit 141 and a code amount control unit 143.
- the statistical calculation unit 141 calculates statistics regarding the strength of at least one of the luminance component and the color difference component for each of the partial regions set in the image.
- the statistics calculated by the statistical calculation unit 141 may be a representative value (for example, average, median or mode) of pixel values (code values) in a partial region for one or more color components, or a histogram. . Then, the statistical calculation unit 141 outputs the calculated statistics to the code amount control unit 143.
- the partial region here can typically correspond to a block corresponding to a processing unit of quantization processing.
- MPEG-2 or H.264. H.264 / AVC macroblock or sub-block, or H.264 Statistics may be calculated for each partial region such as CU or TU in H.265 / HEVC, and a quantization step may be controlled by a code amount control unit described below.
- the present invention is not limited to this example, and the quantization control process described here may be executed for each partial region (may be one pixel) having another shape.
- the code amount control unit 143 determines the type of transfer function applied to the image to be encoded based on the input information input from the information acquisition unit 103 or 104. Then, the code amount control unit 143 can skip the quantization control process described below when the transfer function for SDR out of the transfer function for HDR and the transfer function for SDR is applied. On the other hand, when the transfer function for HDR is applied to the image to be encoded, the code amount control unit 143 determines the code amount assigned to each partial region from among the luminance component and the color difference component of the partial region. Control depending on at least one of the above.
- the code amount control unit 143 scales the quantization step used by the quantization unit 115 for each partial region depending on the strength of one or more color components (or the quantization unit).
- the code amount allocated to each partial area is controlled by scaling to 115.
- the control depending on the intensity of the luminance component as the first embodiment the control depending on the intensity of the color difference component as the second embodiment, and the intensity of both the luminance component and the color difference component as the third embodiment. The control depending on will be described.
- the code amount control unit 143 increases the code amount in a partial region where the intensity of the luminance component is stronger (that is, the high luminance portion). Scale the quantization step used for each subregion. The intensity of the luminance component of each partial area is grasped from the statistics for each partial area calculated by the statistical calculation unit 141.
- the code amount control unit 143 scales the quantization step by dividing the quantization step by a protection ratio that depends on the intensity of the luminance component of each partial region.
- the protection ratio is a parameter representing how much the image quality of the partial area is protected. The larger the protection ratio value, the smaller the quantization step value, and the stronger the image quality of the partial area to which the quantization step is applied.
- the actual division by the protection ratio may be performed in the quantization unit 115 provided with the protection ratio.
- FIG. 9A is an explanatory diagram for describing a first example of a protection ratio for protecting a high-luminance portion.
- the horizontal axis in FIG. 9A represents the sign value of the luminance component.
- the vertical axis represents the protection ratio.
- the protection ratio may be a parameter calculated using a predetermined function with the intensity of the luminance component of each partial region as an argument.
- a linear function is shown as an example in FIG. 9A, higher order functions or other types of functions such as logarithmic functions may be used.
- the quantization step of the partial area is scaled to a smaller value. Thereby, it is possible to avoid excessively damaging the image information of the high-luminance portion that has already been strongly compressed during the conversion from light to an electrical signal.
- FIG. 9B is an explanatory diagram for describing a second example of the protection ratio for protecting the high luminance part.
- the protection ratio may be a parameter that is selectively determined depending on which sub-range the intensity of the luminance component of each partial region belongs.
- the intensity of the luminance component of each partial area is classified into six sub-ranges of less than 200 nit, 200 nit or more and less than 300 nit, 300 nit or more and less than 400 nit, 400 nit or more and less than 500 nit, 500 nit or more and less than 600 nit, and 600 nit or more.
- the protection ratio corresponding to each sub-range is defined.
- the code amount control unit 143 may include a memory that stores in advance a mapping table for mapping such a subrange and a corresponding protection ratio. Also in the second example, as in the first example, since the protection ratio of the partial region where the intensity of the luminance component is stronger is set higher, it is already strongly compressed when converting from light to an electric signal. It can be avoided that the image information of the high brightness portion is excessively damaged.
- the protection ratio (or quantization step) is finely controlled in the central portion of the dynamic range, while the protection ratio is fixed at the end of the dynamic range.
- the code amount control unit 143 increases the code amount in a partial region where the intensity of the color difference component is stronger (ie, the high color difference portion). Scale the quantization step used for each subregion.
- FIG. 10 is an explanatory diagram for explaining a code value to be protected as a high color difference portion.
- the horizontal axis of FIG. 10 represents the code value of the Cb component that is one of the two color difference components.
- the vertical axis represents the sign value of the Cr component which is the other of the two color difference components.
- a point P1 in the figure indicates a corresponding position on the CbCr plane in the YCbCr space of a specific point corresponding to so-called “yellow” in which the code values of the R component and the G component exceed 1000 in the RGB space.
- a point P2 indicates a corresponding position on the CbCr plane in the YCbCr space of a specific point corresponding to so-called “cyan” in which the code values of the G component and the B component exceed 1000 in the RGB space.
- a point P3 indicates a corresponding position on the CbCr plane in the YCbCr space of a specific point corresponding to the so-called “green” in which the code value of the G component exceeds 1000 in the RGB space.
- a point P4 indicates a corresponding position on the CbCr plane in the YCbCr space of a specific point corresponding to so-called “magenta” in which the code values of the R component and the B component exceed 1000 in the RGB space.
- a point P5 indicates a corresponding position on the CbCr plane in the YCbCr space of a specific point corresponding to the so-called “red” in which the R component code value exceeds 1000 in the RGB space.
- a point P6 indicates a corresponding position on the CbCr plane in the YCbCr space of a specific point corresponding to so-called “blue” in which the code value of the B component exceeds 1000 in the RGB space.
- the points P1, P2 and P3 inside the broken line frame HL in the figure have a relatively high Y component value (for example, 700 or more) in the YCbCr space.
- the points P4, P5 and P6 outside the broken line frame HL have relatively low Y component values (for example, less than 700). This is because the “yellow”, “cyan”, and “green” portions of the brightly colored portion can be protected by considering the luminance component, whereas the “magenta”, “red”, and “blue” portions Means not. Therefore, it is also beneficial to increase the code amount allocation for the high color difference portion.
- the intensity of the color difference component of each partial area is grasped from the statistics for each partial area calculated by the statistical calculation unit 141.
- the code amount control unit 143 divides the quantization step by a protection ratio that depends on the intensity of the color difference component in each partial region, thereby obtaining a quantization step (common to the luminance component or specific to the color difference component). Shall be scaled. The actual division may be performed in the quantization unit 115 provided with the protection ratio.
- the protection ratio for protecting the high color difference portion is a parameter calculated using a predetermined function with the intensity of the color difference component of each partial area as an argument, as in the first example shown in FIG. 9A. There may be. Alternatively, the protection ratio for protecting the high color difference portion is determined depending on which sub-range the intensity of the color difference component of each partial region belongs, as in the second example shown in FIG. 9B. May be a parameter.
- the code amount control unit 143 may include a memory that stores in advance a mapping table for mapping the subrange of the color difference component and the corresponding protection ratio.
- FIG. 11 is an explanatory diagram for explaining an example of a protection ratio for protecting a high color difference portion.
- FIG. 11 shows the protection ratios (broken lines) of the color difference components corresponding to the same six subranges in addition to the protection ratios (solid lines) of the luminance components corresponding to the six subranges illustrated in FIG. 9B.
- the protection ratio of the partial area where the intensity of the color difference component is stronger the image information of the high color difference area that is already strongly compressed when converting from light to an electrical signal is excessively damaged. Can be avoided.
- the protection ratio in the middle of the dynamic range the protection of the middle part of the dynamic range, which is likely to affect the subjectively perceived image quality, is efficient while suppressing the sacrifice of encoding efficiency. Can be strengthened.
- the code amount control unit 143 classifies each partial region into two groups based on the histogram calculated for each color component by the statistical calculation unit 141 for each partial region. More specifically, for example, the code amount control unit 143 determines that the ratio of pixels having a Cb component exceeding a certain Cb reference value exceeds the threshold, or the ratio of pixels having a Cr component exceeding a certain Cr reference value is set to the threshold.
- the partial regions that are larger can be classified into the first group, and the partial regions that are not (the proportion of both is below the threshold value) can be classified into the second group.
- the first group includes a partial region with many pixels located outside the broken line frame HL in FIG. 10, and the second group includes a partial region with many pixels located inside the broken line frame HL.
- the code amount control unit 143 protects the high color difference portion according to the second embodiment for the partial region belonging to the first group, and the high luminance portion according to the first embodiment for the partial region belonging to the second group. Protection may be applied.
- FIG. 12 is a flowchart showing an example of the flow of the encoding control process according to this embodiment.
- the encoding control process described here may be repeated for each image constituting the video. Processing steps for obtaining or setting parameters that do not change across multiple images may be skipped in the second and subsequent iterations.
- description of processing steps not directly related to code amount control is omitted.
- the signal acquisition unit 101 or the signal processing unit 102 acquires an image signal generated based on a transfer function relating to conversion between light and an image signal (step S110).
- the image signal acquired here is output to the encoding unit 110.
- the information acquisition unit 103 or 104 acquires input information related to the transfer function applied to the image encoded by the encoding unit 110 via the user interface or from the auxiliary signal multiplexed with the input image signal. (Step S112).
- the input information acquired here is output to the control unit 140.
- the code amount control unit 143 sets a protection ratio table or function used when setting the protection ratio for each partial region based on the type of the transfer function indicated by the input information described above (step). S114).
- the protection ratio table or function set here may be common across a plurality of transfer functions for HDR, or may differ depending on which of the plurality of transfer functions for HDR is applied. May be.
- the subsequent processing is repeated for each of a plurality of partial areas set in the processing target image.
- the partial area to be processed in each iteration is referred to as a target partial area here.
- the quantization unit 115 of the encoding unit 110 determines the quantization step of the target partial region so that a required compression rate is achieved regardless of what transfer function is applied (step S130). .
- the code amount control unit 143 determines the type of the applied transfer function based on the input information (step S132). If it is determined that the HDR transfer function is applied to the image to be encoded, the code amount control unit 143 performs a quantization control process described in detail later (step S140). On the other hand, when it is determined that the transfer function for SDR is applied to the image to be encoded, the code amount control unit 143 skips the quantization control process.
- the quantization unit 115 quantizes the transform coefficient data of the target partial region input from the orthogonal transform unit 114 in a quantization step after scaling (or not scaled because it is an SDR video) (step S160). ).
- the lossless encoding unit 116 encodes the quantized data and the quantization parameter input from the quantization unit 115 to generate an encoded bit stream (step S170).
- Steps S130 to S170 are repeated until processing is completed for all partial areas in the picture (step S180).
- the encoding control process shown in FIG. 12 ends (step S190).
- FIG. 13A is a flowchart showing a first example of the flow of the quantization control process that can be executed in step S140 of FIG.
- the first example shows an example of the flow of quantization control processing for protecting the gradation of a high-luminance portion in an image.
- the statistical calculation unit 141 calculates statistics regarding the intensity of the luminance component of the target partial region (step S141).
- the statistics calculated here may include, for example, the average, median value, or mode value of the pixel values in the partial region for the luminance component.
- the statistical calculation unit 141 outputs the calculated statistics to the code amount control unit 143.
- the code amount control unit 143 determines a protection ratio corresponding to the luminance statistics of the target partial region input from the statistical calculation unit 141 by referring to the protection ratio table or using a function for calculating the protection ratio. (Step S144). Then, the code amount control unit 143 outputs the determined protection ratio to the quantization unit 115.
- the quantization unit 115 scales the quantization step determined in step S130 of FIG. 12 according to the protection ratio input from the code amount control unit 143 (step S146). For example, the quantization unit 115 reduces the quantization step by dividing the quantization step by a protection ratio larger than 1 input from the code amount control unit 143, or divides the quantization step by a protection ratio smaller than 1. This enlarges the quantization step.
- the quantization step tentatively determined so as to achieve the required compression ratio is scaled by the protection ratio, but both the required compression ratio and the protection ratio are considered simultaneously. Thus, the quantization step may be determined. The same applies to other embodiments described below.
- FIG. 13B is a flowchart showing a second example of the flow of the quantization control process that can be executed in step S140 of FIG.
- the second example shows an example of the flow of quantization control processing for protecting the gradation of the high color difference portion in the image.
- the statistical calculation unit 141 calculates statistics regarding the strength of the color difference component of the target partial region (step S142).
- the statistics calculated here may include, for example, the average, median value, or mode value of the pixel values in the partial region for the color difference component.
- the statistical calculation unit 141 outputs the calculated statistics to the code amount control unit 143.
- the code amount control unit 143 determines the protection ratio corresponding to the color difference statistics of the target partial region input from the statistical calculation unit 141 by referring to the protection ratio table or using a function for calculating the protection ratio. (Step S145). Then, the code amount control unit 143 outputs the determined protection ratio to the quantization unit 115.
- the quantization unit 115 scales the quantization step determined in step S130 of FIG. 12 according to the protection ratio input from the code amount control unit 143 (step S147). For example, the quantization unit 115 reduces the quantization step by dividing the quantization step by a protection ratio larger than 1 input from the code amount control unit 143, or divides the quantization step by a protection ratio smaller than 1. This enlarges the quantization step.
- FIG. 13C is a flowchart showing a third example of the flow of the quantization control process that can be executed in step S140 of FIG.
- the third example shows an example of the flow of quantization control processing for protecting the gradations of both the high luminance part and the high color difference part in the image.
- the statistical calculation unit 141 calculates statistics regarding the intensity of the luminance component of the target partial region (step S141). Further, the statistical calculation unit 141 calculates statistics regarding the intensity of the color difference component of the target partial region (step S142). Then, the statistical calculation unit 141 outputs the calculated statistics to the code amount control unit 143.
- the code amount control unit 143 applies luminance-dependent protection to the target partial region based on the statistics of the target partial region (for example, a histogram of the color difference component) input from the statistical calculation unit 141, or the color difference. It is determined whether to apply protection depending on (step S143).
- the code amount control unit 143 determines that the luminance-dependent protection is applied to the target partial region
- the code amount control unit 143 refers to the protection ratio table or the protection ratio corresponding to the luminance statistics of the target partial region. This is determined by using a calculation function (step S144).
- the quantization unit 115 scales the quantization step according to the protection ratio input from the code amount control unit 143 based on the luminance statistics (step S148).
- the code amount control unit 143 determines that the protection depending on the color difference is applied to the target partial region, the code amount control unit 143 refers to the protection ratio table or calculates the protection ratio corresponding to the color difference statistics of the target partial region This is determined by using the function for (Step S145). Then, the quantization unit 115 scales the quantization step according to the protection ratio input from the code amount control unit 143 based on the color difference statistics (step S149).
- FIG. 14 is a block diagram illustrating a modification of the configuration of the image processing device according to the first embodiment.
- the image processing apparatus 100c illustrated in FIG. 14 includes, for example, the server apparatus 15 in the example of FIG. 6A, or the image processing apparatus 12 or the terminal apparatus 16 in the example of FIG. ).
- the image processing apparatus 100c includes a signal acquisition unit 101, an encoding unit 110, and a control unit 140c.
- the signal acquisition unit 101 acquires an input image signal generated based on a transfer function relating to conversion between light and an image signal.
- the input image signal acquired by the signal acquisition unit 101 is a signal that is converted from light by an HDR transfer function and is generated in an HDR signal format.
- the transfer function for HDR here may be, for example, a transfer function such as HLG, ST2084, or S-Log3 to enable display of video with a luminance higher than 100 nits.
- the encoding unit 110 encodes an image represented by the image signal input from the signal acquisition unit 101 to generate an encoded bit stream.
- the image processing apparatus 100c may include the signal processing unit 102 described with reference to FIG. 7B instead of the signal acquisition unit 101.
- control unit 140c determines the code amount allocated to each partial region of the image in the encoding unit 110 on the premise that the HDR transfer function is applied to the encoded image. Control is performed depending on at least one of the luminance component and the color difference component. More specifically, the control unit 140c determines the quantization step used for each partial region according to any of the embodiments described with reference to FIGS. 13A to 13C without determining the type of the transfer function. Can be controlled depending on at least one of the luminance component and the color difference component, thereby controlling the code amount allocated to each partial region.
- the scaling of the quantization step is realized by, for example, multiplication or division (for example, division by a protection ratio) of a parameter determined using a function having a code value of a color component as an argument as described with reference to FIG. 9A. May be. Instead, the quantization step scaling is realized by multiplication or division of the color component code values as previously described with reference to FIG. May be.
- FIG. 15 is a flowchart illustrating an example of the flow of the encoding control process according to the modification described with reference to FIG.
- the encoding control process described here may be repeated for each image constituting the video. Processing steps for obtaining or setting parameters that do not change across multiple images may be skipped in the second and subsequent iterations.
- description of processing steps not directly related to code amount control is omitted.
- the signal acquisition unit 101 or the signal processing unit 102 acquires an image signal to which an HDR transfer function relating to conversion between light and an image signal is applied (step S111).
- the image signal acquired here is output to the encoding unit 110.
- control unit 140c sets a protection ratio table or function used when setting the protection ratio for each partial area (step S115).
- the protection ratio table or function set here may be common across a plurality of transfer functions for HDR, or may differ depending on which of the plurality of transfer functions for HDR is applied. May be.
- the quantization unit 115 of the encoding unit 110 determines a quantization step for the target partial region so that a required compression rate is achieved (step S130).
- control unit 140c executes one of the quantization control processes described with reference to FIGS. 13A to 13C (step S140). Thereby, the quantization step of the partial region of interest determined in step S130 is scaled.
- the quantization unit 115 quantizes the transform coefficient data of the target partial region input from the orthogonal transform unit 114 in the quantization step after scaling (step S160).
- the lossless encoding unit 116 encodes the quantized data and the quantization parameter input from the quantization unit 115 to generate an encoded bit stream (step S170).
- Steps S130 to S170 are repeated until processing is completed for all partial areas in the picture (step S180).
- the encoding control process illustrated in FIG. 15 ends (step S190).
- the first transfer function of the first transfer function corresponding to the first dynamic range and the second transfer function corresponding to the second dynamic range narrower than the first dynamic range are used.
- the code amount assigned to each partial area can be controlled depending on at least one of the luminance component and the color difference component of the partial area.
- the transfer function corresponding to a wider dynamic range is applied to the allocation code amount determined regardless of the transfer function, the partial area depends on the strength of at least one color component. Can be adjusted for each.
- the encoder configuration designed or tuned on the assumption of a specific dynamic range is utilized for the extended dynamic range, it is possible to optimize the allocated code amount and reduce image quality degradation.
- the first dynamic range may be a dynamic range for enabling display with a luminance higher than 100 nits
- the second dynamic range is a dynamic range with an upper limit of 100 nits luminance. It's okay. Accordingly, an encoder designed for an existing SDR video can be used to encode an HDR video to which a transfer function such as HLG, ST2084, or S-Log3 is applied while preventing deterioration of image quality. It becomes possible.
- the code amount allocated to each partial area is controlled by scaling the quantization step depending on at least one of the luminance component and the color difference component of the partial area. For example, the gradation of an image can be better preserved by scaling the quantization step determined according to application requirements (such as the required compression ratio) to a smaller value. Also, by reducing the quantization step to a larger value for a partial region that has a relatively large amount of allocated code, it is possible to compensate for a decrease in encoding efficiency.
- the quantization step used for each partial area is assigned a larger code amount to the partial area where the intensity of at least one of the luminance component and the color difference component of the partial area is stronger.
- To be scaled As described above, for example, in the case of HDR, image information is compressed at a higher compression ratio than in the case of SDR, particularly in a portion where the code value is relatively large, and this is the case when displaying HDR video. This has caused the codec distortion to expand in the high luminance part and the high color difference part in the image.
- the quantization step in the partial area where the intensity of the color component is higher and raising the allocated code amount the codec distortion can be reduced and the gradation change in the original image can be reproduced appropriately. It becomes possible.
- the transfer function can be determined based on the input information related to the transfer function applied to the image.
- control based on the transfer function can be executed as desired by the user even when the transfer function cannot be determined from the input signal.
- control based on a transfer function can be automatically executed without requiring user input.
- the code amount allocated to each partial region of the image is It is controlled depending on at least one of the luminance component and the color difference component. As a result, it is possible to prevent the codec distortion from becoming conspicuous in the partial area of the image due to a shortage of the allocated code amount for expressing the gradation of the original signal.
- an encoder selects the best mode from the viewpoint of encoding efficiency from a plurality of selectable modes and encodes mode information indicating the selected mode when encoding an image. Transmit to the decoder.
- mode selection includes, for example, selection of a prediction mode (eg, prediction direction and prediction block size) in intra prediction, selection of a prediction mode (eg, motion vector, prediction block size and reference picture) in inter prediction, and intra prediction mode. And a prediction method selection between the inter prediction modes.
- Mode selection usually involves subtracting the cost that can correspond to the sum of the code amount generated from the prediction residual remaining after subtracting the predicted image data from the original image data and the code amount generated from the mode information as overhead as a plurality of candidates. This is done by evaluating across modes.
- a cost evaluation formula designed or tuned for SDR video is not necessarily optimal for HDR video. This is because the image information of the HDR video is more strongly compressed than the SDR video, and when the same evaluation formula is used, there is a difference between the code amount modes generated from the prediction residual. This is because it tends to be underestimated.
- the inventors have converted the image signal of the sample video expressed in the signal format for HDR into H.264. It is recognized that unpredictable biases often occur in the selected prediction mode when encoded with existing encoders compliant with H.264 / AVC. For example, when a prediction mode selected for each prediction block as a result of intra prediction for a certain image is analyzed, DC prediction (also referred to as average value prediction) is selected for an unnatural number of blocks over the entire image. was there. Such a bias in the prediction mode deteriorates the prediction accuracy, resulting in distortion scattered throughout the image under a required compression rate. The reason why the prediction mode is biased is that a uniform cost evaluation formula for mode selection is not suitable for HDR video. In particular, in the HDR case, as a result of the strong compression of the image information, the contribution of the prediction residual in the cost evaluation formula becomes small, and it is presumed that the contribution of the mode information is excessively dominant.
- RD (Rate Distortion) optimization based on Lagrange's undetermined multiplier method is known as a method for selecting the best mode from a plurality of candidate modes.
- the coding cost J i for the i th candidate mode may be described as:
- D i represents distortion (hereinafter, referred to as prediction residual code amount) generated in the image in the i-th candidate mode, and is usually the sum of absolute differences (SAD) between the original image and the predicted image. )be equivalent to.
- R i represents the code amount of overhead bits (for example, mode information indicating the prediction mode) generated in the i-th candidate mode.
- ⁇ is a coefficient depending on the quantization parameter QP.
- an offset value depending on QP may be added (or subtracted) instead of the coefficient ⁇ .
- the prediction residual code amount D i the value obtained by Hadamard converting the sum of absolute differences may be used.
- an overhead bit code amount term R i (hereinafter, mode code) It is useful to use a fixed value that is predefined for each candidate mode.
- mode code the same gradation difference in the image before signal conversion is compressed to a smaller code value difference in the HDR case compared to the SDR case. Then, the mode code amount R i optimized for the SDR video is too large to be included in the cost evaluation formula together with the prediction residual code amount D i generated in the HDR video.
- FIG. 16A and FIG. 16B are explanatory diagrams for explaining the influence of the difference in transfer function on mode selection.
- the horizontal axis of the graph shown in the upper part of FIG. 16A represents the pixel position in the horizontal direction on one line of the image.
- the vertical axis represents the code value of a color component in the pixel column on the line.
- the solid line in the graph represents the code value of the original image. In the example shown, the code value takes a large value in the left half of the line, decreases in the center of the line, and takes a small value in the right half of the line.
- a broken line in the left graph represents a code value of a predicted image that can be generated according to DC prediction, which is one of intra prediction modes.
- a broken line in the right graph represents a code value of a prediction image that can be generated according to diagonal prediction, which is another prediction mode of intra prediction.
- the area (shaded area in the figure) surrounded by the trace of the code value of the original image (solid line) and the trace of the code value of the predicted image (broken line) is the area when the respective prediction mode is selected. This corresponds to the prediction error.
- the prediction error of DC prediction is larger than the prediction error of diagonal direction prediction.
- a smaller mode code amount is given to DC prediction with a smaller mode number than diagonal prediction.
- diagonal direction prediction when the sum of the prediction error code amount and the mode code amount, that is, the cost is compared between the two prediction modes, the diagonal direction prediction has a lower cost value than the DC prediction. Therefore, in this case, diagonal direction prediction can be selected as the prediction mode for intra prediction.
- the solid line represents the code value of the original image.
- the transfer function for SDR was applied to the image
- the result that the transfer function for HDR was applied to the image was originally the same level.
- the key difference is compressed to a smaller code value difference. Therefore, the area of the portion surrounded by the trajectory of the code value of the original image (solid line) and the trajectory of the code value of the predicted image (broken line), that is, the prediction error (shaded hatched portion in the figure) is diagonally compared with the DC prediction. Compared with the direction prediction, the difference between these prediction errors is small.
- DC prediction can be selected as a prediction mode for intra prediction.
- the prediction residuals included in the cost evaluation formula are described.
- a method of controlling one of the difference code amount and the mode code amount based on a transfer function is proposed.
- the image processing system according to the present embodiment may be configured similarly to the image processing system 10a or 10b according to the first embodiment.
- an image processing device in a system, a server device or a terminal device, or an image processing module mounted on any of these devices encodes an image acquired based on a transfer function relating to conversion between light and an image signal. It has a function as an image processing apparatus (that is, an encoder).
- an encoder when the encoder encodes an image, the prediction residual code amount or mode code amount for mode selection is controlled based on the transfer function. Thereby, selection of an inappropriate mode when the signal format for HDR is used is avoided, and deterioration of image quality is reduced. From the next section, a specific and exemplary configuration of such an image processing apparatus will be described in detail.
- FIG. 17A is a block diagram illustrating a first example of a schematic configuration of the image processing apparatus according to the present embodiment.
- the image processing apparatus 200a illustrated in FIG. 17A includes a signal acquisition unit 201, an information acquisition unit 203, an encoding unit 210, and a control unit 240.
- the signal acquisition unit 201 acquires an input image signal generated based on a transfer function related to conversion between light and an image signal.
- the signal acquisition unit 201 may acquire an input image signal from an external device via a transmission interface, or input from an imaging module and a signal processing module (not shown) configured integrally with the image processing device 200a. An image signal may be acquired.
- the information acquisition unit 203 acquires input information related to the transfer function applied to the image encoded by the encoding unit 210.
- the information acquisition unit 203 may acquire input information via a user interface included in the image processing apparatus 200a.
- the user interface may be provided by a physical input device such as a touch panel, a button, or a switch provided in the casing of the image processing apparatus 200a.
- the user interface may be provided as a GUI on a terminal device that is remotely connected via a communication interface.
- the input information includes at least a transfer function type indicating the type of transfer function applied to an image to be encoded.
- the user interface may cause the user to select one of the two options “SDR” and “HDR” to be applied to the image. In this case, it is determined that one predefined transfer function for SDR or one predefined transfer function for HDR is applied to the image.
- the user interface may allow the user to select a transfer function to be applied to the image from a plurality of transfer function candidates.
- the information acquisition unit 203 may acquire input information from an auxiliary signal multiplexed with an input image signal.
- the auxiliary signal is received by the signal acquisition unit 201 during a period in which no image signal is transmitted on the signal line. Then, the information acquisition unit 203 can acquire input information including a transfer function type indicating the type of transfer function applied to the image from the auxiliary signal separated in the signal acquisition unit 201.
- the encoding unit 210 encodes an image represented by the image signal acquired by the signal acquisition unit 201 to generate an encoded bitstream.
- the encoding unit 210 is, for example, MPEG-2, H.264, or the like. H.264 / AVC or H.264
- the encoding process may be executed according to any video encoding method such as H.265 / HEVC.
- the encoding process executed by the encoding unit 210 typically includes various processes such as prediction, orthogonal transform, quantization, and entropy encoding, and various mode selections are executed in these processes.
- mode selection in intra prediction and inter prediction will be mainly described here, the idea of the present embodiment is any type such as selection of transform block size or prediction mode of inter layer prediction for scalable coding. It may be used for mode selection.
- the control unit 240 is a prediction residual code amount or mode code amount for mode selection when encoding an image in the encoding unit 210.
- the control unit 240 is included in the cost evaluation formula so that one of the prediction residual and the mode code amount is neither underestimated nor overestimated in the cost evaluation formula for mode selection. Switching at least one term based on the transfer function.
- cost evaluation is controlled mainly based on the type of transfer function is described here, cost evaluation is controlled based on other attributes of the transfer function such as the upper limit of the dynamic range associated with the transfer function. May be.
- FIG. 17B is a block diagram illustrating a second example of a schematic configuration of the image processing apparatus according to the present embodiment.
- the image processing apparatus 200b illustrated in FIG. 17B includes a signal processing unit 202, an information acquisition unit 204, a storage unit 207, an encoding unit 210, and a control unit 240.
- the signal processing unit 202 acquires a primitive image signal input from the imaging device via some transmission interface or a signal line inside the device, or acquires an image signal from a video file stored in the storage unit 207. . Then, the signal processing unit 202 performs digital signal processing that can include, for example, gamma correction and color conversion on the primitive image signal, and generates an image signal to be encoded in a predetermined signal format. The signal format applied to the image by the signal processing unit 202 and the corresponding transfer function are determined based on the input information acquired by the information acquisition unit 204. Then, the signal processing unit 202 outputs the generated image signal to the encoding unit 210.
- the information acquisition unit 204 acquires input information related to a transfer function applied to the image encoded by the encoding unit 210.
- the information acquisition unit 204 may acquire input information via a user interface included in the image processing apparatus 200b.
- the input information includes at least a transfer function type indicating the type of transfer function applied to an image to be encoded.
- the user interface may cause the user to select one of the two options “SDR” and “HDR” to be applied to the image.
- the user interface may allow the user to select a transfer function to be applied to the image from a plurality of transfer function candidates.
- the storage unit 207 is a storage device for storing various video data.
- the storage unit 207 may store a video file that records a digital image signal before signal conversion.
- the user may store the video file stored in another storage medium in the storage unit 207 via an input / output interface (not shown) included in the image processing apparatus 200b.
- the storage unit 207 may store a video file including an encoded bit stream generated as a result of the encoding process executed by the encoding unit 210.
- the video file may be output to an external device upon request.
- the encoding unit 210 encodes an image represented by the image signal acquired by the signal processing unit 202 to generate an encoded bitstream. Based on the type of transfer function indicated by the input information acquired by the information acquisition unit 204, the control unit 240 uses a prediction residual code amount or mode for mode selection when encoding an image in the encoding unit 210. Control the amount of code.
- the encoded bit stream generated by the encoding unit 210 may be transmitted to a device external to the image processing device 200b, or may be stored as a video file by the storage unit 207.
- FIG. 18 is a block diagram illustrating an example of a detailed configuration of the encoding unit and the control unit according to the second embodiment.
- the encoding unit 210 includes a rearrangement buffer 211, a block setting unit 212, a subtraction unit 213, an orthogonal transformation unit 214, a quantization unit 215, a lossless encoding unit 216, and an inverse quantum.
- the rearrangement buffer 211 rearranges the image data of a series of images expressed by the input image signal according to the GOP structure.
- the rearrangement buffer 211 outputs the rearranged image data to the block setting unit 212, the intra prediction unit 230, and the inter prediction unit 235.
- the block setting unit 212 divides each image corresponding to a picture into a plurality of blocks.
- MPEG-2 and H.264 In H.264 / AVC a picture is divided into a plurality of macroblocks having a fixed size in a grid pattern.
- H. In H.265 / HEVC a picture is divided into a plurality of coding units having a variable size in a quadtree shape. These blocks may be further divided into one or more prediction blocks in the prediction process.
- the subtraction unit 213 calculates prediction residual data that is the difference between the image data input from the block setting unit 212 and the predicted image data, and outputs the prediction residual data to the orthogonal transform unit 214.
- the orthogonal transform unit 214 transforms the prediction residual data input from the subtraction unit 213 from image data in the spatial domain to transform coefficient data in the frequency domain.
- the orthogonal transformation executed by the orthogonal transformation unit 214 may be, for example, discrete cosine transformation or discrete sine transformation. Then, orthogonal transform section 214 outputs transform coefficient data to quantization section 215.
- the quantization unit 215 quantizes the transform coefficient data input from the orthogonal transform unit 214 in a quantization step that is determined so that a required compression rate is achieved. Then, the quantization unit 215 outputs the quantized transform coefficient data (hereinafter referred to as quantization data) to the lossless encoding unit 216 and the inverse quantization unit 221.
- the lossless encoding unit 216 generates an encoded bitstream by encoding the quantized data input from the quantization unit 215. Further, the lossless encoding unit 216 encodes various parameters referred to by the decoder, and inserts the encoded parameters into the encoded bitstream.
- the parameters encoded by the lossless encoding unit 216 may include information regarding transfer functions, information regarding color gamuts, information regarding intra prediction, and information regarding inter prediction.
- the lossless encoding unit 216 outputs the generated encoded bitstream to an output destination according to the purpose of the application.
- the inverse quantization unit 221, the inverse orthogonal transform unit 222, and the addition unit 223 constitute a local decoder.
- the local decoder is responsible for reconstructing the original image from the encoded data.
- the inverse quantization unit 221 dequantizes the quantized data in the same quantization step as that used by the quantization unit 215, and restores transform coefficient data. Then, the inverse quantization unit 221 outputs the restored transform coefficient data to the inverse orthogonal transform unit 222.
- the inverse orthogonal transform unit 222 restores the prediction residual data by executing an inverse orthogonal transform process on the transform coefficient data input from the inverse quantization unit 221. Then, the inverse orthogonal transform unit 222 outputs the restored prediction residual data to the addition unit 223.
- the adding unit 223 generates decoded image data by adding the restored prediction residual data input from the inverse orthogonal transform unit 222 and the predicted image data generated by the intra prediction unit 230 or the inter prediction unit 235. To do. Then, the adding unit 223 outputs the generated decoded image data to the loop filter 224 and the frame memory 226.
- the loop filter 224 is an in-loop filter for the purpose of improving the image quality of the decoded image.
- the loop filter 224 may include a deblocking filter for reducing block distortion appearing in the decoded image, for example.
- the loop filter 224 may include an adaptive offset filter for adding an edge offset or a band offset to the decoded image.
- the loop filter 224 outputs the decoded image data after filtering to the frame memory 226.
- the frame memory 226 stores the decoded image data before filtering input from the adder 223 and the decoded image data after application of the in-loop filter input from the loop filter 224.
- the switch 227 reads decoded image data before filtering used for intra prediction from the frame memory 226, and supplies the read decoded image data to the intra prediction unit 230 as reference image data. Further, the switch 227 reads out the decoded image data after filtering used for inter prediction from the frame memory 226, and supplies the read out decoded image data to the inter prediction unit 235 as reference image data.
- the mode selection unit 228 selects a prediction method for each block based on the cost comparison input from the intra prediction unit 230 and the inter prediction unit 235.
- the mode selection unit 228 outputs predicted image data generated by the intra prediction unit 230 to the subtraction unit 213 and outputs information related to the intra prediction to the lossless encoding unit 216 for the block for which the intra prediction has been selected.
- the mode selection unit 228 outputs the prediction image data generated by the inter prediction unit 235 to the subtraction unit 213 and outputs information related to inter prediction to the lossless encoding unit 216.
- the intra prediction unit 230 executes an intra prediction process based on the original image data and the decoded image data. For example, the intra prediction unit 230 evaluates a cost estimated to occur for each of a plurality of candidate modes included in the search range. The cost is evaluated according to, for example, the cost evaluation formula (1) described above or a similar evaluation formula. Typically, the cost evaluation formula includes a prediction residual code amount term and a mode code amount term. In the present embodiment, at least one of the term of the prediction residual code amount and the term of the mode code amount is controlled based on the type of transfer function by the cost control unit 241 described later. The intra prediction unit 230 selects, as the best prediction mode, the prediction mode with the lowest cost based on the cost evaluation results over a plurality of candidate modes.
- the intra prediction unit 230 generates predicted image data according to the selected best prediction mode. Then, the intra prediction unit 230 outputs information related to intra prediction including prediction mode information indicating the best prediction mode (prediction direction, prediction block size, and the like), the corresponding cost, and prediction image data to the mode selection unit 228. .
- the inter prediction unit 235 performs inter prediction processing (motion compensation) based on the original image data and the decoded image data. For example, the inter prediction unit 235 evaluates the cost estimated to occur for each of a plurality of candidate modes included in the search range. Similar to the case of intra prediction, cost evaluation is typically performed according to a cost evaluation formula including a prediction residual code amount term and a mode code amount term. In the present embodiment, at least one of the term of the prediction residual code amount and the term of the mode code amount is controlled based on the type of transfer function by the cost control unit 241 described later. The inter prediction unit 235 selects the prediction mode with the lowest cost as the best prediction mode based on the cost evaluation results over the plurality of candidate modes.
- inter prediction processing motion compensation
- the inter prediction unit 235 generates predicted image data according to the selected best prediction mode. Then, the inter prediction unit 235 outputs information related to inter prediction including prediction mode information indicating the best prediction mode (motion vector, reference picture, prediction block size, and the like), a corresponding cost, and prediction image data as a mode selection unit 228. Output to.
- the control unit 240 includes a cost control unit 241 and a set value storage unit 243.
- the cost control unit 241 determines the type of transfer function applied to the image to be encoded based on the input information input from the information acquisition unit 203 or 204. Then, the cost control unit 241 controls cost evaluation for mode selection in one or more sections of the encoding unit 210 based on the determined type of transfer function. More specifically, for example, the cost control unit 241 scales one of the prediction residual code amount and the mode code amount included in the cost evaluation formula to thereby contribute the prediction residual contribution and the mode in the cost evaluation. The balance between information contribution can be adjusted.
- the contribution of the mode code amount to the mode selection is compared with the contribution of the prediction residual code amount.
- the optimum mode is determined in a state where the fluctuation of the prediction residual code amount is underestimated.
- the variation of the mode code amount is small relative to the variation of the prediction residual code amount across a plurality of candidate modes, the contribution of the mode code amount to the mode selection is less than the contribution of the prediction residual code amount.
- the optimum mode is determined in a state in which the fluctuation of the mode code amount is underestimated. For this reason, it is beneficial to appropriately adjust the contribution of these code amounts, optimize the balance between the two, and perform an appropriate cost evaluation.
- the code amount scaling may be performed by setting a set to be used in cost evaluation among a plurality of predefined cost value sets depending on the type of transfer function. Since the prediction residual cannot be defined in advance, in the first example, the mode code amount (for example, the term R i in Equation (1)) can be defined for each candidate mode.
- a first set of cost values is set when a first transfer function is applied to the image, and a second set of cost values is set when a second transfer function is applied to the image. Can be done.
- the set of cost values that is set when a first transfer function (eg, a transfer function for HDR) corresponding to the first dynamic range is applied to the image is a narrower second
- a smaller mode code amount is defined as compared with a set of cost values set when the second transfer function corresponding to the dynamic range (transfer function for SDR) is applied to the image. Accordingly, the mode code amount can be reduced in accordance with the reduction of the estimated prediction residual, and appropriate cost evaluation can be performed with a good balance.
- the code amount scaling may be performed by adding (or subtracting) an offset corresponding to the type of transfer function, or multiplying (or subtracting) a coefficient corresponding to the type of transfer function.
- the offset or coefficient may be applied to either the prediction residual code amount or the mode code amount.
- the cost control unit 241 may increase the prediction error code amount or decrease the mode code amount when an HDR transfer function is applied to an image. Further, the cost control unit 241 may decrease the prediction error code amount or increase the mode code amount when the transfer function for SDR is applied to the image.
- FIG. 19 is an explanatory diagram for describing an example of mode code amount switching based on the type of transfer function according to the first example described above.
- the set value storage unit 243 stores such cost value sets C1 and C2 defined in advance.
- the mode code amount Ri, HDR included in the cost value set C2 is smaller than the mode code amount Ri, SDR included in the cost value set C1.
- the set value storage unit 243 may store three or more cost value sets respectively associated with three or more transfer function types.
- the cost control unit 241 thus selects one of the plurality of cost value sets associated with the plurality of different transfer functions corresponding to the type of the transfer function as one or more of the encoding unit 210 that performs mode selection. It can be set in sections X1 to Xn.
- the setting value storage unit 243 is not limited to the example of FIG. 19, and the setting value storage unit 243 uses one or more transfer functions as parameters (for example, offset or coefficient) used when scaling the prediction residual code amount or the mode code amount. It may be stored in advance in association with.
- the encoding unit 210 includes the intra prediction unit 230 that performs intra prediction.
- the prediction residual code amount or mode code amount controlled by the cost control unit 241 may be used by the intra prediction unit 230 when selecting a mode from a plurality of candidate modes in intra prediction.
- the encoding unit 210 includes an inter prediction unit 235 that performs inter prediction.
- the prediction residual code amount or mode code amount controlled by the cost control unit 241 may be used by the inter prediction unit 235 when selecting a mode from a plurality of candidate modes in inter prediction.
- the encoding unit 210 includes a mode selection unit 228 that executes selection of a prediction method that is intra prediction or inter prediction.
- the prediction residual code amount or the mode code amount controlled by the cost control unit 241 may be used by the mode selection unit 228 when selecting such a prediction method.
- FIG. 20 is a flowchart illustrating an example of the flow of the encoding control process according to the present embodiment.
- the encoding control process described here may be repeated for each image constituting the video. Processing steps for obtaining or setting parameters that do not change across multiple images may be skipped in the second and subsequent iterations.
- description of processing steps not directly related to mode selection control is omitted.
- the signal acquisition unit 201 or the signal processing unit 202 acquires an image signal generated based on a transfer function related to conversion between light and an image signal (step S210).
- the image signal acquired here is output to the encoding unit 210.
- the information acquisition unit 203 or 204 acquires input information related to the transfer function applied to the image encoded by the encoding unit 210 via the user interface or from the auxiliary signal multiplexed with the input image signal. (Step S212).
- the input information acquired here is output to the control unit 240.
- the cost control unit 241 sets a parameter to be used when the mode is selected in the encoding unit 210 based on the type of transfer function indicated by the input information (step S214).
- the parameter set here may be a set of mode code amounts defined in advance for each candidate mode, or may be an offset or a coefficient applied to the prediction residual code amount or the mode code amount. .
- a block to be processed in each iteration is referred to as a target block here.
- the intra prediction unit 230 of the encoding unit 210 evaluates the cost over the plurality of candidate modes for the block of interest, and selects the best intra prediction mode based on the cost evaluation of these candidate modes (step S220).
- the cost evaluation here can be performed using a cost evaluation formula that includes the prediction residual code amount and the mode code amount.
- the mode code amount is selected from a set of cost values set by the cost control unit 241.
- one of the prediction residual code amount and the mode code amount is scaled using a parameter set by the cost control unit 241.
- the inter prediction unit 235 evaluates the cost over a plurality of candidate modes for the block of interest, and selects the best inter prediction mode based on the cost evaluation of these candidate modes (step S230).
- the cost evaluation here can also be performed using a cost evaluation formula that includes the prediction residual code amount and the mode code amount.
- the mode code amount is selected from a set of cost values set by the cost control unit 241.
- one of the prediction residual code amount and the mode code amount is scaled using a parameter set by the cost control unit 241.
- the mode selection unit 228 selects a prediction method that realizes better coding efficiency among intra prediction and inter prediction for the block of interest (step S240).
- the selection of the prediction method here is also performed based on the cost evaluation.
- the mode selection unit 228 may reuse the cost evaluation derived in the intra prediction unit 230 and the inter prediction unit 235.
- the mode selection unit 228 may recalculate a cost value for comparison between intra prediction and inter prediction. Further, for recalculation of the cost value by the mode selection unit 228, a set of cost values different from those used in step S220 and step S230 may be adopted.
- Steps S220 to S240 are repeated until processing is completed for all blocks in the picture (step S280).
- the encoding control process shown in FIG. 20 ends (step S290).
- the mode code amount can be controlled so that the mode code amount is smaller than when the function is applied to an image. According to this configuration, a cost evaluation formula tuned on the assumption of a specific dynamic range can be easily reused for the extended dynamic range.
- the number of candidate modes is unique to the video coding scheme and does not change. Therefore, it is possible to adopt a technique with a low processing cost in which a plurality of sets of mode code amounts for each candidate mode are defined in advance and a set to be used is switched based on a transfer function.
- the mode code amount can be controlled by scaling the prediction residual code amount or the mode code amount when the first transfer function corresponding to the first dynamic range is applied to the image. .
- the first transfer function for example, the transfer function for HDR
- the second transfer function different from the first transfer function is provided.
- the process optimized for the second transfer function eg, cost evaluation with an existing evaluation formula optimized for the transfer function for SDR
- the first dynamic range may be a dynamic range for enabling display with a luminance higher than 100 nits
- the second dynamic range is a dynamic range with an upper limit of 100 nits luminance. It's okay. Accordingly, an encoder designed for an existing SDR video can be used to encode an HDR video to which a transfer function such as HLG, ST2084, or S-Log3 is applied while preventing deterioration of image quality. It becomes possible.
- Hardware configuration example> The embodiments described up to the previous section may be implemented using any of software, hardware, and a combination of software and hardware.
- a program constituting the software is, for example, a storage medium (non-transitory media) provided inside or outside the apparatus. Stored in advance. Each program is read into a RAM (Random Access Memory) at the time of execution and executed by a processor such as a CPU (Central Processing Unit).
- RAM Random Access Memory
- FIG. 21 is a block diagram illustrating an example of a hardware configuration of an apparatus to which one or more of the above-described embodiments can be applied.
- the image processing apparatus 900 includes a system bus 910, an image processing chip 920, and an off-chip memory 990.
- the image processing chip 920 includes n (n is 1 or more) processing circuits 930-1, 930-2,..., 930-n, a reference buffer 940, a system bus interface 950, and a local bus interface 960.
- the system bus 910 provides a communication path between the image processing chip 920 and an external module (for example, a central control function, an application function, a communication interface, or a user interface).
- the processing circuits 930-1, 930-2,..., 930-n are connected to the system bus 910 via the system bus interface 950 and to the off-chip memory 990 via the local bus interface 960.
- the processing circuits 930-1, 930-2,..., 930-n can also access a reference buffer 940, which can correspond to an on-chip memory (eg, SRAM).
- the off-chip memory 990 may be a frame memory that stores image data processed by the image processing chip 920, for example.
- the processing circuit 930-1 may be used for converting an image signal
- the processing circuit 930-2 may be used for encoding an image signal. Note that these processing circuits may be formed not on the same image processing chip 920 but on separate chips.
- the technology according to the present disclosure can be applied to various products.
- the technology according to the present disclosure may be applied to an operating room system as described in this section.
- FIG. 22 is a diagram schematically showing an overall configuration of an operating room system 5100 to which the technology according to the present disclosure can be applied.
- the operating room system 5100 is configured by connecting a group of devices installed in the operating room so as to cooperate with each other via an audio-visual controller (AV Controller) 5107 and an operating room control device 5109.
- AV Controller audio-visual controller
- FIG. 22 various devices can be installed in the operating room.
- various apparatus groups 5101 for endoscopic surgery a ceiling camera 5187 provided on the ceiling of the operating room and imaging the operator's hand, and an operating room provided on the operating room ceiling.
- An operating field camera 5189 that images the entire situation, a plurality of display devices 5103A to 5103D, a recorder 5105, a patient bed 5183, and an illumination 5191 are illustrated.
- the device group 5101 belongs to an endoscopic surgery system 5113 described later, and includes an endoscope, a display device that displays an image captured by the endoscope, and the like.
- Each device belonging to the endoscopic surgery system 5113 is also referred to as a medical device.
- the display devices 5103A to 5103D, the recorder 5105, the patient bed 5183, and the illumination 5191 are devices provided in an operating room, for example, separately from the endoscopic surgery system 5113.
- These devices that do not belong to the endoscopic surgery system 5113 are also referred to as non-medical devices.
- the audiovisual controller 5107 and / or the operating room control device 5109 controls the operations of these medical devices and non-medical devices in cooperation with each other.
- the audiovisual controller 5107 comprehensively controls processing related to image display in medical devices and non-medical devices.
- the device group 5101, the ceiling camera 5187, and the surgical field camera 5189 have a function of transmitting information to be displayed during surgery (hereinafter also referred to as display information). It may be a device (hereinafter also referred to as a source device).
- Display devices 5103A to 5103D can be devices that output display information (hereinafter also referred to as output destination devices).
- the recorder 5105 may be a device that corresponds to both a transmission source device and an output destination device.
- the audiovisual controller 5107 controls the operation of the transmission source device and the output destination device, acquires display information from the transmission source device, and transmits the display information to the output destination device for display or recording.
- the display information includes various images captured during the operation, various types of information related to the operation (for example, patient physical information, past examination results, information on a surgical procedure, and the like).
- the audiovisual controller 5107 can transmit information about the image of the surgical site in the patient's body cavity captured by the endoscope from the device group 5101 as display information.
- information about the image at hand of the surgeon captured by the ceiling camera 5187 can be transmitted from the ceiling camera 5187 as display information.
- information about an image showing the entire operating room imaged by the operating field camera 5189 can be transmitted from the operating field camera 5189 as display information.
- the audiovisual controller 5107 acquires information about an image captured by the other device from the other device as display information. May be.
- information about these images captured in the past is recorded by the audiovisual controller 5107 in the recorder 5105.
- the audiovisual controller 5107 can acquire information about the image captured in the past from the recorder 5105 as display information.
- the recorder 5105 may also record various types of information related to surgery in advance.
- the audiovisual controller 5107 displays the acquired display information (that is, images taken during the operation and various information related to the operation) on at least one of the display devices 5103A to 5103D that are output destination devices.
- the display device 5103A is a display device that is suspended from the ceiling of the operating room
- the display device 5103B is a display device that is installed on the wall surface of the operating room
- the display device 5103C is installed in the operating room.
- the display device 5103D is a mobile device (for example, a tablet PC (Personal Computer)) having a display function.
- the operating room system 5100 may include a device outside the operating room.
- the device outside the operating room can be, for example, a server connected to a network constructed inside or outside the hospital, a PC used by medical staff, a projector installed in a conference room of the hospital, or the like.
- the audio-visual controller 5107 can display the display information on a display device of another hospital via a video conference system or the like for telemedicine.
- the operating room control device 5109 comprehensively controls processing other than processing related to image display in non-medical devices.
- the operating room control device 5109 controls the driving of the patient bed 5183, the ceiling camera 5187, the operating field camera 5189, and the illumination 5191.
- the operating room system 5100 is provided with a centralized operation panel 5111, and the user gives an instruction for image display to the audiovisual controller 5107 via the centralized operation panel 5111, or the operating room control apparatus 5109. An instruction about the operation of the non-medical device can be given.
- the central operation panel 5111 is configured by providing a touch panel on the display surface of the display device.
- FIG. 23 is a diagram showing a display example of an operation screen on the centralized operation panel 5111.
- an operation screen corresponding to a case where the operating room system 5100 is provided with two display devices as output destination devices is shown.
- the operation screen 5193 is provided with a transmission source selection area 5195, a preview area 5197, and a control area 5201.
- a transmission source device provided in the operating room system 5100 and a thumbnail screen representing display information of the transmission source device are displayed in association with each other. The user can select display information to be displayed on the display device from any of the transmission source devices displayed in the transmission source selection area 5195.
- the preview area 5197 displays a preview of the screen displayed on the two display devices (Monitor 1 and Monitor 2) that are output destination devices.
- four images are displayed as PinP on one display device.
- the four images correspond to display information transmitted from the transmission source device selected in the transmission source selection area 5195. Of the four images, one is displayed as a relatively large main image, and the remaining three are displayed as a relatively small sub image. The user can switch the main image and the sub image by appropriately selecting an area in which four images are displayed.
- a status display area 5199 is provided below the area where the four images are displayed, and the status relating to the surgery (for example, the elapsed time of the surgery, the patient's physical information, etc.) is appropriately displayed in the area. obtain.
- a GUI (Graphical User Interface) part for displaying a GUI (Graphical User Interface) part for operating the source apparatus and a GUI part for operating the output destination apparatus are displayed.
- the transmission source operation area 5203 is provided with GUI parts for performing various operations (panning, tilting, and zooming) on the camera in the transmission source device having an imaging function. The user can operate the operation of the camera in the transmission source device by appropriately selecting these GUI components.
- the transmission source device selected in the transmission source selection area 5195 is a recorder (that is, in the preview area 5197, images recorded in the past are displayed on the recorder).
- a GUI component for performing operations such as playback, stop playback, rewind, and fast forward of the image can be provided in the transmission source operation area 5203.
- GUI parts for performing various operations are provided. Is provided. The user can operate the display on the display device by appropriately selecting these GUI components.
- the operation screen displayed on the centralized operation panel 5111 is not limited to the example shown in the figure, and the user can use the audiovisual controller 5107 and the operating room control device 5109 provided in the operating room system 5100 via the centralized operation panel 5111. Operation input for each device that can be controlled may be possible.
- FIG. 24 is a diagram showing an example of a state of surgery to which the operating room system described above is applied.
- the ceiling camera 5187 and the operating field camera 5189 are provided on the ceiling of the operating room, and can photograph the state of the operator (doctor) 5181 who performs treatment on the affected part of the patient 5185 on the patient bed 5183 and the entire operating room. It is.
- the ceiling camera 5187 and the surgical field camera 5189 may be provided with a magnification adjustment function, a focal length adjustment function, a photographing direction adjustment function, and the like.
- the illumination 5191 is provided on the ceiling of the operating room and irradiates at least the hand of the operator 5181.
- the illumination 5191 may be capable of appropriately adjusting the irradiation light amount, the wavelength (color) of the irradiation light, the light irradiation direction, and the like.
- Endoscopic surgery system 5113, patient bed 5183, ceiling camera 5187, operating field camera 5189, and illumination 5191 are connected via audiovisual controller 5107 and operating room controller 5109 (not shown in FIG. 24) as shown in FIG. Are connected to each other.
- a centralized operation panel 5111 is provided in the operating room. As described above, the user can appropriately operate these devices existing in the operating room via the centralized operating panel 5111.
- an endoscopic surgery system 5113 includes an endoscope 5115, other surgical tools 5131, a support arm device 5141 that supports the endoscope 5115, and various devices for endoscopic surgery. And a cart 5151 on which is mounted.
- trocars 5139a to 5139d are punctured into the abdominal wall. Then, the lens barrel 5117 of the endoscope 5115 and other surgical tools 5131 are inserted into the body cavity of the patient 5185 from the trocars 5139a to 5139d.
- an insufflation tube 5133, an energy treatment tool 5135, and forceps 5137 are inserted into the body cavity of the patient 5185.
- the energy treatment instrument 5135 is a treatment instrument that performs incision and detachment of a tissue, blood vessel sealing, and the like by a high-frequency current and ultrasonic vibration.
- the illustrated surgical tool 5131 is merely an example, and as the surgical tool 5131, for example, various surgical tools generally used in endoscopic surgery such as a lever and a retractor may be used.
- An image of the surgical site in the body cavity of the patient 5185 taken by the endoscope 5115 is displayed on the display device 5155.
- the surgeon 5181 performs a treatment such as excision of the affected part using the energy treatment tool 5135 and the forceps 5137 while viewing the image of the surgical part displayed on the display device 5155 in real time.
- the pneumoperitoneum tube 5133, the energy treatment tool 5135, and the forceps 5137 are supported by an operator 5181 or an assistant during surgery.
- the support arm device 5141 includes an arm portion 5145 extending from the base portion 5143.
- the arm portion 5145 includes joint portions 5147a, 5147b, and 5147c, and links 5149a and 5149b, and is driven by control from the arm control device 5159.
- the endoscope 5115 is supported by the arm unit 5145, and its position and posture are controlled. Thereby, the stable position fixing of the endoscope 5115 can be realized.
- the endoscope 5115 includes a lens barrel 5117 in which a region having a predetermined length from the distal end is inserted into the body cavity of the patient 5185, and a camera head 5119 connected to the proximal end of the lens barrel 5117.
- an endoscope 5115 configured as a so-called rigid mirror having a rigid lens barrel 5117 is illustrated, but the endoscope 5115 is configured as a so-called flexible mirror having a flexible lens barrel 5117. Also good.
- An opening into which an objective lens is fitted is provided at the tip of the lens barrel 5117.
- a light source device 5157 is connected to the endoscope 5115, and the light generated by the light source device 5157 is guided to the tip of the lens barrel by a light guide extending inside the lens barrel 5117, and the objective Irradiation is performed toward the observation target in the body cavity of the patient 5185 through the lens.
- the endoscope 5115 may be a direct endoscope, a perspective mirror, or a side endoscope.
- An optical system and an image sensor are provided inside the camera head 5119, and reflected light (observation light) from the observation target is condensed on the image sensor by the optical system. Observation light is photoelectrically converted by the imaging element, and an electrical signal corresponding to the observation light, that is, an image signal corresponding to the observation image is generated.
- the image signal is transmitted to a camera control unit (CCU) 5153 as RAW data.
- CCU camera control unit
- the camera head 5119 has a function of adjusting the magnification and the focal length by appropriately driving the optical system.
- a plurality of image sensors may be provided in the camera head 5119 in order to cope with, for example, stereoscopic viewing (3D display).
- a plurality of relay optical systems are provided inside the lens barrel 5117 in order to guide observation light to each of the plurality of imaging elements.
- the CCU 5153 includes a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and the like, and comprehensively controls the operations of the endoscope 5115 and the display device 5155. Specifically, the CCU 5153 performs various image processing for displaying an image based on the image signal, such as development processing (demosaic processing), for example, on the image signal received from the camera head 5119. The CCU 5153 provides the display device 5155 with the image signal subjected to the image processing. Further, the audiovisual controller 5107 shown in FIG. 22 is connected to the CCU 5153. The CCU 5153 also provides an image signal subjected to image processing to the audiovisual controller 5107.
- a CPU Central Processing Unit
- GPU Graphics Processing Unit
- the CCU 5153 transmits a control signal to the camera head 5119 to control the driving thereof.
- the control signal can include information regarding imaging conditions such as magnification and focal length. Information regarding the imaging conditions may be input via the input device 5161 or may be input via the above-described centralized operation panel 5111.
- the display device 5155 displays an image based on an image signal subjected to image processing by the CCU 5153 under the control of the CCU 5153.
- the endoscope 5115 is compatible with high-resolution imaging such as 4K (horizontal pixel number 3840 ⁇ vertical pixel number 2160) or 8K (horizontal pixel number 7680 ⁇ vertical pixel number 4320), and / or 3D display.
- high-resolution imaging such as 4K (horizontal pixel number 3840 ⁇ vertical pixel number 2160) or 8K (horizontal pixel number 7680 ⁇ vertical pixel number 4320), and / or 3D display.
- a display device 5155 capable of high-resolution display and / or 3D display can be used.
- 4K or 8K high resolution imaging a more immersive feeling can be obtained by using a display device 5155 having a size of 55 inches or more.
- a plurality of display devices 5155 having different resolutions and sizes may be provided depending on applications.
- the light source device 5157 is composed of a light source such as an LED (light emitting diode), for example, and supplies the endoscope 5115 with irradiation light when photographing a surgical site.
- a light source such as an LED (light emitting diode)
- the arm control device 5159 is configured by a processor such as a CPU, for example, and operates according to a predetermined program to control driving of the arm portion 5145 of the support arm device 5141 according to a predetermined control method.
- the input device 5161 is an input interface to the endoscopic surgery system 5113.
- a user can input various information and instructions to the endoscopic surgery system 5113 via the input device 5161.
- the user inputs various types of information related to the operation, such as the patient's physical information and information about the surgical technique, via the input device 5161.
- the user instructs to drive the arm unit 5145 via the input device 5161 or an instruction to change the imaging conditions (type of irradiation light, magnification, focal length, etc.) by the endoscope 5115.
- An instruction to drive the energy treatment instrument 5135 is input.
- the type of the input device 5161 is not limited, and the input device 5161 may be various known input devices.
- the input device 5161 for example, a mouse, a keyboard, a touch panel, a switch, a foot switch 5171 and / or a lever can be applied.
- the touch panel may be provided on the display surface of the display device 5155.
- the input device 5161 is a device worn by the user, such as a glasses-type wearable device or an HMD (Head Mounted Display), for example, and various inputs according to the user's gesture and line of sight detected by these devices. Is done.
- the input device 5161 includes a camera capable of detecting a user's movement, and various inputs are performed according to a user's gesture and line of sight detected from an image captured by the camera.
- the input device 5161 includes a microphone that can pick up the voice of the user, and various inputs are performed by voice through the microphone.
- the input device 5161 is configured to be able to input various types of information without contact, so that a user belonging to the clean area (for example, an operator 5181) operates a device belonging to the unclean area without contact. Is possible.
- a user belonging to the clean area for example, an operator 5181
- the user can operate the device without releasing his / her hand from the surgical tool he / she has, the convenience for the user is improved.
- the treatment instrument control device 5163 controls driving of the energy treatment instrument 5135 for tissue cauterization, incision, blood vessel sealing, or the like.
- the pneumoperitoneum device 5165 passes gas into the body cavity via the pneumothorax tube 5133.
- the recorder 5167 is an apparatus capable of recording various types of information related to surgery.
- the printer 5169 is a device that can print various types of information related to surgery in various formats such as text, images, or graphs.
- the support arm device 5141 includes a base portion 5143 which is a base, and an arm portion 5145 extending from the base portion 5143.
- the arm portion 5145 includes a plurality of joint portions 5147a, 5147b, and 5147c and a plurality of links 5149a and 5149b connected by the joint portions 5147b.
- FIG. The structure of the arm part 5145 is shown in a simplified manner. Actually, the shape, number and arrangement of the joint portions 5147a to 5147c and the links 5149a and 5149b, the direction of the rotation axis of the joint portions 5147a to 5147c, and the like are appropriately set so that the arm portion 5145 has a desired degree of freedom. obtain.
- the arm portion 5145 can be preferably configured to have six or more degrees of freedom. Accordingly, the endoscope 5115 can be freely moved within the movable range of the arm unit 5145, and therefore the lens barrel 5117 of the endoscope 5115 can be inserted into the body cavity of the patient 5185 from a desired direction. It becomes possible.
- the joint portions 5147a to 5147c are provided with actuators, and the joint portions 5147a to 5147c are configured to be rotatable around a predetermined rotation axis by driving the actuators.
- the drive of the actuator is controlled by the arm control device 5159
- the rotation angles of the joint portions 5147a to 5147c are controlled, and the drive of the arm portion 5145 is controlled.
- control of the position and posture of the endoscope 5115 can be realized.
- the arm control device 5159 can control the driving of the arm unit 5145 by various known control methods such as force control or position control.
- the arm controller 5159 appropriately controls the driving of the arm unit 5145 according to the operation input.
- the position and posture of the endoscope 5115 may be controlled. With this control, the endoscope 5115 at the distal end of the arm portion 5145 can be moved from an arbitrary position to an arbitrary position and then fixedly supported at the position after the movement.
- the arm unit 5145 may be operated by a so-called master slave method. In this case, the arm unit 5145 can be remotely operated by the user via the input device 5161 installed at a location away from the operating room.
- the arm control device 5159 When force control is applied, the arm control device 5159 receives the external force from the user and moves the actuators of the joint portions 5147a to 5147c so that the arm portion 5145 moves smoothly according to the external force. You may perform what is called power assist control to drive. Accordingly, when the user moves the arm unit 5145 while directly touching the arm unit 5145, the arm unit 5145 can be moved with a relatively light force. Therefore, the endoscope 5115 can be moved more intuitively and with a simpler operation, and the convenience for the user can be improved.
- an endoscope 5115 is supported by a doctor called a scopist.
- the position of the endoscope 5115 can be more reliably fixed without relying on human hands, so that an image of the surgical site can be stably obtained. It becomes possible to perform the operation smoothly.
- the arm control device 5159 is not necessarily provided in the cart 5151. Further, the arm control device 5159 does not necessarily have to be one device. For example, the arm control device 5159 may be provided in each of the joint portions 5147a to 5147c of the arm portion 5145 of the support arm device 5141, and the plurality of arm control devices 5159 cooperate to drive the arm portion 5145. Control may be realized.
- the light source device 5157 supplies irradiation light for imaging the surgical site to the endoscope 5115.
- the light source device 5157 is constituted by a white light source constituted by, for example, an LED, a laser light source, or a combination thereof.
- a white light source is configured by a combination of RGB laser light sources
- the output intensity and output timing of each color (each wavelength) can be controlled with high accuracy. Adjustments can be made.
- the laser light from each of the RGB laser light sources is irradiated onto the observation target in a time-sharing manner, and the driving of the image sensor of the camera head 5119 is controlled in synchronization with the irradiation timing, thereby corresponding to each RGB. It is also possible to take the images that have been taken in time division. According to this method, a color image can be obtained without providing a color filter in the image sensor.
- the driving of the light source device 5157 may be controlled so as to change the intensity of the output light every predetermined time. Synchronously with the timing of changing the intensity of the light, the driving of the image sensor of the camera head 5119 is controlled to acquire an image in a time-sharing manner, and the image is synthesized, so that high dynamic without so-called blackout and overexposure is obtained. A range image can be generated.
- the light source device 5157 may be configured to be able to supply light of a predetermined wavelength band corresponding to special light observation.
- special light observation for example, by utilizing the wavelength dependence of light absorption in body tissue, the surface of the mucous membrane is irradiated by irradiating light in a narrow band compared to irradiation light (ie, white light) during normal observation.
- narrow band imaging is performed in which a predetermined tissue such as a blood vessel is imaged with high contrast.
- fluorescence observation may be performed in which an image is obtained by fluorescence generated by irradiating excitation light.
- the body tissue is irradiated with excitation light to observe fluorescence from the body tissue (autofluorescence observation), or a reagent such as indocyanine green (ICG) is locally administered to the body tissue and applied to the body tissue.
- a reagent such as indocyanine green (ICG) is locally administered to the body tissue and applied to the body tissue.
- ICG indocyanine green
- the light source device 5157 can be configured to be able to supply narrowband light and / or excitation light corresponding to such special light observation.
- FIG. 25 is a block diagram illustrating an example of functional configurations of the camera head 5119 and the CCU 5153 illustrated in FIG.
- the camera head 5119 has a lens unit 5121, an imaging unit 5123, a drive unit 5125, a communication unit 5127, and a camera head control unit 5129 as its functions.
- the CCU 5153 includes a communication unit 5173, an image processing unit 5175, and a control unit 5177 as its functions.
- the camera head 5119 and the CCU 5153 are connected to each other via a transmission cable 5179 so that they can communicate with each other.
- the lens unit 5121 is an optical system provided at a connection portion with the lens barrel 5117. Observation light taken from the tip of the lens barrel 5117 is guided to the camera head 5119 and enters the lens unit 5121.
- the lens unit 5121 is configured by combining a plurality of lenses including a zoom lens and a focus lens. The optical characteristics of the lens unit 5121 are adjusted so that the observation light is condensed on the light receiving surface of the image sensor of the imaging unit 5123. Further, the zoom lens and the focus lens are configured such that their positions on the optical axis are movable in order to adjust the magnification and focus of the captured image.
- the imaging unit 5123 is configured by an imaging element, and is arranged at the rear stage of the lens unit 5121.
- the observation light that has passed through the lens unit 5121 is collected on the light receiving surface of the imaging element, and an image signal corresponding to the observation image is generated by photoelectric conversion.
- the image signal generated by the imaging unit 5123 is provided to the communication unit 5127.
- the image pickup element constituting the image pickup unit 5123 for example, a CMOS (Complementary Metal Oxide Semiconductor) type image sensor that can perform color photographing having a Bayer array is used.
- the imaging element for example, an element capable of capturing a high-resolution image of 4K or more may be used.
- the image sensor that constitutes the image capturing unit 5123 is configured to have a pair of image sensors for acquiring right-eye and left-eye image signals corresponding to 3D display. By performing the 3D display, the operator 5181 can more accurately grasp the depth of the living tissue in the surgical site. Note that in the case where the imaging unit 5123 is configured as a multi-plate type, a plurality of lens units 5121 are also provided corresponding to each imaging element.
- the imaging unit 5123 is not necessarily provided in the camera head 5119.
- the imaging unit 5123 may be provided inside the lens barrel 5117 immediately after the objective lens.
- the driving unit 5125 includes an actuator, and moves the zoom lens and the focus lens of the lens unit 5121 by a predetermined distance along the optical axis under the control of the camera head control unit 5129. Thereby, the magnification and focus of the image captured by the imaging unit 5123 can be adjusted as appropriate.
- the communication unit 5127 includes a communication device for transmitting and receiving various types of information to and from the CCU 5153.
- the communication unit 5127 transmits the image signal obtained from the imaging unit 5123 to the CCU 5153 via the transmission cable 5179 as RAW data.
- the image signal is preferably transmitted by optical communication.
- the surgeon 5181 performs the surgery while observing the state of the affected part with the captured image, so that a moving image of the surgical part is displayed in real time as much as possible for safer and more reliable surgery. Because it is required.
- the communication unit 5127 is provided with a photoelectric conversion module that converts an electrical signal into an optical signal.
- the image signal is converted into an optical signal by the photoelectric conversion module, and then transmitted to the CCU 5153 via the transmission cable 5179.
- the communication unit 5127 receives a control signal for controlling the driving of the camera head 5119 from the CCU 5153.
- the control signal includes, for example, information for designating the frame rate of the captured image, information for designating the exposure value at the time of imaging, and / or information for designating the magnification and focus of the captured image. Contains information about the condition.
- the communication unit 5127 provides the received control signal to the camera head control unit 5129.
- the control signal from the CCU 5153 may also be transmitted by optical communication.
- the communication unit 5127 is provided with a photoelectric conversion module that converts an optical signal into an electrical signal.
- the control signal is converted into an electrical signal by the photoelectric conversion module and then provided to the camera head control unit 5129.
- the imaging conditions such as the frame rate, exposure value, magnification, and focus are automatically set by the control unit 5177 of the CCU 5153 based on the acquired image signal. That is, a so-called AE (Auto Exposure) function, AF (Auto Focus) function, and AWB (Auto White Balance) function are mounted on the endoscope 5115.
- AE Auto Exposure
- AF Automatic Focus
- AWB Automatic White Balance
- the camera head control unit 5129 controls driving of the camera head 5119 based on a control signal from the CCU 5153 received via the communication unit 5127. For example, the camera head control unit 5129 controls driving of the image sensor of the imaging unit 5123 based on information indicating that the frame rate of the captured image is specified and / or information indicating that the exposure at the time of imaging is specified. For example, the camera head control unit 5129 appropriately moves the zoom lens and the focus lens of the lens unit 5121 via the drive unit 5125 based on information indicating that the magnification and focus of the captured image are designated.
- the camera head control unit 5129 may further have a function of storing information for identifying the lens barrel 5117 and the camera head 5119.
- the camera head 5119 can be resistant to autoclave sterilization by arranging the lens unit 5121, the imaging unit 5123, and the like in a sealed structure with high airtightness and waterproofness.
- the communication unit 5173 is configured by a communication device for transmitting and receiving various types of information to and from the camera head 5119.
- the communication unit 5173 receives an image signal transmitted from the camera head 5119 via the transmission cable 5179.
- the image signal can be suitably transmitted by optical communication.
- the communication unit 5173 is provided with a photoelectric conversion module that converts an optical signal into an electric signal.
- the communication unit 5173 provides the image processing unit 5175 with the image signal converted into the electrical signal.
- the communication unit 5173 transmits a control signal for controlling the driving of the camera head 5119 to the camera head 5119.
- the control signal may also be transmitted by optical communication.
- the image processing unit 5175 performs various types of image processing on the image signal that is RAW data transmitted from the camera head 5119. Examples of the image processing include development processing, high image quality processing (band enhancement processing, super-resolution processing, NR (Noise reduction) processing and / or camera shake correction processing, etc.), and / or enlargement processing (electronic zoom processing). Various known signal processing is included. Further, the image processing unit 5175 performs detection processing on the image signal for performing AE, AF, and AWB.
- the image processing unit 5175 is configured by a processor such as a CPU or a GPU, and the above-described image processing and detection processing can be performed by the processor operating according to a predetermined program. Note that when the image processing unit 5175 includes a plurality of GPUs, the image processing unit 5175 appropriately divides information related to the image signal, and performs image processing in parallel with the plurality of GPUs.
- the control unit 5177 performs various controls relating to imaging of the surgical site by the endoscope 5115 and display of the captured image. For example, the control unit 5177 generates a control signal for controlling driving of the camera head 5119. At this time, when the imaging condition is input by the user, the control unit 5177 generates a control signal based on the input by the user. Alternatively, when the endoscope 5115 is equipped with the AE function, the AF function, and the AWB function, the control unit 5177 determines the optimum exposure value, focal length, and the distance according to the detection processing result by the image processing unit 5175. A white balance is appropriately calculated and a control signal is generated.
- control unit 5177 causes the display device 5155 to display an image of the surgical site based on the image signal subjected to image processing by the image processing unit 5175.
- the control unit 5177 recognizes various objects in the surgical unit image using various image recognition techniques. For example, the control unit 5177 detects the shape and color of the edge of the object included in the surgical part image, thereby removing surgical tools such as forceps, specific biological parts, bleeding, mist when using the energy treatment tool 5135, and the like. Can be recognized.
- the control unit 5177 causes various types of surgery support information to be superimposed and displayed on the image of the surgical site using the recognition result. Surgery support information is displayed in a superimposed manner and presented to the operator 5181, so that the surgery can be performed more safely and reliably.
- the transmission cable 5179 connecting the camera head 5119 and the CCU 5153 is an electric signal cable corresponding to electric signal communication, an optical fiber corresponding to optical communication, or a composite cable thereof.
- communication is performed by wire using the transmission cable 5179, but communication between the camera head 5119 and the CCU 5153 may be performed wirelessly.
- communication between the two is performed wirelessly, there is no need to install the transmission cable 5179 in the operating room, so that the situation where the movement of the medical staff in the operating room is hindered by the transmission cable 5179 can be solved.
- the operating room system 5100 to which the technology according to the present disclosure can be applied has been described.
- the medical system to which the operating room system 5100 is applied is the endoscopic operating system 5113 is described here as an example
- the configuration of the operating room system 5100 is not limited to such an example.
- the operating room system 5100 may be applied to an examination flexible endoscope system or a microscope operation system instead of the endoscope operation system 5113.
- the technology according to the present disclosure can be preferably applied to the recorder 5105 among the configurations described above.
- the recorder 5105 encodes an image captured by any one of the cameras (eg, the ceiling camera 5187, the operative field camera 5189, or the camera head 5119), according to the technique according to the present disclosure,
- the amount of code assigned to a region may be controlled based on a transfer function relating to conversion between light and an image signal. Thereby, it is possible to prevent the allocated code amount for expressing the gradation of the original signal due to the applied transfer function from being insufficient, and to suppress the codec distortion.
- the recorder 5105 when the recorder 5105 encodes an image captured by any camera, according to the technique according to the present disclosure, the recorder 5105 sets the prediction residual code amount or the mode code amount for mode selection as light. You may control based on the transfer function regarding the conversion between image signals. Thereby, it is possible to prevent unnatural prediction mode bias and reduce image distortion. As a result, in any example, the accuracy of diagnosis or treatment using images can be improved.
- the technology according to the present disclosure is not necessarily sufficiently adapted to the diversified signal representation according to the mechanism described in detail so far, for example, a digital video camera, a digital camcorder, a video encoder, or any kind of encoding function. Provide improvements to existing equipment. According to the technique according to the present disclosure, the codec distortion that is expanded along with the expansion of the dynamic range when reproducing the HDR video is reduced, and the HDR video can be reproduced with good image quality. The technology according to the present disclosure may be applied to encoding of still images.
- luminance / luma and chrominance / chroma are replaced by other terms such as brightness and saturation, depending on the color system used. May be.
- An encoding unit that encodes an image obtained based on a transfer function relating to conversion between light and an image signal; Based on the transfer function, a control unit that controls a prediction residual code amount or a mode code amount for mode selection when the image is encoded in the encoding unit;
- An image processing apparatus comprising: (2) When the first transfer function corresponding to the first dynamic range is applied to the image, the control unit performs the second transfer corresponding to the second dynamic range that is narrower than the first dynamic range.
- the image processing apparatus according to (1), wherein the mode code amount is controlled so that the mode code amount is smaller than when a function is applied to the image.
- the control unit performs the second transfer corresponding to the second dynamic range that is narrower than the first dynamic range.
- the encoding process executed by the encoding unit includes intra prediction, The prediction residual code amount or the mode code amount controlled by the control unit is used in mode selection from a plurality of candidate modes in the intra prediction.
- the image processing apparatus according to any one of (1) to (4).
- the encoding process executed by the encoding unit includes inter prediction, The prediction residual code amount or the mode code amount controlled by the control unit is used in mode selection from a plurality of candidate modes in the inter prediction.
- the image processing apparatus according to any one of (1) to (5).
- the encoding process executed by the encoding unit includes intra prediction and inter prediction, The prediction residual code amount or the mode code amount controlled by the control unit is used when selecting a prediction method that is the intra prediction or the inter prediction.
- the image processing apparatus according to any one of (1) to (6).
- the control unit determines a type of the transfer function based on input information related to the transfer function applied to the image, and controls the prediction residual code amount or the mode code amount based on the determined type
- the image processing apparatus according to any one of (1) to (7).
- a mode selection unit that selects a mode in which the cost including the prediction residual code amount and the mode code amount is the minimum among a plurality of candidate modes,
- the encoding unit encodes the image according to a mode selected by the mode selection unit;
- the image processing apparatus according to any one of (1) to (8).
- the image processing apparatus according to (8), wherein the input information is information acquired through a user interface.
- the image processing apparatus according to (8), wherein the input information is information acquired from an auxiliary signal multiplexed with an input image signal representing the image. (12)
- the image processing apparatus further includes a storage unit that stores a value of the mode code amount associated with a type of transfer function, or a parameter for controlling the prediction residual code amount or the mode code amount, The image processing apparatus according to (8).
- An image processing method including: (14) The processor of the image processing device, An encoding unit that encodes an image obtained based on a transfer function relating to conversion between light and an image signal; Based on the transfer function, a control unit that controls a prediction residual code amount or a mode code amount for mode selection when the image is encoded in the encoding unit; Program to function as.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
なお、上記の効果は必ずしも限定的なものではなく、上記の効果と共に、又は上記の効果に代えて、本明細書に示されたいずれかの効果、又は本明細書から把握され得る他の効果が奏されてもよい。
1.関連技術の説明
1-1.SDR及びHDR
1-2.コーデック歪み
1-3.伝達関数
1-4.色域
2.第1の実施形態
2-1.導入
2-2.システムの概要
2-3.画像処理装置の概略的な構成
2-4.符号化部及び制御部の詳細な構成
2-5.処理の流れ
2-6.変形例
2-7.第1の実施形態のまとめ
3.第2の実施形態
3-1.導入
3-2.システムの概要
3-3.画像処理装置の概略的な構成
3-4.符号化部及び制御部の詳細な構成
3-5.処理の流れ
3-6.第2の実施形態のまとめ
4.ハードウェア構成例
5.応用例
6.総括
[1-1.SDR及びHDR]
近年、実世界の様子をより忠実に再現し又はより豊富な明るさ及び色彩で映像を再生することを可能とするための、映像信号表現の拡張が進められている。HDRは、従来の標準的なダイナミックレンジであるSDRよりも広い輝度ダイナミックレンジで画像又は映像を表現しようとする概念である。
SDR映像かHDR映像かに関わらず、画像信号を非可逆圧縮を含む映像符号化方式で符号化すると、復号される画像信号に基づいて再生される画像に、画質の劣化が生じる。こうした画質の劣化を、本明細書ではコーデック歪みという。コーデック歪みの度合いは、PSNR(Peak Signal-to-Noise Ratio)という指標で評価され得る。概して、符号化効率を同等とした場合、H.264/AVCで符号化/復号された画像の画質は、MPEG-2で符号化/復号された画像の画質よりも高く、H.265/HEVCで符号化/復号された画像の画質はH.264/AVCよりも高い。しかしながら、通常、コーデック歪みの評価は、エンコーダへ入力される原画像とデコーダから出力される復号画像とを比較することにより行われる。HDR映像の撮像若しくは表示の際に行われる信号変換、又はダイナミックレンジの縮減若しくは拡張がコーデック歪みにどのように作用するかは、あまり知られていない。
-輝度の高い領域(例えば、空の中の雲)
-色の鮮やかな領域(例えば、赤色又は青色に光るランプ)
これらの部分領域において歪みが顕著となる原因は、HDR用の信号フォーマットの信号伝達関数に関連する。
一般に、撮像装置における光から画像信号への信号変換の特性は、OETF(Opto-Electronic Transfer Function;光電気伝達関数)でモデル化される。図3は、典型的なSDR用の信号フォーマットのOETF及びHDR用の信号フォーマットのOETFのそれぞれの例を示している。図3において、横軸は、変換前の光の輝度ダイナミックレンジを表し、100%が100nitの輝度に相当する。縦軸は、変換後の画像信号の符号値を表し、10bitの場合には符号値は0から1023までの値をとり得る。図中に破線で示したSDR用の信号フォーマット(例えば、BT.709)のOETFと実線で示したHDR用のOETF(例えば、HLG、ST2084又はS-Log3)とを比較すると、特に符号値が相対的に大きい部分において伝達関数の傾きの違いが顕著である。これは、こうした部分において、HDRのケースではSDRと比較して画像情報がより高い圧縮比で圧縮されていること、即ち符号値の同程度の変化がHDRのケースではSDRのケースよりも大きい階調の変化を表すことを意味している。RGB表色系において赤色(R)成分、緑色(G)成分及び青色(B)成分の各々の伝達関数を解析した場合にも、図3に示したグラフに類似したHDRとSDRとの間の信号伝達特性の違いが確認された。
実世界の様子をより忠実に再現し又はよりリッチな映像表現を可能とする技術として、HDRと共に色域もまた重要な概念である。ITU-Rにより標準化されたBT.2020は、これまで多くのアプリケーションで使用されてきたBT.709の色域と比較して、より鮮やかな色彩を表現することを可能とする色域を定義している。図5は、BT.709及びBT.2020により定義されている色域について説明するための説明図である。図5を参照すると、所定の拘束条件を用いて3次元の色空間を2次元平面へマッピングした色域グラフが示されている。グラフ中の十字マークは、白色がマッピングされる位置を示す。グラフ中の破線は、BT.709に従って表現することのできる色の範囲を示す。グラフ中の実線は、BT.2020に従って表現することのできる色の範囲を示す。グラフ中の点線は、人間の視覚が識別することのできる色の範囲を示す。図5から理解されるように、BT.2020は、BT.709よりも多彩な色を表現することを可能とする。BT.709が実世界に存在する色の約75%を表現可能であるのに対し、BT.2020は99%以上の色を表現可能であると言われている。BT.2020は、SDR映像の色域として利用されてもよく、又はHDR映像の色域として利用されてもよい。
[2-1.導入]
HDR用の信号フォーマットを用いた場合に画像内の部分領域において顕著となる上述したコーデック歪みのうちのいくつかは、特に画像信号の各色成分のダイナミックレンジのうち相対的に大きい符号値に対応するサブレンジにおいて、原信号の階調を表現するための割り当て符号量が不足することを原因としている。MPEG-2、H.264/AVC又はH.265/HEVCといった映像符号化方式に準拠するエンコーダは、所要の圧縮率を達成するために、画像信号を周波数ドメインにおいて量子化する。通常は、イントラ予測又はインター予測といった予測技術を適用した後の予測残差を直交変換することにより得られる変換係数が量子化される。しかし、SDR映像の符号化のために最適化されたそれらエンコーダにより使用される量子化ステップは、HDR用の信号フォーマットが使用される場合にはしばしば大き過ぎる。これは、大きい符号値に対応するサブレンジにおいて、既に信号変換の際に階調情報が(SDRのケースよりも)強く圧縮されているという事実を、既存のエンコーダが考慮していないからである。
図6Aは、本実施形態に係る画像処理システムの構成の第1の例を示す説明図である。図6Aに示した画像処理システム10aは、撮像装置11、信号処理装置14、及びサーバ装置15を含む。
図7Aは、本実施形態に係る画像処理装置の概略的な構成の第1の例を示すブロック図である。図7Aに示した画像処理装置100aは、例えば、図6Aの例におけるサーバ装置15、又は図6Bの例における撮像装置12若しくは端末装置16(又は、それら装置のいずれかに搭載される画像処理モジュール)であってよい。画像処理装置100aは、信号取得部101、情報取得部103、符号化部110及び制御部140を備える。
本項では、図7A及び図7Bに示した符号化部110及び制御部140のより具体的な構成について詳しく説明する。図8は、第1の実施形態に係る符号化部及び制御部の詳細な構成の一例を示すブロック図である。
図8を参照すると、符号化部110は、並び替えバッファ111、ブロック設定部112、減算部113、直交変換部114、量子化部115、可逆符号化部116、逆量子化部121、逆直交変換部122、加算部123、ループフィルタ124、フレームメモリ126、スイッチ127、モード選択部128、イントラ予測部130及びインター予測部135を備える。
図8を参照すると、制御部140は、統計演算部141及び符号量制御部143を備える。
第1の実施例において、符号量制御部143は、輝度成分の強さがより強い部分領域(即ち、高輝度部分)により多くの符号量が割り当てられるように、各部分領域について使用される量子化ステップをスケーリングする。各部分領域の輝度成分の強さは、統計演算部141により算出される部分領域ごとの統計から把握される。ここでは、符号量制御部143は、量子化ステップを各部分領域の輝度成分の強さに依存する保護比で除算することにより、量子化ステップをスケーリングするものとする。保護比は、部分領域の画質をどの程度保護するかを表すパラメータである。保護比の値が大きいほど、量子化ステップの値は小さくなり、当該量子化ステップが適用される部分領域の画質はより強く保護される。なお、保護比での実際の除算は、保護比を提供される量子化部115において行われてよい。
第2の実施例において、符号量制御部143は、色差成分の強さがより強い部分領域(即ち、高色差部分)により多くの符号量が割り当てられるように、各部分領域について使用される量子化ステップをスケーリングする。
図10の点P1、P2及びP3(「黄色」、「シアン」及び「緑色」)は、上述したように、色の鮮やかな(R成分、G成分及びB成分のうちの1つ以上が強い)領域に属し且つ高輝度(Y成分が強い)領域にも属する。このような色を有する部分領域を高輝度部分として保護しながら高色差部分としても保護すると、割り当て符号量が不適切なほど多くなってしまう可能性がある。そこで、符号量制御部143は、輝度成分及び色差成分のうちの一方の強さに応じて量子化ステップがスケーリングされた部分領域の当該量子化ステップを、輝度成分及び色差成分のうちの他方の強さに応じてスケーリングしなくてもよい。
(1)符号化制御処理
図12は、本実施形態に係る符号化制御処理の流れの一例を示すフローチャートである。ここで説明する符号化制御処理は、映像を構成する個々の画像について繰り返されてよい。複数の画像にわたって変化しないパラメータを取得し又は設定するための処理ステップは、2回目以降の繰り返しにおいてスキップされてもよい。なお、ここでは、説明の簡明さのために、符号量の制御に直接的に関連しない処理ステップについての説明は省略される。
図13Aは、図12のステップS140において実行され得る量子化制御処理の流れの第1の例を示すフローチャートである。第1の例は、画像内の高輝度部分の階調を保護するための量子化制御処理の流れの一例を示している。
図13Bは、図12のステップS140において実行され得る量子化制御処理の流れの第2の例を示すフローチャートである。第2の例は、画像内の高色差部分の階調を保護するための量子化制御処理の流れの一例を示している。
図13Cは、図12のステップS140において実行され得る量子化制御処理の流れの第3の例を示すフローチャートである。第3の例は、画像内の高輝度部分及び高色差部分の双方の階調を保護するための量子化制御処理の流れの一例を示している。
ここまで、画像を符号化する機能を有する画像処理装置が、画像に適用された伝達関数の種別に基づいて、部分領域ごとの割り当て符号量を制御する処理のオン/オフを切り替える例について説明した。しかしながら、本実施形態のアイディアは、伝達関数の種別の判定を経ることなく、部分領域ごとの割り当て符号量の制御が実行されるケースにも適用可能である。本項では、そうした一変形例について説明する。
図14は、第1の実施形態に係る画像処理装置の構成の一変形例を示すブロック図である。図14に示した画像処理装置100cは、例えば、図6Aの例におけるサーバ装置15、又は図6Bの例における撮像装置12若しくは端末装置16(又は、それら装置のいずれかに搭載される画像処理モジュール)であってよい。画像処理装置100cは、信号取得部101、符号化部110及び制御部140cを備える。
図15は、図14を用いて説明した変形例に係る符号化制御処理の流れの一例を示すフローチャートである。ここで説明する符号化制御処理は、映像を構成する個々の画像について繰り返されてよい。複数の画像にわたって変化しないパラメータを取得し又は設定するための処理ステップは、2回目以降の繰り返しにおいてスキップされてもよい。なお、ここでは、説明の簡明さのために、符号量の制御に直接的に関連しない処理ステップについての説明は省略される。
ここまで、図6A~図15を用いて、本開示に係る技術の第1の実施形態について説明した。上述した実施形態では、光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する際に、伝達関数に基づいて、画像の各部分領域に割り当てられる符号量が制御される。かかる構成によれば、どういった伝達関数が画像に適用されるかに依存して部分領域ごとに割り当て符号量を変化させることができる。それにより、伝達関数の選択の結果として原信号の階調を表現するための割り当て符号量が不足することを原因として、画像の部分領域においてコーデック歪みが顕著となることを防止することが可能となる。
[3-1.導入]
多くの映像符号化方式において、エンコーダは、画像を符号化する際に、選択可能な複数のモードから符号化効率の観点で最良のモードを選択し、選択したモードを示すモード情報を符号化してデコーダへ伝送する。そうしたモード選択は、例えば、イントラ予測における予測モード(例えば、予測方向及び予測ブロックサイズ)の選択、インター予測における予測モード(例えば、動きベクトル、予測ブロックサイズ及び参照ピクチャ)の選択、並びにイントラ予測モードとインター予測モードとの間の予測手法の選択を含み得る。モード選択は、通常、原画像データから予測画像データを減算して残る予測残差から発生する符号量とオーバヘッドとしてのモード情報から発生する符号量との和に相当し得るコストを、複数の候補モードにわたって評価することにより行われる。しかし、SDR映像のために設計され又はチューニングされたコスト評価式は、必ずしもHDR映像のために最適ではない。なぜなら、HDR映像の画像信号においては、SDR映像と比較して画像情報がより強く圧縮されており、同じ評価式が使用される場合には予測残差から発生する符号量のモード間の差異が過小評価されがちだからである。
本実施形態に係る画像処理システムは、第1の実施形態における画像処理システム10a又は10bと同様に構成されてよい。例えばシステム内の撮像装置、サーバ装置若しくは端末装置、又はそれら装置のいずれかに搭載される画像処理モジュールは、光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する画像処理装置(即ち、エンコーダ)としての機能を有する。本実施形態では、エンコーダが画像を符号化する際に、伝達関数に基づいて、モード選択のための予測残差符号量又はモード符号量が制御される。それにより、HDR用の信号フォーマットが使用される場合の不適切なモードの選択が回避され、画質の劣化が軽減される。次項より、そうした画像処理装置の具体的かつ例示的な構成について詳しく説明する。
図17Aは、本実施形態に係る画像処理装置の概略的な構成の第1の例を示すブロック図である。図17Aに示した画像処理装置200aは、信号取得部201、情報取得部203、符号化部210及び制御部240を備える。
本項では、図17A及び図17Bに示した符号化部210及び制御部240のより具体的な構成について詳しく説明する。図18は、第2の実施形態に係る符号化部及び制御部の詳細な構成の一例を示すブロック図である。
図18を参照すると、符号化部210は、並び替えバッファ211、ブロック設定部212、減算部213、直交変換部214、量子化部215、可逆符号化部216、逆量子化部221、逆直交変換部222、加算部223、ループフィルタ224、フレームメモリ226、スイッチ227、モード選択部228、イントラ予測部230及びインター予測部235を備える。
図18を参照すると、制御部240は、コスト制御部241及び設定値記憶部243を備える。コスト制御部241は、符号化される画像に適用される伝達関数の種別を、情報取得部203又は204から入力される入力情報に基づいて判定する。そして、コスト制御部241は、判定した伝達関数の種別に基づいて、符号化部210の1つ以上のセクションにおけるモード選択のためのコストの評価を制御する。より具体的には、コスト制御部241は、例えば、コスト評価式に算入される予測残差符号量及びモード符号量のうちの一方をスケーリングすることにより、コスト評価における予測残差の寄与とモード情報の寄与との間のバランスを調整することができる。コスト評価の際に、複数の候補モードにわたる予測残差符号量の変動に対してモード符号量の変動が大きい場合、モード選択へのモード符号量の寄与が予測残差符号量の寄与と比較して過剰となり、結果的に予測残差符号量の変動を過小評価した状態で最適なモードが決定される。逆に、複数の候補モードにわたる予測残差符号量の変動に対してモード符号量の変動が小さい場合、モード選択へのモード符号量の寄与が予測残差符号量の寄与と比較して過少となり、結果的にモード符号量の変動を過小評価した状態で最適なモードが決定される。そのため、これら符号量の寄与を良好に調整して両者の間のバランスを最適化し、適正なコスト評価を行うことが有益である。
図20は、本実施形態に係る符号化制御処理の流れの一例を示すフローチャートである。ここで説明する符号化制御処理は、映像を構成する個々の画像について繰り返されてよい。複数の画像にわたって変化しないパラメータを取得し又は設定するための処理ステップは、2回目以降の繰り返しにおいてスキップされてもよい。なお、ここでは、説明の簡明さのために、モード選択の制御に直接的に関連しない処理ステップについての説明は省略される。
ここまで、図16A~図20を用いて、本開示に係る技術の第2の実施形態について説明した。上述した実施形態では、光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する際に、伝達関数に基づいて、画像を符号化する際のモード選択のための予測残差符号量又はモード符号量が制御される。かかる構成によれば、画一的ではなく、どういった伝達関数が画像に適用されるかに依存して異なる評価式でモード選択のためのコスト評価を行うことができる。それにより、不自然な予測モードの偏りが生じることを防止し、予測精度を改善して画像の歪みを軽減することが可能となる。
前節までに説明した実施形態は、ソフトウェア、ハードウェア、及びソフトウェアとハードウェアとの組合せのいずれを用いて実現されてもよい。画像処理装置100a、100b、100c、200a又は200bがソフトウェアを使用する場合、ソフトウェアを構成するプログラムは、例えば、装置の内部又は外部に設けられる記憶媒体(非一時的な媒体:non-transitory media)に予め格納される。そして、各プログラムは、例えば、実行時にRAM(Random Access Memory)に読み込まれ、CPU(Central Processing Unit)などのプロセッサにより実行される。
本開示に係る技術は、様々な製品へ応用することができる。例えば、本開示に係る技術は、本節で説明されるような手術室システムに適用されてもよい。
支持アーム装置5141は、ベース部5143から延伸するアーム部5145を備える。図示する例では、アーム部5145は、関節部5147a、5147b、5147c、及びリンク5149a、5149bから構成されており、アーム制御装置5159からの制御により駆動される。アーム部5145によって内視鏡5115が支持され、その位置及び姿勢が制御される。これにより、内視鏡5115の安定的な位置の固定が実現され得る。
内視鏡5115は、先端から所定の長さの領域が患者5185の体腔内に挿入される鏡筒5117と、鏡筒5117の基端に接続されるカメラヘッド5119と、から構成される。図示する例では、硬性の鏡筒5117を有するいわゆる硬性鏡として構成される内視鏡5115を図示しているが、内視鏡5115は、軟性の鏡筒5117を有するいわゆる軟性鏡として構成されてもよい。
CCU5153は、CPU(Central Processing Unit)やGPU(Graphics Processing Unit)等によって構成され、内視鏡5115及び表示装置5155の動作を統括的に制御する。具体的には、CCU5153は、カメラヘッド5119から受け取った画像信号に対して、例えば現像処理(デモザイク処理)等の、当該画像信号に基づく画像を表示するための各種の画像処理を施す。CCU5153は、当該画像処理を施した画像信号を表示装置5155に提供する。また、CCU5153には、図22に示す視聴覚コントローラ5107が接続される。CCU5153は、画像処理を施した画像信号を視聴覚コントローラ5107にも提供する。また、CCU5153は、カメラヘッド5119に対して制御信号を送信し、その駆動を制御する。当該制御信号には、倍率や焦点距離等、撮像条件に関する情報が含まれ得る。当該撮像条件に関する情報は、入力装置5161を介して入力されてもよいし、上述した集中操作パネル5111を介して入力されてもよい。
支持アーム装置5141は、基台であるベース部5143と、ベース部5143から延伸するアーム部5145と、を備える。図示する例では、アーム部5145は、複数の関節部5147a、5147b、5147cと、関節部5147bによって連結される複数のリンク5149a、5149bと、から構成されているが、図24では、簡単のため、アーム部5145の構成を簡略化して図示している。実際には、アーム部5145が所望の自由度を有するように、関節部5147a~5147c及びリンク5149a、5149bの形状、数及び配置、並びに関節部5147a~5147cの回転軸の方向等が適宜設定され得る。例えば、アーム部5145は、好適に、6自由度以上の自由度を有するように構成され得る。これにより、アーム部5145の可動範囲内において内視鏡5115を自由に移動させることが可能になるため、所望の方向から内視鏡5115の鏡筒5117を患者5185の体腔内に挿入することが可能になる。
光源装置5157は、内視鏡5115に術部を撮影する際の照射光を供給する。光源装置5157は、例えばLED、レーザ光源又はこれらの組み合わせによって構成される白色光源から構成される。このとき、RGBレーザ光源の組み合わせにより白色光源が構成される場合には、各色(各波長)の出力強度及び出力タイミングを高精度に制御することができるため、光源装置5157において撮像画像のホワイトバランスの調整を行うことができる。また、この場合には、RGBレーザ光源それぞれからのレーザ光を時分割で観察対象に照射し、その照射タイミングに同期してカメラヘッド5119の撮像素子の駆動を制御することにより、RGBそれぞれに対応した画像を時分割で撮像することも可能である。当該方法によれば、当該撮像素子にカラーフィルタを設けなくても、カラー画像を得ることができる。
図25を参照して、内視鏡5115のカメラヘッド5119及びCCU5153の機能についてより詳細に説明する。図25は、図24に示すカメラヘッド5119及びCCU5153の機能構成の一例を示すブロック図である。
本開示に係る技術は、ここまでに詳しく説明した仕組みに従って、多様化しつつある信号表現に必ずしも十分に適応できていない例えばデジタルビデオカメラ、デジタルカムコーダ、ビデオエンコーダ、又はエンコード機能を有する任意の種類の既存の装置に対する改善を提供する。本開示に係る技術によれば、HDR映像を再生する際にダイナミックレンジの拡張に伴って拡大されるコーデック歪みが軽減され、良好な画質でHDR映像を再生することが可能となる。本開示に係る技術は、静止画の符号化へと応用されてもよい。
(1)
光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する符号化部と、
前記伝達関数に基づいて、前記符号化部において前記画像を符号化する際のモード選択のための予測残差符号量又はモード符号量を制御する制御部と、
を備える画像処理装置。
(2)
前記制御部は、第1のダイナミックレンジに対応する第1の伝達関数が前記画像に適用されている場合に、前記第1のダイナミックレンジよりも狭い第2のダイナミックレンジに対応する第2の伝達関数が前記画像に適用されている場合よりも前記モード符号量が少なくなるように、前記モード符号量を制御する、前記(1)に記載の画像処理装置。
(3)
前記制御部は、第1のダイナミックレンジに対応する第1の伝達関数が前記画像に適用されている場合に、前記第1のダイナミックレンジよりも狭い第2のダイナミックレンジに対応する第2の伝達関数が前記画像に適用されている場合よりも前記予測残差符号量が多くなるように、前記予測残差符号量を制御する、前記(1)に記載の画像処理装置。
(4)
前記第1のダイナミックレンジは、100nitより高い輝度での表示を可能とするためのダイナミックレンジである、前記(2)又は前記(3)に記載の画像処理装置。
(5)
前記符号化部により実行される符号化処理は、イントラ予測を含み、
前記制御部により制御される前記予測残差符号量又は前記モード符号量は、前記イントラ予測における複数の候補モードからのモード選択の際に使用される、
前記(1)~(4)のいずれか1項に記載の画像処理装置。
(6)
前記符号化部により実行される符号化処理は、インター予測を含み、
前記制御部により制御される前記予測残差符号量又は前記モード符号量は、前記インター予測における複数の候補モードからのモード選択の際に使用される、
前記(1)~(5)のいずれか1項に記載の画像処理装置。
(7)
前記符号化部により実行される符号化処理は、イントラ予測及びインター予測を含み、
前記制御部により制御される前記予測残差符号量又は前記モード符号量は、前記イントラ予測又は前記インター予測である予測手法の選択の際に使用される、
前記(1)~(6)のいずれか1項に記載の画像処理装置。
(8)
前記制御部は、前記画像に適用される前記伝達関数に関する入力情報に基づいて、前記伝達関数の種別を判定し、判定した前記種別に基づいて前記予測残差符号量又は前記モード符号量を制御する、前記(1)~(7)のいずれか1項に記載の画像処理装置。
(9)
複数の候補モードのうち、前記予測残差符号量及び前記モード符号量を含むコストが最小となるモードを選択するモード選択部、をさらに備え、
前記符号化部は、前記モード選択部により選択されるモードに従って前記画像を符号化する、
前記(1)~(8)のいずれか1項に記載の画像処理装置。
(10)
前記入力情報は、ユーザインタフェースを介して取得された情報である、前記(8)に記載の画像処理装置。
(11)
前記入力情報は、前記画像を表現する入力画像信号と多重化される補助信号から取得された情報である、前記(8)に記載の画像処理装置。
(12)
前記画像処理装置は、伝達関数の種別に関連付けられる前記モード符号量の値、又は、前記予測残差符号量若しくは前記モード符号量の制御のためのパラメータを記憶する記憶部、をさらに備える、前記(8)に記載の画像処理装置。
(13)
光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化することと、
前記伝達関数に基づいて、前記画像を符号化する際のモード選択のための予測残差符号量又はモード符号量を制御することと、
を含む画像処理方法。
(14)
画像処理装置のプロセッサを、
光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する符号化部と、
前記伝達関数に基づいて、前記符号化部において前記画像を符号化する際のモード選択のための予測残差符号量又はモード符号量を制御する制御部と、
として機能させるためのプログラム。
101 信号取得部
102 信号処理部
103,104 情報取得部
107 記憶部
110 符号化部
115 量子化部
140,140c 制御部
200a,200b 画像処理装置
201 信号取得部
202 信号処理部
203,204 情報取得部
207 記憶部
210 符号化部
228 モード選択部
230 イントラ予測部
235 インター予測部
240 制御部
Claims (14)
- 光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する符号化部と、
前記伝達関数に基づいて、前記符号化部において前記画像を符号化する際のモード選択のための予測残差符号量又はモード符号量を制御する制御部と、
を備える画像処理装置。 - 前記制御部は、第1のダイナミックレンジに対応する第1の伝達関数が前記画像に適用されている場合に、前記第1のダイナミックレンジよりも狭い第2のダイナミックレンジに対応する第2の伝達関数が前記画像に適用されている場合よりも前記モード符号量が少なくなるように、前記モード符号量を制御する、請求項1に記載の画像処理装置。
- 前記制御部は、第1のダイナミックレンジに対応する第1の伝達関数が前記画像に適用されている場合に、前記第1のダイナミックレンジよりも狭い第2のダイナミックレンジに対応する第2の伝達関数が前記画像に適用されている場合よりも前記予測残差符号量が多くなるように、前記予測残差符号量を制御する、請求項1に記載の画像処理装置。
- 前記第1のダイナミックレンジは、100nitより高い輝度での表示を可能とするためのダイナミックレンジである、請求項2に記載の画像処理装置。
- 前記符号化部により実行される符号化処理は、イントラ予測を含み、
前記制御部により制御される前記予測残差符号量又は前記モード符号量は、前記イントラ予測における複数の候補モードからのモード選択の際に使用される、
請求項1に記載の画像処理装置。 - 前記符号化部により実行される符号化処理は、インター予測を含み、
前記制御部により制御される前記予測残差符号量又は前記モード符号量は、前記インター予測における複数の候補モードからのモード選択の際に使用される、
請求項1に記載の画像処理装置。 - 前記符号化部により実行される符号化処理は、イントラ予測及びインター予測を含み、
前記制御部により制御される前記予測残差符号量又は前記モード符号量は、前記イントラ予測又は前記インター予測である予測手法の選択の際に使用される、
請求項1に記載の画像処理装置。 - 前記制御部は、前記画像に適用される前記伝達関数に関する入力情報に基づいて、前記伝達関数の種別を判定し、判定した前記種別に基づいて前記予測残差符号量又は前記モード符号量を制御する、請求項1に記載の画像処理装置。
- 複数の候補モードのうち、前記予測残差符号量及び前記モード符号量を含むコストが最小となるモードを選択するモード選択部、をさらに備え、
前記符号化部は、前記モード選択部により選択されるモードに従って前記画像を符号化する、
請求項1に記載の画像処理装置。 - 前記入力情報は、ユーザインタフェースを介して取得された情報である、請求項8に記載の画像処理装置。
- 前記入力情報は、前記画像を表現する入力画像信号と多重化される補助信号から取得された情報である、請求項8に記載の画像処理装置。
- 前記画像処理装置は、伝達関数の種別に関連付けられる前記モード符号量の値、又は、前記予測残差符号量若しくは前記モード符号量の制御のためのパラメータを記憶する記憶部、をさらに備える、請求項8に記載の画像処理装置。
- 光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化することと、
前記伝達関数に基づいて、前記画像を符号化する際のモード選択のための予測残差符号量又はモード符号量を制御することと、
を含む画像処理方法。 - 画像処理装置のプロセッサを、
光と画像信号との間の変換に関する伝達関数に基づいて取得される画像を符号化する符号化部と、
前記伝達関数に基づいて、前記符号化部において前記画像を符号化する際のモード選択のための予測残差符号量又はモード符号量を制御する制御部と、
として機能させるためのプログラム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
BR112019011920-5A BR112019011920A2 (pt) | 2016-12-19 | 2017-10-17 | dispositivo e método de processamento de imagem, e, programa. |
EP17885292.7A EP3557869B1 (en) | 2016-12-19 | 2017-10-17 | Image processing device, image processing method, and program |
CN201780076633.2A CN110050464B (zh) | 2016-12-19 | 2017-10-17 | 图像处理设备、图像处理方法和存储介质 |
US16/346,783 US11190744B2 (en) | 2016-12-19 | 2017-10-17 | Image processing device, image processing method, and program for determining a cost function for mode selection |
KR1020197016934A KR20190097011A (ko) | 2016-12-19 | 2017-10-17 | 화상 처리 장치, 화상 처리 방법 및 프로그램 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016-246000 | 2016-12-19 | ||
JP2016246000A JP6822123B2 (ja) | 2016-12-19 | 2016-12-19 | 画像処理装置、画像処理方法及びプログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018116603A1 true WO2018116603A1 (ja) | 2018-06-28 |
Family
ID=62627298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/037572 WO2018116603A1 (ja) | 2016-12-19 | 2017-10-17 | 画像処理装置、画像処理方法及びプログラム |
Country Status (7)
Country | Link |
---|---|
US (1) | US11190744B2 (ja) |
EP (1) | EP3557869B1 (ja) |
JP (1) | JP6822123B2 (ja) |
KR (1) | KR20190097011A (ja) |
CN (1) | CN110050464B (ja) |
BR (1) | BR112019011920A2 (ja) |
WO (1) | WO2018116603A1 (ja) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10453221B2 (en) * | 2017-04-10 | 2019-10-22 | Intel Corporation | Region based processing |
JP7147145B2 (ja) * | 2017-09-26 | 2022-10-05 | 富士通株式会社 | 動画像符号化装置、動画像符号化方法、及び動画像符号化プログラム |
WO2020171046A1 (en) * | 2019-02-20 | 2020-08-27 | Panasonic Intellectual Property Corporation Of America | Image encoder and image decoder |
US11775589B2 (en) * | 2019-08-26 | 2023-10-03 | Google Llc | Systems and methods for weighted quantization |
CN111291989B (zh) * | 2020-02-03 | 2023-03-24 | 重庆特斯联智慧科技股份有限公司 | 一种大型建筑人流深度学习调配系统及其方法 |
CN115361510A (zh) * | 2020-05-08 | 2022-11-18 | 华为技术有限公司 | 一种高动态范围hdr视频的处理方法、编码设备和解码设备 |
US11638019B2 (en) * | 2020-07-29 | 2023-04-25 | Alibaba Group Holding Limited | Methods and systems for prediction from multiple cross-components |
US11270468B1 (en) * | 2020-12-18 | 2022-03-08 | Facebook Technologies, Llc. | Joint color image and texture data compression |
US11647193B2 (en) * | 2020-12-18 | 2023-05-09 | Meta Platforms Technologies, Llc | Adaptive range packing compression |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011509536A (ja) * | 2008-01-04 | 2011-03-24 | シャープ株式会社 | レイヤー間(inter−layer)画像予測パラメータを決定するための方法及び装置 |
JP2014518030A (ja) * | 2011-04-28 | 2014-07-24 | コーニンクレッカ フィリップス エヌ ヴェ | Hdr画像を符号化及び復号化するための装置及び方法 |
WO2015130797A1 (en) * | 2014-02-25 | 2015-09-03 | Apple Inc. | Adaptive transfer function for video encoding and decoding |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008252176A (ja) | 2007-03-29 | 2008-10-16 | Toshiba Corp | 動画像符号化装置及び方法 |
BRPI0821229A2 (pt) | 2007-12-20 | 2017-01-10 | Basf Se | composição para cuidados pessoais, método para o tratamento de condicionamento de fibras contendo queratina de mamíferos, e, terpolímero catiônico |
CN103369142B (zh) | 2013-07-09 | 2015-02-04 | 广东欧珀移动通信有限公司 | 一种防止打电话误操作的方法及系统 |
JP6202330B2 (ja) * | 2013-10-15 | 2017-09-27 | ソニー株式会社 | 復号装置および復号方法、並びに符号化装置および符号化方法 |
MX364028B (es) * | 2013-12-27 | 2019-04-11 | Sony Corp | Aparato y metodo de procesamiento de imagenes. |
US10136133B2 (en) * | 2014-11-11 | 2018-11-20 | Dolby Laboratories Licensing Corporation | Rate control adaptation for high-dynamic range images |
US20160309154A1 (en) * | 2015-04-17 | 2016-10-20 | Qualcomm Incorporated | Dynamic range adjustment for high dynamic range and wide color gamut video coding |
CN107852512A (zh) * | 2015-06-07 | 2018-03-27 | 夏普株式会社 | 基于亮度转换函数或视频色彩分量值的优化视频编码的系统及方法 |
US10244245B2 (en) * | 2015-06-08 | 2019-03-26 | Qualcomm Incorporated | Content-adaptive application of fixed transfer function to high dynamic range (HDR) and/or wide color gamut (WCG) video data |
US10200690B2 (en) * | 2015-09-22 | 2019-02-05 | Qualcomm Incorporated | Video decoder conformance for high dynamic range (HDR) video coding using a core video standard |
AU2015261734A1 (en) * | 2015-11-30 | 2017-06-15 | Canon Kabushiki Kaisha | Method, apparatus and system for encoding and decoding video data according to local luminance intensity |
WO2017123133A1 (en) * | 2016-01-12 | 2017-07-20 | Telefonaktiebolaget Lm Ericsson (Publ) | Video coding using hybrid intra prediction |
EP3301925A1 (en) | 2016-09-30 | 2018-04-04 | Thomson Licensing | Method for local inter-layer prediction intra based |
-
2016
- 2016-12-19 JP JP2016246000A patent/JP6822123B2/ja active Active
-
2017
- 2017-10-17 BR BR112019011920-5A patent/BR112019011920A2/pt not_active IP Right Cessation
- 2017-10-17 EP EP17885292.7A patent/EP3557869B1/en active Active
- 2017-10-17 WO PCT/JP2017/037572 patent/WO2018116603A1/ja active Application Filing
- 2017-10-17 US US16/346,783 patent/US11190744B2/en active Active
- 2017-10-17 KR KR1020197016934A patent/KR20190097011A/ko not_active Application Discontinuation
- 2017-10-17 CN CN201780076633.2A patent/CN110050464B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011509536A (ja) * | 2008-01-04 | 2011-03-24 | シャープ株式会社 | レイヤー間(inter−layer)画像予測パラメータを決定するための方法及び装置 |
JP2014518030A (ja) * | 2011-04-28 | 2014-07-24 | コーニンクレッカ フィリップス エヌ ヴェ | Hdr画像を符号化及び復号化するための装置及び方法 |
WO2015130797A1 (en) * | 2014-02-25 | 2015-09-03 | Apple Inc. | Adaptive transfer function for video encoding and decoding |
Non-Patent Citations (4)
Title |
---|
ASSOCIATION OF RADIO INDUSTRIES AND BUSINESSES: "ESSENTIAL PARAMETER VALUES FOR THE EXTENDED IMAGE DYNAMIC RANGE TELEVISION (EIDRTV) SYSTEM FOR PROGRAMME PRODUCTION ARIB STANDARD", ARIB STD-B67, 3 July 2015 (2015-07-03), Retrieved from the Internet <URL:http://www. arib. or.jp/english/html/overview/doc/2-STD-B67v 1 0.pdf> |
ITU-T: "H.264: Advanced video coding for generic audiovisual services", ITU-T RECOMMENDATION H.264, November 2007 (2007-11-01) |
ITU-T: "H.265: High efficiency video coding", ITU-T RECOMMENDATION H.265, October 2014 (2014-10-01) |
See also references of EP3557869A4 |
Also Published As
Publication number | Publication date |
---|---|
EP3557869A4 (en) | 2020-01-22 |
US20190281267A1 (en) | 2019-09-12 |
EP3557869A1 (en) | 2019-10-23 |
CN110050464B (zh) | 2022-02-11 |
JP2018101867A (ja) | 2018-06-28 |
EP3557869B1 (en) | 2023-03-29 |
JP6822123B2 (ja) | 2021-01-27 |
US11190744B2 (en) | 2021-11-30 |
KR20190097011A (ko) | 2019-08-20 |
CN110050464A (zh) | 2019-07-23 |
BR112019011920A2 (pt) | 2019-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018116605A1 (ja) | 画像処理装置、画像処理方法及びプログラム | |
WO2018116603A1 (ja) | 画像処理装置、画像処理方法及びプログラム | |
WO2018116604A1 (ja) | 画像処理装置、画像処理方法及びプログラム | |
JP6844539B2 (ja) | 映像信号処理装置および映像信号処理方法、ならびに表示装置 | |
CN109863755B (zh) | 信号处理设备、方法和程序 | |
US10163196B2 (en) | Image processing device and imaging system | |
CN110168605B (zh) | 用于动态范围压缩的视频信号处理装置、视频信号处理方法和计算机可读介质 | |
US20170046836A1 (en) | Real-time endoscopic image enhancement | |
JP2021531883A (ja) | 手術室における分散型画像処理システム | |
CN109964487B (zh) | 管理装置和管理方法 | |
CN107847119B (zh) | 医疗信号处理装置、医疗显示装置和医疗观察系统 | |
CN116074538A (zh) | 图像编码设备及其控制方法和计算机可读存储介质 | |
WO2010079682A1 (ja) | 画像圧縮方法、画像処理装置、画像表示装置及び画像表示システム | |
US12034935B2 (en) | Reception apparatus, reception method, and image processing system | |
WO2011158562A1 (ja) | 多視点画像符号化装置 | |
WO2019003954A1 (ja) | 通信システムおよび送信装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17885292 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20197016934 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112019011920 Country of ref document: BR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2017885292 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 112019011920 Country of ref document: BR Kind code of ref document: A2 Effective date: 20190612 |