WO2017183479A1 - 符号化装置及び符号化方法、並びに、復号装置及び復号方法 - Google Patents
符号化装置及び符号化方法、並びに、復号装置及び復号方法 Download PDFInfo
- Publication number
- WO2017183479A1 WO2017183479A1 PCT/JP2017/014454 JP2017014454W WO2017183479A1 WO 2017183479 A1 WO2017183479 A1 WO 2017183479A1 JP 2017014454 W JP2017014454 W JP 2017014454W WO 2017183479 A1 WO2017183479 A1 WO 2017183479A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- texture
- component
- image
- encoding
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/40—Analysis of texture
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/11—Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/65—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/94—Vector quantisation
Definitions
- the present technology relates to an encoding device, an encoding method, and a decoding device and a decoding method, and in particular, for example, an encoding device, an encoding method, and a decoding device that can improve transmission efficiency and image quality. And a decoding method.
- the texture is, for example, a fine pattern in the image, and is often a high frequency (high frequency band) signal.
- FFT Fast Fourier Transform
- AVC Advanced Video Coding
- a texture component is removed from an original image in an encoding device, and the original image after the removal of the texture component is encoded.
- the encoding device a small amount of texture components (images) and synthesis parameters are transmitted together with encoded data obtained by encoding.
- the encoded data is decoded, and the texture component from the encoding device is synthesized with the decoded image obtained by the decoding using the synthesis parameter, and a restored image is generated by restoring the original image. .
- Patent Document 1 proposes a technique for reducing and transmitting a texture (texture pattern) in an encoding apparatus and enlarging the reduced texture by a super-resolution technique in a decoding apparatus.
- the image quality of the restored image can be improved.
- the transmission efficiency is increased. to degrade.
- the present technology has been made in view of such circumstances, and is intended to improve transmission efficiency and image quality.
- An encoding apparatus includes an encoding unit that encodes an input image using an irreversible encoding method, a database in which a plurality of texture components are registered, and the plurality of texture components that are registered in the database.
- an encoding device comprising: identification information for identifying a match component that is the texture component that matches the input image; and a transmission unit that transmits encoded data obtained by encoding the input image.
- the encoding method of the present technology includes encoding the input image by an irreversible encoding method, and the input image of the plurality of texture components registered in a database in which a plurality of texture components are registered.
- a coding method including transmitting identification information for identifying a match component that is a texture component that matches the image and encoded data obtained by encoding the input image.
- an input image is encoded by an irreversible encoding method, and among the plurality of texture components registered in a database in which a plurality of texture components are registered, Identification information for identifying a match component, which is the texture component that matches the input image, and encoded data obtained by encoding the input image are transmitted.
- the decoding device of the present technology includes: a receiving unit that receives encoded data obtained by encoding an input image using an irreversible encoding method; and identification information that identifies a match component that is a texture component that matches the input image;
- the matching unit identified by the identification information among the plurality of texture components registered in the database, a decoding unit that decodes the converted data into a decoded image, a database in which a plurality of texture components are registered It is a decoding apparatus provided with the synthetic
- the decoding method of the present technology receives encoded data obtained by encoding an input image by an irreversible encoding method, identification information for identifying a match component that is a texture component that matches the input image, and the encoding Decoding data into a decoded image, and the texture component as the match component identified by the identification information among the plurality of texture components registered in a database in which a plurality of texture components are registered And a decoding method including combining the decoded image.
- the decoding apparatus and decoding method of the present technology receives encoded data obtained by encoding an input image using a lossy encoding method and identification information for identifying a match component that is a texture component that matches the input image. Then, the encoded data is decoded into a decoded image, and among the plurality of texture components registered in a database in which a plurality of texture components are registered, as the match component identified by the identification information The texture component and the decoded image are combined.
- the encoding device and the decoding device can be realized by causing a computer to execute a program.
- a program to be executed by a computer can be provided by being transmitted via a transmission medium or by being recorded on a recording medium.
- the encoding device and the decoding device may be independent devices, or may be internal blocks constituting one device.
- 12 is a flowchart illustrating an example of encoding processing of the encoding device 30. 12 is a flowchart illustrating an example of a decoding process of the decoding device 40. It is a block diagram which shows the 3rd structural example of the codec which restore
- 12 is a flowchart illustrating an example of encoding processing of the encoding device 50.
- 12 is a flowchart illustrating an example of a decoding process of the decoding device 60. It is a block diagram which shows the 5th structural example of the codec which restore
- 12 is a flowchart illustrating an example of encoding processing of the encoding device 50.
- 12 is a flowchart illustrating an example of a decoding process of the decoding device 60.
- FIG. 12 is a flowchart illustrating an example of processing performed by the encoding device 50 and the decoding device 60 to update the DB data of the texture DB 63. It is a block diagram which shows the 8th structural example of the codec which restore
- 3 is a block diagram illustrating a configuration example of a registration unit 151.
- FIG. 12 is a flowchart illustrating an example of encoding processing of the encoding device 50. It is a figure which shows the example of a multiview image encoding system. It is a figure which shows the main structural examples of the multiview image coding apparatus to which this technique is applied. It is a figure which shows the main structural examples of the multiview image decoding apparatus to which this technique is applied. It is a figure which shows the example of a hierarchy image coding system. It is a figure which shows the main structural examples of the hierarchy image coding apparatus to which this technique is applied. It is a figure which shows the main structural examples of the hierarchy image decoding apparatus to which this technique is applied. And FIG. 20 is a block diagram illustrating a main configuration example of a computer.
- FIG. 1 is a block diagram showing a first configuration example of a codec that restores a texture lost in encoding.
- the codec includes an encoding device 10 and a decoding device 20.
- the encoding device 10 includes a texture component extraction unit 11, a removal unit 12, and an encoding unit 13.
- the texture component extraction unit 11 is supplied with an original image (moving image, still image) as an input image input to the encoding device 10.
- the texture component extraction unit 11 extracts a texture component of the input image from the input image, supplies the texture component to the removal unit 12, and transmits it to the decoding device 20.
- the removal unit 12 is supplied not only with the texture component of the input image from the texture component extraction unit 11 but also with the input image.
- the removal unit 12 calculates the difference between the input image and the texture component of the input image from the texture component extraction unit 11, thereby removing the texture component of the input image from the input image and removing the texture component.
- the input image is supplied to the encoding unit 13 as an encoding target image to be encoded by the encoding unit 13.
- the encoding target image obtained by the removing unit 12 is an image obtained by removing texture components, that is, high frequency components from the input image, it can be said that there are low frequency components of the input image.
- the encoding unit 13 is, for example, MPEG (Moving Picture Experts Group) as a hybrid system combining predictive coding and orthogonal exchange, AVC, HEVC (High Efficiency Video Coding), and other non- (lossy) lossless encoding.
- MPEG Motion Picture Experts Group
- AVC Advanced Video Coding
- HEVC High Efficiency Video Coding
- other non- (lossy) lossless encoding The encoding target image from the removal unit 12 is encoded by the method, and the encoded data obtained as a result is transmitted to the decoding device 20.
- an image is converted into a frequency domain signal, and the frequency domain signal is quantized into an image as in AVC or the like.
- the included low frequency components, ie, high frequency components such as texture components are lost.
- the decoding device 20 includes a decoding unit 21, a texture component restoration unit 22, and a synthesis unit 23.
- the decoding unit 21 receives the encoded data transmitted from the encoding device 10 (the encoding unit 13 thereof) and receives the encoded data, and decodes the data by a method corresponding to the encoding method of the encoding unit 13.
- the decoding unit 21 supplies a decoded image obtained by decoding the encoded data to the synthesis unit 23.
- the decoded image obtained by the decoding unit 21 corresponds to an encoding target image, that is, a low-frequency component of the input image.
- the texture component restoration unit 22 receives the texture component transmitted from the encoding device 10 (the texture component extraction unit 11), performs necessary processing, restores the texture component of the input image, and then combines the synthesis unit 23. To supply.
- the synthesis unit 23 restored the input image (original image) by synthesizing the low-frequency component of the input image as the decoded image from the decoding unit 21 and the texture component of the input image from the texture component restoration unit 22.
- a restored image is generated and output as an output image output from the decoding device 20.
- the texture component of the input image is transmitted from the encoding device 10 to the decoding device 20, so that the decoding device 20 obtains an output image in which the texture of the input image is restored, that is, the image quality of the output image. Can be improved.
- the transmission efficiency can be improved by reducing the number of texture components (number of patterns) transmitted from the encoding device 10 to the decoding device 20, the number of texture components transmitted from the encoding device 10 to the decoding device 20 If the number is reduced, the image quality of the output image deteriorates.
- FIG. 2 is a block diagram illustrating a second configuration example of a codec that restores a texture lost in encoding.
- the codec includes an encoding device 30 and a decoding device 40.
- the encoding device 30 includes a texture DB (database) 31, a texture component acquisition unit 32, a removal unit 33, an encoding unit 34, and a transmission unit 35.
- a texture DB database
- the encoding device 30 includes a texture DB (database) 31, a texture component acquisition unit 32, a removal unit 33, an encoding unit 34, and a transmission unit 35.
- texture components of various patterns that is, a plurality (types of textures) of texture components are registered.
- the texture component acquisition unit 32 is supplied with an original image as an input image.
- the texture component acquisition unit 32 acquires, for each predetermined block of the input image, a match component that is the texture component that most closely matches the block from the texture components registered in the texture DB 31, and supplies it to the removal unit 33. That is, the texture component acquisition unit 32 acquires, as a match component, a texture component that minimizes the sum of square errors of pixel values with a predetermined block of the input image, for example, from the texture components registered in the texture DB 31. This is supplied to the removing unit 33.
- the texture component acquisition unit 32 supplies identification information for identifying the match component to the transmission unit 35.
- a texture component is registered together with unique identification information for identifying the texture component.
- the removal unit 33 is also supplied with an input image.
- the removal unit 33 calculates the difference between the input image and the match component of the input image from the texture component acquisition unit 32, thereby removing the texture component as the match component of the input image from the input image.
- the input image from which the component has been removed is supplied to the encoding unit 34 as an encoding target image to be encoded by the encoding unit 34.
- the encoding target image obtained by the removal unit 33 is an image obtained by removing texture components, that is, high frequency components from the input image, it can be said that there are low frequency components of the input image.
- the encoding unit 34 encodes the encoding target image from the removal unit 33 by, for example, MPEG, AVC, HEVC, or other lossy encoding method, and the encoded data obtained as a result is transmitted to the transmission unit 35. Supply.
- the encoding target image of the encoding unit 34 is an input image from which the texture component is removed, and the encoding is performed more efficiently than when the input image itself is the encoding target image. That is, the amount of encoded data can be reduced.
- the image is converted into a frequency domain signal, and the frequency domain signal is quantized.
- high frequency components such as texture components are lost.
- the transmission unit 35 transmits the identification information from the texture component acquisition unit 32 and the encoded data from the encoding unit 34.
- the identification information and encoded data transmitted by the transmission unit 35 are supplied to the decoding device 40 via a transmission medium (not shown), or are recorded on a recording medium (not shown), and further read from the recording medium. And is supplied to the decoding device 40.
- the transmission unit 35 can transmit the identification information and the encoded data separately, or can transmit the information integrally, that is, for example, by multiplexing the identification information and the encoded data. It can also be transmitted.
- the texture component acquisition unit 32 acquires a match component that is the texture component that most closely matches the block for each predetermined block of the input image, the match component and thus the identification information can be obtained for each block. it can.
- the transmission unit 35 can transmit the identification information for each block, or can transmit the information in a segmentation unit larger than the block, that is, for example, a frame unit.
- the decoding device 40 includes a receiving unit 41, a decoding unit 42, a texture DB 43, a texture component acquisition unit 44, and a synthesis unit 45.
- the receiving unit 41 receives the encoded data and the identification information transmitted from the transmission unit 35, and supplies the encoded data to the decoding unit 42 and the identification information to the texture component acquisition unit 44. Supply.
- the decoding unit 42 decodes the encoded data from the receiving unit 41 by a method corresponding to the encoding method of the encoding unit 34, and supplies a decoded image obtained as a result to the combining unit 45.
- the decoded image obtained by the decoding unit 42 corresponds to an encoding target image, that is, here, a low frequency component of the input image.
- texture components of various patterns that is, a plurality of texture components are registered.
- texture DB 43 at least a plurality of texture components that are the same as those registered in the texture DB 31 of the encoding device 30 are registered.
- the texture component acquisition unit 44 acquires the texture component as the match component identified by the identification information from the reception unit 41 from the texture component registered in the texture DB 43 and supplies the texture component to the synthesis unit 45.
- the synthesizing unit 45 restores the input image (original image) by synthesizing the low frequency component of the input image as the decoded image from the decoding unit 42 and the texture component as the match component from the texture component acquiring unit 44.
- the restored image is generated and output as an output image.
- a plurality of texture components are registered (held) in the texture DBs 31 and 43 in advance, and match the input image of the plurality of texture components.
- Identification information for identifying a texture component as a match component is transmitted from the encoding device 30 to the decoding device 40.
- the decoding device 40 can obtain an output image obtained by restoring the texture of the input image using the texture component as the match component identified by the identification information, that is, the image quality of the output image can be improved.
- the encoding device 30 and the decoding device 40 in the encoding device 30 and the decoding device 40, a plurality of texture components are registered in advance in the texture DBs 31 and 43, and the encoding device 30 matches the input image. Since identification information for identifying the texture component is transmitted instead of the texture component itself as a match component, transmission efficiency (compression efficiency) is improved as compared with the codec of FIG. 1 in which the texture component itself is transmitted. Can do.
- the image from which the texture component has been removed from the input image is used as the encoding target image, and the encoding target image is encoded by the encoding unit 34.
- the compression efficiency can be improved as compared with the case where the encoding unit 34 encodes, and as a result, the transmission efficiency can be improved.
- the image quality of the output image can be improved and the transmission efficiency can be improved.
- the image quality of the output image can be further improved without reducing the transmission efficiency.
- FIG. 3 is a flowchart for explaining an example of the encoding process of the encoding device 30 of FIG.
- the encoding device 30 performs the encoding process according to the flowchart of FIG. 3 with the frames of the input image supplied to the encoding device 30 as sequential frames of interest.
- the texture component acquisition unit 32 divides (divides) the target frame of the input image into blocks for detecting a match component that matches the texture component of the texture DB 31.
- the texture component acquisition unit 32 selects one block that has not yet been selected as the target block from the block of the target frame of the input image, and the process proceeds to step S12.
- step S12 the texture component acquisition unit 32 acquires a match component that is a texture component that most closely matches the target block of the input image from the texture components registered in the texture DB 31.
- the texture component acquisition unit 32 acquires, for example, the texture component most similar to the texture of the target block among the texture components registered in the texture DB 31 as a match component.
- the texture component acquisition unit 32 supplies the match component acquired for the block of interest to the removal unit 33, and the process proceeds to step S13.
- step S13 the texture component acquisition unit 32 acquires identification information for identifying the match component from the texture DB 31, supplies the identification information to the transmission unit 35, and the process proceeds to step S14.
- step S14 the texture component acquisition unit 32 determines whether all blocks of the target frame of the input image have been selected as the target block.
- step S14 If it is determined in step S14 that all the blocks of the target frame of the input image have not yet been selected as the target block, the process returns to step S11, and the same process is repeated thereafter.
- step S14 If it is determined in step S14 that all blocks of the target frame of the input image have been selected as the target block, the process proceeds to step S15.
- step S15 the transmission unit 35 creates an identification information map in which each block of the target frame of the input image is associated with identification information of the match component acquired from the block (identification information from the texture component acquisition unit 32). The process proceeds to step S16.
- step S ⁇ b> 16 the removal unit 33 removes the match component of each block from the texture component acquisition unit 32 from each block of the target frame of the input image, the low frequency component (of the target frame) of the input image, That is, the difference between the input image and the texture component as the match component is generated as an encoding target image, supplied to the encoding unit 34, and the process proceeds to step S17.
- step S ⁇ b> 17 the encoding unit 34 encodes the encoding target image from the removal unit 33 by the lossy encoding method, supplies the encoded data obtained as a result to the transmission unit 35, Proceed to step S18.
- step S18 the transmission unit 35 transmits the identification information map and the encoded data from the encoding unit 34, and the encoding device 30 ends the process for the frame of interest of the input image.
- FIG. 4 is a flowchart for explaining an example of the decoding process of the decoding device 40 of FIG.
- step S21 the receiving unit 41 receives (receives) the encoded data and the identification information map for one frame transmitted from the encoding device 30. Further, the receiving unit 41 supplies the encoded data to the decoding unit 42 and also supplies the identification information map to the texture component acquisition unit 44, and the process proceeds from step S21 to step S22.
- step S22 the decoding unit 42 decodes the encoded data from the receiving unit 41, supplies the decoded image (frame) obtained as a result to the synthesizing unit 45, and the process proceeds to step S23.
- step S23 the texture component acquisition unit 44 selects one of the blocks of the identification information map from the reception unit 41 that has not yet been selected as the target block, as the target block. Proceed to S24.
- step S24 the texture component acquisition unit 44 acquires, from the texture components of the texture DB 43, the texture component identified by the identification information of the target block (represented by the identification information) as the match component of the target block. Then, the texture component acquisition unit 44 supplies the matching component of the block of interest to the synthesis unit 45, and the process proceeds from step S24 to step S25.
- step S25 the texture component acquisition unit 44 determines whether all the blocks of the identification information map have been selected as the target block.
- step S25 If it is determined in step S25 that all blocks of the identification information map have not yet been selected as the target block, the process returns to step S23, and the same process is repeated thereafter.
- step S25 If it is determined in step S25 that all blocks of the identification information map have been selected as the target block, the process proceeds to step S26.
- step S ⁇ b> 26 the synthesis unit 45 synthesizes the texture component as the match component for each block from the texture component acquisition unit 44 into the position of the corresponding block in the decoded image (frame) from the decoding unit 42. Then, a restored image (frame thereof) obtained by restoring the input image (original image) is generated and output as an output image, and the process for the encoded data and the identification information map for one frame is completed.
- FIG. 5 is a block diagram showing a third configuration example of the codec for restoring the texture lost in the encoding.
- the codec is common to the case of FIG. 2 in that it includes an encoding device 30 and a decoding device 40.
- the encoding device 30 has a texture DB 31, a texture component acquisition unit 32, an encoding unit 34, and a transmission unit 35 in common with the case of FIG. 2, and the decoding device 40 receives 2 is common to the case of FIG. 2 in that it includes a unit 41, a decoding unit 42, a texture DB 43, a texture component acquisition unit 44, and a synthesis unit 45.
- the encoding device 30 is different from the case of FIG.
- the input unit itself does not encode the input image from which the texture component is removed, that is, the low-frequency component of the input image. Supplied as a target image.
- the encoding unit 34 encodes the input image itself as the encoding target image.
- the encoding target image is encoded by, for example, MPEG, AVC, HEVC, or other irreversible encoding methods as described in FIG.
- the high frequency component of the input image as the encoding target image that is, for example, at least a part of the texture component is lost.
- the synthesizing unit 45 includes the decoded image from the decoding unit 42, that is, the input image in which (at least a part of) the texture component is lost, and the texture as the match component from the texture component acquisition unit 44. Synthesize ingredients.
- the encoding unit 34 of the encoding device 30 performs encoding using the lossy encoding method with the input image itself as an encoding target image, not the low frequency component of the input image. 2 is performed, the same processing as that of the codec of FIG. 2 is performed.
- the image quality of the output image can be improved and the transmission efficiency can be improved.
- FIG. 6 is a block diagram showing a fourth configuration example of the codec for restoring the texture lost in the encoding.
- the codec includes an encoding device 50 and a decoding device 60.
- the encoding device 50 includes a texture DB 51, a separation unit 52, a base synthesis unit 53, a match component determination unit 54, a removal unit 55, an encoding unit 56, and a transmission unit 57.
- texture DB 51 various types of texture components, that is, a plurality of texture components are registered.
- texture components are base (basic learning) and registered in the form of a base.
- the basis of the texture component is an image component that can express the texture by a finite number of linear combinations.
- the original image as an input image is supplied to the separation unit 52.
- the separation unit 52 filters the input image to separate the low frequency component of the input image from the input image, and supplies the low-frequency component to the base synthesis unit 53.
- the base combining unit 53 For each of a plurality of texture components whose bases are registered in the texture DB 51, the base combining unit 53 performs base combining using the low-frequency component of the input image from the separation unit 52 and the bases registered in the texture DB 51. Do.
- the base synthesis unit 53 generates a texture component as a restoration component obtained by restoring the texture component of the input image for each of a plurality of texture components whose bases are registered in the texture DB 51 by base synthesis, and sends them to the match component determination unit 54. Supply.
- the match component determination unit 54 is supplied from the basis synthesis unit 53 with restoration components for each of a plurality of texture components whose bases are registered in the texture DB 51, and also with an input image.
- the match component determination unit 54 determines, for each predetermined block of the input image, a match component that is a texture component as a restoration component that most closely matches the block, from among the restoration components from the base synthesis unit 53, and a removal unit 55.
- the match component determination unit 54 divides the input image (its frame) into blocks for determining the match component.
- a block for determining a match component for example, a block of an arbitrary size such as a block of horizontal ⁇ vertical 16 ⁇ 16 pixels can be adopted.
- the match component determination unit 54 determines, for each block of the input image, a restoration component having the smallest error with respect to the block among the restoration components from the base synthesis unit 53 as a match component.
- the error of the restoration component with respect to the block is, for example, S / N, that is, the sum of squares of differences between the pixel values of the restoration component and the block (sum of squared errors), the activity of the restoration component and the block, etc.
- S / N the sum of squares of differences between the pixel values of the restoration component and the block (sum of squared errors), the activity of the restoration component and the block, etc.
- a difference or the like between predetermined feature amounts can be employed.
- the match component determination unit 54 supplies identification information for identifying the match component to the transmission unit 57.
- the base of the texture component is registered together with unique identification information for identifying the texture component.
- the match component determination unit 54 acquires the identification information of the texture component (its basis) determined as the match component for each block of the input image from the texture DB 51 and supplies it to the transmission unit 57.
- the removal unit 55 is also supplied with an input image.
- the removal unit 55 calculates the difference between the input image and the match component (for each block) of the input image from the match component determination unit 54, thereby obtaining a texture component as a match component of the input image from the input image.
- the input image from which the texture component has been removed is supplied to the encoding unit 56 as an encoding target image to be encoded by the encoding unit 56.
- the encoding target image obtained by the removal unit 55 is an image obtained by removing texture components, that is, high frequency components from the input image, it can be said that it is a low frequency component of the input image.
- the encoding unit 56 encodes the image to be encoded from the removal unit 55 by, for example, MPEG as a hybrid method combining predictive encoding and orthogonal exchange, AVC, HEVC, or other irreversible encoding methods. Then, the encoded data obtained as a result is supplied to the transmission unit 57.
- the image is converted into a frequency domain signal, and the frequency domain signal is quantized.
- high frequency components such as texture components are lost.
- the transmission unit 57 transmits the identification information from the match component determination unit 54 and the encoded data from the encoding unit 56.
- the identification information and the encoded data transmitted by the transmission unit 57 are supplied to the decoding device 60 via a transmission medium (not shown) or recorded on a recording medium (not shown), and further read from the recording medium. And is supplied to the decoding device 60.
- the transmission unit 57 can transmit the identification information and the encoded data separately as well as the transmission unit 35 in FIG.
- the transmission unit 57 can transmit the identification information for each block, or can transmit the information in a segmentation unit larger than the block, that is, for example, a frame unit.
- the decoding device 60 includes a receiving unit 61, a decoding unit 62, a texture DB 63, a base synthesis unit 64, a separation unit 65, and a synthesis unit 66.
- the receiving unit 61 receives (receives) the encoded data and the identification information transmitted from the transmission unit 57, supplies the encoded data to the decoding unit 62, and supplies the identification information to the base synthesis unit 64. .
- the decoding unit 62 decodes the encoded data from the receiving unit 61 by a method corresponding to the encoding method of the encoding unit 56, and supplies a decoded image obtained as a result to the separating unit 65 and the combining unit 66.
- the decoded image obtained by the decoding unit 62 corresponds to an encoding target image, that is, here, a low frequency component of the input image.
- texture DB 63 various types of texture components, that is, a plurality of texture components are registered. For example, at least a plurality of texture components that are registered in the texture DB 51 of the encoding device 50 are registered in the texture DB 63.
- the texture components are grounded and registered in the form of a base.
- the base synthesis unit 64 is supplied with the low frequency component of the decoded image from the separation unit 65.
- the texture synthesis unit 64 registers the texture component base as a match component identified by the identification information from the reception unit 61 for each block of the decoded image corresponding to the block of the input image, and is registered in the texture DB 63. Get from the base of.
- the base synthesis unit 64 performs base synthesis using the low-frequency component of the decoded image supplied from the separation unit 65 and the base acquired from the texture DB 63 for each block of the decoded image, as in the base synthesis unit 53. To do.
- the basis synthesis unit 64 generates a texture component as a restoration component obtained by restoring the match component as the texture component of the input image for each block of the decoded image by basis synthesis, and supplies the texture component to the synthesis unit 66.
- the separation unit 65 separates the low frequency component of the decoded image from the decoded image by filtering the decoded image from the decoding unit 62, and supplies the low frequency component to the base synthesis unit 64.
- the passbands of the filtering performed by the separation units 52 and 65 are the same.
- the separation unit 65 removes the distortion generated in the decoded image and removes the low-frequency component of the decoded image. As a component, an image substantially similar to the decoded image is obtained.
- the separation unit 65 is not essential, and the decoding device 60 can be configured without the separation unit 65.
- the synthesizing unit 66 reconstructs the input image (original image) by synthesizing the low frequency component of the input image as the decoded image from the decoding unit 62 and the texture component as the match component from the base synthesizing unit 64. A restored image is generated and output as an output image.
- a plurality of texture components are registered (held) in the texture DBs 51 and 63 in advance, and match the input image of the plurality of texture components.
- Identification information for identifying a texture component as a match component is transmitted from the encoding device 50 to the decoding device 60.
- the decoding device 60 can obtain an output image obtained by restoring the texture of the input image using the texture component as the match component identified by the identification information, that is, the image quality of the output image can be improved.
- the encoding device 50 and the decoding device 60 in the encoding device 50 and the decoding device 60, a plurality of texture components are registered in advance in the texture DBs 51 and 63, and the encoding device 50 matches the input image with the decoding device 60. Since identification information for identifying the texture component is transmitted instead of the texture component itself as a match component, transmission efficiency (compression efficiency) is improved as compared with the codec of FIG. 1 in which the texture component itself is transmitted. Can do.
- an image from which the texture component has been removed from the input image is used as an encoding target image, and the encoding target image is encoded by the encoding unit 56. Compared with the case of encoding by the encoding unit 56, the compression efficiency can be improved, and as a result, the transmission efficiency can be improved.
- the image quality of the output image can be improved and the transmission efficiency can be improved in the same manner as the codec in FIG.
- texture components are registered in the form of a base. Therefore, the capacity required for the texture DBs 51 and 63 can be reduced as compared with the case where the texture components are directly registered in the form of an image.
- various patterns of texture components can be generated for each type of texture.
- the texture can be classified according to an object (target) having a texture such as a forest, rock, water (surface), or cloth.
- various types of texture components can be generated as each type of texture.
- FIG. 7 is a diagram for explaining the outline of the base learning that bases the texture components.
- an image as a texture component is prepared as a learning image for each type of texture such as forest, rock, water, and cloth, and the basis of each type of texture component is used by using the learning image. Is required.
- the learning image is first converted into a high-resolution high-resolution image, and filtering of the high-resolution image (for example, filtering similar to that performed by the separation units 52 and 65 in FIG. 6) is performed, whereby the low-frequency component of the high-resolution image That is, a low resolution image with low resolution is required.
- a base learning is performed by pairing a high-resolution image as a learning image with a low-resolution image obtained from the high-resolution image, and a high-resolution high-resolution base and a low-resolution low-resolution base are obtained. A pair is required.
- Each base as a high-resolution base corresponds to each base as a low-resolution base. That is, among the high-resolution bases, there are bases corresponding to (paired with) each base as a low-resolution base.
- the K-SVD method or the K-means method can be employed as the basis learning method.
- FIG. 8 is a diagram for explaining the outline of the base synthesis for generating the texture component using the base.
- Base synthesis can be performed, for example, by the method of Matching Pursuits.
- an image be a target image
- a low frequency component of the target image obtained by removing a part or all of the texture component from the target image is a target image to be restored.
- the block for restoring the texture is selected from the restoration target image as the target block sequentially in the raster scan order, for example.
- a plurality of bases are selected as selection bases from a base as a low-resolution base of a certain type of texture, and a prediction block predicting the target block (its image) is generated by linear combination of the selection bases.
- the number of coefficients w i matches the number of selected bases.
- the calculation cost required to determine the coefficient w i may be enormous, so select the selection base.
- the maximum number of bases to be limited is limited to a predetermined value, and the coefficient w i can be determined only for the combination of the selected bases within the limit range.
- coefficients w i are determined for each combination of selected selected basement from a low resolution basis, in the coefficient w i for that respective combinations, for example, of the prediction block, coefficients of an error for the target block is the minimum w i Is determined as the generation coefficient w i for generating the texture component.
- a low resolution base as a selection base when a prediction block with the smallest error is obtained and a pair of high resolution bases are determined as a generation base for generating texture components.
- the base synthesis can be performed by, for example, Iterative Reweighted Least Square other than the Matching Pursuits method. For example, Jianchao YangJ., Huang TS, Yi MaWright. (2010). Image Super-Resolution via Sparse Representation. Image Processing, IEEE Transaction, Vols. -2873.
- FIG. 9 is a flowchart for explaining an example of the encoding process of the encoding device 50 of FIG.
- the encoding device 50 performs the encoding process according to the flowchart of FIG. 9 with the frames of the input image supplied to the encoding device 50 as sequential frames of interest.
- step S41 the separation unit 52 separates the low frequency component from the target frame of the input image and supplies the low frequency component to the base synthesis unit 53.
- the base synthesis unit 53 divides the target frame of the input image into blocks for determining the match component. Then, the process proceeds from step S41 to step S42, and the base synthesis unit 53 selects one block that has not yet been selected as the target block from the block of the target frame of the input image as the target block. Proceed to S43.
- step S43 the base synthesis unit 53 selects, as a target component, one texture component that has not yet been selected as a target component from a plurality (types) of texture components whose bases are registered (stored) in the texture DB 51. Then, the process proceeds to step S44.
- step S ⁇ b> 44 the base synthesis unit 53 performs base synthesis using the low-frequency component of the target block and the base of the target component among the low-frequency components of the input image from the separation unit 52, thereby generating the target block.
- the restoration component which restored the texture component of is obtained.
- the base synthesis unit 53 supplies the restoration component of the block of interest to the match component determination unit 54, and the process proceeds from step S44 to step S45.
- step S45 the match component determination unit 54 calculates an error of the restoration component of the target block with respect to the target block of the input image, and the process proceeds to step S46.
- step S46 the match component determination unit 54 determines whether the error of the restoration component of the target block is smaller than the minimum error of the target block.
- the minimum error for the target block is the restoration component of the target block obtained for the texture component selected as the target component so far among the texture components whose bases are registered in the texture DB 51.
- a predetermined large value is adopted as the minimum value of errors and the initial value of the minimum error for the block of interest.
- step S46 When it is determined in step S46 that the error of the restoration component of the target block is not smaller than the minimum error for the target block, the process skips step S47 and proceeds to step S48.
- step S46 If it is determined in step S46 that the error of the restoration component of the target block is smaller than the minimum error for the target block, the process proceeds to step S47.
- step S47 the match component determining unit 54 updates the minimum error for the target block to the error of the restoration component of the target block, that is, the latest error, and the process proceeds to step S48.
- step S48 the match component determination unit 54 determines whether or not the error of the restoration component of the target block has been acquired for all texture components whose bases are registered in the texture DB 51.
- step S48 If it is determined in step S48 that the error of the restoration component of the target block has not yet been acquired for all texture components whose bases are registered in the texture DB 51, the process returns to step S43.
- step S48 if there is a texture component that has not been selected as the target component among the texture components whose bases are registered in the texture DB 51, the process returns from step S48 to step S43. Repeated.
- step S48 If it is determined in step S48 that the error of the restoration component of the target block has been acquired for all texture components whose bases are registered in the texture DB 51, the process proceeds to step S49.
- the error of the restoration component of the target block is obtained for all the texture components whose bases are registered in the texture DB 51, and the smallest error among the errors is obtained as the minimum error for the target block. If so, the process proceeds from step S48 to step S49.
- step S49 the match component determination unit 54 determines, as a match component, a texture component in which a minimum error for the block of interest is obtained among the texture components whose bases are registered in the texture DB 51. Further, the match component determination unit 54 acquires the identification information of the match component and supplies it to the transmission unit 57, and the process proceeds from step S49 to step S50.
- step S50 the base synthesis unit 53 determines whether all the blocks of the target frame of the input image have been selected as the target block.
- step S50 If it is determined in step S50 that all the blocks of the target frame of the input image have not yet been selected as the target block, the process returns to step S42, and the same process is repeated thereafter.
- step S50 If it is determined in step S50 that all blocks of the target frame of the input image have been selected as the target block, the process proceeds to step S51.
- step S51 the transmission unit 57 generates an identification information map in which each block of the target frame of the input image is associated with identification information of the match component of the block (identification information from the match component determination unit 54). The process proceeds to step S52.
- step S52 the removal unit 55 removes the matching component of each block from the base synthesis unit 53 from each block of the target frame of the input image, that is, the low frequency component (of the target frame) of the input image, that is, The difference between the input image and the texture component as the match component is generated as an encoding target image, supplied to the encoding unit 56, and the process proceeds to step S53.
- step S53 the encoding unit 56 encodes the image to be encoded from the removal unit 55 using the lossy encoding method, supplies the encoded data obtained as a result to the transmission unit 57, and the processing is as follows. Proceed to step S54.
- step S54 the transmission unit 57 transmits the identification information map and the encoded data from the encoding unit 56, and the encoding device 50 ends the process for the frame of interest of the input image.
- FIG. 10 is a flowchart for explaining an example of the decoding process of the decoding device 60 of FIG.
- step S61 the receiving unit 61 receives (receives) the encoded data and the identification information map for one frame transmitted from the encoding device 50. Further, the receiving unit 61 supplies the encoded data to the decoding unit 62 and also supplies the identification information map to the base synthesis unit 64, and the process proceeds from step S61 to step S62.
- step S62 the decoding unit 62 decodes the encoded data from the receiving unit 61, and supplies the decoded image (frame) obtained as a result to the separating unit 65 and the synthesizing unit 66. Proceed to
- step S63 the separation unit 65 separates the low frequency component of the decoded image from the decoded image from the decoding unit 62, supplies the low frequency component to the base synthesis unit 64, and the process proceeds to step S64.
- step S64 the base synthesizing unit 64 selects one of the blocks of the identification information map from the receiving unit 61 that has not yet been selected as the target block as the target block, and the processing is performed in step S65. Proceed to
- step S65 the base synthesis unit 64 acquires the texture component base (represented by the identification information) identified by the identification information of the target block from the texture component base of the texture DB 63 as the base of the target component. The process proceeds to step S66.
- step S ⁇ b> 66 the base synthesis unit 64 uses the low-frequency component of the target block and the base of the target component among the low-frequency components of the decoded image from the separation unit 65 and uses the same base as the base synthesis unit 53. By performing synthesis, the texture component as the match component of the block of interest is restored.
- the base synthesis unit 64 supplies the matching component of the block of interest to the synthesis unit 66, and the process proceeds from step S66 to step S67.
- step S67 the base synthesis unit 64 determines whether all blocks of the identification information map have been selected as the target block.
- step S67 If it is determined in step S67 that all blocks of the identification information map have not yet been selected as the target block, the process returns to step S64, and the same process is repeated thereafter.
- step S67 If it is determined in step S67 that all blocks of the identification information map have been selected as the target block, the process proceeds to step S68.
- step S68 the synthesizing unit 66 synthesizes the texture component as the match component for each block from the base synthesizing unit 64 into the position of the corresponding block of the decoded image (frame) from the decoding unit 62.
- a restored image (frame) obtained by restoring the input image (original image) is generated and output as an output image, and the process for the encoded data and the identification information map for one frame is completed.
- FIG. 11 is a block diagram illustrating a fifth configuration example of the codec that restores texture lost in encoding.
- the codec is common to the case of FIG. 6 in that it includes an encoding device 50 and a decoding device 60.
- the encoding device 50 is the same as the case of FIG. 6 in that it includes a texture DB 51 or a match component determination unit 54, an encoding unit 56, and a transmission unit 57. 6 is common to the case of FIG.
- the encoding device 50 is different from the case of FIG. 6 in that it does not have the removal unit 55.
- the input unit itself does not encode the input image from which the texture component has been removed, that is, the low-frequency component of the input image. Supplied as a target image.
- the encoding unit 56 encodes the input image itself as the encoding target image.
- the encoding target image is encoded by, for example, MPEG, AVC, HEVC, or other irreversible encoding methods as described with reference to FIG.
- the lossy encoding method of the encoding unit 56 According to encoding by the lossy encoding method of the encoding unit 56, at least a part of the texture component of the input image itself as the encoding target image is lost.
- the synthesizing unit 66 includes a decoded image from the decoding unit 62, that is, an input image in which (at least a part of) texture components are lost, and a texture component as a match component from the base synthesizing unit 64. And are synthesized.
- a restored image obtained by restoring the texture component lost by the encoding of the encoding unit 56 is generated as an output image.
- the encoding unit 56 of the encoding device 50 performs encoding using the lossy encoding method with the input image itself as the encoding target image, not the low frequency component of the input image. Except for this point, the same processing as that of the codec of FIG. 6 is performed.
- the decoded image and the texture component as the match component are combined in the decoding device 60, so that a restored image obtained by restoring the texture lost by the encoding of the encoding unit 56 is obtained. Is generated as an output image. That is, even if the encoding unit 56 performs high-compression encoding that removes many texture components, the decoding device 60 can restore the removed texture components.
- FIG. 12 is a block diagram illustrating a sixth configuration example of the codec that restores texture lost in encoding.
- the codec is common to the case of FIG. 6 in that it includes an encoding device 50 and a decoding device 60.
- the encoding device 50 is common to the case of FIG. 6 in that it includes a texture DB 51, a base synthesis unit 53, a match component determination unit 54, an encoding unit 56, and a transmission unit 57.
- the apparatus 60 is common to the case of FIG. 6 in that it includes a receiving unit 61 or a base combining unit 64 and a combining unit 66.
- the encoding device 50 is different from the case of FIG. 6 in that it does not have the separation unit 52 and the removal unit 55. Further, in FIG. 12, the encoding device 50 is different from the case of FIG. 6 in that a decoding unit 81 is newly provided.
- the decoding device 60 is different from the case of FIG. 6 in that it does not have the separation unit 65.
- the encoding device 50 does not include the removal unit 55 in FIG. 12, in the encoding device 50, as in the case of FIG. 11, the input from which the texture component is removed in the encoding unit 56.
- the image that is, the input image itself is encoded as the encoding target image, not the low frequency component of the input image.
- the encoded data obtained by encoding the input image itself as the image to be encoded by the encoding unit 56 is supplied to the decoding unit 81.
- the decoding unit 81 decodes the encoded data from the encoding unit 56 in the same manner as the decoding unit 62 and supplies the decoded image obtained as a result to the base synthesis unit 53.
- the decoded image obtained by the decoding unit 81 is an image in which the texture component of the input image is lost, that is, an image corresponding to the low frequency component of the input image.
- the base synthesis unit 53 performs base synthesis using a decoded image corresponding to the low frequency component of the input image instead of the low frequency component of the input image itself.
- the base synthesizing unit 53 performs base synthesis using the decoded image corresponding to the low frequency component of the input image.
- the synthesis unit 64 performs base synthesis using the decoded image obtained by the decoding unit 62.
- the decoding device 60 is configured not to include the separation unit 65 of FIG. 6 (and FIG. 11).
- FIG. 13 is a flowchart for explaining an example of the encoding process of the encoding device 50 of FIG.
- the encoding device 50 performs the encoding process according to the flowchart of FIG. 13 with the frames of the input image supplied to the encoding device 50 as sequential frames of interest.
- step S71 the encoding unit 56 encodes the frame of interest of the input image as the encoding target image by the lossy encoding method, and the encoded data obtained as a result is transmitted to the transmission unit 57 and the decoding unit 81.
- the process proceeds to step S72.
- step S72 the decoding unit 81 decodes the encoded data of the frame of interest, and supplies the decoded image corresponding to the low frequency component of the frame of interest of the input image obtained as a result to the base synthesis unit 53.
- the base synthesis unit 53 divides the decoded image of the frame of interest into blocks for determining the match component. Then, the process proceeds from step S72 to step S73, and the base synthesis unit 53 selects one block that has not yet been selected as the target block from the decoded image blocks of the target frame as the target block. Proceed to S74.
- step S74 the base composition unit 53 selects, as a target component, one texture component that has not yet been selected as a target component from a plurality (types) of texture components whose bases are registered in the texture DB 51, and performs processing. Advances to step S75.
- step S75 the base synthesis unit 53 performs base synthesis using the target block and the base of the target component in the decoded image from the decoding unit 81, so that the target block (exactly, A restoration component obtained by restoring the texture component of the input image block at the same position as the target block of the decoded image is obtained.
- the base synthesis unit 53 supplies the restoration component of the target block of the input image to the match component determination unit 54, and the process proceeds from step S75 to step S76.
- step S76 the match component determination unit 54 calculates an error of the restoration component of the target block of the input image with respect to the target block of the input image, and the process proceeds to step S77.
- step S77 the match component determination unit 54 determines whether the error of the restoration component of the target block of the input image is smaller than the minimum error for the target block.
- the minimum error for the target block is the texture component whose base has been registered in the texture DB 51 with respect to the texture component that has been selected as the target component so far. Is a minimum value among the errors of the restoration component of the target block of the input image obtained in this way, and a predetermined large value is adopted as the initial value of the minimum error for the target block.
- step S77 If it is determined in step S77 that the error of the restoration component of the target block of the input image is not smaller than the minimum error for the target block, the process skips step S78 and proceeds to step S79.
- step S77 If it is determined in step S77 that the error of the restoration component of the target block is smaller than the minimum error for the target block, the process proceeds to step S78.
- step S78 the match component determination unit 54 updates the minimum error of the target block to the error of the restoration component of the target block of the input image, that is, the latest error, and the process proceeds to step S79.
- step S79 the match component determination unit 54 determines whether or not the error of the restoration component of the target block of the input image has been acquired for all texture components whose bases are registered in the texture DB 51.
- step S79 If it is determined in step S79 that the error of the restoration component of the target block of the input image has not yet been acquired for all texture components whose bases are registered in the texture DB 51, the process returns to step S74.
- step S79 if there is a texture component that has not been selected as the target component among the texture components whose bases are registered in the texture DB 51, the process returns from step S79 to step S74. Repeated.
- step S79 If it is determined in step S79 that the error of the restoration component of the target block of the input image has been acquired for all texture components whose bases are registered in the texture DB 51, the process proceeds to step S80.
- the error of the restoration component of the target block of the input image is obtained for all the texture components whose bases are registered in the texture DB 51, and among these errors, the minimum error is the minimum error for the target block. Is obtained, the process proceeds from step S79 to step S80.
- step S80 the match component determination unit 54 determines, as a match component, a texture component in which a minimum error for the target block is obtained from among the texture components as the recovery component of the target block of the input image. Further, the match component determination unit 54 acquires the identification information of the match component and supplies it to the transmission unit 57, and the process proceeds from step S80 to step S81.
- step S81 the base synthesis unit 53 determines whether all blocks of the decoded image of the target frame have been selected as the target block.
- step S81 If it is determined in step S81 that all the blocks of the decoded image of the frame of interest have not been selected as the block of interest, the process returns to step S73, and the same process is repeated thereafter.
- step S81 If it is determined in step S81 that all blocks of the decoded image of the frame of interest have been selected as the block of interest, the process proceeds to step S82.
- step S82 the transmission unit 57 generates an identification information map in which each block of the target frame of the input image is associated with identification information of the matching component of the block (identification information from the transmission unit 57). The process proceeds to step S83.
- step S83 the transmission unit 57 transmits the identification information map and the encoded data from the encoding unit 56, and the encoding device 50 ends the process for the frame of interest of the input image.
- FIG. 14 is a flowchart for explaining an example of the decoding process of the decoding device 60 of FIG.
- step S91 the receiving unit 61 receives (receives) the encoded data and the identification information map for one frame transmitted from the encoding device 50. Further, the receiving unit 61 supplies the encoded data to the decoding unit 62 and also supplies the identification information map to the base synthesis unit 64, and the process proceeds from step S91 to step S92.
- step S92 the decoding unit 62 decodes the encoded data from the receiving unit 61, and supplies the decoded image (frame) obtained as a result to the base synthesis unit 64 and the synthesis unit 66. Proceed to S93.
- step S93 the base synthesizing unit 64 selects one of the blocks of the identification information map from the receiving unit 61 that has not yet been selected as the target block as the target block, and the processing is performed in step S94. Proceed to
- step S94 the basis synthesis unit 64 acquires the texture component base (represented by the identification information) identified by the identification information of the target block from the texture component base in the texture DB 63 as the base of the target component. The process proceeds to step S95.
- step S95 the base synthesizing unit 64 selects the block of interest in the decoded image from the separating unit 65 (more precisely, the block of the decoded image at the same position as the block of interest among the blocks of the identification information map), By using the base of the component and performing base synthesis similar to the base synthesis unit 53, the texture component as the matching component of the block of interest is restored.
- step S95 the base synthesis unit 64 supplies the matching component of the block of interest to the synthesis unit 66, and the process proceeds from step S95 to step S96.
- step S96 the base synthesis unit 64 determines whether all blocks of the identification information map have been selected as the target block.
- step S96 If it is determined in step S96 that all blocks of the identification information map have not yet been selected as the target block, the process returns to step S93, and the same process is repeated thereafter.
- step S96 If it is determined in step S96 that all blocks in the identification information map have been selected as the target block, the process proceeds to step S97.
- step S97 the synthesizing unit 66 synthesizes the texture component as the match component for each block from the base synthesizing unit 64 into the position of the corresponding block of the decoded image (frame) from the decoding unit 62.
- a restored image (frame) obtained by restoring the input image (original image) is generated and output as an output image, and the process for the encoded data and the identification information map for one frame is completed.
- FIG. 15 is a block diagram showing a seventh configuration example of the codec for restoring the texture lost in the encoding.
- the texture components (bases) of the texture DB 51 and 63 can be updated as necessary.
- texture component update function is not limited to the fourth configuration example shown in FIG. 6 (for example, the second configuration example shown in FIG. 2, the third configuration example shown in FIG. 5, or the fifth configuration example shown in FIG. 11). And the codec of the sixth configuration example in FIG. 12).
- FIG. 15 shows a configuration example of a codec that can update the base of the texture component of the texture DB 63.
- the codec is common to the case of FIG. 6 in that it includes an encoding device 50 and a decoding device 60.
- the encoding device 50 has the texture DB 51 to the transmission unit 57 in common with the case of FIG. 6, and the decoding device 60 has the reception unit 61 to the synthesis unit 66 in FIG. Common to the case of.
- the encoding device 50 is different from the case of FIG. 6 in that the data transmission unit 101 is newly provided. Further, in FIG. 15, the decoding device 60 is different from the case of FIG.
- the data transmission unit 101 transmits the texture component base (and identification information) as DB data registered in the texture DB 51 to the update unit 111 in response to a request from the update unit 111 of the decoding device 60.
- the update unit 111 refers to the texture component base as the DB data registered in the texture DB 51 via the data transmission unit 101 of the encoding device 50, and among the texture component bases registered in the texture DB 51. Then, it is confirmed whether or not there is a basis of a texture component that is not registered in the texture DB 63 of the decoding device 60 (hereinafter also referred to as an unregistered component).
- the update unit 111 requests and acquires the basis for the unregistered component from the data transmission unit 101 of the encoding device 50, and obtains the basis for the unregistered component as the texture DB 63.
- the DB data as the registered contents of the texture DB 63 is updated.
- the basis of unregistered components is not only the basis of texture components of different types, but also the basis of texture components of the same type, a new basis obtained by new basis learning (so to say, a new version of the basis). ) Is also included.
- FIG. 16 is a flowchart illustrating an example of processing performed by the encoding device 50 and the decoding device 60 to update the DB data of the texture DB 63.
- step S111 the data transmission unit 101 determines whether there is a request for DB data from the update unit 111 of the decoding device 60.
- step S111 If it is determined in step S111 that there is no request for DB data, the process returns to step S111.
- step S111 If it is determined in step S111 that there is a request for DB data, the process proceeds to step S112.
- step S ⁇ b> 112 the data transmission unit 101 acquires the texture component base (and identification information) as the DB data requested from the update unit 111 from the texture DB 51, and transmits it to the update unit 111. And a process returns from step S112 to step S111, and the same process is repeated hereafter.
- step S121 the update unit 111 determines whether it is the update timing of the texture DB 63.
- the update timing of the texture DB 63 for example, in the decoding unit 62, timing immediately before starting decoding of a certain content, timing immediately before starting block processing after starting decoding, A timing immediately before the start of processing, a regular or irregular timing, or any other timing can be adopted.
- step S121 If it is determined in step S121 that it is not the update timing of the texture DB 63, the process returns to step S121.
- step S121 If it is determined in step S121 that it is the update timing of the texture DB 63, the process proceeds to step S122.
- step S122 the update unit 111 determines whether there is a base of unregistered components not registered in the texture DB 63 among the bases of unregistered DB data, that is, texture components registered in the texture DB 51. To do.
- step S122 If it is determined in step S122 that there is no basis for unregistered components, the process returns to step S121.
- step S122 If it is determined in step S122 that there is a basis for unregistered components, the process proceeds to step S123.
- step S123 the update unit 111 requests the unregistered DB data, that is, the base of the unregistered component, from the data transmission unit 101 of the encoding device 50, and the process proceeds to step S124.
- step S124 the update unit 111 receives the base of the unregistered component from the data transmission unit 101 and receives it. Furthermore, the update unit 111 updates the DB data of the texture DB 63 by registering the base of the unregistered component from the data transmission unit 101 in the texture DB 63, and the process returns from step S124 to step S121.
- the unregistered component can be obtained from the encoding device 50, registered in the texture DB 63, and the DB data of the texture DB 51 of the encoding device 50 can be included in the DB data of the texture DB 63 of the decoding device 60.
- FIG. 17 is a block diagram illustrating an eighth configuration example of a codec that restores texture lost in encoding.
- FIG. 17 shows a configuration example of a codec that can update the texture component bases of the texture DBs 51 and 63.
- the codec is common to the case of FIG. 6 in that it includes an encoding device 50 and a decoding device 60.
- the encoding device 50 has the texture DB 51 to the transmission unit 57 in common with the case of FIG. 6, and the decoding device 60 has the reception unit 61 to the synthesis unit 66 in FIG. Common to the case of.
- the encoding apparatus 50 is different from the case of FIG.
- the decoding device 60 is different from the case of FIG. 6 in that it newly includes an updating unit 131.
- the update unit 121 accesses an external server 141 such as a server on the Internet as necessary, and acquires a texture component base (and identification information) from the server 141.
- various types of texture component bases are appropriately uploaded to the server 141, and the update unit 121 acquires the bases of predetermined texture components from the server 141 as necessary.
- the update unit 121 updates the DB data of the texture DB 51 by registering the texture component base acquired from the server 141 in the texture DB 51.
- the basis of the texture acquired from the server 141 can be determined as necessary.
- the quality of the original image as the input image encoded by the encoding device 50 specifically, for example, the S / N of the image, the resolution (SD (Standard Definition) image, Or, it is an HD (High Definition) image), and the base of the texture component capable of maintaining the quality (appropriate for maintaining the quality) can be obtained according to the frequency band, etc. .
- SD Standard Definition
- HD High Definition
- the update unit 121 for example, a base of an existing type of texture, but a base whose version is new (for example, a base of a texture component having a higher expression effect) or a base of a new type of texture component,
- a base of a new texture component is uploaded to the server 141, the base of the new texture component can be acquired.
- the updating unit 121 can download (acquire) the texture component base from the server 141 and upload the texture component base registered in the texture DB 51 to the server 141 as necessary.
- the update unit 131 refers to the texture component base as the DB data registered in the texture DB 51 via the update unit 121 of the encoding device 50, and among the texture component bases registered in the texture DB 51, It is confirmed whether there is a base of unregistered components that are not registered in the texture DB 63 of the decoding device 60.
- the updating unit 131 accesses an external server 142 such as a server on the Internet. Then, the update unit 131 requests and acquires the base of the unregistered component from the server 142, and registers the base of the unregistered component in the texture DB 63, so that the DB data of the texture DB 63 and the DB data of the texture DB 51 are registered.
- Update to include
- the servers 141 and 142 are synchronized and thus have the same texture component base.
- the update unit 121 accesses the server 141 and the update unit 131 accesses the server 142.
- the update unit 121 may access either of the servers 141 and 142. it can.
- the update unit 131 is the same.
- the server 141 accessed by the update unit 121 and the server 142 accessed by the update unit 131 are prepared separately, but the server 141 and the update unit 131 accessed by the update unit 121 are provided.
- the server 142 to access for example, the same server such as a server on the cloud can be employed.
- the update of the DB data of the texture DB 51 by the update unit 121 of the encoding device 50 is not essential. However, when updating the DB data of the texture DB 51, for example, by updating the texture component base registered in the texture DB 51 for each stream of the original image as the input image, the capacity of the texture DB 51 is increased. Even if it is limited to some extent, the input image can be processed using the basis of texture components suitable for the input image.
- the update of the DB data of the texture DB 51 by the update unit 121 of the encoding device 50 and the update of the DB data of the texture Db 63 by the update unit 131 of the decoding device 60 are the same as in the case of the update unit 111 of FIG. Can be performed at any timing.
- FIG. 18 is a block diagram illustrating a ninth configuration example of the codec that restores texture lost in encoding.
- FIG. 18 shows a configuration example of a codec that can update the texture component bases of the texture DBs 51 and 63.
- the codec is common to the case of FIG. 15 in that it includes an encoding device 50 and a decoding device 60.
- the encoding device 50 includes the texture DB 51 through the transmission unit 57 and the data transmission unit 101 in common with the case of FIG. 15, and the decoding device 60 includes the reception unit 61 through the synthesis unit 66. And having the update unit 111 is common to the case of FIG.
- the encoding device 50 is different from the case of FIG. 15 in that it further includes a registration unit 151.
- the registration unit 151 registers a texture component base suitable for encoding an input image in the texture DB 51.
- the registration unit 151 is supplied with necessary information from other blocks, but the connection lines for supplying information to the registration unit 151 are not shown in order to prevent the diagram from becoming complicated. is there.
- FIG. 19 is a block diagram illustrating a tenth configuration example of the codec that restores texture lost in encoding.
- FIG. 19 shows a configuration example of a codec that can update the bases of the texture components of the texture DBs 51 and 63.
- the codec is common to the case of FIG. 17 in that it includes an encoding device 50 and a decoding device 60.
- the encoding device 50 includes the texture DB 51 to the transmission unit 57 and the update unit 121 in common with the case of FIG. 17, and the decoding device 60 includes a reception unit 61 to a synthesis unit 66, And it is common to the case of FIG.
- the encoding device 50 is different from the case of FIG. 17 in that it has a new registration unit 151 of FIG.
- FIG. 20 is a block diagram illustrating a configuration example of the registration unit 151 illustrated in FIGS. 18 and 19.
- the registration unit 151 includes a base learning unit 161 and a registration determination unit 162.
- the base learning unit 161 is supplied with the input image and is supplied with the low frequency component of the input image from the separation unit 52.
- the base learning unit 161 performs base learning using the input image and the low-frequency component of the input image as a pair of the high-resolution image and the low-resolution image described in FIG.
- a base of the texture component (for example, the pair of the high resolution base and the low resolution base described in FIG. 7) is generated.
- the base learning unit 161 temporarily registers the texture component base generated by the base learning in the texture DB 51 together with identification information for identifying the new texture component as the base of the new texture component.
- the base learning in the base learning unit 161 and the temporary registration of the base (and identification information) of the new texture component obtained by the base learning to the texture DB 51 can be performed at an arbitrary timing.
- the base learning in the base learning unit 161 and the temporary registration of the base of the new texture component can be performed on each frame, for example, every time the frame of the input image is supplied to the encoding device 50. .
- the base learning in the base learning unit 161 and the provisional registration of the base of the new texture component are, for example, errors in the input image of the match component when the base of the new texture component is not registered in the texture DB 51, This can be done when the threshold value is exceeded.
- the error of the match component with respect to the input image can be calculated in units of frames, for example. Further, as an error of the match component with respect to the input image, a difference between pixel values of the match component and the input image, a difference between predetermined feature amounts such as activity between the match component and the input image, or the like may be employed. it can.
- the registration determination unit 162 restores the texture (of the target block) of the input image generated from the base of the texture component for each texture component whose base is registered in the texture DB 51 from the match component determination unit 54.
- the error for the input image is supplied.
- the registration determination unit 162 uses the restoration component error from the match component determination unit 54 and the like to determine whether to register the base of the new texture component.
- the registration determination unit 162 When the registration determination unit 162 determines to newly register the basis of the new texture component, the registration determination unit 162 fully registers the basis of the new texture component temporarily registered in the texture DB 51 in the texture DB 51.
- the registration determination unit 162 uses the error of the restoration component from the match component determination unit 54 to generate the error of the restoration component generated from the basis of the new texture component from the basis of the texture component registered in the texture DB 51. It is recognized whether the error is the smallest among the errors of the generated restoration component.
- the registration determination unit 162 When the error of the restoration component generated from the basis of the new texture component is the minimum error, that is, when the new texture component is a match component, the registration determination unit 162 performs the registration. It is determined whether or not a main registration condition as a predetermined condition determined in advance is satisfied. When the main registration condition is satisfied, the registration determination unit 162 determines that the base of the new texture component is to be main-registered, and main-registers the base of the new texture component.
- the match component in the case where it is not performed is superior to the S / N for the input image by a certain value or more.
- an RD (Rate-Distotion) curve when a base of a new texture component is registered in the texture DB 51 is an RD when a base of the new texture component is not registered in the texture DB 51.
- the fact that it is superior to a curve by a certain value or more can be employed.
- the error of the restoration component generated from the base of the new texture component is the error of the restoration component generated from the base of the texture component registered in the texture DB 51 regardless of the main registration condition. If the error is the smallest of the errors, the basis of the new texture component can be fully registered.
- the registration determination unit 162 when the registration condition is satisfied, the error of the restoration component generated from the base of the new texture component is converted into the restoration component generated from the basis of the texture component registered in the texture DB 51. Even if the error is not the minimum error, the basis of the new texture component can be fully registered.
- the method of main registration of the base of the new texture component for example, there are a method of adding the base of the new texture component to the texture DB 51 and a method of overwriting (changing).
- the base of the new texture component is registered in the form of being added to the base of the texture component registered in the texture DB 51.
- the base of the new texture component is registered in the form of overwriting the base of some of the texture components registered in the texture DB 51.
- the base of the new texture component is, for example, the base of the texture component that has not been determined as the match component among the texture component bases registered in the texture DB 51, or the match component
- the determined time can be overwritten on the basis of the past texture component.
- the data amount of the texture component base registered in the texture DB 51 differs depending on whether the new texture component base is added or overwritten.
- the RD curve is different when the data amount is different. Therefore, as a main registration condition, the RD curve when the base of the new texture component is registered in the texture DB 51 is more than a certain value better than the RD curve when the base of the new texture component is not registered in the texture DB 51. Is used, the RD curve of each case of adding the new texture component base and the case of overwriting is obtained, and whether the base of the new texture component is added or overwritten, It can be determined according to the RD curve.
- FIG. 21 is a flowchart illustrating an example of the encoding process of the encoding device 50 in FIGS. 18 and 19.
- the base learning in the base learning unit 161 and the provisional registration of the base of the new texture component are performed for each frame every time the frame of the input image is supplied to the encoding device 50. I will do it.
- the encoding device 50 performs the encoding process according to the flowchart of FIG. 21 with the frames of the input image supplied to the encoding device 50 as sequential frames of interest.
- step S151 the separation unit 52 separates the low frequency component from the target frame of the input image and supplies the low frequency component to the base synthesis unit 53 and the registration unit 151.
- the base learning unit 161 of the registration unit 151 divides the frame of interest of the input image into learning blocks as units for performing base learning, and the process proceeds from step S151 to step S152.
- step S152 the base learning unit 161 selects one learning block that has not yet been selected as the attention learning block from the learning blocks of the attention frame of the input image as the attention learning block, and the process proceeds to step S153. .
- step S153 the base learning unit 161 performs base learning to base the texture of the attention learning block, and the process proceeds to step S154.
- the base learning unit 161 sets the pair of the attention learning block and the block at the same position as the attention learning block among the low frequency components of the input image from the separation unit 52 as the high-resolution image and Base learning is performed using the pair of low-resolution images, and a texture component base (for example, the pair of the high-resolution base and the low-resolution base described in FIG. 7) is generated.
- step S154 the base learning unit 161 uses the texture component base of the attention learning block obtained by the base learning as the base of the new texture component, and temporarily stores the base (and identification information) of the new texture component in the texture DB 51. sign up.
- the base synthesis unit 53 divides the target frame of the input image into blocks for determining the match component. Then, the process proceeds from step S154 to step S155, and the base synthesis unit 53 selects one block that has not yet been selected as the target block from the target frame block of the input image as the target block. The process proceeds to S156.
- step S156 the base synthesis unit 53 selects one texture component that has not yet been selected as a target component from a plurality (types) of texture components whose bases are registered (including provisional registration) in the texture DB 51. The component is selected, and the process proceeds to step S157.
- step S157 the base synthesizing unit 53 uses the low-frequency component of the target block and the base of the target component among the low-frequency components of the input image from the separation unit 52, for example, the base described in FIG. By performing synthesis, a restoration component obtained by restoring the texture component of the block of interest is obtained.
- the base synthesis unit 53 supplies the restoration component of the block of interest to the match component determination unit 54, and the process proceeds from step S157 to step S158.
- step S158 the match component determination unit 54 calculates an error of the restoration component of the target block with respect to the target block of the input image, and the process proceeds to step S159.
- step S159 the match component determination unit 54 determines whether the error of the restoration component of the block of interest is smaller than the minimum error of the block of interest.
- the minimum error for the target block is the texture component whose base has been registered in the texture DB 51 with respect to the texture component that has been selected as the target component so far.
- a predetermined large value is employed as the initial value of the minimum error for the target block.
- step S159 If it is determined in step S159 that the error of the restoration component of the target block is not smaller than the minimum error for the target block, the process skips step S160 and proceeds to step S161.
- step S159 If it is determined in step S159 that the error of the restoration component of the target block is smaller than the minimum error for the target block, the process proceeds to step S160.
- step S160 the match component determination unit 54 updates the minimum error for the target block to the error of the recovery component of the target block, that is, the latest error, and the process proceeds to step S161.
- step S161 the match component determination unit 54 determines whether or not the error of the restoration component of the target block has been acquired for all texture components whose bases are registered in the texture DB 51.
- step S161 If it is determined in step S161 that the error of the restoration component of the target block has not yet been acquired for all texture components whose bases are registered in the texture DB 51, the process returns to step S156.
- step S161 if there is a texture component that is not selected as the target component among the texture components whose bases are registered in the texture DB 51, the process returns from step S161 to step S156. Repeated.
- step S161 If it is determined in step S161 that the error of the restoration component of the target block has been acquired for all texture components whose bases are registered in the texture DB 51, the process proceeds to step S162.
- step S161 when the error of the restoration component of the target block is obtained for all the texture components whose bases are registered in the texture DB 51, the process proceeds from step S161 to step S162.
- step S162 the registration determination unit 162 of the registration unit 151 (FIG. 20) determines whether to register the base of the new texture component.
- step S162 If it is determined in step S162 that the base of the new texture component is not registered, the process skips step S163 and proceeds to step S164.
- the restoration component obtained from the basis of the new texture component Even if the error is not the minimum error or even if it is the minimum error, if the registration condition is not satisfied, it is determined not to register the base of the new texture component, and the main registration of the base of the new texture component is not performed. I will not.
- step S162 determines whether the base of the new texture component is to be fully registered. If it is determined in step S162 that the base of the new texture component is to be fully registered, the process proceeds to step S163.
- step S163 the registration determination unit 162 performs main registration of the base of the new texture component temporarily registered in the texture DB 51 in the texture DB 51, and the process proceeds to step S164.
- the error of the restoration component obtained from the base of the new texture component among the error of the restoration component of the target block obtained by the match component determination unit 54 is If the error is the minimum error and the main registration condition is satisfied, it is determined that the base of the new texture component is to be main-registered, and the main registration of the base of the new texture component is performed.
- step S164 the base synthesis unit 53 determines whether all the blocks of the target frame of the input image have been selected as the target block.
- step S164 If it is determined in step S164 that all the blocks of the target frame of the input image have not yet been selected as the target block, the process returns to step S155, and the same process is repeated thereafter.
- step S164 If it is determined in step S164 that all blocks of the target frame of the input image have been selected as the target block, the process proceeds to step S165.
- step S165 the base learning unit 161 of the registration unit 151 (FIG. 20) determines whether all the learning blocks of the target frame of the input image have been selected as the target learning block.
- step S165 If it is determined in step S165 that all the blocks of the target frame of the input image have not yet been selected as the target learning block, the processing returns to step S152, and the same processing is repeated thereafter.
- step S165 If it is determined in step S165 that all the blocks of the target frame of the input image have been selected as the target learning block, the process proceeds to step S166.
- step S166 the match component determination unit 54 determines, for each block of the target frame of the input image, the texture component that has obtained the minimum error among the texture components as the block restoration components. Further, the match component determination unit 54 acquires the identification information of the match component for each block of the target frame of the input image, and the process proceeds from step S66 to step S167.
- step S167 the transmission unit 57 generates an identification information map in which each block of the target frame of the input image is associated with identification information of the match component of the block (identification information from the match component determination unit 54). The process proceeds to step S168.
- the registration determination unit 162 of the registration unit 151 does not perform main registration in the texture DB 51, but uses the texture component base (and identification information) temporarily registered as the texture DB 51. Delete from.
- step S168 the removal unit 55 removes the match component of each block from the base synthesis unit 53 from each block of the target frame of the input image, that is, the low frequency component (of the target frame) of the input image, that is, The difference between the input image and the texture component as the match component is generated as an encoding target image, supplied to the encoding unit 56, and the process proceeds to step S169.
- step S169 the encoding unit 56 encodes the encoding target image from the removal unit 55 using the lossy encoding method, and supplies the encoded data obtained as a result to the transmission unit 57. Proceed to step S170.
- step S170 the transmission unit 57 transmits the identification information map and the encoded data from the encoding unit 56, and the encoding device 50 ends the process for the target frame of the input image.
- the information amount of identification information transmitted from the encoding device 50 (or 30) to the decoding device 60 (or 40) can be controlled to a fixed value or a variable value. You can also. Whether the information amount of the identification information is controlled to a fixed value or a variable value, the larger the information amount of the identification information, the larger the information amount. Although the transmission efficiency is lowered, the image quality of the decoded image (output image) can be improved.
- the target block for determining the match component in the input image and the target block for synthesizing the match component in the decoded image are also referred to as the target block.
- a block (area) of texture components generated by base synthesis is also referred to as a texture block.
- the information amount of the identification information for example, the data amount of the identification information transmitted for an input image of one frame (picture) is adopted.
- the information amount of the identification information is represented by the number of bits of one piece of identification information ⁇ the number of target blocks constituting the input image of one frame.
- target blocks that make up one frame of the input image need not all have the same size, but here, for the sake of simplicity, all of the target blocks that make up the input image of one frame are the same. Let's be size.
- one frame of the input image is a 1920 x 1080 pixel HD (High Definition) image
- the 192 x 108 pixel block obtained by dividing the horizontal and vertical directions by 10 is the target block
- One frame is composed of 100 target blocks.
- the number of pieces of identification information for one frame is 100, which is the same as the number of target blocks.
- the number of pieces of identification information in one frame is 1000, which is the same as the number of target blocks.
- the size of the target block and the texture block generated by the base synthesis are the same, but the size of the target block and the texture block are the same. It is not necessary. That is, as the target block, a block having a size equal to or smaller than the size of the texture block can be employed.
- a partial texture such as the central portion of the texture block can be adopted as the texture of the target block.
- the target block is limited to a size equal to or smaller than the size of the texture block. Therefore, the input image cannot be divided into target blocks having a size exceeding the texture block, and the number of target blocks when the input image is divided into target blocks is limited by the size of the texture block.
- the codec when the information amount of the identification information is controlled to a fixed value, the number of bits of one identification information and the number of target blocks of one frame are each controlled to a fixed value.
- the amount of identification information the number of bits of one piece of identification information x the number of bits of one piece of identification information and the number of bits of one piece of identification block Each of the numbers is controlled to a variable value.
- the texture DB 51 when the 2 N texture component (underlying) is stored, the 2 N texture components thereof, the N-bit identification information Can be identified.
- the less than N bits for example, is represented by N-1 to 2 N-1 pieces of identification information that, among the 2 N pieces of texture components stored in the texture DB 51, by leaving association with 2 N-1 pieces of texture components, the number of bits identifying information , N bits or N-1 bits.
- the number of target blocks in one frame can be controlled within a range where the target block becomes a block smaller than the size of the texture block.
- the codec when the amount of identification information is controlled to a variable value, one or both of the number of bits of one identification information and the number of target blocks of one frame are adaptively variable. Is controlled to the value of
- the amount of identification information when the amount of identification information is controlled to a variable value, the image quality (bit rate) of the input image, the size (number of pixels) of one frame, and the genre (for example, sports, animation, movies, etc.) Accordingly, the amount of identification information can be adaptively controlled.
- additional information that can be used to generate a texture component that contributes to improving the image quality of the decoded image is generated by the encoding device 50 (or 30), and the decoding device 60 ( Or 40).
- the additional information includes, for example, gain information that determines the amplitude of the texture component as a match component, parameters that can be used to generate the texture component, and image feature quantities of the input image.
- the gain information as additional information is used for controlling the gain of the texture component obtained by the base synthesis. That is, in the base synthesis, a texture component obtained by normalizing the amplitude of the texture component is obtained as necessary.
- the gain information can be used to determine the amplitude of the texture component whose amplitude is normalized.
- the parameter as additional information includes, for example, the texture bandwidth information of the input image (such as the result of FFT (Fast Transform) of the texture of the input image).
- the texture bandwidth information of the input image such as the result of FFT (Fast Transform) of the texture of the input image.
- the threshold is the degree to which the texture component as the match component that most closely matches the input image (target block) matches the input image.
- the encoding device 50 (or 30) can transmit band information as additional information to the decoding device 60 (or 40).
- the decoding device 60 filters the texture component as the match component so as to be a band similar to the band represented by the band information as the additional information, and synthesizes the filtered texture component with the decoded image.
- the image quality of the decoded image can be improved.
- DR dynamic range
- DR variance adjacent pixel difference of the input image
- the encoding device 50 adds The image feature amount as information can be transmitted to the decoding device 60.
- the decoding device 60 processes the texture component as the match component so as to be a texture component of the image feature amount similar to the image feature amount as additional information, and converts the processed texture component into a decoded image. By synthesizing, the image quality of the decoded image can be improved.
- the number of additional information transmitted from the encoding device 50 to the decoding device 60 is, for example, a condition such as processing performance of the codec (or a device on which the codec is mounted), operation cost (power consumption, heat generation, etc.) Can be increased or decreased.
- a parameter or image feature amount as additional information can be additionally transmitted to improve the image quality of the decoded image.
- gain information as additional information can be transmitted from the encoding device 50 to the decoding device 60 in units of one pixel.
- the decoding device 60 can use the gain information for each pixel as it is for controlling the amplitude of the texture component.
- gain information as additional information is transmitted from the encoding device 50 to the decoding device 60 in units of a plurality of pixels, thereby reducing the amount of additional information transmitted.
- gain information in units of a plurality of pixels can be interpolated so as to be gain information in units of one pixel, and used for controlling the amplitude of the texture component.
- the texture of an image is DCT (Discrete Cosine Transform), further quantized and transmitted.
- DCT Discrete Cosine Transform
- the texture is transmitted by the identification information.
- the encoding device 50 is different from the conventional lossy encoding method in that the texture is transmitted by the identification information.
- the first image is converted into the second image.
- a pixel serving as a prediction tap used for a prediction calculation for obtaining a pixel value of a corresponding pixel of a second image corresponding to a target pixel of interest of the first image is extracted from the first image.
- the selected pixel is classified into one of a plurality of classes according to a certain rule.
- the sum of square errors as a statistical error between the result of the prediction calculation using the student image corresponding to the first image and the teacher image corresponding to the second image is minimized.
- the tap coefficient of the class of the target pixel is obtained from the tap coefficient used for the prediction calculation for each of the plurality of classes obtained by learning, and the tap coefficient of the class of the target pixel and the prediction tap of the target pixel are obtained. By performing the prediction calculation used, the pixel value of the corresponding pixel is obtained.
- the tap coefficient learning in the classification adaptation process, the sum of square errors between the prediction calculation result using the student image corresponding to the first image and the teacher image corresponding to the second image is minimized. As a criterion for determining the tap coefficient, the tap coefficient is obtained.
- the encoding device 50 also uses the input image as a match component determination criterion for determining a match component that matches the input image from the texture components stored in the texture DB 51 (or 31), as in the class classification adaptive process. It is possible to employ the fact that the sum of the square errors of the texture component and the texture component is minimized (hereinafter also referred to as a square error minimum criterion).
- the encoding apparatus 50 can employ a standard other than the minimum square error standard as the match component determination standard, and can control the image quality (mainly texture) and subjective performance of the decoded image.
- match component determination criterion other than the square error minimum criterion, for example, SSIM (Structural Similarity) having an index close to qualitative can be used.
- this technique adopts a standard other than the minimum square error standard as a match component determination standard and can control the image quality and subjective performance of the decoded image. It is different from classification adaptation processing.
- the subjective performance means, for example, the performance of image characteristics such as fineness, sharpness, resolution, and contrast that affect the evaluator's qualitative image quality evaluation and impression.
- the match component determination norm for obtaining a decoded image with a desired subjective performance or image quality for example, a decoded image with a sense of fineness or sharpness
- a desired subjective performance or image quality for example, a decoded image with a sense of fineness or sharpness
- an input image with a certain subjective performance or image quality is specified. It is possible to create (design) a codec adjuster, a user, etc. by repeating a qualitative evaluation experiment for an input image of the subjective performance and image quality.
- the input image (subjective performance and image quality) used for the creation needs to be known.
- FIG. 22 is a diagram illustrating an example of a multi-view image encoding method.
- the multi-viewpoint image includes images of a plurality of viewpoints (views).
- the multiple views of this multi-viewpoint image are encoded using the base view that encodes and decodes using only the image of its own view without using the information of other views, and the information of other views.
- -It consists of a non-base view that performs decoding.
- Non-base view encoding / decoding may use base view information or other non-base view information.
- the multi-view image is encoded for each viewpoint.
- the encoded data of each viewpoint is decoded (that is, for each viewpoint).
- the method described in the above embodiment may be applied to such encoding / decoding of each viewpoint. By doing so, transmission efficiency and image quality can be improved. In other words, transmission efficiency and image quality can be improved in the case of multi-viewpoint images as well.
- FIG. 23 is a diagram illustrating a multi-view image encoding apparatus of the multi-view image encoding / decoding system that performs the multi-view image encoding / decoding described above.
- the multi-view image encoding apparatus 1000 includes an encoding unit 1001, an encoding unit 1002, and a multiplexing unit 1003.
- the encoding unit 1001 encodes the base view image and generates a base view image encoded stream.
- the encoding unit 1002 encodes the non-base view image and generates a non-base view image encoded stream.
- the multiplexing unit 1003 multiplexes the base view image encoded stream generated by the encoding unit 1001 and the non-base view image encoded stream generated by the encoding unit 1002 to generate a multi-view image encoded stream. To do.
- FIG. 24 is a diagram illustrating a multi-view image decoding apparatus that performs the above-described multi-view image decoding.
- the multi-view image decoding apparatus 1010 includes a demultiplexing unit 1011, a decoding unit 1012, and a decoding unit 1013.
- the demultiplexing unit 1011 demultiplexes the multi-view image encoded stream in which the base view image encoded stream and the non-base view image encoded stream are multiplexed, and the base view image encoded stream and the non-base view image The encoded stream is extracted.
- the decoding unit 1012 decodes the base view image encoded stream extracted by the demultiplexing unit 1011 to obtain a base view image.
- the decoding unit 1013 decodes the non-base view image encoded stream extracted by the demultiplexing unit 1011 to obtain a non-base view image.
- the encoding device 10 described in the above embodiment is applied as the encoding unit 1001 and the encoding unit 1002 of the multi-view image encoding device 1000. Also good. By doing so, the method described in the above embodiment can be applied to the encoding of multi-viewpoint images. That is, transmission efficiency and image quality can be improved.
- the decoding device 20 described in the above embodiment may be applied as the decoding unit 1012 and the decoding unit 1013 of the multi-viewpoint image decoding device 1010. By doing so, the methods described in the above embodiments can be applied to decoding of encoded data of multi-viewpoint images. That is, transmission efficiency and image quality can be improved.
- FIG. 25 is a diagram illustrating an example of a hierarchical image encoding method.
- Hierarchical image coding is a method in which image data is divided into a plurality of layers (hierarchization) so as to have a scalability function with respect to a predetermined parameter, and is encoded for each layer.
- Hierarchical image decoding is decoding corresponding to the hierarchical image encoding.
- the hierarchized image includes images of a plurality of hierarchies (layers) having different predetermined parameter values.
- a plurality of layers of this hierarchical image are encoded / decoded using only the image of the own layer without using the image of the other layer, and encoded / decoded using the image of the other layer.
- It consists of a non-base layer (also called enhancement layer) that performs decoding.
- the non-base layer an image of the base layer may be used, or an image of another non-base layer may be used.
- the non-base layer is composed of difference image data (difference data) between its own image and an image of another layer so that redundancy is reduced.
- difference image data difference data
- an image with lower quality than the original image can be obtained using only the base layer data.
- an original image that is, a high-quality image
- image compression information of only the base layer (base layer) is transmitted, and a moving image with low spatiotemporal resolution or poor image quality is reproduced.
- image enhancement information of the enhancement layer is transmitted.
- Image compression information corresponding to the capabilities of the terminal and the network can be transmitted from the server without performing transcoding processing, such as playing a moving image with high image quality.
- the hierarchical image is encoded for each layer.
- the encoded data of each layer is decoded (that is, for each layer).
- the method described in the above embodiment may be applied to such encoding / decoding of each layer. By doing so, transmission efficiency and image quality can be improved. That is, in the case of hierarchical images, similarly, transmission efficiency and image quality can be improved.
- parameters having a scalability function are arbitrary.
- spatial resolution may be used as the parameter (spatial scalability).
- spatial scalability the resolution of the image is different for each layer.
- temporal resolution may be applied as a parameter for providing such scalability (temporal scalability).
- temporal scalability temporary scalability
- the frame rate is different for each layer.
- a signal-to-noise ratio (SNR (Signal-to-Noise-ratio)) may be applied (SNR-scalability) as a parameter for providing such scalability.
- SNR Signal-to-noise ratio
- the SN ratio is different for each layer.
- the parameters for providing scalability may be other than the examples described above.
- the base layer (base layer) consists of 8-bit (bit) images, and by adding an enhancement layer (enhancement layer) to this, the bit depth scalability (bit-depth ⁇ ⁇ ⁇ scalability) that can obtain a 10-bit (bit) image is is there.
- base layer (base ⁇ ⁇ layer) consists of component images in 4: 2: 0 format, and by adding the enhancement layer (enhancement layer) to this, chroma scalability (chroma) scalability).
- FIG. 26 is a diagram illustrating a hierarchical image encoding apparatus of the hierarchical image encoding / decoding system that performs the hierarchical image encoding / decoding described above.
- the hierarchical image encoding device 1020 includes an encoding unit 1021, an encoding unit 1022, and a multiplexing unit 1023.
- the encoding unit 1021 encodes the base layer image and generates a base layer image encoded stream.
- the encoding unit 1022 encodes the non-base layer image and generates a non-base layer image encoded stream.
- the multiplexing unit 1023 multiplexes the base layer image encoded stream generated by the encoding unit 1021 and the non-base layer image encoded stream generated by the encoding unit 1022 to generate a hierarchical image encoded stream. .
- FIG. 27 is a diagram illustrating a hierarchical image decoding apparatus that performs the hierarchical image decoding described above.
- the hierarchical image decoding device 1030 includes a demultiplexing unit 1031, a decoding unit 1032 and a decoding unit 1033.
- the demultiplexing unit 1031 demultiplexes the hierarchical image encoded stream in which the base layer image encoded stream and the non-base layer image encoded stream are multiplexed, and the base layer image encoded stream and the non-base layer image code Stream.
- the decoding unit 1032 decodes the base layer image encoded stream extracted by the demultiplexing unit 1031 to obtain a base layer image.
- the decoding unit 1033 decodes the non-base layer image encoded stream extracted by the demultiplexing unit 1031 to obtain a non-base layer image.
- the encoding device 10 described in the above embodiment may be applied as the encoding unit 1021 and the encoding unit 1022 of the hierarchical image encoding device 1020. .
- the method described in the above embodiment can be applied to the encoding of the hierarchical image. That is, transmission efficiency and image quality can be improved.
- the decoding device 20 described in the above embodiment may be applied as the decoding unit 1032 and the decoding unit 1033 of the hierarchical image decoding device 1030.
- the method described in the above embodiment can be applied to decoding of the encoded data of the hierarchical image. That is, transmission efficiency and image quality can be improved.
- the series of processes described above can be executed by hardware or software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose personal computer that can execute various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 28 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 1110 is also connected to the bus 1104.
- An input unit 1111, an output unit 1112, a storage unit 1113, a communication unit 1114, and a drive 1115 are connected to the input / output interface 1110.
- the input unit 1111 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like.
- the output unit 1112 includes, for example, a display, a speaker, an output terminal, and the like.
- the storage unit 1113 includes, for example, a hard disk, a RAM disk, a nonvolatile memory, and the like.
- the communication unit 1114 is composed of a network interface, for example.
- the drive 1115 drives a removable medium 821 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 1101 loads, for example, the program stored in the storage unit 1113 to the RAM 1103 via the input / output interface 1110 and the bus 1104 and executes the above-described series. Is performed.
- the RAM 1103 also appropriately stores data necessary for the CPU 1101 to execute various processes.
- the program executed by the computer (CPU 1101) can be recorded and applied to, for example, a removable medium 821 as a package medium or the like.
- the program can be installed in the storage unit 1113 via the input / output interface 1110 by attaching the removable medium 821 to the drive 1115.
- This program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting. In that case, the program can be received by the communication unit 1114 and installed in the storage unit 1113.
- a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be received by the communication unit 1114 and installed in the storage unit 1113.
- this program can be installed in the ROM 1102 or the storage unit 1113 in advance.
- the encoding device 10 and the decoding device 20 are, for example, a transmitter and a receiver in cable broadcasting such as satellite broadcasting and cable TV, distribution on the Internet, and distribution to terminals by cellular communication.
- the present invention can be applied to various electronic devices such as a recording apparatus that records an image on a medium such as an optical disk, a magnetic disk, and a flash memory, and a reproducing apparatus that reproduces an image from the storage medium.
- a recording apparatus that records an image on a medium such as an optical disk, a magnetic disk, and a flash memory
- a reproducing apparatus that reproduces an image from the storage medium.
- FIG. 29 is a diagram illustrating an example of a schematic configuration of a television device to which the above-described embodiment is applied.
- a television device 1200 includes an antenna 1201, a tuner 1202, a demultiplexer 1203, a decoder 1204, a video signal processing unit 1205, a display unit 1206, an audio signal processing unit 1207, a speaker 1208, an external interface (I / F) unit 1209, and a control unit. 1210, a user interface (I / F) unit 1211, and a bus 1212.
- Tuner 1202 extracts a signal of a desired channel from a broadcast signal received via antenna 1201, and demodulates the extracted signal. Then, tuner 1202 outputs the encoded bit stream obtained by demodulation to demultiplexer 1203. That is, the tuner 1202 serves as a transmission unit in the television apparatus 1200 that receives an encoded stream in which an image is encoded.
- the demultiplexer 1203 separates the video stream and audio stream of the viewing target program from the encoded bit stream, and outputs the separated streams to the decoder 1204. Further, the demultiplexer 1203 extracts auxiliary data such as EPG (Electronic Program Guide) from the encoded bit stream, and supplies the extracted data to the control unit 1210. Note that the demultiplexer 1203 may perform descrambling when the encoded bit stream is scrambled.
- EPG Electronic Program Guide
- the decoder 1204 decodes the video stream and audio stream input from the demultiplexer 1203. Then, the decoder 1204 outputs the video data generated by the decoding process to the video signal processing unit 1205. In addition, the decoder 1204 outputs the audio data generated by the decoding process to the audio signal processing unit 1207.
- the video signal processing unit 1205 reproduces the video data input from the decoder 1204 and causes the display unit 1206 to display the video.
- the video signal processing unit 1205 may cause the display unit 1206 to display an application screen supplied via the network.
- the video signal processing unit 1205 may perform additional processing such as noise removal on the video data according to the setting.
- the video signal processing unit 1205 may generate a GUI (Graphical User Interface) image such as a menu, a button, or a cursor, and superimpose the generated image on the output image.
- GUI Graphic User Interface
- the display unit 1206 is driven by a drive signal supplied from the video signal processing unit 1205, and displays a video on a video screen of a display device (for example, a liquid crystal display, a plasma display, or an OELD (Organic ElectroLuminescence Display) (organic EL display)). Or an image is displayed.
- a display device for example, a liquid crystal display, a plasma display, or an OELD (Organic ElectroLuminescence Display) (organic EL display)). Or an image is displayed.
- the audio signal processing unit 1207 performs reproduction processing such as D / A conversion and amplification on the audio data input from the decoder 1204, and outputs audio from the speaker 1208.
- the audio signal processing unit 1207 may perform additional processing such as noise removal on the audio data.
- the external interface unit 1209 is an interface for connecting the television apparatus 1200 to an external device or a network.
- a video stream or an audio stream received via the external interface unit 1209 may be decoded by the decoder 1204. That is, the external interface unit 1209 also has a role as a transmission unit in the television apparatus 1200 that receives an encoded stream in which an image is encoded.
- the control unit 1210 includes a processor such as a CPU and memories such as a RAM and a ROM.
- the memory stores a program executed by the CPU, program data, EPG data, data acquired via a network, and the like.
- the program stored in the memory is read and executed by the CPU when the television apparatus 1200 is started.
- the CPU controls the operation of the television apparatus 1200 according to an operation signal input from the user interface unit 1211 by executing the program.
- the user interface unit 1211 is connected to the control unit 1210.
- the user interface unit 1211 includes, for example, buttons and switches for the user to operate the television device 1200, a remote control signal receiving unit, and the like.
- the user interface unit 1211 detects an operation by the user via these components, generates an operation signal, and outputs the generated operation signal to the control unit 1210.
- the bus 1212 interconnects the tuner 1202, the demultiplexer 1203, the decoder 1204, the video signal processing unit 1205, the audio signal processing unit 1207, the external interface unit 1209, and the control unit 1210.
- the decoder 1204 may have the function of the decoding apparatus 20 described above. That is, the decoder 1204 may decode the encoded data by the method described in the above embodiments.
- the television device 1200 can improve transmission efficiency and image quality.
- the video signal processing unit 1205 encodes the image data supplied from the decoder 1204, for example, and the obtained encoded data is transmitted via the external interface unit 1209. You may enable it to output to the exterior of the television apparatus 1200.
- the video signal processing unit 1205 may have the function of the encoding device 10 described above. That is, the video signal processing unit 1205 may encode the image data supplied from the decoder 1204 by the method described in the above embodiments.
- the television device 1200 can improve transmission efficiency and image quality.
- FIG. 30 is a diagram illustrating an example of a schematic configuration of a mobile phone to which the above-described embodiment is applied.
- a cellular phone 1220 includes an antenna 1221, a communication unit 1222, an audio codec 1223, a speaker 1224, a microphone 1225, a camera unit 1226, an image processing unit 1227, a demultiplexing unit 1228, a recording / playback unit 1229, a display unit 1230, a control unit 1231, an operation A portion 1232 and a bus 1233.
- the antenna 1221 is connected to the communication unit 1222.
- the speaker 1224 and the microphone 1225 are connected to the audio codec 1223.
- the operation unit 1232 is connected to the control unit 1231.
- the bus 1233 connects the communication unit 1222, the audio codec 1223, the camera unit 1226, the image processing unit 1227, the demultiplexing unit 1228, the recording / reproducing unit 1229, the display unit 1230, and the control unit 1231 to each other.
- the mobile phone 1220 has various operation modes including a voice call mode, a data communication mode, a shooting mode, and a videophone mode, and is used for sending and receiving voice signals, sending and receiving e-mail or image data, taking images, recording data, and the like. Perform the action.
- the analog voice signal generated by the microphone 1225 is supplied to the voice codec 1223.
- the audio codec 1223 converts an analog audio signal into audio data, A / D converts the compressed audio data, and compresses it. Then, the audio codec 1223 outputs the compressed audio data to the communication unit 1222.
- the communication unit 1222 encodes and modulates audio data, and generates a transmission signal. Then, the communication unit 1222 transmits the generated transmission signal to a base station (not shown) via the antenna 1221. In addition, the communication unit 1222 amplifies a radio signal received via the antenna 1221 and performs frequency conversion to obtain a received signal.
- the communication unit 1222 demodulates and decodes the received signal to generate audio data, and outputs the generated audio data to the audio codec 1223.
- the audio codec 1223 decompresses and D / A converts the audio data to generate an analog audio signal. Then, the audio codec 1223 supplies the generated audio signal to the speaker 1224 to output audio.
- the control unit 1231 generates character data constituting the e-mail in response to an operation by the user via the operation unit 1232.
- the control unit 1231 displays characters on the display unit 1230.
- the control unit 1231 generates e-mail data in response to a transmission instruction from the user via the operation unit 1232, and outputs the generated e-mail data to the communication unit 1222.
- the communication unit 1222 encodes and modulates the e-mail data, and generates a transmission signal. Then, the communication unit 1222 transmits the generated transmission signal to a base station (not shown) via the antenna 1221.
- the communication unit 1222 amplifies a radio signal received via the antenna 1221 and performs frequency conversion to obtain a received signal. Then, the communication unit 1222 demodulates and decodes the received signal to restore the email data, and outputs the restored email data to the control unit 1231.
- the control unit 1231 displays the contents of the e-mail on the display unit 1230, supplies the e-mail data to the recording / reproducing unit 1229, and writes the data in the storage medium.
- the recording / reproducing unit 1229 has an arbitrary readable / writable storage medium.
- the storage medium may be a built-in storage medium such as a RAM or a flash memory, and may be an externally mounted type such as a hard disk, a magnetic disk, a magneto-optical disk, an optical disk, a USB (Universal Serial Bus) memory, or a memory card. It may be a storage medium.
- the camera unit 1226 captures an image of a subject to generate image data, and outputs the generated image data to the image processing unit 1227.
- the image processing unit 1227 encodes the image data input from the camera unit 1226, supplies the encoded stream to the recording / reproducing unit 1229, and writes the encoded stream in the storage medium.
- the recording / reproducing unit 1229 reads out the encoded stream recorded in the storage medium and outputs it to the image processing unit 1227.
- the image processing unit 1227 decodes the encoded stream input from the recording / playback unit 1229, supplies the image data to the display unit 1230, and displays the image.
- the demultiplexing unit 1228 multiplexes the video stream encoded by the image processing unit 1227 and the audio stream input from the audio codec 1223, and the multiplexed stream is used as the communication unit 1222. Output to.
- the communication unit 1222 encodes and modulates the stream and generates a transmission signal. Then, the communication unit 1222 transmits the generated transmission signal to a base station (not shown) via the antenna 1221.
- the communication unit 1222 amplifies a radio signal received via the antenna 1221 and performs frequency conversion to obtain a received signal.
- These transmission signal and reception signal may include an encoded bit stream.
- Communication unit 1222 then demodulates and decodes the received signal to restore the stream, and outputs the restored stream to demultiplexing unit 1228.
- the demultiplexing unit 1228 separates the video stream and the audio stream from the input stream, and outputs the video stream to the image processing unit 1227 and the audio stream to the audio codec 1223.
- the image processing unit 1227 decodes the video stream and generates video data.
- the video data is supplied to the display unit 1230, and a series of images is displayed on the display unit 1230.
- the audio codec 1223 expands the audio stream and performs D / A conversion to generate an analog audio signal. Then, the audio codec 1223 supplies the generated audio signal to the speaker 1224 to output audio.
- the image processing unit 1227 may have the function of the encoding device 10 described above. That is, the image processing unit 1227 may encode the image data by the method described in the above embodiments.
- the mobile phone 1220 can improve transmission efficiency and image quality.
- the image processing unit 1227 may have the function of the decoding device 20 described above. That is, the image processing unit 1227 may decode the encoded data by the method described in the above embodiment.
- the mobile phone 1220 can improve transmission efficiency and image quality.
- FIG. 31 is a diagram showing an example of a schematic configuration of a recording / reproducing apparatus to which the above-described embodiment is applied.
- the recording / playback apparatus 1240 encodes the received broadcast program audio data and video data, for example, and records the encoded data on a recording medium. Further, the recording / reproducing apparatus 1240 may encode audio data and video data acquired from another apparatus and record them on a recording medium, for example. Further, the recording / reproducing apparatus 1240 reproduces data recorded on the recording medium on a monitor and a speaker, for example, in accordance with a user instruction. At this time, the recording / reproducing device 1240 decodes the audio data and the video data.
- the recording / reproducing apparatus 1240 includes a tuner 1241, an external interface (I / F) unit 1242, an encoder 1243, an HDD (Hard Disk Drive) unit 1244, a disk drive 1245, a selector 1246, a decoder 1247, and an OSD (On-Screen Display) unit 1248.
- Tuner 1241 extracts a signal of a desired channel from a broadcast signal received via an antenna (not shown), and demodulates the extracted signal. Then, tuner 1241 outputs the encoded bit stream obtained by demodulation to selector 1246. That is, the tuner 1241 has a role as a transmission unit in the recording / reproducing apparatus 1240.
- the external interface unit 1242 is an interface for connecting the recording / reproducing device 1240 to an external device or a network.
- the external interface unit 1242 may be, for example, an IEEE (Institute of Electrical and Electronic Engineers) 1394 interface, a network interface, a USB interface, or a flash memory interface.
- IEEE Institute of Electrical and Electronic Engineers
- video data and audio data received via the external interface unit 1242 are input to the encoder 1243. That is, the external interface unit 1242 has a role as a transmission unit in the recording / reproducing apparatus 1240.
- the encoder 1243 encodes video data and audio data when the video data and audio data input from the external interface unit 1242 are not encoded. Then, the encoder 1243 outputs the encoded bit stream to the selector 1246.
- the HDD unit 1244 records an encoded bit stream, various programs, and other data in which content data such as video and audio are compressed, on an internal hard disk. Further, the HDD unit 1244 reads out these data from the hard disk when reproducing video and audio.
- the disk drive 1245 performs recording and reading of data to and from the mounted recording medium.
- Recording media mounted on the disk drive 1245 include, for example, DVD (Digital Versatile Disc) discs (DVD-Video, DVD-RAM (DVD -Random Access Memory), DVD-R (DVD-Recordable), DVD-RW (DVD- Rewritable), DVD + R (DVD + Recordable), DVD + RW (DVD + Rewritable), etc.) or Blu-ray (registered trademark) disc.
- the selector 1246 selects an encoded bit stream input from the tuner 1241 or the encoder 1243 during video and audio recording, and outputs the selected encoded bit stream to the HDD 1244 or the disk drive 1245. Further, the selector 1246 outputs the encoded bit stream input from the HDD 1244 or the disk drive 1245 to the decoder 1247 when reproducing video and audio.
- the decoder 1247 decodes the encoded bit stream and generates video data and audio data. Then, the decoder 1247 outputs the generated video data to the OSD unit 1248. The decoder 1247 outputs the generated audio data to an external speaker.
- the OSD unit 1248 reproduces the video data input from the decoder 1247 and displays the video.
- the OSD unit 1248 may superimpose a GUI image such as a menu, a button, or a cursor on the video to be displayed.
- the control unit 1249 includes a processor such as a CPU and memories such as a RAM and a ROM.
- the memory stores a program executed by the CPU, program data, and the like.
- the program stored in the memory is read and executed by the CPU when the recording / reproducing apparatus 1240 is activated, for example.
- the CPU controls the operation of the recording / reproducing device 1240 according to an operation signal input from the user interface unit 1250, for example, by executing the program.
- the user interface unit 1250 is connected to the control unit 1249.
- the user interface unit 1250 includes, for example, buttons and switches for the user to operate the recording / reproducing device 1240, a remote control signal receiving unit, and the like.
- the user interface unit 1250 detects an operation by the user via these components, generates an operation signal, and outputs the generated operation signal to the control unit 1249.
- the encoder 1243 may have the function of the encoding apparatus 10 described above. That is, the encoder 1243 may encode the image data by the method described in the above embodiments. By doing so, the recording / reproducing apparatus 1240 can improve transmission efficiency and image quality.
- the decoder 1247 may have the function of the decoding device 20 described above. That is, the decoder 1247 may decode the encoded data by the method described in the above embodiments. By doing so, the recording / reproducing apparatus 1240 can improve transmission efficiency and image quality.
- FIG. 32 is a diagram illustrating an example of a schematic configuration of an imaging apparatus to which the above-described embodiment is applied.
- the imaging device 1260 images a subject to generate an image, encodes the image data, and records the image data on a recording medium.
- the imaging device 1260 includes an optical block 1261, an imaging unit 1262, a signal processing unit 1263, an image processing unit 1264, a display unit 1265, an external interface (I / F) unit 1266, a memory unit 1267, a media drive 1268, an OSD unit 1269, and a control.
- the optical block 1261 is connected to the imaging unit 1262.
- the imaging unit 1262 is connected to the signal processing unit 1263.
- the display unit 1265 is connected to the image processing unit 1264.
- the user interface unit 1271 is connected to the control unit 1270.
- the bus 1272 connects the image processing unit 1264, the external interface unit 1266, the memory unit 1267, the media drive 1268, the OSD unit 1269, and the control unit 1270 to each other.
- the optical block 1261 has a focus lens, a diaphragm mechanism, and the like.
- the optical block 1261 forms an optical image of the subject on the imaging surface of the imaging unit 1262.
- the imaging unit 1262 includes an image sensor such as a CCD (Charge-Coupled Device) or a CMOS (Complementary Metal-Oxide Semiconductor), and converts an optical image formed on the imaging surface into an image signal as an electrical signal by photoelectric conversion. Then, the imaging unit 1262 outputs the image signal to the signal processing unit 1263.
- CCD Charge-Coupled Device
- CMOS Complementary Metal-Oxide Semiconductor
- the signal processing unit 1263 performs various camera signal processes such as knee correction, gamma correction, and color correction on the image signal input from the imaging unit 1262.
- the signal processing unit 1263 outputs the image data after camera signal processing to the image processing unit 1264.
- the image processing unit 1264 encodes the image data input from the signal processing unit 1263 to generate encoded data. Then, the image processing unit 1264 outputs the generated encoded data to the external interface unit 1266 or the media drive 1268.
- the image processing unit 1264 decodes encoded data input from the external interface unit 1266 or the media drive 1268, and generates image data. Then, the image processing unit 1264 outputs the generated image data to the display unit 1265. Further, the image processing unit 1264 may display the image by outputting the image data input from the signal processing unit 1263 to the display unit 1265. In addition, the image processing unit 1264 may superimpose display data acquired from the OSD unit 1269 on an image output to the display unit 1265.
- the OSD unit 1269 generates a GUI image such as a menu, a button, or a cursor, and outputs the generated image to the image processing unit 1264.
- the external interface unit 1266 is configured as a USB input / output terminal, for example.
- the external interface unit 1266 connects the imaging device 1260 and a printer, for example, when printing an image.
- a drive is connected to the external interface unit 1266 as necessary.
- a removable medium such as a magnetic disk or an optical disk is attached to the drive, and a program read from the removable medium can be installed in the imaging apparatus 1260.
- the external interface unit 1266 may be configured as a network interface connected to a network such as a LAN or the Internet. That is, the external interface unit 1266 has a role as a transmission unit in the imaging device 1260.
- the recording medium attached to the media drive 1268 may be any readable / writable removable medium such as a magnetic disk, a magneto-optical disk, an optical disk, or a semiconductor memory.
- a recording medium may be fixedly mounted on the media drive 1268, and a non-portable storage unit such as an internal hard disk drive or an SSD (Solid State Drive) may be configured.
- the control unit 1270 includes a processor such as a CPU, and memories such as a RAM and a ROM.
- the memory stores a program executed by the CPU, program data, and the like.
- the program stored in the memory is read and executed by the CPU when the imaging device 1260 is activated, for example.
- the CPU controls the operation of the imaging device 1260 according to an operation signal input from the user interface unit 1271, for example, by executing the program.
- the user interface unit 1271 is connected to the control unit 1270.
- the user interface unit 1271 includes, for example, buttons and switches for the user to operate the imaging device 1260.
- the user interface unit 1271 detects an operation by the user via these components, generates an operation signal, and outputs the generated operation signal to the control unit 1270.
- the image processing unit 1264 may have the function of the encoding device 10 described above. That is, the image processing unit 1264 may encode the image data by the method described in the above embodiments. Thus, the imaging device 1260 can improve transmission efficiency and image quality.
- the image processing unit 1264 may have the function of the decoding device 20 described above. That is, the image processing unit 1264 may decode the encoded data by the method described in the above embodiment.
- the imaging device 1260 can improve transmission efficiency and image quality.
- the present technology can also be applied to HTTP streaming such as MPEGASHDASH, for example, by selecting an appropriate piece of data from a plurality of encoded data with different resolutions prepared in advance. Can do. That is, information regarding encoding and decoding can be shared among a plurality of such encoded data.
- the present technology is not limited thereto, and any configuration mounted on a device constituting such a device or system, for example, a system Implemented as a processor such as LSI (Large Scale Integration), a module using multiple processors, a unit using multiple modules, etc., or a set with other functions added to the unit (ie, part of the device configuration) You can also
- FIG. 33 is a diagram illustrating an example of a schematic configuration of a video set to which the present technology is applied.
- the video set 1300 shown in FIG. 33 has such a multi-functional configuration, and a device having a function related to image encoding and decoding (either one or both) can be used for the function. It is a combination of devices having other related functions.
- a video set 1300 includes a module group such as a video module 1311, an external memory 1312, a power management module 1313, and a front end module 1314, and a connectivity 1321, a camera 1322, a sensor 1323, and the like. And a device having a function.
- a module is a component that has several functions that are related to each other and that has a coherent function.
- the specific physical configuration is arbitrary. For example, a plurality of processors each having a function, electronic circuit elements such as resistors and capacitors, and other devices arranged on a wiring board or the like can be considered. . It is also possible to combine the module with another module, a processor, or the like to form a new module.
- the video module 1311 is a combination of configurations having functions related to image processing, and includes an application processor 1331, a video processor 1332, a broadband modem 1333, and an RF module 1334.
- a processor is a configuration in which a configuration having a predetermined function is integrated on a semiconductor chip by a SoC (System On a Chip), and for example, there is a system LSI (Large Scale Integration).
- the configuration having the predetermined function may be a logic circuit (hardware configuration), a CPU, a ROM, a RAM, and the like, and a program (software configuration) executed using them. , Or a combination of both.
- a processor has a logic circuit and a CPU, ROM, RAM, etc., a part of the function is realized by a logic circuit (hardware configuration), and other functions are executed by the CPU (software configuration) It may be realized by.
- the 33 is a processor that executes an application related to image processing.
- the application executed in the application processor 1331 not only performs arithmetic processing to realize a predetermined function, but also can control the internal and external configurations of the video module 1311 such as the video processor 1332 as necessary. .
- the video processor 1332 is a processor having a function related to image encoding / decoding (one or both of them).
- the broadband modem 1333 converts the data (digital signal) transmitted by wired or wireless (or both) broadband communication via a broadband line such as the Internet or a public telephone line network into an analog signal by digitally modulating the data.
- the analog signal received by the broadband communication is demodulated and converted into data (digital signal).
- the broadband modem 1333 processes arbitrary information such as image data processed by the video processor 1332, a stream obtained by encoding the image data, an application program, setting data, and the like.
- the RF module 1334 is a module that performs frequency conversion, modulation / demodulation, amplification, filter processing, and the like on an RF (Radio Frequency) signal transmitted / received via an antenna. For example, the RF module 1334 generates an RF signal by performing frequency conversion or the like on the baseband signal generated by the broadband modem 1333. Further, for example, the RF module 1334 generates a baseband signal by performing frequency conversion or the like on the RF signal received via the front end module 1314.
- RF Radio Frequency
- the application processor 1331 and the video processor 1332 may be integrated into a single processor.
- the external memory 1312 is a module that is provided outside the video module 1311 and has a storage device used by the video module 1311.
- the storage device of the external memory 1312 may be realized by any physical configuration, but is generally used for storing a large amount of data such as image data in units of frames. For example, it is desirable to realize it with a relatively inexpensive and large-capacity semiconductor memory such as DRAM (Dynamic Random Access Memory).
- the power management module 1313 manages and controls power supply to the video module 1311 (each component in the video module 1311).
- the front-end module 1314 is a module that provides the RF module 1334 with a front-end function (circuit on the transmitting / receiving end on the antenna side). As illustrated in FIG. 33, the front end module 1314 includes, for example, an antenna unit 1351, a filter 1352, and an amplification unit 1353.
- the antenna unit 1351 has an antenna for transmitting and receiving a radio signal and its peripheral configuration.
- the antenna unit 1351 transmits the signal supplied from the amplification unit 1353 as a radio signal, and supplies the received radio signal to the filter 1352 as an electric signal (RF signal).
- the filter 1352 performs a filtering process on the RF signal received via the antenna unit 1351 and supplies the processed RF signal to the RF module 1334.
- the amplifying unit 1353 amplifies the RF signal supplied from the RF module 1334 and supplies the amplified RF signal to the antenna unit 1351.
- Connectivity 1321 is a module having a function related to connection with the outside.
- the physical configuration of the connectivity 1321 is arbitrary.
- the connectivity 1321 has a configuration having a communication function other than the communication standard supported by the broadband modem 1333, an external input / output terminal, and the like.
- the communication 1321 is compliant with wireless communication standards such as Bluetooth (registered trademark), IEEE 802.11 (for example, Wi-Fi (Wireless Fidelity, registered trademark)), NFC (Near Field Communication), IrDA (InfraRed Data Association), etc. You may make it have a module which has a function, an antenna etc. which transmit / receive the signal based on the standard.
- the connectivity 1321 has a module having a communication function compliant with a wired communication standard such as USB (Universal Serial Bus), HDMI (registered trademark) (High-Definition Multimedia Interface), or a terminal compliant with the standard. You may do it.
- the connectivity 1321 may have other data (signal) transmission functions such as analog input / output terminals.
- the connectivity 1321 may include a data (signal) transmission destination device.
- the drive 1321 reads / writes data to / from a recording medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory (not only a removable media drive, but also a hard disk, SSD (Solid State Drive) NAS (including Network Attached Storage) and the like.
- the connectivity 1321 may include an image or audio output device (a monitor, a speaker, or the like).
- the camera 1322 is a module having a function of capturing a subject and obtaining image data of the subject.
- Image data obtained by imaging by the camera 1322 is supplied to, for example, a video processor 1332 and encoded.
- the sensor 1323 includes, for example, a voice sensor, an ultrasonic sensor, an optical sensor, an illuminance sensor, an infrared sensor, an image sensor, a rotation sensor, an angle sensor, an angular velocity sensor, a velocity sensor, an acceleration sensor, an inclination sensor, a magnetic identification sensor, an impact sensor, It is a module having an arbitrary sensor function such as a temperature sensor.
- the data detected by the sensor 1323 is supplied to the application processor 1331 and used by an application or the like.
- the configuration described as a module in the above may be realized as a processor, or conversely, the configuration described as a processor may be realized as a module.
- the present technology can be applied to the video processor 1332 as described later. Therefore, the video set 1300 can be implemented as a set to which the present technology is applied.
- FIG. 34 is a diagram illustrating an example of a schematic configuration of a video processor 1332 (FIG. 33) to which the present technology is applied.
- the video processor 1332 receives the input of the video signal and the audio signal, encodes them in a predetermined method, decodes the encoded video data and audio data, A function of reproducing and outputting an audio signal.
- the video processor 1332 includes a video input processing unit 1401, a first image scaling unit 1402, a second image scaling unit 1403, a video output processing unit 1404, a frame memory 1405, and a memory control unit 1406.
- the video processor 1332 includes an encoding / decoding engine 1407, video ES (ElementaryElementStream) buffers 1408A and 1408B, and audio ES buffers 1409A and 1409B.
- the video processor 1332 includes an audio encoder 1410, an audio decoder 1411, a multiplexing unit (MUX (Multiplexer)) 1412, a demultiplexing unit (DMUX (Demultiplexer)) 1413, and a stream buffer 1414.
- MUX Multiplexing unit
- DMUX demultiplexing unit
- the video input processing unit 1401 acquires a video signal input from, for example, the connectivity 1321 (FIG. 33) and converts it into digital image data.
- the first image enlargement / reduction unit 1402 performs format conversion, image enlargement / reduction processing, and the like on the image data.
- the second image enlargement / reduction unit 1403 performs image enlargement / reduction processing on the image data in accordance with the format of the output destination via the video output processing unit 1404, or is the same as the first image enlargement / reduction unit 1402. Format conversion and image enlargement / reduction processing.
- the video output processing unit 1404 performs format conversion, conversion to an analog signal, and the like on the image data and outputs the reproduced video signal to, for example, the connectivity 1321 or the like.
- the frame memory 1405 is a memory for image data shared by the video input processing unit 1401, the first image scaling unit 1402, the second image scaling unit 1403, the video output processing unit 1404, and the encoding / decoding engine 1407. .
- the frame memory 1405 is realized as a semiconductor memory such as a DRAM, for example.
- the memory control unit 1406 receives the synchronization signal from the encoding / decoding engine 1407, and controls the write / read access to the frame memory 1405 according to the access schedule to the frame memory 1405 written in the access management table 1406A.
- the access management table 1406A is updated by the memory control unit 1406 in accordance with processing executed by the encoding / decoding engine 1407, the first image enlargement / reduction unit 1402, the second image enlargement / reduction unit 1403, and the like.
- the encoding / decoding engine 1407 performs encoding processing of image data and decoding processing of a video stream that is data obtained by encoding the image data. For example, the encoding / decoding engine 1407 encodes the image data read from the frame memory 1405 and sequentially writes the data as a video stream in the video ES buffer 1408A. Further, for example, the video stream is sequentially read from the video ES buffer 1408B, decoded, and sequentially written in the frame memory 1405 as image data.
- the encoding / decoding engine 1407 uses the frame memory 1405 as a work area in the encoding and decoding. Also, the encoding / decoding engine 1407 outputs a synchronization signal to the memory control unit 1406, for example, at a timing at which processing for each macroblock is started.
- the video ES buffer 1408A buffers the video stream generated by the encoding / decoding engine 1407 and supplies the buffered video stream to the multiplexing unit (MUX) 1412.
- the video ES buffer 1408B buffers the video stream supplied from the demultiplexer (DMUX) 1413 and supplies the buffered video stream to the encoding / decoding engine 1407.
- the audio ES buffer 1409A buffers the audio stream generated by the audio encoder 1410 and supplies the buffered audio stream to the multiplexing unit (MUX) 1412.
- the audio ES buffer 1409B buffers the audio stream supplied from the demultiplexer (DMUX) 1413 and supplies the buffered audio stream to the audio decoder 1411.
- the audio encoder 1410 converts, for example, an audio signal input from the connectivity 1321 or the like, for example, into a digital format, and encodes it using a predetermined method such as an MPEG audio method or an AC3 (Audio Code number 3) method.
- the audio encoder 1410 sequentially writes an audio stream, which is data obtained by encoding an audio signal, in the audio ES buffer 1409A.
- the audio decoder 1411 decodes the audio stream supplied from the audio ES buffer 1409B, performs conversion to an analog signal, for example, and supplies the reproduced audio signal to, for example, the connectivity 1321 or the like.
- the multiplexing unit (MUX) 1412 multiplexes the video stream and the audio stream.
- the multiplexing method (that is, the format of the bit stream generated by multiplexing) is arbitrary.
- the multiplexing unit (MUX) 1412 can also add predetermined header information or the like to the bit stream. That is, the multiplexing unit (MUX) 1412 can convert the stream format by multiplexing. For example, the multiplexing unit (MUX) 1412 multiplexes the video stream and the audio stream to convert it into a transport stream that is a bit stream in a transfer format. Further, for example, the multiplexing unit (MUX) 1412 multiplexes the video stream and the audio stream, thereby converting the data into file format data (file data) for recording.
- the demultiplexing unit (DMUX) 1413 demultiplexes the bit stream in which the video stream and the audio stream are multiplexed by a method corresponding to the multiplexing by the multiplexing unit (MUX) 1412. That is, the demultiplexer (DMUX) 1413 extracts the video stream and the audio stream from the bit stream read from the stream buffer 1414 (separates the video stream and the audio stream). That is, the demultiplexer (DMUX) 1413 can convert the stream format by demultiplexing (inverse conversion of the conversion by the multiplexer (MUX) 1412).
- the demultiplexing unit (DMUX) 1413 obtains a transport stream supplied from, for example, the connectivity 1321 or the broadband modem 1333 via the stream buffer 1414 and demultiplexes the video stream and the audio stream. And can be converted to Further, for example, the demultiplexer (DMUX) 1413 obtains the file data read from various recording media by the connectivity 1321, for example, via the stream buffer 1414, and demultiplexes the video stream and the audio. Can be converted to a stream.
- Stream buffer 1414 buffers the bit stream.
- the stream buffer 1414 buffers the transport stream supplied from the multiplexing unit (MUX) 1412 and, for example, in the connectivity 1321 or the broadband modem 1333 at a predetermined timing or based on an external request or the like. Supply.
- MUX multiplexing unit
- the stream buffer 1414 buffers the file data supplied from the multiplexing unit (MUX) 1412 and supplies it to the connectivity 1321 at a predetermined timing or based on an external request, for example. It is recorded on various recording media.
- MUX multiplexing unit
- the stream buffer 1414 buffers a transport stream acquired through, for example, the connectivity 1321 or the broadband modem 1333, and performs a demultiplexing unit (DMUX) at a predetermined timing or based on a request from the outside. 1413.
- DMUX demultiplexing unit
- the stream buffer 1414 buffers file data read from various recording media in, for example, the connectivity 1321, and the demultiplexer (DMUX) 1413 at a predetermined timing or based on an external request or the like. To supply.
- DMUX demultiplexer
- a video signal input to the video processor 1332 from the connectivity 1321 or the like is converted into digital image data of a predetermined format such as 4: 2: 2Y / Cb / Cr format by the video input processing unit 1401 and stored in the frame memory 1405.
- This digital image data is read by the first image enlargement / reduction unit 1402 or the second image enlargement / reduction unit 1403, and format conversion to a predetermined method such as 4: 2: 0Y / Cb / Cr method and enlargement / reduction processing are performed. Is written again in the frame memory 1405.
- This image data is encoded by the encoding / decoding engine 1407 and written as a video stream in the video ES buffer 1408A.
- an audio signal input from the connectivity 1321 or the like to the video processor 1332 is encoded by the audio encoder 1410 and written as an audio stream in the audio ES buffer 1409A.
- the video stream of the video ES buffer 1408A and the audio stream of the audio ES buffer 1409A are read and multiplexed by the multiplexing unit (MUX) 1412 and converted into a transport stream, file data, or the like.
- the transport stream generated by the multiplexing unit (MUX) 1412 is buffered in the stream buffer 1414 and then output to the external network via, for example, the connectivity 1321 or the broadband modem 1333.
- the file data generated by the multiplexing unit (MUX) 1412 is buffered in the stream buffer 1414, and then output to, for example, the connectivity 1321 and recorded on various recording media.
- a transport stream input from an external network to the video processor 1332 via the connectivity 1321 or the broadband modem 1333 is buffered in the stream buffer 1414 and then demultiplexed by the demultiplexer (DMUX) 1413.
- DMUX demultiplexer
- file data read from various recording media by the connectivity 1321 and input to the video processor 1332 is buffered by the stream buffer 1414 and then demultiplexed by the demultiplexer (DMUX) 1413. That is, the transport stream or file data input to the video processor 1332 is separated into a video stream and an audio stream by the demultiplexer (DMUX) 1413.
- the audio stream is supplied to the audio decoder 1411 via the audio ES buffer 1409B and decoded to reproduce the audio signal.
- the video stream is written to the video ES buffer 1408B, and then sequentially read and decoded by the encoding / decoding engine 1407, and written to the frame memory 1405.
- the decoded image data is enlarged / reduced by the second image enlargement / reduction unit 1403 and written to the frame memory 1405.
- the decoded image data is read out to the video output processing unit 1404, format-converted to a predetermined system such as 4: 2: 2Y / Cb / Cr system, and further converted into an analog signal to be converted into a video signal. Is played out.
- the present technology when the present technology is applied to the video processor 1332 configured as described above, the present technology according to the above-described embodiment may be applied to the encoding / decoding engine 1407. That is, for example, the encoding / decoding engine 1407 may have the functions of the encoding device 10 and / or the decoding device 20 described above. In this way, the video processor 1332 can obtain the same effects as those of the encoding device 10 and the decoding device 20 according to the above-described embodiment.
- the present technology (that is, the function of the encoding device 10 and / or the function of the decoding device 20) may be realized by hardware such as a logic circuit or an embedded program. It may be realized by software such as the above, or may be realized by both of them.
- FIG. 35 is a diagram illustrating another example of a schematic configuration of the video processor 1332 to which the present technology is applied.
- the video processor 1332 has a function of encoding and decoding video data by a predetermined method.
- the video processor 1332 includes a control unit 1511, a display interface 1512, a display engine 1513, an image processing engine 1514, and an internal memory 1515.
- the video processor 1332 includes a codec engine 1516, a memory interface 1517, a multiplexing / demultiplexing unit (MUX DMUX) 1518, a network interface 1519, and a video interface 1520.
- MUX DMUX multiplexing / demultiplexing unit
- the control unit 1511 controls the operation of each processing unit in the video processor 1332 such as the display interface 1512, the display engine 1513, the image processing engine 1514, and the codec engine 1516.
- the control unit 1511 includes, for example, a main CPU 1531, a sub CPU 1532, and a system controller 1533.
- the main CPU 1531 executes a program and the like for controlling the operation of each processing unit in the video processor 1332.
- the main CPU 1531 generates a control signal according to the program and supplies it to each processing unit (that is, controls the operation of each processing unit).
- the sub CPU 1532 plays an auxiliary role of the main CPU 1531.
- the sub CPU 1532 executes a child process such as a program executed by the main CPU 1531, a subroutine, or the like.
- the system controller 1533 controls operations of the main CPU 1531 and the sub CPU 1532 such as designating a program to be executed by the main CPU 1531 and the sub CPU 1532.
- the display interface 1512 outputs the image data to, for example, the connectivity 1321 under the control of the control unit 1511.
- the display interface 1512 converts the digital data image data into an analog signal, and outputs the analog video signal to the monitor device or the like of the connectivity 1321 as a reproduced video signal or as the digital data image data.
- the display engine 1513 Under the control of the control unit 1511, the display engine 1513 performs various conversion processes such as format conversion, size conversion, color gamut conversion, and the like so as to match the image data with hardware specifications such as a monitor device that displays the image. I do.
- the image processing engine 1514 performs predetermined image processing such as filter processing for improving image quality on the image data under the control of the control unit 1511.
- the internal memory 1515 is a memory provided inside the video processor 1332 that is shared by the display engine 1513, the image processing engine 1514, and the codec engine 1516.
- the internal memory 1515 is used, for example, for data exchange performed between the display engine 1513, the image processing engine 1514, and the codec engine 1516.
- the internal memory 1515 stores data supplied from the display engine 1513, the image processing engine 1514, or the codec engine 1516, and stores the data as needed (eg, upon request). This is supplied to the image processing engine 1514 or the codec engine 1516.
- the internal memory 1515 may be realized by any storage device, but is generally used for storing a small amount of data such as image data or parameters in units of blocks. It is desirable to realize a semiconductor memory having a relatively small capacity but a high response speed (for example, as compared with the external memory 1312) such as “Static Random Access Memory”.
- the codec engine 1516 performs processing related to encoding and decoding of image data.
- the encoding / decoding scheme supported by the codec engine 1516 is arbitrary, and the number thereof may be one or plural.
- the codec engine 1516 may be provided with codec functions of a plurality of encoding / decoding schemes, and may be configured to perform encoding of image data or decoding of encoded data using one selected from them.
- the codec engine 1516 includes, for example, MPEG-2 video 1541, AVC / H.2641542, HEVC / H.2651543, HEVC / H.265 (Scalable) 1544, as function blocks for processing related to the codec.
- HEVC / H.265 (Multi-view) 1545 and MPEG-DASH 1551 are included.
- MPEG-2 Video1541 is a functional block that encodes and decodes image data in the MPEG-2 format.
- AVC / H.2641542 is a functional block that encodes and decodes image data using the AVC method.
- HEVC / H.2651543 is a functional block that encodes and decodes image data using the HEVC method.
- HEVC / H.265 (Scalable) 1544 is a functional block that performs scalable encoding and scalable decoding of image data using the HEVC method.
- HEVC / H.265 (Multi-view) 1545 is a functional block that multi-view encodes or multi-view decodes image data using the HEVC method.
- MPEG-DASH 1551 is a functional block that transmits and receives image data using the MPEG-DASH (MPEG-Dynamic Adaptive Streaming over HTTP) method.
- MPEG-DASH is a technology for streaming video using HTTP (HyperText Transfer Protocol), and selects and transmits appropriate data from multiple encoded data with different resolutions prepared in advance in segments. This is one of the features.
- MPEG-DASH 1551 generates a stream compliant with the standard, controls transmission of the stream, and the like.
- MPEG-2 Video 1541 to HEVC / H.265 (Multi-view) 1545 described above are used. Is used.
- the memory interface 1517 is an interface for the external memory 1312. Data supplied from the image processing engine 1514 or the codec engine 1516 is supplied to the external memory 1312 via the memory interface 1517. The data read from the external memory 1312 is supplied to the video processor 1332 (the image processing engine 1514 or the codec engine 1516) via the memory interface 1517.
- a multiplexing / demultiplexing unit (MUX DMUX) 1518 performs multiplexing and demultiplexing of various data related to images such as a bit stream of encoded data, image data, and a video signal.
- This multiplexing / demultiplexing method is arbitrary.
- the multiplexing / demultiplexing unit (MUX DMUX) 1518 can not only combine a plurality of data into one but also add predetermined header information or the like to the data.
- the multiplexing / demultiplexing unit (MUX DMUX) 1518 not only divides one data into a plurality of data but also adds predetermined header information or the like to each divided data. it can.
- the multiplexing / demultiplexing unit (MUX DMUX) 1518 can convert the data format by multiplexing / demultiplexing.
- the multiplexing / demultiplexing unit (MUX DMUX) 1518 multiplexes the bitstream, thereby transporting the transport stream, which is a bit stream in a transfer format, or data in a file format for recording (file data).
- the transport stream which is a bit stream in a transfer format, or data in a file format for recording (file data).
- file data file format for recording
- the network interface 1519 is an interface for a broadband modem 1333, connectivity 1321, etc., for example.
- the video interface 1520 is an interface for the connectivity 1321, the camera 1322, and the like, for example.
- the transport stream is supplied to the multiplexing / demultiplexing unit (MUX DMUX) 1518 via the network interface 1519.
- MUX DMUX multiplexing / demultiplexing unit
- codec engine 1516 the image data obtained by decoding by the codec engine 1516 is subjected to predetermined image processing by the image processing engine 1514, subjected to predetermined conversion by the display engine 1513, and is connected to, for example, the connectivity 1321 through the display interface 1512. And the image is displayed on the monitor.
- image data obtained by decoding by the codec engine 1516 is re-encoded by the codec engine 1516, multiplexed by a multiplexing / demultiplexing unit (MUX DMUX) 1518, converted into file data, and video
- MUX DMUX multiplexing / demultiplexing unit
- encoded data file data obtained by encoding image data read from a recording medium (not shown) by the connectivity 1321 or the like is transmitted through a video interface 1520 via a multiplexing / demultiplexing unit (MUX DMUX). ) 1518 to be demultiplexed and decoded by the codec engine 1516.
- Image data obtained by decoding by the codec engine 1516 is subjected to predetermined image processing by the image processing engine 1514, subjected to predetermined conversion by the display engine 1513, and supplied to, for example, the connectivity 1321 through the display interface 1512. The image is displayed on the monitor.
- image data obtained by decoding by the codec engine 1516 is re-encoded by the codec engine 1516, multiplexed by the multiplexing / demultiplexing unit (MUX DMUX) 1518, and converted into a transport stream,
- the data is supplied to, for example, the connectivity 1321 and the broadband modem 1333 via the network interface 1519 and transmitted to another device (not shown).
- image data and other data are exchanged between the processing units in the video processor 1332 using, for example, the internal memory 1515 or the external memory 1312.
- the power management module 1313 controls power supply to the control unit 1511, for example.
- the present technology when the present technology is applied to the video processor 1332 configured as described above, the present technology according to the above-described embodiment may be applied to the codec engine 1516. That is, for example, the codec engine 1516 may have the functions of the encoding device 10 and / or the decoding device 20 described above. In this way, the video processor 1332 can obtain the same effects as those of the encoding device 10 and the decoding device 20 described above.
- the present technology (that is, the functions of the encoding device 10 and the decoding device 20) may be realized by hardware such as a logic circuit or software such as an embedded program. You may make it carry out, and you may make it implement
- the configuration of the video processor 1332 is arbitrary and may be other than the two examples described above.
- the video processor 1332 may be configured as one semiconductor chip, but may be configured as a plurality of semiconductor chips. For example, a three-dimensional stacked LSI in which a plurality of semiconductors are stacked may be used. Further, it may be realized by a plurality of LSIs.
- Video set 1300 can be incorporated into various devices that process image data.
- the video set 1300 can be incorporated in the television device 1200 (FIG. 29), the mobile phone 1220 (FIG. 30), the recording / playback device 1240 (FIG. 31), the imaging device 1260 (FIG. 32), or the like.
- the device can obtain the same effects as those of the encoding device 10 and the decoding device 20 described above.
- the video processor 1332 can implement as a structure to which this technique is applied.
- the video processor 1332 can be implemented as a video processor to which the present technology is applied.
- the processor or the video module 1311 indicated by the dotted line 1341 can be implemented as a processor or a module to which the present technology is applied.
- the video module 1311, the external memory 1312, the power management module 1313, and the front end module 1314 can be combined and implemented as a video unit 1361 to which the present technology is applied. In any case, the same effects as those of the encoding device 10 and the decoding device 20 described above can be obtained.
- any configuration including the video processor 1332 can be incorporated into various devices that process image data, as in the case of the video set 1300.
- a video processor 1332 a processor indicated by a dotted line 1341, a video module 1311, or a video unit 1361, a television device 1200 (FIG. 29), a mobile phone 1220 (FIG. 30), a recording / playback device 1240 (FIG. 31) It can be incorporated in an imaging device 1260 (FIG. 32) or the like.
- the apparatus can obtain the same effects as those of the encoding apparatus 10 and the decoding apparatus 20 described above, as in the case of the video set 1300.
- the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
- the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units).
- the configurations described above as a plurality of devices (or processing units) may be combined into a single device (or processing unit).
- a configuration other than those described above may be added to the configuration of each device (or each processing unit).
- a part of the configuration of a certain device (or processing unit) may be included in the configuration of another device (or other processing unit). .
- the present technology can take a configuration of cloud computing in which one function is shared and processed by a plurality of devices via a network.
- the above-described program can be executed in an arbitrary device.
- the device may have necessary functions (functional blocks and the like) so that necessary information can be obtained.
- each step described in the above flowchart can be executed by one device or can be executed by a plurality of devices. Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
- the program executed by the computer may be executed in a time series in the order described in this specification for the processing of the steps describing the program, or in parallel or called. It may be executed individually at a necessary timing. That is, as long as no contradiction occurs, the processing of each step may be executed in an order different from the order described above. Furthermore, the processing of the steps describing this program may be executed in parallel with the processing of other programs, or may be executed in combination with the processing of other programs.
- this technique can take the following structures.
- An encoding unit that encodes an input image using a lossy encoding method; A database in which multiple texture components are registered; Among the plurality of texture components registered in the database, identification information for identifying a match component that is the texture component that matches the input image and encoded data obtained by encoding the input image are transmitted.
- An encoding device comprising: a transmission unit.
- a separation unit that separates a low frequency component of the input image from the input image; For each of the plurality of texture components registered in the database, a restoration component is generated by restoring the texture component of the input image by base synthesis using the low-frequency component of the input image and the basis of the texture component.
- a base synthesis unit to A match component determining unit that determines, as the match component, the restored component having the smallest error with respect to the input image among the restored components generated for each of the plurality of texture components registered in the database; The encoding device according to ⁇ 2>.
- a decoding unit for decoding the encoded data into a decoded image For each of the plurality of texture components registered in the database, a base synthesis unit that generates a restoration component obtained by restoring the texture component of the input image by base synthesis using the decoded image and the base of the texture component
- a match component determining unit that determines, as the match component, the restored component having the smallest error with respect to the input image among the restored components generated for each of the plurality of texture components registered in the database;
- ⁇ 5> The encoding device according to any one of ⁇ 2> to ⁇ 4>, further including a data transmission unit that transmits data of the database in response to a request from a decoding device that decodes the encoded data.
- ⁇ 6> The encoding device according to any one of ⁇ 2> to ⁇ 4>, further including an update unit that acquires data from a server and updates the database.
- ⁇ 7> The encoding device according to any one of ⁇ 2> to ⁇ 6>, further including a registration unit that registers the basis of the texture component in the database.
- the registration unit The texture component base of the input image is provisionally registered in the database as a base of a new texture component, The encoding device according to ⁇ 7>, wherein the base of the new texture component is fully registered in the database when a predetermined condition is satisfied.
- the registration unit When the base of the new texture component is registered in the database, the S / N (Signal to Noise ratio) of the match component for the input image is not registered in the database.
- the match component of the case is superior to a certain value over S / N for the input image, Or, the RD (Rate-Distotion) curve when the basis of the new texture component is registered in the database is more than a certain value than the RD curve when the basis of the new texture component is not registered in the database.
- the encoding apparatus according to ⁇ 8>, wherein the base of the new texture component is main-registered in the database using the predetermined condition as being excellent.
- the registration unit determines the texture component basis of the input image when an error of the match component when the basis of a new texture component is not registered in the database is greater than or equal to a threshold value.
- the encoding device which is temporarily registered in the database as a base of a new texture component.
- the encoding device according to any one of ⁇ 1> to ⁇ 10>, wherein the encoding unit encodes a difference between the input image and the match component.
- ⁇ 12> Encoding the input image with a lossy encoding method; Among the plurality of texture components registered in a database in which a plurality of texture components are registered, identification information for identifying a match component that is the texture component that matches the input image, and the input image is encoded. An encoded method including transmitting the encoded data.
- a receiving unit that receives encoded data obtained by encoding an input image using a lossy encoding method, and identification information that identifies a match component that is a texture component that matches the input image;
- a decoding unit for decoding the encoded data into a decoded image;
- a database in which multiple texture components are registered;
- a decoding apparatus comprising: a combining unit that combines the texture component as the match component identified by the identification information and the decoded image among the plurality of texture components registered in the database.
- a base synthesis unit that generates a restored component obtained by restoring the match component by base synthesis using the decoded image or a low-frequency component of the decoded image and a base of the match component;
- an update unit that acquires data from a server and updates the database.
- a decoding method including:
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
このような階層画像符号化・階層画像復号(スケーラブル符号化・スケーラブル復号)において、スケーラビリティ(scalability)機能を有するパラメータは、任意である。例えば、空間解像度をそのパラメータとしてもよい(spatial scalability)。このスペーシャルスケーラビリティ(spatial scalability)の場合、レイヤ毎に画像の解像度が異なる。
ビデオセット1300は、画像データを処理する各種装置に組み込むことができる。例えば、ビデオセット1300は、テレビジョン装置1200(図29)、携帯電話機1220(図30)、記録再生装置1240(図31)、撮像装置1260(図32)等に組み込むことができる。ビデオセット1300を組み込むことにより、その装置は、上述した符号化装置10や復号装置20と同様の効果を得ることができる。
入力画像を非可逆符号化方式で符号化する符号化部と、
複数のテクスチャ成分が登録されているデータベースと、
前記データベースに登録されている前記複数のテクスチャ成分のうちの、前記入力画像にマッチする前記テクスチャ成分であるマッチ成分を識別する識別情報と、前記入力画像を符号化した符号化データとを伝送する伝送部と
を備える符号化装置。
<2>
前記データベースには、前記テクスチャ成分を基底化して得られる、前記テクスチャ成分の基底が登録されている
<1>に記載の符号化装置。
<3>
前記入力画像から、前記入力画像の低域成分を分離する分離部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて、前記入力画像の低域成分と、前記テクスチャ成分の基底とを用いた基底合成により、前記入力画像のテクスチャ成分を復元した復元成分を生成する基底合成部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて生成された前記復元成分の中で、前記入力画像に対する誤差が最小の前記復元成分を、前記マッチ成分に決定するマッチ成分決定部と
をさらに備える<2>に記載の符号化装置。
<4>
前記符号化データを復号画像に復号する復号部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて、前記復号画像と、前記テクスチャ成分の基底とを用いた基底合成により、前記入力画像のテクスチャ成分を復元した復元成分を生成する基底合成部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて生成された前記復元成分の中で、前記入力画像に対する誤差が最小の前記復元成分を、前記マッチ成分に決定するマッチ成分決定部と
をさらに備える<2>に記載の符号化装置。
<5>
前記符号化データを復号する復号装置からの要求に応じて、前記データベースのデータを伝送するデータ伝送部をさらに備える
<2>ないし<4>のいずれかに記載の符号化装置。
<6>
サーバからデータを取得し、前記データベースを更新する更新部をさらに備える
<2>ないし<4>のいずれかに記載の符号化装置。
<7>
前記データベースに前記テクスチャ成分の基底を登録する登録部をさらに備える
<2>ないし<6>のいずれかに記載の符号化装置。
<8>
前記登録部は、
前記入力画像のテクスチャ成分の基底を、新テクスチャ成分の基底として、前記データベースに仮登録し、
所定の条件が満たされる場合に、前記新テクスチャ成分の基底を、前記データベースに本登録する
<7>に記載の符号化装置。
<9>
前記登録部は、
前記新テクスチャ成分の基底が前記データベースに登録されている場合の前記マッチ成分の、前記入力画像に対するS/N(Signal to Noise ratio)が、前記新テクスチャ成分の基底が前記データベースに登録されていない場合の前記マッチ成分の、前記入力画像に対するS/Nよりも一定値以上優れていること、
又は、前記新テクスチャ成分の基底が前記データベースに登録されている場合のRD(Rate-Distotion)曲線が、前記新テクスチャ成分の基底が前記データベースに登録されていない場合のRD曲線よりも一定値以上優れていること
を、前記所定の条件として、前記新テクスチャ成分の基底を、前記データベースに本登録する
<8>に記載の符号化装置。
<10>
前記登録部は、新テクスチャ成分の基底が前記データベースに登録されていない場合の前記マッチ成分の、前記入力画像に対する誤差が、閾値以上である場合に、前記入力画像のテクスチャ成分の基底を、前記新テクスチャ成分の基底として、前記データベースに仮登録する
<8>又は<9>に記載の符号化装置。
<11>
前記符号化部は、前記入力画像と前記マッチ成分との差分を符号化する
<1>ないし<10>のいずれかに記載の符号化装置。
<12>
入力画像を非可逆符号化方式で符号化することと、
複数のテクスチャ成分が登録されているデータベースに登録されている前記複数のテクスチャ成分のうちの、前記入力画像にマッチする前記テクスチャ成分であるマッチ成分を識別する識別情報と、前記入力画像を符号化した符号化データとを伝送することと
を含む符号化方法。
<13>
入力画像を非可逆符号化方式で符号化した符号化データと、前記入力画像にマッチするテクスチャ成分であるマッチ成分を識別する識別情報とを受け取る受け取り部と、
前記符号化データを、復号画像に復号する復号部と、
複数のテクスチャ成分が登録されているデータベースと、
前記データベースに登録されている前記複数のテクスチャ成分のうちの、前記識別情報により識別される前記マッチ成分としての前記テクスチャ成分と、前記復号画像とを合成する合成部と
を備える復号装置。
<14>
前記データベースには、前記テクスチャ成分を基底化して得られる、前記テクスチャ成分の基底が登録されている
<13>に記載の復号装置。
<15>
前記復号画像、又は、前記復号画像の低域成分と、前記マッチ成分の基底とを用いた基底合成により、前記マッチ成分を復元した復元成分を生成する基底合成部をさらに備え、
前記合成部は、前記復元成分と、前記復号画像とを合成する
<14>に記載の復号装置。
<16>
前記入力画像を符号化する符号化装置に、前記データベースに登録するデータを要求することにより取得し、前記データベースを更新する更新部をさらに備える
<14>又は<15>に記載の復号装置。
<17>
サーバからデータを取得し、前記データベースを更新する更新部をさらに備える
<14>又は<15>に記載の復号装置。
<18>
入力画像を非可逆符号化方式で符号化した符号化データと、前記入力画像にマッチするテクスチャ成分であるマッチ成分を識別する識別情報とを受け取ることと、
前記符号化データを、復号画像に復号することと、
複数のテクスチャ成分が登録されているデータベースに登録されている前記複数のテクスチャ成分のうちの、前記識別情報により識別される前記マッチ成分としての前記テクスチャ成分と、前記復号画像とを合成することと
を含む復号方法。
Claims (18)
- 入力画像を非可逆符号化方式で符号化する符号化部と、
複数のテクスチャ成分が登録されているデータベースと、
前記データベースに登録されている前記複数のテクスチャ成分のうちの、前記入力画像にマッチする前記テクスチャ成分であるマッチ成分を識別する識別情報と、前記入力画像を符号化した符号化データとを伝送する伝送部と
を備える符号化装置。 - 前記データベースには、前記テクスチャ成分を基底化して得られる、前記テクスチャ成分の基底が登録されている
請求項1に記載の符号化装置。 - 前記入力画像から、前記入力画像の低域成分を分離する分離部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて、前記入力画像の低域成分と、前記テクスチャ成分の基底とを用いた基底合成により、前記入力画像のテクスチャ成分を復元した復元成分を生成する基底合成部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて生成された前記復元成分の中で、前記入力画像に対する誤差が最小の前記復元成分を、前記マッチ成分に決定するマッチ成分決定部と
をさらに備える請求項2に記載の符号化装置。 - 前記符号化データを復号画像に復号する復号部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて、前記復号画像と、前記テクスチャ成分の基底とを用いた基底合成により、前記入力画像のテクスチャ成分を復元した復元成分を生成する基底合成部と、
前記データベースに登録されている前記複数のテクスチャ成分それぞれについて生成された前記復元成分の中で、前記入力画像に対する誤差が最小の前記復元成分を、前記マッチ成分に決定するマッチ成分決定部と
をさらに備える請求項2に記載の符号化装置。 - 前記符号化データを復号する復号装置からの要求に応じて、前記データベースのデータを伝送するデータ伝送部をさらに備える
請求項2に記載の符号化装置。 - サーバからデータを取得し、前記データベースを更新する更新部をさらに備える
請求項2に記載の符号化装置。 - 前記データベースに前記テクスチャ成分の基底を登録する登録部をさらに備える
請求項2に記載の符号化装置。 - 前記登録部は、
前記入力画像のテクスチャ成分の基底を、新テクスチャ成分の基底として、前記データベースに仮登録し、
所定の条件が満たされる場合に、前記新テクスチャ成分の基底を、前記データベースに本登録する
請求項7に記載の符号化装置。 - 前記登録部は、
前記新テクスチャ成分の基底が前記データベースに登録されている場合の前記マッチ成分の、前記入力画像に対するS/N(Signal to Noise ratio)が、前記新テクスチャ成分の基底が前記データベースに登録されていない場合の前記マッチ成分の、前記入力画像に対するS/Nよりも一定値以上優れていること、
又は、前記新テクスチャ成分の基底が前記データベースに登録されている場合のRD(Rate-Distotion)曲線が、前記新テクスチャ成分の基底が前記データベースに登録されていない場合のRD曲線よりも一定値以上優れていること
を、前記所定の条件として、前記新テクスチャ成分の基底を、前記データベースに本登録する
請求項8に記載の符号化装置。 - 前記登録部は、新テクスチャ成分の基底が前記データベースに登録されていない場合の前記マッチ成分の、前記入力画像に対する誤差が、閾値以上である場合に、前記入力画像のテクスチャ成分の基底を、前記新テクスチャ成分の基底として、前記データベースに仮登録する
請求項8に記載の符号化装置。 - 前記符号化部は、前記入力画像と前記マッチ成分との差分を符号化する
請求項1に記載の符号化装置。 - 入力画像を非可逆符号化方式で符号化することと、
複数のテクスチャ成分が登録されているデータベースに登録されている前記複数のテクスチャ成分のうちの、前記入力画像にマッチする前記テクスチャ成分であるマッチ成分を識別する識別情報と、前記入力画像を符号化した符号化データとを伝送することと
を含む符号化方法。 - 入力画像を非可逆符号化方式で符号化した符号化データと、前記入力画像にマッチするテクスチャ成分であるマッチ成分を識別する識別情報とを受け取る受け取り部と、
前記符号化データを、復号画像に復号する復号部と、
複数のテクスチャ成分が登録されているデータベースと、
前記データベースに登録されている前記複数のテクスチャ成分のうちの、前記識別情報により識別される前記マッチ成分としての前記テクスチャ成分と、前記復号画像とを合成する合成部と
を備える復号装置。 - 前記データベースには、前記テクスチャ成分を基底化して得られる、前記テクスチャ成分の基底が登録されている
請求項13に記載の復号装置。 - 前記復号画像、又は、前記復号画像の低域成分と、前記マッチ成分の基底とを用いた基底合成により、前記マッチ成分を復元した復元成分を生成する基底合成部をさらに備え、
前記合成部は、前記復元成分と、前記復号画像とを合成する
請求項14に記載の復号装置。 - 前記入力画像を符号化する符号化装置に、前記データベースに登録するデータを要求することにより取得し、前記データベースを更新する更新部をさらに備える
請求項14に記載の復号装置。 - サーバからデータを取得し、前記データベースを更新する更新部をさらに備える
請求項14に記載の復号装置。 - 入力画像を非可逆符号化方式で符号化した符号化データと、前記入力画像にマッチするテクスチャ成分であるマッチ成分を識別する識別情報とを受け取ることと、
前記符号化データを、復号画像に復号することと、
複数のテクスチャ成分が登録されているデータベースに登録されている前記複数のテクスチャ成分のうちの、前記識別情報により識別される前記マッチ成分としての前記テクスチャ成分と、前記復号画像とを合成することと
を含む復号方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP17785821.4A EP3448023A4 (en) | 2016-04-22 | 2017-04-07 | ENCLOSURE DEVICE AND ENCLOSURE METHOD AND DECOMPOSITION DEVICE AND DECOMPOSITION METHOD |
JP2018513110A JP6883219B2 (ja) | 2016-04-22 | 2017-04-07 | 符号化装置及び符号化方法、並びに、システム |
US16/094,084 US10715804B2 (en) | 2016-04-22 | 2017-04-07 | Encoding apparatus and encoding method as well as decoding apparatus and decoding method |
US16/880,650 US11259018B2 (en) | 2016-04-22 | 2020-05-21 | Encoding apparatus and encoding method as well as decoding apparatus and decoding method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016-086215 | 2016-04-22 | ||
JP2016086215 | 2016-04-22 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/094,084 A-371-Of-International US10715804B2 (en) | 2016-04-22 | 2017-04-07 | Encoding apparatus and encoding method as well as decoding apparatus and decoding method |
US16/880,650 Continuation US11259018B2 (en) | 2016-04-22 | 2020-05-21 | Encoding apparatus and encoding method as well as decoding apparatus and decoding method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017183479A1 true WO2017183479A1 (ja) | 2017-10-26 |
Family
ID=60116811
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/014454 WO2017183479A1 (ja) | 2016-04-22 | 2017-04-07 | 符号化装置及び符号化方法、並びに、復号装置及び復号方法 |
Country Status (4)
Country | Link |
---|---|
US (2) | US10715804B2 (ja) |
EP (1) | EP3448023A4 (ja) |
JP (1) | JP6883219B2 (ja) |
WO (1) | WO2017183479A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019208258A1 (ja) * | 2018-04-26 | 2019-10-31 | ソニー株式会社 | 符号化装置、符号化方法、復号装置、及び、復号方法 |
WO2019225344A1 (ja) * | 2018-05-21 | 2019-11-28 | 日本電信電話株式会社 | 符号化装置、画像補間システム及び符号化プログラム |
WO2019225337A1 (ja) * | 2018-05-21 | 2019-11-28 | 日本電信電話株式会社 | 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム及び復号プログラム |
JPWO2021140652A1 (ja) * | 2020-01-10 | 2021-07-15 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6883219B2 (ja) * | 2016-04-22 | 2021-06-09 | ソニーグループ株式会社 | 符号化装置及び符号化方法、並びに、システム |
US11412260B2 (en) * | 2018-10-29 | 2022-08-09 | Google Llc | Geometric transforms for image compression |
US10839565B1 (en) | 2019-08-19 | 2020-11-17 | Samsung Electronics Co., Ltd. | Decoding apparatus and operating method of the same, and artificial intelligence (AI) up-scaling apparatus and operating method of the same |
CN113663328B (zh) * | 2021-08-25 | 2023-09-19 | 腾讯科技(深圳)有限公司 | 画面录制方法、装置、计算机设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07250247A (ja) * | 1994-03-12 | 1995-09-26 | Victor Co Of Japan Ltd | 多次元画像圧縮伸張方法 |
WO2011090798A1 (en) | 2010-01-22 | 2011-07-28 | Thomson Licensing | Data pruning for video compression using example-based super-resolution |
JP2013524597A (ja) * | 2010-04-02 | 2013-06-17 | トムソン ライセンシング | 画像シーケンスのブロックを符号化する方法および再構成する方法 |
JP2014503885A (ja) * | 2010-11-29 | 2014-02-13 | トムソン ライセンシング | 画像の自己相似テクスチャ領域の再構成方法及び装置 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1127562A (zh) * | 1994-04-22 | 1996-07-24 | 索尼公司 | 视频信号编码方法及设备和视频信号译码设备 |
JPH08116448A (ja) * | 1994-10-13 | 1996-05-07 | Fuji Xerox Co Ltd | 画像信号の符号化装置及び復号装置 |
DE10310023A1 (de) | 2003-02-28 | 2004-09-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Verfahren und Anordnung zur Videocodierung, wobei die Videocodierung Texturanalyse und Textursynthese umfasst, sowie ein entsprechendes Computerprogramm und ein entsprechendes computerlesbares Speichermedium |
US8345053B2 (en) * | 2006-09-21 | 2013-01-01 | Qualcomm Incorporated | Graphics processors with parallel scheduling and execution of threads |
EP1926321A1 (en) | 2006-11-27 | 2008-05-28 | Matsushita Electric Industrial Co., Ltd. | Hybrid texture representation |
EP2118852B1 (en) | 2007-03-07 | 2011-11-30 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Concept for synthesizing texture in a video sequence |
US20090067495A1 (en) * | 2007-09-11 | 2009-03-12 | The Hong Kong University Of Science And Technology | Rate distortion optimization for inter mode generation for error resilient video coding |
JP4524711B2 (ja) * | 2008-08-04 | 2010-08-18 | ソニー株式会社 | ビデオ信号処理装置、ビデオ信号処理方法、プログラム |
KR101498206B1 (ko) * | 2008-09-30 | 2015-03-06 | 삼성전자주식회사 | 고해상도 영상 획득 장치 및 그 방법 |
KR101797673B1 (ko) * | 2010-07-13 | 2017-12-13 | 삼성전자주식회사 | 대역별 공간적 변조를 통한 영상 질감 향상 방법 및 그 장치 |
TWI511547B (zh) | 2012-04-10 | 2015-12-01 | Acer Inc | 利用旋轉操作輔助視訊壓縮的方法及其影像擷取裝置 |
JP6883219B2 (ja) * | 2016-04-22 | 2021-06-09 | ソニーグループ株式会社 | 符号化装置及び符号化方法、並びに、システム |
-
2017
- 2017-04-07 JP JP2018513110A patent/JP6883219B2/ja active Active
- 2017-04-07 EP EP17785821.4A patent/EP3448023A4/en not_active Withdrawn
- 2017-04-07 WO PCT/JP2017/014454 patent/WO2017183479A1/ja active Application Filing
- 2017-04-07 US US16/094,084 patent/US10715804B2/en active Active
-
2020
- 2020-05-21 US US16/880,650 patent/US11259018B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07250247A (ja) * | 1994-03-12 | 1995-09-26 | Victor Co Of Japan Ltd | 多次元画像圧縮伸張方法 |
WO2011090798A1 (en) | 2010-01-22 | 2011-07-28 | Thomson Licensing | Data pruning for video compression using example-based super-resolution |
JP2013518464A (ja) * | 2010-01-22 | 2013-05-20 | トムソン ライセンシング | Example−based超解像を用いたビデオ圧縮のためのデータプルーニング |
JP2013524597A (ja) * | 2010-04-02 | 2013-06-17 | トムソン ライセンシング | 画像シーケンスのブロックを符号化する方法および再構成する方法 |
JP2014503885A (ja) * | 2010-11-29 | 2014-02-13 | トムソン ライセンシング | 画像の自己相似テクスチャ領域の再構成方法及び装置 |
Non-Patent Citations (4)
Title |
---|
A. DUMITRAS; B. HASKELL: "An Encoder-Decoder Texture Replacement Method with Application to Content-Based Movie Coding", IEEE TRANSACTION SON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 14, no. 6, June 2004 (2004-06-01), XP001196870, DOI: doi:10.1109/TCSVT.2004.828336 |
ADRIANA DUMITRAS ET AL.: "AN ENCODER-DECODER TEXTURE REPLACEMENT METHOD WITH APPLICATION TO CONTENT-BASED MOVIE CODING", IEEE TRANSACTION ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 14, no. 6, 1 June 2004 (2004-06-01), pages 825 - 840, XP001196870 * |
JIANCHAO YANGJ.; HUANG T. S.; YI MAWRIGHT: "Image Super-Resolution via Sparse Representation", IMAGE PROCESSING, IEEE TRANSACTION, vol. 19, no. 11, 2010, pages 2861 - 2873, XP011328631, DOI: doi:10.1109/TIP.2010.2050625 |
See also references of EP3448023A4 |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019208258A1 (ja) * | 2018-04-26 | 2019-10-31 | ソニー株式会社 | 符号化装置、符号化方法、復号装置、及び、復号方法 |
WO2019225344A1 (ja) * | 2018-05-21 | 2019-11-28 | 日本電信電話株式会社 | 符号化装置、画像補間システム及び符号化プログラム |
JP2019205010A (ja) * | 2018-05-21 | 2019-11-28 | 日本電信電話株式会社 | 符号化装置、画像補間システム及び符号化プログラム |
WO2019225337A1 (ja) * | 2018-05-21 | 2019-11-28 | 日本電信電話株式会社 | 符号化装置、復号装置、符号化方法、復号方法、符号化プログラム及び復号プログラム |
US11350134B2 (en) | 2018-05-21 | 2022-05-31 | Nippon Telegraph And Telephone Corporation | Encoding apparatus, image interpolating apparatus and encoding program |
JPWO2021140652A1 (ja) * | 2020-01-10 | 2021-07-15 | ||
JP7388453B2 (ja) | 2020-01-10 | 2023-11-29 | 日本電信電話株式会社 | データ処理装置、データ処理方法、及びデータ処理プログラム |
US12032425B2 (en) | 2020-01-10 | 2024-07-09 | Nippon Telegraph And Telephone Corporation | Data processing apparatus, data processing method, and data processing program |
Also Published As
Publication number | Publication date |
---|---|
EP3448023A1 (en) | 2019-02-27 |
JPWO2017183479A1 (ja) | 2019-02-28 |
EP3448023A4 (en) | 2019-02-27 |
JP6883219B2 (ja) | 2021-06-09 |
US20200288128A1 (en) | 2020-09-10 |
US11259018B2 (en) | 2022-02-22 |
US20190132589A1 (en) | 2019-05-02 |
US10715804B2 (en) | 2020-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6883219B2 (ja) | 符号化装置及び符号化方法、並びに、システム | |
US10779009B2 (en) | Image decoding device and method | |
WO2018037737A1 (ja) | 画像処理装置、画像処理方法、及びプログラム | |
US20190215534A1 (en) | Image processing apparatus and image processing method | |
JP6977719B2 (ja) | 符号化装置及び符号化方法、並びに、復号装置及び復号方法 | |
WO2018070267A1 (ja) | 画像処理装置および画像処理方法 | |
WO2017191749A1 (ja) | 画像処理装置及び画像処理方法 | |
WO2018131515A1 (ja) | 画像処理装置及び画像処理方法 | |
WO2019039283A1 (ja) | 画像処理装置及び画像処理方法 | |
US10595021B2 (en) | Image processing device and method | |
US10298927B2 (en) | Image decoding device and method | |
US20200288123A1 (en) | Image processing apparatus and image processing method | |
WO2018131524A1 (ja) | 画像処理装置及び画像処理方法 | |
WO2018173873A1 (ja) | 符号化装置及び符号化方法、並びに、復号装置及び復号方法 | |
WO2018168484A1 (ja) | 符号化装置、符号化方法、復号装置、及び、復号方法 | |
WO2018047480A1 (ja) | 画像処理装置、画像処理方法、及びプログラム | |
CN111133757B (zh) | 编码设备、编码方法、解码设备和解码方法 | |
WO2017169722A1 (ja) | 画像処理装置および方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 2018513110 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2017785821 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2017785821 Country of ref document: EP Effective date: 20181122 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17785821 Country of ref document: EP Kind code of ref document: A1 |