JP2022129735A

JP2022129735A - Image encoding device, image encoding method, image decoding device, and image decoding method

Info

Publication number: JP2022129735A
Application number: JP2021028534A
Authority: JP
Inventors: 昂深堀; Akira Fukabori; ケビン梶谷; Kevin Kajiya; 雅博筒; Masahiro Tsutsu; チャリスラサンサフェルナンド; Lasantha Fernando Charith; 哲孫; Ze Son; 信吉澤; Makoto Yoshizawa; 隆士道川; Takashi Michikawa; 秀夫横田; Hideo Yokota; 茂穂野田; Shigeo Noda
Original assignee: RIKEN Institute of Physical and Chemical Research; Avatarin Inc
Current assignee: RIKEN Institute of Physical and Chemical Research; Avatarin Inc
Priority date: 2021-02-25
Filing date: 2021-02-25
Publication date: 2022-09-06

Abstract

To provide an image encoding device capable of compressing an input image without degrading its image quality.SOLUTION: An image encoding device 10 includes: a saliency map generation unit 12 that generates a saliency map of an input image from the input image; a gradient intensity map generation unit 13 that generates a gradient intensity map of the input image from the input image; an importance map generation unit 14 that generates an importance map of the input image from the saliency map and the gradient intensity map; a binarization unit 15 that binarizes the importance map into a first gradation value and a second gradation value lighter than the first gradation value by dithering; a color information extraction unit 16 that extracts, from the input image, image information including position information and color information on each pixel having a first grayscale value in the importance map from among a plurality of pixels of the input image; and an encoding unit 17 that encodes the pixel information.SELECTED DRAWING: Figure 1

Description

本発明は、画像符号化装置、画像符号化方法、画像復号化装置、及び画像復号化方法に関わる。 The present invention relates to an image encoding device, an image encoding method, an image decoding device, and an image decoding method.

動画像の圧縮符号化技術については、例えば、ＩＳＯ／ＩＥＣ（International Organization for Standardization/International Electrotechnical Commission）による標準規格として、ＭＰＥＧ規格(Ｈ．２６５／ＨＥＶＣ)が知られている。ＨＥＶＣでは、４Ｋ（３８４０×２１６０画素）画像や、８Ｋ（７６８０×４３２０画素）画像に対する符号化方式などを規定している。 MPEG standard (H.265/HEVC) is known as a standard by ISO/IEC (International Organization for Standardization/International Electrotechnical Commission) for compression encoding technology of moving images. HEVC defines encoding methods for 4K (3840×2160 pixels) images and 8K (7680×4320 pixels) images.

今後、８Ｋ画像などの高精細な動画再生のニーズが高まることが予想されており、大容量の画像を、品質を劣化させることなく圧縮し、低遅延で伝送する技術の開発が望まれている。 In the future, it is expected that the need for high-definition video playback such as 8K images will increase, and the development of technology that compresses large-capacity images without degrading quality and transmits them with low delay is desired. .

そこで、本発明は、入力画像を、その画質を劣化させることなく、圧縮することのできる画像符号化装置及び画像符号化方法を提案することを課題とする。 Accordingly, an object of the present invention is to propose an image coding apparatus and an image coding method capable of compressing an input image without degrading its image quality.

上述の課題を解決するため、本発明に関わる画像符号化装置は、入力画像から入力画像の顕著性マップを生成する顕著性マップ生成部と、入力画像から入力画像の勾配強度マップを生成する勾配強度マップ生成部と、顕著性マップ及び勾配強度マップから入力画像の重要度マップを生成する重要度マップ生成部と、重要度マップを、ディザリングにより、第１の濃淡値と、第１の濃淡値よりも薄い第２の濃淡値とに２値化する２値化部と、入力画像の複数の画素のうち、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を入力画像から抽出する色情報抽出部と、画素情報を符号化する符号化部とを備える。入力画像そのものを符号化するのではなく、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を符号化することにより、入力画像の重要な情報（例えば、人間の視覚的な顕著性の高い部分の情報）を保持したまま、効率よく画像圧縮することができる。 In order to solve the above-described problems, the image coding apparatus according to the present invention includes a saliency map generation unit that generates a saliency map of an input image from an input image, and a gradient map that generates a gradient intensity map of the input image from the input image. an intensity map generator for generating an importance map of the input image from the saliency map and the gradient intensity map; a binarization unit that binarizes the image into a second grayscale value that is lighter than the value; position information and color of each pixel having the first grayscale value in the importance map among the plurality of pixels of the input image A color information extraction unit extracting pixel information including information from an input image, and an encoding unit encoding the pixel information. Instead of encoding the input image itself, the important information of the input image ( For example, the image can be compressed efficiently while maintaining the information of the highly conspicuous part of the human visual sense.

本発明に関わる画像復号化装置は、本発明に関わる画像符号化装置により符号化された画素情報から入力画像を復元する画像復号化装置であって、符号化された画素情報を復号化する復号化部と、復号化された画素情報から、画像補間方法により、入力画像を復元する画像復元部を備える。画素情報は、入力画像を実用上十分な精度で復元するのに必要な情報を保持しているため、画素情報から、画像補間により、画質を劣化させることなく、復元画像を生成することができる。 An image decoding apparatus according to the present invention is an image decoding apparatus that restores an input image from pixel information encoded by the image encoding apparatus according to the present invention. and an image restoration unit for restoring an input image by an image interpolation method from the decoded pixel information. Since the pixel information holds the information necessary to restore the input image with sufficient accuracy for practical use, the restored image can be generated from the pixel information by image interpolation without degrading the image quality. .

画像補間の方法として、例えば、低次元多様体モデルに基づく画像補間方法を用いることができる。この画像補間方法によれば、少ない情報（重要度マップの値が第２の濃淡値を有する画素の位置情報もその色情報も含まない画素情報）から、入力画像を精度よく復元することができる。 As an image interpolation method, for example, an image interpolation method based on a low-dimensional manifold model can be used. According to this image interpolation method, an input image can be accurately restored from a small amount of information (pixel information that does not include position information or color information of pixels whose value in the importance map has the second grayscale value). .

本発明に関わる画像符号化方法は、入力画像から入力画像の顕著性マップを生成するステップと、入力画像から入力画像の勾配強度マップを生成するステップと、顕著性マップ及び勾配強度マップから入力画像の重要度マップを生成するステップと、重要度マップを、ディザリングにより、第１の濃淡値と、第１の濃淡値よりも薄い第２の濃淡値とに２値化するステップと、入力画像の複数の画素のうち、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を入力画像から抽出するステップと、画素情報を符号化するステップを含む。入力画像そのものを符号化するのではなく、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を符号化することにより、入力画像の重要な情報（例えば、人間の視覚的な顕著性の高い部分の情報）を保持したまま、効率よく画像圧縮することができる。 An image coding method according to the present invention comprises the steps of generating a saliency map of the input image from the input image; generating a gradient intensity map of the input image from the input image; binarizing the importance map into a first gray value and a second gray value lighter than the first gray value by dithering; extracting from the input image pixel information including position information and color information for each pixel having a first gray value in the importance map among the plurality of pixels in the input image; and encoding the pixel information. . Instead of encoding the input image itself, the important information of the input image ( For example, the image can be compressed efficiently while maintaining the information of the highly conspicuous part of the human visual sense.

本発明に関わる画像復号化方法は、本発明に関わる画像符号化方法により符号化された画素情報から入力画像を復元する画像復号化方法であって、符号化された画素情報を復号化するステップと、復号化された画素情報から、画像補間方法により、入力画像を復元するステップを含む。画素情報は、入力画像を実用上十分な精度で復元するのに必要な情報を保持しているため、画素情報から、画像補間により、画質を劣化させることなく、復元画像を生成することができる。 An image decoding method according to the present invention is an image decoding method for restoring an input image from pixel information encoded by the image encoding method according to the present invention, the step of decoding the encoded pixel information. and reconstructing the input image from the decoded pixel information by an image interpolation method. Since the pixel information holds the information necessary to restore the input image with sufficient accuracy for practical use, the restored image can be generated from the pixel information by image interpolation without degrading the image quality. .

本発明によれば、入力画像を、その画質を劣化させることなく、効率的に圧縮することができる。 According to the present invention, an input image can be efficiently compressed without degrading its image quality.

本発明の実施形態に関わる画像符号化装置のハードウェア構成を示す説明図である。1 is an explanatory diagram showing the hardware configuration of an image encoding device according to an embodiment of the present invention; FIG. 本発明の実施形態に関わる画像符号化装置の機能ブロック図である。1 is a functional block diagram of an image encoding device according to an embodiment of the present invention; FIG. 本発明の実施形態に関わる画像復号化装置のハードウェア構成を示す説明図である。1 is an explanatory diagram showing the hardware configuration of an image decoding device according to an embodiment of the present invention; FIG. 本発明の実施形態に関わる画像復号化装置の機能ブロック図である。1 is a functional block diagram of an image decoding device according to an embodiment of the present invention; FIG. 本発明の実施形態に関わる画像入力部に入力される入力画像の一例を示す図である。It is a figure which shows an example of the input image input into the image input part in connection with embodiment of this invention. 本発明の実施形態に関わる顕著性マップ生成部により生成される顕著性マップの一例を示す図である。It is a figure which shows an example of the saliency map produced|generated by the saliency map production|generation part in connection with embodiment of this invention. 本発明の実施形態に関わる勾配強度マップ生成部により生成される勾配強度マップの一例を示す図である。It is a figure which shows an example of the gradient intensity|strength map produced|generated by the gradient intensity|strength map production|generation part in connection with embodiment of this invention. 本発明の実施形態に関わる重要度マップ生成部により生成される重要度マップの一例を示す図である。It is a figure which shows an example of the importance map produced|generated by the importance map production|generation part in connection with embodiment of this invention. 本発明の実施形態に関わる２値化部により２値化される重要度マップの一例を示す図である。It is a figure which shows an example of the importance map binarized by the binarization part in connection with embodiment of this invention. 本発明の実施形態に関わる復号化部により復号される画素情報の一例を示す図である。It is a figure which shows an example of the pixel information decoded by the decoding part in connection with embodiment of this invention. 本発明の実施形態に関わる画像復元部により復元される入力画像の一例を示す図である。It is a figure which shows an example of the input image restored by the image restoration part in connection with embodiment of this invention.

以下、図面を参照しながら本発明の実施形態について説明する。ここで、同一符号は同一の構成要素を示すものとし、重複する説明は省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. Here, the same reference numerals denote the same components, and duplicate descriptions are omitted.

図１は本発明の実施形態に関わる画像符号化装置１０のハードウェア構成を示す説明図である。画像符号化装置１０は、プロセッサ１０１、メモリ１０２、記憶装置１０３、カメラ１０４、及び通信装置１０５を備えている。記憶装置１０３は、オペレーティングシステム１０６及び画像処理プログラム１０７などのソフトウェア資源を格納している。これらのソフトウェア資源は、メモリ１０２に読み込まれ、プロセッサ１０１により実行される。 FIG. 1 is an explanatory diagram showing the hardware configuration of an image encoding device 10 according to an embodiment of the invention. The image encoding device 10 includes a processor 101 , memory 102 , storage device 103 , camera 104 and communication device 105 . The storage device 103 stores software resources such as an operating system 106 and an image processing program 107 . These software resources are loaded into memory 102 and executed by processor 101 .

図２は本発明の実施形態に関わる画像符号化装置１０の機能ブロック図である。プロセッサ１０１、メモリ１０２、記憶装置１０３、カメラ１０４、及び通信装置１０５などのハードウェア資源と、オペレーティングシステム１０６及び画像処理プログラム１０７などのソフトウェア資源との協働により、画像入力部１１、顕著性マップ生成部１２、勾配強度マップ生成部１３、重要度マップ生成部１４、２値化部１５、色情報抽出部１６、符号化部１７、及び通信部１８の機能が実現されている。 FIG. 2 is a functional block diagram of the image encoding device 10 according to the embodiment of the invention. Hardware resources such as the processor 101, the memory 102, the storage device 103, the camera 104, and the communication device 105 cooperate with software resources such as the operating system 106 and the image processing program 107 to generate the image input unit 11 and the saliency map. The functions of the generation unit 12, the gradient intensity map generation unit 13, the importance map generation unit 14, the binarization unit 15, the color information extraction unit 16, the encoding unit 17, and the communication unit 18 are realized.

画像入力部１１は、カメラ１０４によって撮影された動画を構成する入力画像を取り込む。 The image input unit 11 takes in input images that form a moving image captured by the camera 104 .

顕著性マップ生成部１２は、入力画像から入力画像の顕著性マップを生成する。顕著性マップは、入力画像における領域ごとの目立ちやすさ、すなわち、人間の視覚における顕著さの空間分布を示す。顕著性マップは、入力画像中の点や領域に対して人間が瞬間的に注目する度合い、すなわち、顕著性の高さを数値化した顕著性値を算出することにより生成することができる。例えば、目の網膜にある網膜神経節細胞の中に受容野と呼ばれる領域があり、この受容野に光による刺激を受けると、その情報が脳に伝達される。受容野は、中央にある円形の部分とその周辺領域との２つで構成されている。このような受容野における仕組みを利用し、中央にある円形の部分とその周辺領域との刺激により信号が強くなる箇所（注意を引く場所）を数値化するようなモデルを顕著性マップとして用いることができる。具体的には、入力画像からピラミッド画像を作成し、ガウシアンフィルタにより特徴を抽出することにより、顕著性マップを生成する手法が知られている。また、畳み込みニューラルネットワークの出力が入力の摂動に対してどのように変化するか解析することで顕著性マップを生成する手法が知られている。 The saliency map generator 12 generates a saliency map of the input image from the input image. A saliency map indicates the saliency of each region in the input image, that is, the spatial distribution of salience in human vision. A saliency map can be generated by calculating a saliency value that quantifies the degree to which a person instantly pays attention to a point or area in an input image, that is, the level of saliency. For example, retinal ganglion cells in the retina of the eye have a region called a receptive field, and when this receptive field is stimulated by light, the information is transmitted to the brain. The receptive field consists of two parts, a central circular part and a peripheral area. Using such a mechanism in the receptive field, a model is used as a saliency map that quantifies the points where the signal becomes stronger due to the stimulation of the circular area in the center and the surrounding areas (places that attract attention). can be done. Specifically, a method of generating a saliency map by creating a pyramid image from an input image and extracting features using a Gaussian filter is known. Also known is a method of generating a saliency map by analyzing how the output of a convolutional neural network changes with respect to input perturbations.

勾配強度マップ生成部１３は、入力画像から入力画像の勾配強度マップを生成する。勾配強度マップは、入力画像の各画素の勾配強度値（すなわち、画素の輝度の差）の空間分布を示す。例えば、ｘ方向及びｙ方向を、互いに直交する二方向とし、ある画素のｘ方向の勾配強度値をＥｘとし、ｙ方向の勾配強度値をＥｙとすると、その勾配強度値Ｅは、Ｅｘ及びＥｙの二乗和の平方根として算出される。勾配強度マップは、入力画像のエッジの空間分布を示す。 The gradient intensity map generation unit 13 generates a gradient intensity map of the input image from the input image. The gradient strength map shows the spatial distribution of the gradient strength values (ie, pixel intensity differences) for each pixel in the input image. For example, the x direction and the y direction are two directions orthogonal to each other, and the gradient strength value in the x direction of a certain pixel is Ex, and the gradient strength value in the y direction is Ey. is calculated as the square root of the sum of squares of A gradient intensity map shows the spatial distribution of the edges of the input image.

重要度マップ生成部１４は、顕著性マップ及び勾配強度マップから入力画像の重要度マップを生成する。重要度マップは、入力画像の各画素の重要度値の空間分布を示す。重要度マップ生成部１４は、入力画像の顕著さの空間分布を示す顕著性マップと、入力画像のエッジの空間分布を示す勾配強度マップとを組み合わせて重要度マップを生成する。このようにして生成された重要度マップは、入力画像の顕著性の高い部分と、入力画像のエッジ部分とを、他の部分（顕著性の低い部分又はエッジでない部分）よりも、相対的に視覚的な重要度の高い部分として、各画素の重要度値の空間分布を示す。 The importance map generator 14 generates an importance map of the input image from the saliency map and the gradient intensity map. The importance map shows the spatial distribution of importance values for each pixel of the input image. The importance map generating unit 14 generates an importance map by combining a saliency map indicating the spatial distribution of salience of the input image and a gradient intensity map indicating the spatial distribution of edges of the input image. The importance map generated in this way shows the high salience portion of the input image and the edge portion of the input image relative to other portions (low salience portions or non-edge portions). The spatial distribution of the importance value of each pixel is shown as a portion of high visual importance.

例えば、入力画像がｌ行ｗ列の画素値の行列Ｉとして表されるものとし、入力画像の顕著性マップがｌ行ｗ列の顕著性値の行列Ｓとして表されるものとし、入力画像の勾配強度マップがｌ行ｗ列の勾配強度値の行列Ｇとして表されるものとし、入力画像の重要度マップがｌ行ｗ列の重要度値の行列ＩＭとして表されるものとする。行列Ｉのｉ行ｊ列目の要素Ｉ（ｉ，ｊ）は、ｉ行ｊ列目の画素の値を示す。行列Ｓのｉ行ｊ列目の要素Ｓ（ｉ，ｊ）は、ｉ行ｊ列目の画素の顕著性値を示す。行列Ｇのｉ行ｊ列目の要素Ｇ（ｉ，ｊ）は、ｉ行ｊ列目の画素の勾配強度値を示す。行列ＩＭのｉ行ｊ列目の要素ＩＭ（ｉ，ｊ）は、ｉ行ｊ列目の画素の重要度値を示す。但し、１≦ｉ≦ｌ、且つ、１≦ｊ≦ｗである。 For example, the input image is represented as a matrix I of l rows and w columns of pixel values, and the saliency map of the input image is represented as a matrix S of l rows and w columns of saliency values. Let the gradient magnitude map be represented as a matrix G of l rows and w columns of gradient magnitude values, and let the importance map of the input image be represented as a matrix IM of l rows and w columns of importance values. An element I(i,j) at the i-th row and j-th column of the matrix I indicates the value of the i-th row and j-th column pixel. An element S(i,j) at the i-th row and j-th column of the matrix S indicates the saliency value of the i-th row and j-th column pixel. An element G(i,j) at the i-th row and j-th column of the matrix G indicates the gradient intensity value of the i-th row and j-th column pixel. An element IM(i,j) at the i-th row and j-th column of the matrix IM indicates the importance value of the i-th row and j-th column pixel. However, 1≤i≤l and 1≤j≤w.

重要度マップ生成部１４は、例えば、以下に示す（１）式～（３）式のうち何れかの式により、ＩＭ（ｉ，ｊ）を算出してもよい。 The importance map generation unit 14 may calculate IM(i, j) by, for example, one of the following formulas (1) to (3).

ＩＭ（ｉ，ｊ）＝｛Ｓ（ｉ，ｊ）＋Ｇ（ｉ，ｊ）｝ⁿ／２ⁿ …（１） IM(i, j)={S(i, j)+G(i, j)} ⁿ /2 ⁿ (1)

ＩＭ（ｉ，ｊ）＝｛Ｓ（ｉ，ｊ）｝ⁿ （Ｓ（ｉ，ｊ）＞Ｇ（ｉ，ｊ）のとき）
ＩＭ（ｉ，ｊ）＝｛Ｇ（ｉ，ｊ）｝ⁿ （Ｇ（ｉ，ｊ）＞Ｓ（ｉ，ｊ）のとき）…（２） IM(i,j)={S(i,j)} ⁿ (when S(i,j)>G(i,j))
IM(i,j)={G(i,j)} ⁿ (when G(i,j)>S(i,j)) (2)

ＩＭ（ｉ，ｊ）＝α｛Ｓ（ｉ，ｊ）｝ⁿ¹＋（１－α）｛Ｇ（ｉ，ｊ）｝ⁿ² …（３） IM(i,j)=α{S(i,j)} ⁿ¹ +(1−α){G(i,j)} ⁿ² (3)

但し、０＜ｎ＜１、０＜ｎ１＜１、０＜ｎ２＜１、及び０＜α＜１の関係を満たすものとする。 However, the relationships 0<n<1, 0<n1<1, 0<n2<1, and 0<α<1 are satisfied.

なお、重要度マップ生成部１４は、各画素の重要度値を正規化してもよい。 Note that the importance map generator 14 may normalize the importance value of each pixel.

２値化部１５は、重要度マップをディザリングにより、第１の濃淡値と、第１の濃淡値よりも薄い第２の濃淡値とに２値化する。第１の濃淡値は、例えば、「黒」を示す濃淡値「１」であり、第２の濃淡値は、例えば、「白」を示す濃淡値「０」である。この２値化処理では、例えば、各画素における２値化で生じた誤差を、画素間の距離に応じた重み付けで周囲の画素に足し合わせる誤差拡散法を適用することができる。 The binarization unit 15 binarizes the importance map into a first grayscale value and a second grayscale value lighter than the first grayscale value by dithering. The first gray value is, for example, the gray value "1" indicating "black", and the second gray value is, for example, the gray value "0" indicating "white". In this binarization process, for example, an error diffusion method can be applied in which the error generated by the binarization of each pixel is weighted according to the distance between the pixels and added to the surrounding pixels.

色情報抽出部１６は、入力画像の複数の画素のうち、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を入力画像から抽出する。例えば、重要度マップの値が第１の濃淡値を有する画素の位置情報を（Ｘ，Ｙ）とし、その色情報を（Ｒ，Ｇ，Ｂ）とすると、画素情報は、（Ｘ，Ｙ，Ｒ，Ｇ，Ｂ）として表現することができる。ここで、Ｒ，Ｇ，Ｂは、それぞれ、赤、緑、青の色情報を示す。例えば、入力画像の１行２列目の画素の重要度マップの値が第１の濃淡値を有し、且つ、その画素の色情報が（２０，０，１０）である場合、画素情報は、（００１，００２，０２０，０００，０１０）として表現することができる。なお、画素情報は、重要度マップの値が第２の濃淡値を有する画素の位置情報もその色情報も含まない。 The color information extraction unit 16 extracts pixel information including position information and color information of each pixel having the first grayscale value in the importance map from among the plurality of pixels of the input image. For example, if the position information of a pixel having the first grayscale value in the importance map is (X, Y) and its color information is (R, G, B), then the pixel information is (X, Y, R, G, B). Here, R, G, and B represent color information of red, green, and blue, respectively. For example, if the importance map value of the pixel in the first row and second column of the input image has the first grayscale value and the color information of that pixel is (20, 0, 10), the pixel information is , (001,002,020,000,010). It should be noted that the pixel information does not include the position information of the pixel whose value in the importance map has the second gray value, nor the color information thereof.

符号化部１７は、色情報抽出部１６から出力される画素情報を符号化する。画素情報の符号化方式として、例えば、ハフマン符号化方式を用いることができる。入力画像そのものを符号化するのではなく、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を符号化することにより、入力画像の重要な情報（例えば、人間の視覚的な顕著性の高い部分の情報）を保持したまま、効率よく画像圧縮することができる。 The encoder 17 encodes the pixel information output from the color information extractor 16 . For example, the Huffman coding method can be used as the pixel information coding method. Instead of encoding the input image itself, the important information of the input image ( For example, the image can be compressed efficiently while maintaining the information of the highly conspicuous part of the human visual sense.

通信部１８は、符号化部１７により符号化された画素情報を、通信網を通じて外部に送信する。通信網は、無線ネットワーク、有線ネットワーク、又は無線ネットワークと有線ネットワークとが混在するネットワークでもよい。 The communication unit 18 transmits the pixel information encoded by the encoding unit 17 to the outside through a communication network. The communication network may be a wireless network, a wired network, or a mixed network of wireless and wired networks.

図３は本発明の実施形態に関わる画像復号化装置２０のハードウェア構成構成を示す説明図である。画像復号化装置２０は、プロセッサ２０１、メモリ２０２、記憶装置２０３、画像表示装置２０４、及び通信装置２０５を備えている。記憶装置２０３は、オペレーティングシステム２０６及び画像処理プログラム２０７などのソフトウェア資源を格納している。これらのソフトウェア資源は、メモリ２０２に読み込まれ、プロセッサ２０１により実行される。 FIG. 3 is an explanatory diagram showing the hardware configuration of the image decoding device 20 according to the embodiment of the present invention. The image decoding device 20 includes a processor 201 , a memory 202 , a storage device 203 , an image display device 204 and a communication device 205 . The storage device 203 stores software resources such as an operating system 206 and an image processing program 207 . These software resources are loaded into memory 202 and executed by processor 201 .

図４は本発明の実施形態に関わる画像復号化装置２０の機能ブロック図である。プロセッサ２０１、メモリ２０２、記憶装置２０３、画像表示装置２０４、及び通信装置２０５などのハードウェア資源と、オペレーティングシステム２０６及び画像処理プログラム２０７などのソフトウェア資源との協働により、通信部２１、復号化部２２、画像復元部２３、及び画像出力部２４の機能が実現されている。 FIG. 4 is a functional block diagram of the image decoding device 20 according to the embodiment of the invention. Hardware resources such as the processor 201, the memory 202, the storage device 203, the image display device 204, and the communication device 205 cooperate with software resources such as the operating system 206 and the image processing program 207 to enable the communication unit 21, the decoding Functions of the unit 22, the image restoration unit 23, and the image output unit 24 are realized.

通信部２１は、画像符号化装置１０から送信される符号化された画素情報を、通信網を通じて受信する。 The communication unit 21 receives encoded pixel information transmitted from the image encoding device 10 through a communication network.

復号化部２２は、符号化された画素情報を復号化する。符号化された画素情報の復号化方式として、例えば、ハフマン復号化方式を用いることができる。 The decoding unit 22 decodes the encoded pixel information. A Huffman decoding method, for example, can be used as a method for decoding the encoded pixel information.

画像復元部２３は、復号化部２２により復号化された画素情報から、画像補間方法により、入力画像を復元する。画素情報は、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含むが、重要度マップの値が第２の濃淡値を有する画素の位置情報もその色情報も含まない。画像復元部２３は、重要度マップの値が第２の濃淡値を有する画素については、その色情報がゼロであるとして補間処理を行う。画素情報は、入力画像を実用上十分な精度で復元するのに必要な情報を保持しているため、画素情報から、画像補間により、画質を劣化させることなく、復元画像を生成することができる。 The image restoration unit 23 restores the input image from the pixel information decoded by the decoding unit 22 by an image interpolation method. The pixel information includes location information and color information for each pixel whose importance map value has a first gray value, but location information for a pixel whose importance map value has a second gray value also includes its color information. does not include The image restoration unit 23 performs interpolation processing on the assumption that the color information of pixels having the second grayscale value in the importance map is zero. Since the pixel information holds the information necessary to restore the input image with sufficient accuracy for practical use, the restored image can be generated from the pixel information by image interpolation without degrading the image quality. .

なお、画像補間方法として、例えば、（１）テンソル復元に基づく方法、（２）低次元多模体モデルに基づく方法、（３）構造化行列のランク最小化に基づく方法、（４）特異値分解に基づく方法、（５）深層学習に基づく方法、及び（６）離散フーリエ変換に基づく方法などを挙げることができる。 As the image interpolation method, for example, (1) a method based on tensor reconstruction, (2) a method based on a low-dimensional multi-body model, (3) a method based on rank minimization of a structured matrix, (4) a singular value (5) deep learning-based methods; and (6) discrete Fourier transform-based methods.

テンソル復元に基づく方法に言及した文献として、例えば、（Ａ）Zhao Q, Zhang L, Cichocki A. Bayesian CP factorization of incomplete tensors with automatic rank determination. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1751-1763、及び（Ｂ）Chen Y L, Hsu C T, Liao H Y M. Simultaneous tensor decomposition and completion using factor priors. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(3): 577-591などがある。 References to methods based on tensor reconstruction include, for example, (A) Zhao Q, Zhang L, Cichocki A. Bayesian CP factorization of incomplete tensors with automatic rank determination. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9 ): 1751-1763, and (B) Chen Y L, Hsu C T, Liao H Y M. Simultaneous tensor decomposition and completion using factor priors. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(3): 577-591, etc. be.

低次元多模体モデルに基づく方法に言及した文献として、例えば、Yokota T, Hontani H, Zhao Q, et al. Manifold Modeling in Embedded Space: An Interpretable Alternative to Deep Image Prior. IEEE Transactions on Neural Networks and Learning Systems, 2020などがある。 For example, Yokota T, Hontani H, Zhao Q, et al. Manifold Modeling in Embedded Space: An Interpretable Alternative to Deep Image Prior. IEEE Transactions on Neural Networks and Learning Systems, 2020, etc.

構造化行列のランク最小化に基づく方法に言及した文献として、例えば、Takahashi T, Konishi K, Furukawa T. Structured matrix rank minimization approach to image inpainting, IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS). IEEE, 2012: 860-863などがある。 Takahashi T, Konishi K, Furukawa T. Structured matrix rank minimization approach to image inpainting, IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS). , 2012: 860-863.

特異値分解に基づく方法に言及した文献として、例えば、Song L, Du B, Zhang L, et al. Nonlocal patch based T-SVD for image inpainting: Algorithm and error analysis, Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1) などがある。 For example, Song L, Du B, Zhang L, et al. Nonlocal patch based T-SVD for image inpainting: Algorithm and error analysis, Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1), etc.

深層学習に基づく方法に言及した文献として、例えば、Ulyanov D, Vedaldi A, Lempitsky V. Deep image prior, Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 9446-9454などがある。 References to deep learning-based methods include, for example, Ulyanov D, Vedaldi A, Lempitsky V. Deep image prior, Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 9446-9454.

離散フーリエ変換に基づく方法に言及した文献として、例えば、Sridevi G, Kumar S S. Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circuits, Systems, and Signal Processing, 2019, 38(8): 3802-3817などがある。 References to methods based on discrete Fourier transform include, for example, Sridevi G, Kumar S S. Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circuits, Systems, and Signal Processing, 2019, 38(8): 3802 -3817 and so on.

画像出力部２４は、復元画像を映像表示する。映像表示される復元画像は、動画でもよく、或いは静止画でもよい。 The image output unit 24 video-displays the restored image. The restored image to be displayed may be a moving image or a still image.

なお、図５は、画像入力部１１に入力される入力画像の一例を示す。図６は、顕著性マップ生成部１２により生成される顕著性マップの一例を示す。図７は、勾配強度マップ生成部１３により生成される勾配強度マップの一例を示す。図８は、重要度マップ生成部１４により生成される重要度マップの一例を示す。図９は、２値化部１５により２値化される重要度マップの一例を示す。図１０は、復号化部２２により復号される画素情報の一例を示す。図１１は、画像復元部２３により復元される入力画像（復元画像）の一例を示す。 5 shows an example of an input image input to the image input unit 11. FIG. FIG. 6 shows an example of a saliency map generated by the saliency map generator 12. As shown in FIG. FIG. 7 shows an example of a gradient intensity map generated by the gradient intensity map generator 13. FIG. FIG. 8 shows an example of an importance map generated by the importance map generation unit 14. As shown in FIG. FIG. 9 shows an example of the importance map binarized by the binarization unit 15 . FIG. 10 shows an example of pixel information decoded by the decoding unit 22. As shown in FIG. FIG. 11 shows an example of an input image (restored image) restored by the image restoration unit 23. As shown in FIG.

本発明の実施形態によれば、入力画像そのものを符号化するのではなく、重要度マップの値が第１の濃淡値を有する各画素の位置情報及び色情報を含む画素情報を符号化することにより、入力画像の重要な情報（例えば、人間の視覚的な顕著性の高い部分の情報）を保持したまま、効率よく画像圧縮することができる。また、画素情報は、入力画像を実用上十分な精度で復元するのに必要な情報を保持しているため、画素情報から、画像補間により、画質を劣化させることなく、復元画像を生成することができる。また、画像補間の方法として、低次元多様体モデルに基づく画像補間方法を用いることにより、少ない情報（重要度マップの値が第２の濃淡値を有する画素の位置情報もその色情報も含まない画素情報）から、入力画像を精度よく復元することができる。 According to embodiments of the present invention, rather than encoding the input image itself, the value of the importance map encodes pixel information including position and color information for each pixel having a first gray value. Therefore, image compression can be performed efficiently while maintaining important information of the input image (for example, information of highly noticeable parts visually by humans). In addition, since the pixel information holds the information necessary to restore the input image with practically sufficient accuracy, the restored image can be generated from the pixel information by image interpolation without degrading the image quality. can be done. In addition, by using an image interpolation method based on a low-dimensional manifold model as an image interpolation method, less information (the value of the importance map does not include the position information of the pixel having the second grayscale value and the color information thereof) The input image can be restored with high accuracy from the pixel information).

本発明の実施形態によれば、効率よく画像圧縮できるため、動画像を低遅延で伝送することができる。例えば、遠隔医療システム、ロボットの遠隔制御システム、遠隔会議システム、又はバーチャル・リアリティ・システムなどの動画伝送に好適である。 According to the embodiments of the present invention, since image compression can be performed efficiently, moving images can be transmitted with low delay. For example, it is suitable for moving image transmission for telemedicine systems, remote control systems for robots, teleconferencing systems, or virtual reality systems.

なお、以上説明した実施形態は、本発明の理解を容易にするためのものであり、本発明を限定して解釈するためのものではない。本発明は、その趣旨を逸脱することなく、変更／改良され得るととともに、本発明にはその等価物も含まれる。即ち、実施形態に当業者が適宜設計変更を加えたものも、本発明の特徴を備えている限り、本発明の範囲に包含される。また、実施形態が備える各要素は、技術的に可能な限りにおいて組み合わせることができ、これらを組み合わせたものも本発明の特徴を含む限り本発明の範囲に包含される。 In addition, the embodiment described above is intended to facilitate understanding of the present invention, and is not intended to limit and interpret the present invention. The present invention may be modified/improved without departing from its spirit, and the present invention also includes equivalents thereof. In other words, any design modifications made by those skilled in the art to the embodiments are also included in the scope of the present invention as long as they have the features of the present invention. In addition, each element provided in the embodiment can be combined as long as it is technically possible, and the combination thereof is also included in the scope of the present invention as long as it includes the features of the present invention.

１０…画像符号化装置１１…画像入力部１２…顕著性マップ生成部１３…勾配強度マップ生成部１４…重要度マップ生成部１５…２値化部１６……色情報抽出部１７…符号化部１８…通信部２０…画像復号化装置２１…通信部２２…復号化部２３…画像復元部２４…画像出力部 DESCRIPTION OF SYMBOLS 10... Image encoding apparatus 11... Image input part 12... Saliency map generation part 13... Gradient intensity map generation part 14... Importance map generation part 15... Binarization part 16... Color information extraction part 17... Encoding part DESCRIPTION OF SYMBOLS 18... Communication part 20... Image decoding apparatus 21... Communication part 22... Decoding part 23... Image restoration part 24... Image output part

Claims

a saliency map generator that generates a saliency map of the input image from the input image;
a gradient intensity map generator that generates a gradient intensity map of the input image from the input image;
an importance map generator that generates an importance map of the input image from the saliency map and the gradient intensity map;
a binarization unit that binarizes the importance map into a first gradation value and a second gradation value lighter than the first gradation value by dithering;
a color information extraction unit for extracting, from the input image, pixel information including position information and color information of each pixel having the first grayscale value in the importance map among the plurality of pixels of the input image;
an encoding unit that encodes the pixel information;
An image encoding device comprising:

An image decoding device for restoring the input image from the pixel information encoded by the image encoding device according to claim 1,
a decoding unit that decodes the encoded pixel information;
an image restoration unit that restores the input image from the decoded pixel information by an image interpolation method;
An image decoding device.

The image decoding device according to claim 2,
The image decoding device, wherein the image interpolation method is an image interpolation method based on a low-dimensional manifold model.

generating from an input image a saliency map of said input image;
generating a gradient intensity map of the input image from the input image;
generating an importance map of the input image from the saliency map and the gradient strength map;
binarizing the importance map into a first grayscale value and a second grayscale value lighter than the first grayscale value by dithering;
extracting from the input image pixel information including position information and color information of each pixel having the first grayscale value in the importance map among the plurality of pixels of the input image;
encoding the pixel information;
An image encoding method, comprising:

An image decoding method for restoring the input image from the pixel information encoded by the image encoding method according to claim 4,
decoding the encoded pixel information;
reconstructing the input image from the decoded pixel information by an image interpolation method;
An image decoding method comprising:

The image decoding method according to claim 5,
The image decoding method, wherein the image interpolation method is an image interpolation method based on a low-dimensional manifold model.